Skip to main content

Token

Definition

The basic unit of text processing in AI models. Roughly 1 token = 4 characters of English text. Used for billing and context limits.

Why It Matters

Every LLM thinks in tokens, not characters or words. Billing is per-token, context windows are measured in tokens, latency is dominated by tokens-per-second. Understanding what a token is unlocks why a "short" prompt can be expensive and why model pricing isn't directly comparable on character counts.

Key Points

  • GPT-family tokeniser (cl100k_base): ~100K vocabulary using byte-pair encoding (BPE). Most modern LLMs use 32K–100K BPE vocabulary.
  • Rule of thumb: 100 tokens ≈ 75 words ≈ 400 characters for English prose.
  • Code and numbers are token-expensive, 'print(12345)' can be 6–8 tokens; the same semantic content in natural language might be 3.
  • GPT-4o pricing (mid-2025): $5/M input tokens, $15/M output tokens. Self-hosted Qwen 2.5 72B: ~$0.30/M tokens.
  • Tokenisation is language-dependent: a 128K-token context holds ~96K words of English but only ~50K–60K words of Chinese or Arabic.

Example

English text averages ~4 characters per token. "Token" is one token; "tokenisation" is three tokens; "日本語" is also about three tokens. A typical chat reply is 100–500 tokens; a paragraph ~50 tokens; this glossary entry is roughly 200 tokens long.

Common Misconception

Token counts from different model families are not directly comparable. A '128K context window' on GPT-4o and a '128K context window' on Qwen may hold different amounts of actual text because each model uses a different tokeniser with different vocabulary size and merge rules.

Related Terms

  • Context WindowThe maximum amount of text an AI model can process at once, measured in tokens. GPT-4o has 128K tokens.
  • InferenceThe process of running an AI model to generate a response. When you send a message to ChatGPT, the model performs inference.
  • LLM (Large Language Model)A neural network trained on massive text datasets that can generate, understand and manipulate human language. Examples: GPT-4, Qwen, Claude.

Token on Rewind.ai

Rewind.ai prices in tokens. The token counter in chat shows the running cost in real time, for free models it counts against your daily 2,500 / 5,000 pool, for paid models against your purchased balance.

Explore the Tools

Browse Glossary

View All AI Terms

FAQ

Token on Rewind.ai is a free AI tool. There's no charge and no sign up needed to start.

Yes. You get 2,500 free tokens per day to use Token and every other tool on Rewind.ai. A free account raises that to 5,000 tokens/day. You can buy more starting at $1.

Token runs open-source AI models on our GPU servers. Send your request and the result comes back in seconds.

No. You can use Token right away without signing up. A free account doubles your daily usage to 5,000 tokens and saves your history.

Anonymous users get 2,500 tokens/day. Free accounts get 5,000 tokens/day. Tokens reset every 24 hours. Each generation costs ~100-5,000 tokens depending on the operation.

Your data is processed on our servers and isn't stored permanently unless you choose to save it. We don't sell or share it.

Yes. Content from Token is yours to use for personal or commercial work. The AI models we run are commercially licensed.

Token matches the quality of paid services because it runs the latest open-source AI models. The difference is you don't pay per use.

Token runs open-source AI models including Qwen 2.5, FLUX and Whisper. We update to newer models as they ship.

Yes. Token works in any mobile browser, and the layout adapts to your screen size.

Sign up for a free account to get 5,000 tokens/day, double the anonymous limit. Or buy token packs starting at $5 for 200,000 tokens. See /pricing/ for all options.

Yes. After you generate content, you can download it, copy it, or share it via a unique link. Signed-in users can also view their generation history.

Love Rewind.ai? Tell your friends!

Rate this page