Token
Definition
The basic unit of text processing in AI models. Roughly 1 token = 4 characters of English text. Used for billing and context limits.
Why It Matters
Every LLM thinks in tokens, not characters or words. Billing is per-token, context windows are measured in tokens, latency is dominated by tokens-per-second. Understanding what a token is unlocks why a "short" prompt can be expensive and why model pricing isn't directly comparable on character counts.
Key Points
- GPT-family tokeniser (cl100k_base): ~100K vocabulary using byte-pair encoding (BPE). Most modern LLMs use 32K–100K BPE vocabulary.
- Rule of thumb: 100 tokens ≈ 75 words ≈ 400 characters for English prose.
- Code and numbers are token-expensive, 'print(12345)' can be 6–8 tokens; the same semantic content in natural language might be 3.
- GPT-4o pricing (mid-2025): $5/M input tokens, $15/M output tokens. Self-hosted Qwen 2.5 72B: ~$0.30/M tokens.
- Tokenisation is language-dependent: a 128K-token context holds ~96K words of English but only ~50K–60K words of Chinese or Arabic.
Example
English text averages ~4 characters per token. "Token" is one token; "tokenisation" is three tokens; "日本語" is also about three tokens. A typical chat reply is 100–500 tokens; a paragraph ~50 tokens; this glossary entry is roughly 200 tokens long.
Common Misconception
Token counts from different model families are not directly comparable. A '128K context window' on GPT-4o and a '128K context window' on Qwen may hold different amounts of actual text because each model uses a different tokeniser with different vocabulary size and merge rules.
Related Terms
- Context WindowThe maximum amount of text an AI model can process at once, measured in tokens. GPT-4o has 128K tokens.
- InferenceThe process of running an AI model to generate a response. When you send a message to ChatGPT, the model performs inference.
- LLM (Large Language Model)A neural network trained on massive text datasets that can generate, understand and manipulate human language. Examples: GPT-4, Qwen, Claude.
Token on Rewind.ai
Rewind.ai prices in tokens. The token counter in chat shows the running cost in real time, for free models it counts against your daily 2,500 / 5,000 pool, for paid models against your purchased balance.
Explore the ToolsQuick Facts
| Term | Token |
| Related | Context Window, Inference, LLM (Large Language Model) |