Hallucination
Definition
When an AI model generates false or fabricated information that sounds confident and plausible.
Why It Matters
LLMs are trained to predict plausible-sounding text, not to refuse what they don't know. When the answer isn't in their training data they often invent specific-but-wrong details, fake citations, made-up library functions, invented court cases. Knowing this is the difference between using LLMs productively and getting embarrassed.
Key Points
- Three types: intrinsic hallucination (output contradicts the provided source), extrinsic (fabricates information not in the source), and faithful (correct reasoning but wrong premises).
- Hallucination rates vary widely: GPT-4o hallucinates citations ~5–15 % of the time; smaller or older models 20–40 %.
- Mitigation hierarchy, roughly in order of effectiveness: RAG with cited sources > output structure enforcement > lower temperature > larger model.
- Chain-of-thought prompting reduces some hallucinations by forcing explicit reasoning before assertion, the model is more likely to surface its uncertainty.
- Factual domains with sparse training data (niche historical events, unpublished research, private company details) have systematically higher hallucination rates.
Example
Ask an LLM for the URL of a paper that doesn't exist and it will often fabricate a plausible-looking arxiv ID. Ask for a quote from a real book and it will sometimes invent a passage in the right author's style. Retrieval (RAG) is the main mitigation.
Common Misconception
Low temperature does not prevent hallucination, it makes the output deterministic, but if the model's highest-confidence answer is factually wrong, it will output that wrong answer confidently every single time. The only structural fix is grounding responses in retrieved source documents with citations.
Related Terms
- LLM (Large Language Model)A neural network trained on massive text datasets that can generate, understand and manipulate human language. Examples: GPT-4, Qwen, Claude.
- RAG (Retrieval-Augmented Generation)A technique where AI retrieves relevant documents before generating a response, improving accuracy.
- TemperatureA parameter that controls AI output randomness. Low temperature = more focused. High temperature = more creative.
Hallucination on Rewind.ai
Rewind.ai's search tool grounds responses in cited sources. For raw chat, lower the temperature (0.2–0.3) when accuracy matters more than creativity, and verify any specific claim before quoting it.
Explore the ToolsQuick Facts
| Term | Hallucination |
| Related | LLM (Large Language Model), RAG (Retrieval-Augmented Generation), Temperature |