The legal research agent summarized five case precedents. Three were real. Two were invented — complete with realistic case names, docket numbers, and plausible rulings. The attorney filed the brief. The judge was not amused.
Hallucination failure is when a model generates factually wrong outputs it presents with full confidence. Not guessing, not hedging. A fluent, well-formatted, completely wrong answer drawn from whatever patterns exist in its training weights.
This is distinct from knowledge base decay or retrieval failure (where the model has access to information but it’s outdated). Hallucination is the model filling knowledge gaps with plausible-sounding fabrications when its training data doesn’t contain the answer.
What Hallucination Looks Like in Production
Hallucinated outputs are often indistinguishable from correct ones:
| Output type | Example |
|---|---|
| Fabricated citations | Real author, plausible title, wrong year, wrong journal |
| Invented case law | Real-sounding case names, docket numbers that don’t exist |
| Wrong technical specs | Product model numbers, API endpoints, version numbers that are close but incorrect |
| False statistics | Made-up percentages that sound reasonable |
| Fictional companies | Real-sounding business names, addresses, contact info |
The pattern: the model “knows” the general shape of the answer and fills in specific details with high-confidence confabulation.
Hallucination vs Knowledge Base Decay
These are often conflated. They have different causes and different fixes.
| Hallucination | Knowledge Base Decay | |
|---|---|---|
| Cause | Model generates from training weights | Retrieval system returns stale data |
| Affected by RAG | Partially (model still hallucinates on details) | Directly (fixing KB fixes the output) |
| Fix | HITL review, output constraints, attribution requirements | Update the knowledge base |
| When it happens | Especially on specific facts, names, numbers | On facts that changed after the KB was last updated |
RAG reduces hallucination by giving the model grounding documents to cite. It doesn’t eliminate it. A model with a retrieval system still:
- Hallucinate details about events in the retrieved documents
- Cite the document but misquote it
- Blend retrieved facts with invented elaborations
- Hallucinate when the retrieved documents don’t contain a specific detail the user asked about
The Three High-Risk Domains
1. Legal and Medical Information
Models are trained on general legal and medical text. They generate plausible-sounding precedents and diagnoses. The stakes are high — a hallucinated drug dosage or a fabricated court ruling has direct harm potential.
Hallucination rates are highest when the correct answer is specific and verifiable: exact case names, specific statutes, precise dosing protocols. These require exact recall, which language models are structurally bad at.
2. Product and Technical Information
Models hallucinate API endpoints, model numbers, software version numbers, and technical specifications. “Does Product X support Feature Y?” — if the model’s training data contains general discussion of Product X but not the specific version, it will synthesize a confident yes or no based on what would be plausible.
Customer-facing product documentation agents are high-risk for this pattern. A user asks about a specific feature of your SaaS product. If the training data doesn’t include your product (it usually doesn’t), the model generates a plausible response based on similar products it has seen.
3. Financial and Market Information
Models fill in specific financial figures, company valuations, market statistics, and deal terms with plausible-sounding numbers. For financial analysis tasks, every specific figure needs source verification — “roughly $5B” from an LLM with no citation is not a usable data point.
Detection Approaches
Citation Grounding
Require the model to cite a specific source for every specific claim. If it can’t cite it, it’s likely hallucinating. A useful system prompt instruction: “Cite the exact passage from the provided documents that supports this statement.”
This works well with RAG systems — you can automatically verify that the citation exists in the retrieved documents. A claim with no matching citation is flagged for review.
Factuality Scoring (LLM-as-Judge)
Use a secondary model to evaluate the factual accuracy of outputs. The judge model is given the original context (retrieved documents, known facts) and asked to identify any claims that aren’t supported by the context.
This is the hallucination-specific version of the LLM-as-judge approach in the drift monitoring playbook. With a tool like Langfuse, this can run automatically on production outputs.
Human-in-the-Loop for High-Stakes Outputs
For applications where hallucination has real consequences (legal, medical, financial, customer-facing), require human review before outputs reach users. The agent drafts. A human approves.
This doesn’t scale to high volume but is the right default before you’ve built the automated detection layer.
What RAG Can and Can’t Do
RAG is the most common recommendation for reducing hallucination. The reduction is real. A model with a well-maintained knowledge base of your company’s policies, products, and procedures hallucinates significantly less on topics covered by that knowledge base.
The limits:
- Coverage gaps. The model hallucinates when the answer isn’t in the retrieved documents. If the user asks about something not in your KB, the model still has to generate a response — and hallucination risk is back.
- Detail fabrication. The model retrieves the right document but hallucinate specific details within it (exact numbers, names, dates).
- Multi-hop reasoning. For complex questions requiring synthesis across multiple documents, the model can hallucinate the synthesis even when the individual facts are correctly retrieved.
A knowledge base covering your specific domain meaningfully reduces hallucination — but doesn’t eliminate it. Build in the backstops.
The Solo Implementer Default
For an SMB deploying AI agents over internal knowledge bases: use RAG, require citations for specific claims, and never ship AI-generated specific facts (numbers, names, dates) to external-facing documents without a human check. For high-stakes domains — legal, medical, financial — human review before the output ships is mandatory.
A hallucinated fact that reaches a customer or appears in a legal document costs more to fix than the time saved by skipping review.
Related
- RAG — Reduces hallucination by providing grounding documents
- LLM Drift — Hallucination rates can increase as models drift
- Human-in-the-Loop — The backstop when automated detection isn’t enough
- Knowledge Base Decay — The retrieval failure that looks like hallucination
- Silent Agent Failure — Hallucination often surfaces as silent wrong answers
- Operations & Maintenance — Where hallucination monitoring lives