Hallucination Failure

The legal research agent summarized five case precedents. Three were real. Two were invented — complete with realistic case names, docket numbers, and plausible rulings. The attorney filed the brief. The judge was not amused.

Hallucination failure is when a model generates factually wrong outputs it presents with full confidence. Not guessing, not hedging. A fluent, well-formatted, completely wrong answer drawn from whatever patterns exist in its training weights.

This is distinct from knowledge base decay or retrieval failure (where the model has access to information but it’s outdated). Hallucination is the model filling knowledge gaps with plausible-sounding fabrications when its training data doesn’t contain the answer.

What Hallucination Looks Like in Production

Hallucinated outputs are often indistinguishable from correct ones:

Output type	Example
Fabricated citations	Real author, plausible title, wrong year, wrong journal
Invented case law	Real-sounding case names, docket numbers that don’t exist
Wrong technical specs	Product model numbers, API endpoints, version numbers that are close but incorrect
False statistics	Made-up percentages that sound reasonable
Fictional companies	Real-sounding business names, addresses, contact info

The pattern: the model “knows” the general shape of the answer and fills in specific details with high-confidence confabulation.

Hallucination vs Knowledge Base Decay

These are often conflated. They have different causes and different fixes.

	Hallucination	Knowledge Base Decay
Cause	Model generates from training weights	Retrieval system returns stale data
Affected by RAG	Partially (model still hallucinates on details)	Directly (fixing KB fixes the output)
Fix	HITL review, output constraints, attribution requirements	Update the knowledge base
When it happens	Especially on specific facts, names, numbers	On facts that changed after the KB was last updated

RAG reduces hallucination by giving the model grounding documents to cite. It doesn’t eliminate it. A model with a retrieval system still:

Hallucinate details about events in the retrieved documents
Cite the document but misquote it
Blend retrieved facts with invented elaborations
Hallucinate when the retrieved documents don’t contain a specific detail the user asked about

The Three High-Risk Domains

1. Legal and Medical Information

Models are trained on general legal and medical text. They generate plausible-sounding precedents and diagnoses. The stakes are high — a hallucinated drug dosage or a fabricated court ruling has direct harm potential.

Hallucination rates are highest when the correct answer is specific and verifiable: exact case names, specific statutes, precise dosing protocols. These require exact recall, which language models are structurally bad at.

2. Product and Technical Information

Models hallucinate API endpoints, model numbers, software version numbers, and technical specifications. “Does Product X support Feature Y?” — if the model’s training data contains general discussion of Product X but not the specific version, it will synthesize a confident yes or no based on what would be plausible.

Customer-facing product documentation agents are high-risk for this pattern. A user asks about a specific feature of your SaaS product. If the training data doesn’t include your product (it usually doesn’t), the model generates a plausible response based on similar products it has seen.

3. Financial and Market Information

Models fill in specific financial figures, company valuations, market statistics, and deal terms with plausible-sounding numbers. For financial analysis tasks, every specific figure needs source verification — “roughly $5B” from an LLM with no citation is not a usable data point.

Detection Approaches

Citation Grounding

Require the model to cite a specific source for every specific claim. If it can’t cite it, it’s likely hallucinating. A useful system prompt instruction: “Cite the exact passage from the provided documents that supports this statement.”

This works well with RAG systems — you can automatically verify that the citation exists in the retrieved documents. A claim with no matching citation is flagged for review.

Factuality Scoring (LLM-as-Judge)

Use a secondary model to evaluate the factual accuracy of outputs. The judge model is given the original context (retrieved documents, known facts) and asked to identify any claims that aren’t supported by the context.

This is the hallucination-specific version of the LLM-as-judge approach in the drift monitoring playbook. With a tool like Langfuse, this can run automatically on production outputs.

Human-in-the-Loop for High-Stakes Outputs

For applications where hallucination has real consequences (legal, medical, financial, customer-facing), require human review before outputs reach users. The agent drafts. A human approves.

This doesn’t scale to high volume but is the right default before you’ve built the automated detection layer.

What RAG Can and Can’t Do

RAG is the most common recommendation for reducing hallucination. The reduction is real. A model with a well-maintained knowledge base of your company’s policies, products, and procedures hallucinates significantly less on topics covered by that knowledge base.

The limits:

Coverage gaps. The model hallucinates when the answer isn’t in the retrieved documents. If the user asks about something not in your KB, the model still has to generate a response — and hallucination risk is back.
Detail fabrication. The model retrieves the right document but hallucinate specific details within it (exact numbers, names, dates).
Multi-hop reasoning. For complex questions requiring synthesis across multiple documents, the model can hallucinate the synthesis even when the individual facts are correctly retrieved.

A knowledge base covering your specific domain meaningfully reduces hallucination — but doesn’t eliminate it. Build in the backstops.

The Solo Implementer Default

For an SMB deploying AI agents over internal knowledge bases: use RAG, require citations for specific claims, and never ship AI-generated specific facts (numbers, names, dates) to external-facing documents without a human check. For high-stakes domains — legal, medical, financial — human review before the output ships is mandatory.

A hallucinated fact that reaches a customer or appears in a legal document costs more to fix than the time saved by skipping review.

RAG — Reduces hallucination by providing grounding documents
LLM Drift — Hallucination rates can increase as models drift
Human-in-the-Loop — The backstop when automated detection isn’t enough
Knowledge Base Decay — The retrieval failure that looks like hallucination
Silent Agent Failure — Hallucination often surfaces as silent wrong answers
Operations & Maintenance — Where hallucination monitoring lives

WyrdWerk Deployment Wiki

Explorer

Hallucination Failure

What Hallucination Looks Like in Production

Hallucination vs Knowledge Base Decay

The Three High-Risk Domains

1. Legal and Medical Information

2. Product and Technical Information

3. Financial and Market Information

Detection Approaches

Citation Grounding

Factuality Scoring (LLM-as-Judge)

Human-in-the-Loop for High-Stakes Outputs

What RAG Can and Can’t Do

The Solo Implementer Default

Graph View

Table of Contents

Backlinks

WyrdWerk Deployment Wiki

Explorer

Hallucination Failure

What Hallucination Looks Like in Production

Hallucination vs Knowledge Base Decay

The Three High-Risk Domains

1. Legal and Medical Information

2. Product and Technical Information

3. Financial and Market Information

Detection Approaches

Citation Grounding

Factuality Scoring (LLM-as-Judge)

Human-in-the-Loop for High-Stakes Outputs

What RAG Can and Can’t Do

The Solo Implementer Default

Related

Graph View

Table of Contents

Backlinks