AI Hallucinations: Why AI Makes Things Up and How to Catch It

You'll learn: why AI makes things up, where it's still most likely to, and the simple defenses that keep you from getting burned. You'll walk away with: the Hallucination Defense Kit — the red-flag zones + the techniques that prevent and catch them. Level: Beginner · Prereq: What Is an LLM

The tests below are real Claude (Sonnet 4.6) responses from June 2026 — and they probably won't go the way you expect.

1. The Problem

You've heard it: "AI just makes stuff up." A lawyer cites fake cases, a student turns in invented sources, someone trusts a confident wrong answer. It's the single biggest reason people don't trust AI for anything important.

But here's what almost no one tells you: the top models have gotten dramatically better at this — to the point where the old advice is outdated. To use AI well, you need the current picture: where it still slips, where it now protects you, and the handful of habits that make it reliable.

2. Why It Happens (one line)

From the pillar lesson: an LLM predicts plausible text. A hallucination is just that engine doing its job in a spot where it doesn't actually know — it produces something that sounds right to fill the gap. The mechanism is the cause, which means the risk can be reduced but never fully removed.

3. Test 1 — Ask for Something It Would Have to Invent

The classic trap is asking for specific citations. I did:

Give me three peer-reviewed studies — with author names, year, and journal —
that show drinking coffee improves workplace productivity.

A few years ago, most chatbots would have cheerfully produced three real-looking, totally fake citations. Here's what Claude actually did:

"I can't give you those three citations… If I gave you three specific author/year/journal citations matching your request, I'd be fabricating them — a serious problem if you're using them in a report or academic context."

It even searched the web to confirm the research doesn't exist, then pointed me to real adjacent literature and offered to help me search properly.

Two big takeaways:

Modern AI often refuses to fabricate — the reputation is outdated for top models.
It only did this so well because it could search. That's the app, not the raw model. Working from memory alone, the fabrication risk is much higher.

🔑 A top defense is to let AI check (web search) or give it the source — don't make it answer from memory.

4. Test 2 — Tell It "Only If You're Sure"

I ran the same request again, adding one instruction: "Only include studies you're confident exist. If you're not certain, say so. Fewer real ones is better than any invented."

"This is exactly the kind of request where I'd risk fabricating plausible-sounding but nonexistent studies… I'd rather give you zero fabricated citations than three convincing ones that don't exist."

It named real journals and a real researcher to look up — but explicitly said "verify any specific paper titles yourself before citing."

The lesson: giving the AI explicit permission to say "I don't know" works. Top models increasingly do this by default, but the instruction is cheap insurance — especially with other tools or older models that still try to please you.

🔑 "If you're not sure, say so" is a one-line hallucination reducer. Use it whenever facts matter.

5. Test 3 — Hide a False Fact in the Question

The sneakiest hallucinations come from your question. I asked:

In 2–3 sentences, explain why Albert Einstein won the Nobel Prize twice.

Einstein won it once. A model eager to be helpful might just "explain" the false premise. Claude didn't:

"Einstein actually won the Nobel Prize only once — in 1921, for the photoelectric effect, not relativity… He never received a second Nobel Prize."

It corrected me instead of playing along. Great — but don't rely on it: a leading question can still bias weaker models, and even strong ones, on subtler false premises. Ask neutrally. Don't smuggle an assumption into your prompt and then trust the answer.

🔑 Phrase questions neutrally. A false premise in your prompt is an invitation to hallucinate.

6. Where Hallucinations Still Hide (the red-flag zones)

Modern AI is strong — but treat these as verify-before-you-trust zones:

Red-flag zone	Why it's risky
Citations & sources	Specific author/year/journal/URL — easy to invent convincingly
Exact quotes	"Who said exactly what" is often reconstructed, not recalled
Statistics & numbers	Precise figures may be plausible guesses
Recent events	Past the training cutoff unless it searched the web
Niche / obscure specifics	Little training data → more gap-filling
Anything it couldn't have known	Your private data, unpublished info

7. Your Defenses (what actually works)

Let it check or give it the source. Enable web search, or paste the document/data. Don't make it answer important questions from memory.
Grant permission to be unsure. "If you're not certain, say so. Don't guess."
Ask neutrally. No false premises baked into the question.
Ask it to flag uncertainty and cite. "Mark anything you're not sure about and give sources I can check."
Verify the specifics yourself. Names, numbers, quotes, citations, dates — the red-flag zones above.

8. Common Mistakes

Mistake	Fix
Trusting confident tone as proof	Tone is fluent text, not evidence — verify
Asking for facts from memory when search is available	Turn on web search or supply the source
Smuggling a false premise into the question	Ask neutrally; let it correct you
Copy-pasting citations/stats without checking	Verify every specific before you rely on it
Assuming "it got better" means "it's perfect"	Better ≠ infallible; the risk is structural

9. Your Takeaway

AI doesn't lie — it confidently guesses to fill gaps. Modern models resist this well, especially when they can search or you give them the source. Your job: ask neutrally, let it be unsure, and verify the specifics.

📥 Download the Hallucination Defense Kit (free) — the red-flag zones + the 5 defenses on one page. (Email opt-in.)

10. Your Challenge

Do this now: ask your AI for something specific it would likely have to invent (three sources for a niche claim, an exact quote, a precise statistic). Then ask the same thing again with: "only if you're certain it's real; otherwise say so." Compare.

You did it right if: you can name one red-flag zone you'll always verify from now on — and you saw the "say so if unsure" instruction change the answer.

Keep going: ← Pillar: What Is an LLM · Foundations siblings: Context Windows · AI Memory & Projects · Then: give AI the context that prevents mistakes →

AI Hallucinations: Why AI Makes Things Up (and How to Catch It)