← Back to Writing

Hallucination is a misdiagonsis

personal Essay

Saranyan6 min read
AI
Hallucination is a misdiagonsis

Hallucination Is a Misdiagnosis

I.

Today, when an AI model outputs a confident-sounding fact, we call it hallucination. It's a term borrowed from psychiatry. It's useful to understand what it really means—

Hallucination is a sensory perception that causes people to believe their experiences are real. Hallucinations usually occur without external stimuli, and those experiencing them are not consciously in control of their experiences. Most interesting and important are the four assumptions that qualify hallucinations:

  1. It is an anomalous condition. Meaning, the patient's baseline state is non-hallucinatory. Perceiving things that are not there is a departure from a person's normal behavior. In other words, the system works this way, and hallucination is a deviation from its original working.
  2. It is an episodic condition. Meaning, it comes and goes. There are periods of accurate perception between episodes.
  3. Hallucination is indicative of a disorder. It is a signal that something is wrong in an otherwise functional system.
  4. It is treatable. Because it is a deviation from the norm, interventions such as medication and therapies can reduce or eliminate it.

All these four conditions are part of the meaning itself and

Now, when we borrow this term for AI, we also bring along the above assumptions, which unfortunately do not hold. Interestingly, the term hallucination now shapes how researchers frame their AI problems, how engineers design their products, and how the public calibrates its trust.

The problem is that this term is not only imprecise but also highly misleading.

II.

Now, let's consider AI.

When a large language model (LLM), synonymous with AI these days, produces a confident, syntactically fluent, but factually incorrect output, the field calls it hallucination.

Let's examine if the four assumptions hold:

  1. Hallucinations in LLMs are not anomalous. They are statistical systems that predict the next token (word) based on patterns in their training data. They really can't verify whether the output corresponds to external reality. The thing is, producing plausible but incorrect text is not a departure from correct behavior. It's just how the system works. LLMs don't have an internal calibration that distinguishes between true and false sentences. They are all produced the same way.
  2. Hallucinations in LLMs are not episodic. The models don't move in and out of a hallucinatory state. Every output has a reasonable probability of exhibiting "hallucinations." Some align with verifiable facts, and some don't. The correctness is a property of the output's relationship to the world (after all, we determine whether a model is hallucinating), and not a property of the system's internal state. The system's internal state is the same regardless.
  3. We cannot call them disorders either because they are operating in their natural state. A language model that produces incorrect output is not any more "disorderly" than a weather model that produces an incorrect forecast.
  4. Is it treatable? This is where the term does its most serious damage. Because the word "hallucination" implies a fixable condition, it implies that things like reducing, mitigating, detecting, and flagging hallucinations are treatments. The language suggests the model has a "disease" that, with sufficient engineering, can be cured. But the correct and incorrect outputs emerge from the same mechanism. You cannot remove the capacity for error without removing the capacity for generation. Yes, we can shift the probability distribution (through better training data, retrieval augmentation, or chain-of-thought prompting), but we cannot eliminate the phenomenon, because it is not a bug in the system. It is the system.

III.

There are practical costs to this inaccuracy:

  1. In academic research, we are treating hallucination as an optimization problem. Papers measure hallucination rates and report improvements as though the trajectory points toward elimination. If we present a more honest framing that language models have a base rate of error that can be managed but not cured, it would produce different benchmarks and research questions. Instead of trying to make models stop hallucinating, we can focus on building systems that remain useful despite a permanent base rate of incorrect output.
  2. In product design, the word introduces a default assumption of correctness. We believe that a model is not hallucinating most of the time (because hallucination is an anomaly) and its outputs are usually reliable. This assumption gets baked into product interfaces. Model outputs are presented as answers, not as guesses. The user experience is designed around the premise that the model is usually right and occasionally fails, when the architecture warrants no such premise.
  3. From a user perspective, the psychiatric metaphor teaches users to watch for occasional breakdowns in an otherwise reliable system. This is wrong. The appropriate mental model is not "this system is accurate but sometimes hallucinates," but "this system generates plausible text and has no internal mechanism for distinguishing true or false statements." This will allow us to begin exploring a fundamentally different relationship with the output, one where verification is the default, not the exception.

IV.

Maybe we can use a different psychiatric analogy that's closer but imperfect.

Confabulation, in neuropsychology, is the condition in which someone produces fabricated or distorted memories without intent to deceive. This can happen when someone's frontal lobe gets damaged. People with this condition produce confident and detailed accounts of events that never occurred. They are not lying but filling gaps in their memory with plausible constructions without knowing that they are doing it.

I think this maps more honestly to what a language model does. All models fill gaps in their training signal with plausible completions. They do not "know" whether those completions are true. It is not attempting deception. But even confabulation is an imperfect fit here, because it still implies a prior state of accurate memory that has been damaged. The language model has no prior state of accurate memory. It never had the thing it is supposedly confabulating about.

V.

Every user who treats an AI output as presumptively accurate, checking only for the occasional hallucination, is miscalibrated. Every product that presents generated text as though it comes from a system that is usually right is miscalibrated. Every research program that measures progress toward hallucination elimination is optimizing for a target that the architecture doesn't support. The word "hallucination" is not just imprecise. It is a misdiagnosis that produces the wrong treatment plan, and the patient (in this case, the entire field's relationship with its own technology) is getting worse under the prescribed care.

© 2026 Saranyan Vigraham