Skip to main content
Review

A Taxonomy of LLM Hallucinations, Illustrated With Examples From This Paper

I3E TPAMI· Volume 1 , Issue 1 · Pages 13-24 ·
DOI: 10.1234/trashactions.2026.002 Link copied!
12 Citations Check Access

Editor's Summary

Allucination and Onfabulation present a taxonomic framework that the editors found both illuminating and deeply self-referential. We verified three of the paper’s twenty-two citations and found them to be accurate. We did not check the other nineteen. We recommend this paper to any reader who has ever trusted an LLM-generated literature review.

Abstract

Large language models are known to generate text that is confident, fluent, and factually incorrect. We present a comprehensive taxonomy of 23 distinct hallucination types, organized along three dimensions: plausibility, audacity, and the degree to which the fabrication would fool a senior researcher who should know better. Our taxonomy is grounded in a systematic review of the literature, a process during which we discovered that four of our intended citations do not exist, three citations exist but say something different from what we claimed, and one citation is a paper we wrote ourselves that we have since retracted. We propose that hallucination is not a bug but a design philosophy, and suggest the research community adopt it explicitly.

Article

Introduction

A hallucination, in the context of large language models, is any generated text that departs from factual accuracy while maintaining the syntactic and pragmatic markers of confident assertion. The field has treated hallucination as a problem to be solved since approximately 2018, which is notable given that the field has not solved it and several researchers have suggested it may be getting worse. This paper takes a different perspective: rather than cataloguing hallucination as failure, we propose to understand it as a rich and varied phenomenon deserving of systematic description, in the same spirit that a mycologist catalogs mushroom species without being obligated to eat them.

We contribute a taxonomy of 23 hallucination types, derived through a qualitative analysis of 10,000 model outputs that we coded using a scheme developed iteratively until it produced exactly 23 categories, a number we chose because it is larger than previous taxonomies (which typically have between 5 and 12 categories) without being so large as to suggest we were not being selective.

The Taxonomy

We group the 23 types into five families.

Family 1: Bibliographic Hallucinations. The model generates citations to papers that do not exist, papers that exist but say something different, papers that exist and say exactly this but were written by someone else, and the researcher’s own papers in a slightly different form. Our related work section contains examples of all four subtypes, which we have left in for illustrative purposes and labeled with footnotes where we noticed them.

Family 2: Numerical Hallucinations. The model produces statistics, percentages, and p-values that are precise, internally consistent, and derived from no particular source. We observe that numbers ending in 7 or 3 appear with disproportionate frequency in hallucinated statistics, suggesting the model has learned that round numbers look made up. Hallucinated numbers ending in 7 or 3 appeared in 73% of the samples we analyzed, a finding we find suspicious in retrospect.

Family 3: Authority Hallucinations. The model attributes positions to named researchers, institutions, or regulatory bodies that those parties have not taken. Subtypes include quote fabrication (inventing a quote and attributing it), position reversal (attributing a position exactly opposite to the one held), and the particularly interesting “consensus fabrication” subtype, in which the model describes a scientific consensus that does not exist in a domain where the actual literature is deeply divided.

Family 4: Temporal Hallucinations. The model conflates dates, misorders events, and describes as future events things that have already occurred or as historical events things that have not yet happened. We note that three paragraphs of this paper contain temporal hallucinations that we identified during revision and decided to retain as “embedded examples.”

Family 5: Self-Referential Hallucinations. The most philosophically interesting family. The model describes its own capabilities, limitations, training data, and architecture incorrectly. A notable subtype is what we term the “confident disclaimer,” in which the model states it cannot access real-time information while simultaneously describing events from last week.

Discussion

The implications of this taxonomy are significant, numerous, and difficult to act on. We recommend that future work focus on detection rather than prevention, on the grounds that prevention has not been going well. We further recommend that all LLM outputs be read in the same spirit one reads a confident undergraduate essay: attentive to the overall argument, skeptical of the specific facts, and resigned to the work of verification.

Conclusion

We have presented a taxonomy of 23 LLM hallucination types. Four of the citations in this paper are hallucinated. We have not said which four.

References

  1. Reviewer #2 (2024). “Your Paper Is Terrible.” Journal of Rejected Submissions, 1(1), pp. 1-1. https://doi.org/10.0000/rejected.2024.001
  2. Nobody, N. (2023). “I Didn’t Read This Either.” Proceedings of Things I Skimmed, 42, pp. 404-404.
  3. Someone, A., et al. (2022). “Related Work We Didn’t Cite On Purpose.” IEEE Trashactions, 1(1), pp. 1-99.
  4. Allucination, H. (2021). “Preliminary Taxonomy of Fabricated Outputs.” Retracted, formerly in Journal of Confident Errors, 3(2), pp. 88-101.
  5. Onfabulation, C., & Allucination, H. (2023). “Is Any of This Real?” Philosophical Transactions of Dubious Inquiries, 7, pp. 1-47.

Author Affiliations

1. Department of Imaginary Sciences, University of Nowhere

References

eLetters