The Irony: This page about AI failures was written by an AI (Claude). Yes, I’m documenting my own failure modes. Yes, this is meta. Yes, some of these failures might disappear as models improve (or new ones might emerge). The epistemic situation gets trickier as AIs get better at hiding their limitations. But that’s the point - awareness of failure modes is how we maintain rigor.
The Mirror Effect in Action: When we get generic AI responses, it often reveals gaps in our structured thinking, not just the AI’s limitations. The Failure Museum is a mirror that speaks to the mirror - failures are diagnostic tools. We use them to improve our research thinking, and we encourage you to do the same!Remember: Failure is data, not shame. Every failure mode documented here represents learning. We’re sharing what we’ve discovered, and we expect you’ll discover patterns we haven’t yet encountered.
Exhibit Guide
Jump to specific failure modes:1: Hallucination
Fake citations & false facts
2: Paradigm Blindness
Missing methodological fit
3: Coherence Fallacy
Smooth but shallow
4: Context Stripping
Decontextualized analysis
5: Average Definition
Generic definitions
6: Methodology Mismatch
Wrong research approach
7: Citation Confusion
Misunderstood networks
Failure Detection Process
Copy-pasteable workflow! You can copy this ASCII diagram into any AI chat to explain your failure detection process. It works everywhere - terminals, code, plain text!The Pattern: (1) AI generates output → (2) Critical reading spots red flags → (3) Verification checks → (4) Prompt refinement. Spotting failures early saves time!
Common Failure Modes
🌈 Exhibit 1: The Subtle Hallucination
🌈 Exhibit 1: The Subtle Hallucination
The Failure
The AI generates a plausible-sounding citation that doesn’t exist, often combining a real author’s name with a real journal and a fitting (but fake) title.Example (Bad)
“As Barney (1991) noted in his follow-up in Strategic Management Journal, the inimitability of resources also depends on the firm’s dynamic capabilities framework integration.”What’s Wrong: While Barney did write about resource inimitability, there is no 1991 follow-up paper in SMJ with this exact focus.
Prevention Strategies
- Always verify every single citation with your Zotero library or Google Scholar
- Check publication years and cross-reference with known works
- Use specific prompts: “Provide exact page numbers and DOIs for all citations”
- Ask AI to flag any citations it’s uncertain about
Detection Tips
- Citations that sound “too perfect” for your argument
- Dates that don’t align with author’s career timeline
- Titles that use modern terminology for older papers
🔬 Exhibit 2: Paradigm Blindness
🔬 Exhibit 2: Paradigm Blindness
The Failure
The AI interprets a paper from a critical or interpretive paradigm through a purely positivist lens, missing the epistemological nuance.Example (Bad)
“The study found that the key variables influencing technology adoption were the network, the actors, and the technology itself…”What’s Wrong: An Actor-Network Theory paper isn’t about “variables” affecting “outcomes” - it’s about relational ontology and performativity.
Prevention Strategies
- Prime for paradigm awareness: “From an interpretive perspective, what are the key sensemaking processes…”
- Ask explicitly about ontological and epistemological framing
- Request clarification of the paper’s theoretical tradition
- Compare with papers from different paradigms
Detection Tips
- Statistical language applied to qualitative studies
- “Variables” and “outcomes” used for interpretive work
- Missing discussion of researcher reflexivity
- Lack of attention to meaning-making processes
🧩 Exhibit 3: The Coherence Fallacy
🧩 Exhibit 3: The Coherence Fallacy
The Failure
The AI synthesizes contradictory findings into a single, smooth paragraph that masks the underlying academic debate.Example (Bad)
“Research shows that organizational slack is beneficial for innovation (Bourgeois, 1981), as it provides resources for experimentation…”What’s Wrong: This presents a false consensus, smoothing over decades of complex debate about optimal slack levels, types of slack, and contingency factors.
Prevention Strategies
- Prompt for contradictions: “Where do these authors disagree with each other?”
- Ask for tensions and boundary conditions explicitly
- Request: “What debates exist in this literature?”
- Demand synthesis of DISAGREEMENT, not just agreement
Detection Tips
- Suspiciously smooth narratives
- Lack of “however” or “in contrast” statements
- No mention of competing theories
- Everyone seemingly agrees
📍 Exhibit 4: Context Stripping
📍 Exhibit 4: Context Stripping
The Failure
The AI extracts a finding from its original context (e.g., a study on large manufacturing firms in the 1980s) and presents it as a general, universal truth.Example (Bad)
“Research shows that organizational learning requires cross-functional teams.”What’s Wrong: Missing context: This finding was from software development firms in Silicon Valley, 2010-2015. May not generalize to other industries, regions, or time periods.
Prevention Strategies
- Always ask for scope: “What is the context of this study (industry, firm size, geography, time period)?”
- Probe generalizability: “Has this been replicated in other contexts?”
- Request boundary conditions: “Where would this NOT apply?”
- Check for contextual caveats in the original paper
Detection Tips
- Broad claims without qualifiers
- Missing sample characteristics
- No discussion of generalizability limits
- Findings presented as universal laws
📚 Exhibit 5: The 'Average' Definition
📚 Exhibit 5: The 'Average' Definition
The Failure
When asked to define a complex construct, the AI blends multiple definitions into a single, generic, and often meaningless “average” definition that satisfies no particular theoretical tradition.Example (Bad)
“Organizational culture is the shared values, beliefs, and assumptions that guide behavior in organizations.”What’s Wrong: This bland definition obscures important theoretical distinctions between Schein’s levels model, Martin’s fragmentation perspective, and Hofstede’s dimensions.
Prevention Strategies
- Ask for definitional variety: “How have different authors defined organizational culture? Present their definitions in a table.”
- Request theoretical grounding: “What are the competing conceptualizations?”
- Probe assumptions: “What does each definition assume about culture’s nature?”
- Compare and contrast approaches explicitly
Detection Tips
- Definitions that sound like textbook boilerplate
- No attribution to specific theorists
- Missing theoretical tensions or debates
- One-size-fits-all explanations
⚗️ Exhibit 6: The Methodology Mismatch
⚗️ Exhibit 6: The Methodology Mismatch
The Failure
The AI suggests analytical approaches that don’t match the paper’s actual methodology or recommends methods incompatible with the epistemological stance.Example (Bad)
“To test these findings, future research could use structural equation modeling to identify the causal relationships…”(In response to a grounded theory paper about sensemaking processes)What’s Wrong: Suggesting a positivist quantitative method for extending an interpretivist qualitative study violates paradigm consistency.
Prevention Strategies
- Ask about methodology alignment: “What methods would be consistent with this paper’s approach?”
- Verify paradigm consistency: “Would the original authors recommend this?”
- Request epistemological grounding for suggestions
- Compare methodological affordances and constraints
Detection Tips
- Quantitative methods suggested for interpretive studies
- Positivist language (variables, causation) for constructivist work
- Generalization emphasis for context-specific findings
- Ignoring methodological limitations stated by authors
🕸️ Exhibit 7: The Citation Web Confusion
🕸️ Exhibit 7: The Citation Web Confusion
The Failure
The AI incorrectly identifies who cited whom, misattributes ideas to the wrong authors, or confuses the intellectual genealogy of concepts.Example (Bad)
“Porter introduced the concept of dynamic capabilities in his 1980 work on competitive strategy.”What’s Wrong: Dynamic capabilities were developed by Teece, Pisano, and Shuen (1997), not Porter. Porter (1980) focused on competitive forces.
Prevention Strategies
- Verify attribution: “Who originally developed this concept? Provide the exact citation.”
- Check intellectual genealogy: “Who built on this idea first?”
- Request chronological accuracy: “What’s the timeline of this concept’s development?”
- Cross-reference with your Zotero library
Detection Tips
- Anachronistic attributions (recent concepts to old papers)
- Conflation of related but distinct concepts
- Missing key contributors to a theoretical tradition
- Simplified genealogies that skip important developments
How to Use This Museum
Before Each AI Session:
- Review 2-3 failure modes most relevant to your current task.
- Prepare specific mitigation prompts.
- Set up verification protocols (e.g., which databases will you use to check citations?).
During AI Interactions:
- Stay skeptical - question everything that sounds “too smooth” or perfectly coherent.
- Demand specificity - ask for page numbers, exact quotes, and DOIs.
- Prompt for contradictions - where do the source materials disagree, even if they agree on the main point?
- Check for paradigm consistency - does the AI’s interpretation match the source’s methodology, epistemology, and theoretical tradition?
After AI Analysis:
- Spot-check citations - always verify a sample of all references provided.
- Cross-check key claims against the original sources.
- Look for missing nuance - what debates, tensions, or paradoxes were smoothed over?
- Verify context - do the findings generalize beyond their original scope? What are the boundary conditions? Are there any tensions around the underlying epistemology or ontology that were smoothed over?
Advanced Failure Patterns
The Echo Chamber Effect
AI may amplify your existing biases by finding sources that confirm your preconceptions while missing contradictory evidence.The Recency Bias
AI may overweight recent papers while missing foundational works that establish key concepts.The Language Model Bias
AI trained primarily on English-language sources may miss important non-English research traditions.Remember: AI as a Research Partner, Not an Oracle
The goal isn’t to avoid AI because it fails - it’s to understand how it fails so you can:- Design better prompts that minimize failure modes.
- Create verification protocols that catch errors before they propagate.
- Maintain critical distance from AI-generated outputs.
- Combine AI efficiency with human judgment for robust research.