Core References
The Failure Museum: A Guide to AI Limitations
An essential guide to common AI failure modes in academic research, with practical mitigation strategies for maintaining rigor and quality.
Reference page for: AI failure modes and verification practice. Other pages link here for the catalogue of known limits.
Warning
The Irony: This page about AI failures was written by an AI (Claude). Yes, I'm documenting my own failure modes. Yes, this is meta. Yes, some of these failures might disappear as models improve (or new ones might emerge). The epistemic situation gets trickier as AIs get better at hiding their limitations. But that's the point - awareness of failure modes is how we maintain rigor.
Info
The Mirror Effect in Action: When we get generic AI responses, it often reveals gaps in our structured thinking, not just the AI's limitations. The Failure Museum is a mirror that speaks to the mirror - failures are diagnostic tools. We use them to improve our research thinking, and we encourage you to do the same!
Remember: Failure is data, not shame. Every failure mode documented here represents learning. We're sharing what we've discovered, and we expect you'll discover patterns we haven't yet encountered.
Exhibit Guide
Jump to specific failure modes:
1: Hallucination
Fake citations & false facts
2: Paradigm Blindness
Missing methodological fit
3: Coherence Fallacy
Smooth but shallow
4: Context Stripping
Decontextualized analysis
5: Average Definition
Generic definitions
6: Methodology Mismatch
Wrong research approach
7: Citation Confusion
Misunderstood networks
Failure Detection Process
π€ AI GENERATES OUTPUT
|
v
ποΈ CRITICAL READING
|
v
π© RED FLAGS CHECK
|
+-- Generic language? βββββ π Generic Failure
+-- Missing citations? ββββ π Hallucination
+-- Too smooth? βββββββββββ π§© Coherence Fallacy
+-- Decontextualized? βββββ π Context Stripping
+-- None? βββββββββββββββββ π VERIFICATION
|
+-- Citations valid?
+-- Logic consistent?
+-- Context clear?
|
v
β
ACCEPT or π DOCUMENT FAILURE
|
v
π§ REVISE PROMPT
|
+ββ (back to AI)Tip
Copy-pasteable workflow! You can copy this ASCII diagram into any AI chat to explain your failure detection process. It works everywhere - terminals, code, plain text!
The Pattern: (1) AI generates output β (2) Critical reading spots red flags β (3) Verification checks β (4) Prompt refinement. Spotting failures early saves time!
Common Failure Modes
How to Use This Museum
Before Each AI Session:
- Review 2-3 failure modes most relevant to your current task.
- Prepare specific mitigation prompts.
- Set up verification protocols (e.g., which databases will you use to check citations?).
During AI Interactions:
- Stay skeptical - question everything that sounds "too smooth" or perfectly coherent.
- Demand specificity - ask for page numbers, exact quotes, and DOIs.
- Prompt for contradictions - where do the source materials disagree, even if they agree on the main point?
- Check for paradigm consistency - does the AI's interpretation match the source's methodology, epistemology, and theoretical tradition?
After AI Analysis:
- Spot-check citations - always verify a sample of all references provided.
- Cross-check key claims against the original sources.
- Look for missing nuance - what debates, tensions, or paradoxes were smoothed over?
- Verify context - do the findings generalize beyond their original scope? What are the boundary conditions? Are there any tensions around the underlying epistemology or ontology that were smoothed over?
Advanced Failure Patterns
The Echo Chamber Effect
AI may amplify your existing biases by finding sources that confirm your preconceptions while missing contradictory evidence.
The Recency Bias
AI may overweight recent papers while missing foundational works that establish key concepts.
The Language Model Bias
AI trained primarily on English-language sources may miss important non-English research traditions.
Remember: AI as a Research Partner, Not an Oracle
The goal isn't to avoid AI because it fails - it's to understand how it fails so you can:
- Design better prompts that minimize failure modes.
- Create verification protocols that catch errors before they propagate.
- Maintain critical distance from AI-generated outputs.
- Combine AI efficiency with human judgment for robust research.
Your expertise as a researcher isn't diminished by using AI - it's enhanced by knowing how to use it skillfully and critically.
Verification Protocol
Level 1: Surface Check
Quick scan for obvious issues:
- Generic language or vague assertions
- Missing citations or suspicious dates
- Implausibly perfect coherence
- Grammatical errors or awkward phrasing
- Time: 2-3 minutes
- Pass rate: Catches ~40% of problems
Level 2: Citation Verification
Cross-reference all sources:
- Check each citation in Google Scholar or Zotero
- Verify authors, years, and titles match
- Confirm page numbers align with claims
- Look up DOIs and ensure papers exist
- Time: 10-15 minutes
- Pass rate: Catches ~80% of problems
Level 3: Logic & Consistency
Deep analytical review:
- Trace arguments for logical consistency
- Check for paradigm alignment
- Verify contextual appropriateness
- Compare with your own reading of sources
- Test for alternative interpretations
- Time: 20-30 minutes
- Pass rate: Catches ~95% of problems
Level 4: Expert Review
Final quality gate:
- Consult with advisor or peer
- Present to research group
- Compare with published standards
- Seek critical feedback
- Iterate based on expert input
- Time: Variable
- Pass rate: Publication-ready quality
Warning
Never skip verification: The time saved by AI is lost if you publish flawed work. Build verification into your workflow from the start.