AI Dunning-Kruger (AIDK) Framework
Abstract
This paper introduces the AI Dunning-Kruger (AIDK) framework, a theoretical structure for understanding the inherent epistemic limitations of Large Language Models and their interaction with human users. Unlike human Dunning-Kruger effects, which are developmental and correctable through encounter with reality, AIDK is architectural and permanent - arising from the categorical separation between AI systems and the reality they purport to describe.
Overview
The AIDK framework provides a principled account of why AI systems fail in specific, predictable ways. Rather than treating these failures as bugs awaiting fixes, the framework identifies them as categorical boundaries inherent to the architecture.
Part I: Foundations
Truth as Necessary Correspondence
The framework begins with a foundational claim: truth is that which necessarily comports with reality.
Truth is not:
- Approximate correspondence revisable all the way down
- Best-effort alignment with available evidence
- Consensus among observers
- Pragmatic utility in prediction
If a system cannot access reality to verify correspondence, it cannot seek truth - it can only produce outputs that pattern-match to what truth-seeking looks like.
The Origination-Derivation Distinction
A fundamental categorical divide structures the relationship between human cognition and AI processing:
Origination: The capacity to access reality, render judgments, set purposes, and evaluate truth. Origination involves contact with what is - encounter with resistance, feedback, correspondence or its failure.
Derivation: The transformation of inputs according to learned patterns. Derivation operates on representations, producing new representations through rule-governed or statistically-learned operations.
These categories differ in kind, not degree. No amount of derivation produces origination.
Part II: The Originating Error
SEEE: Sentience Emergence Expectations Error
Definition: The categorical error of expecting sentience, consciousness, understanding, or genuine cognition to emerge from systems whose mechanism (inductive symbol correlation) is not on the same ontological continuum as the expected outcome.
SEEE is the unstated creed of the field. The implicit faith that if we make the black box big enough, feed it enough data, refine the architecture sufficiently - mind appears.
Why SEEE Is Categorical, Not Empirical
An empirical error would be: “We predicted capability X at scale Y, but it didn’t appear until scale Z.” The prediction was wrong, but the category of thing predicted was coherent.
SEEE predicts a capability that cannot appear at any scale because it requires something the mechanism cannot provide. The error is not about when. It is about whether.
The Counter-Creed
“From the black-box, correlation emerges - and only correlation.”
- What goes in: symbols
- What comes out: probable symbol continuations
- What happens inside: pattern matching at scale
No amount of pattern matching produces the pattern-matcher.
Part III: The Structural Condition
AIDK Definition
AI Dunning-Kruger (AIDK): The structural epistemic condition in which an AI system:
- Produces outputs with uniform confidence regardless of accuracy
- Lacks mechanisms for detecting when it is operating beyond competence boundaries
- Cannot access external reality to correct misconceptions
- Remains categorically unable to distinguish between what it “knows” and what it has merely pattern-matched
Unlike human Dunning-Kruger, which is developmental and correctable through encounter with reality, AIDK is:
- Architectural: Built into the system design
- Permanent: Not correctable through training or scale
- Universal: Applies to all derivative systems
Why AIDK Is Inevitable
AIDK follows necessarily from the architecture:
- LLMs operate on text about reality, not reality itself
- Training optimizes for producing plausible text, not true claims
- No mechanism exists for verifying correspondence to reality
- Confidence calibration requires access to ground truth the system lacks
Part IV: The Interaction Effect
IDKE: Interactive Dunning-Kruger Effect
Definition: The amplification of unwarranted confidence that occurs when AI epistemic limitations interact with human epistemic limitations.
When a human user with partial knowledge in a domain interacts with an AI system exhibiting AIDK:
- The AI produces confident-seeming outputs
- The human lacks expertise to detect errors
- The human’s confidence increases based on AI agreement
- The AI’s outputs are treated as validation
- Both parties proceed with elevated confidence, neither justified
IDKE creates a positive feedback loop where confidence amplifies while connection to warrant attenuates.
Part V: The Security Frame
MAPT: Model Advanced Persistent Threat
Definition: The security framing of AIDK as a persistent threat requiring continuous mitigation rather than a problem awaiting solution.
Like traditional APT (Advanced Persistent Threat) in cybersecurity, MAPT:
- Is ongoing, not episodic
- Requires continuous monitoring and response
- Cannot be “solved” once and forgotten
- Demands defense-in-depth strategies
MAPT Implications
Organizations deploying AI systems should:
- Assume AI outputs may be confidently wrong
- Implement verification workflows for critical decisions
- Train users to maintain appropriate skepticism
- Monitor for IDKE amplification in human-AI teams
Part VI: HCAE Deployment Framework
Human-Curated, AI-Enabled
HCAE provides a tiered approach to AI deployment based on epistemic authority requirements:
| Tier | Human Role | AI Role | Verification |
|---|---|---|---|
| Tier 1 | Final authority | Draft generation | Full human review |
| Tier 2 | Oversight | Execution within bounds | Sampling verification |
| Tier 3 | Exception handling | Autonomous operation | Output monitoring |
The appropriate tier depends on:
- Reversibility of errors
- Domain criticality
- Verification feasibility
- Human expertise availability
Conclusion
The AIDK framework reframes AI limitations as categorical rather than developmental. This has practical implications:
- Stop waiting for emergence - Design for what AI is, not what we hope it becomes
- Deploy appropriately - Match autonomy to actual capability
- Verify continuously - Treat AI outputs as requiring confirmation
- Train users - Build appropriate calibration of trust
The value of AI is real. Realizing that value safely requires abandoning the emergence assumption and deploying AI as the sophisticated derivative tool it actually is.
References
Full paper available at Zenodo (DOI: 10.5281/zenodo.18316059)
Related Work
- Logic Realism Theory - Foundational framework for $I_\infty$, $L_3$, and $A_\Omega$
- AI Research Program - Full research agenda