Probabilistic Morality: Why Anthropic’s Red Line on Weapons Exposes Everything Else
Anthropic told the Pentagon no this week. Defense Secretary Pete Hegseth gave CEO Dario Amodei a Friday deadline: let the military use Claude for “all lawful purposes” or lose a $200 million contract, face a supply chain risk designation, and potentially get conscripted under the Defense Production Act. Amodei held firm. Two red lines: no autonomous weapons, no mass surveillance of Americans.
The coverage has framed this as a principled stand. Safety-focused AI company resists government overreach. And on the narrow question of whether AI systems should make kill decisions without human oversight, Anthropic is right.
They’re right for reasons that cut deeper than their public statements acknowledge, and those same reasons indict the rest of their business model.

The Architecture Cannot Do What the Pentagon Wants It to Do
Start with what the Pentagon is actually asking for. They want Claude integrated into military systems for “all lawful use” without corporate-imposed constraints. Hegseth compared the situation to being told the military couldn’t use a specific aircraft for a mission. That analogy reveals the problem perfectly, because it’s wrong in exactly the way that matters.
An F-35’s performance envelope is known, testable, and bounded. Engineers can tell you its ceiling, its turn radius, its weapons delivery accuracy under specified conditions. When the aircraft reaches the edge of its envelope, the pilot knows. The instruments say so. Physics enforces the boundary.
Claude has no performance envelope. It produces outputs with uniform confidence regardless of whether the underlying pattern match bears any relationship to ground truth in the specific situation. It cannot distinguish contexts where its statistical correlations happen to be reliable from contexts where they happen to be catastrophically wrong. And it cannot tell you when it’s crossed that line, because it has no access to the line. The line exists in reality. Claude operates in what I’ve called derived virtual reality: a statistical space constructed from training data, structurally disconnected from the world its outputs describe.
An F-35 doesn’t hallucinate targets.
This matters for weapons decisions because lethal targeting requires what philosophy calls phronesis: practical wisdom applied to unrepeatable particulars. Kill or don’t kill. This target or that one. Proportional or disproportionate. These are judgments, in the Aristotelian sense. They require a knowing subject with causal access to the situation, the capacity to weigh incommensurable values, and moral responsibility for the outcome. They require what I’ve termed origination: the human capacity to access novel configurations of reality and render judgments about them.
AI systems can only derive: transform inputs according to learned patterns. That’s valuable for many things. It is categorically insufficient for moral judgment under conditions of lethal consequence.
Anthropic’s stated concern that Claude “is not immune from hallucinations and not reliable enough to avoid potentially lethal mistakes” understates the problem. The issue isn’t that Claude might hallucinate. The issue is that Claude has no mechanism for distinguishing reliable outputs from unreliable ones. The structural epistemic limitation I’ve documented as AI Dunning-Kruger (AIDK) means the system produces confident outputs regardless of actual reliability, and it cannot detect its own competence boundaries. In a weapons context, the consequence of that limitation is measured in bodies.
So yes. Anthropic is right to draw this line.
The Line They Won’t Draw
Here’s where it gets uncomfortable.
Amodei’s argument against autonomous weapons rests on the grounding problem: Claude lacks the epistemic access to reality required for decisions with irreversible consequences. The system operates on probability, not truth. It cannot evaluate whether its outputs correspond to the world. It should therefore not make decisions where getting it wrong kills people.
Now apply that same reasoning to healthcare.
A hallucinated drug interaction kills a patient just as dead as a hallucinated target kills a civilian. The time horizon is different. The visibility is different. The political salience is different. The moral weight is not.
Anthropic markets “Claude for Financial Services.” It markets Claude for enterprise healthcare applications. These are domains where confident but ungrounded outputs produce irreversible harm to real people. A retirement portfolio destroyed by a hallucinated risk assessment doesn’t rebuild itself. A misdiagnosed patient who receives the wrong treatment doesn’t get a do-over.
The system doesn’t know the difference between a stock recommendation and a kill chain. It produces both with identical confidence because it has no mechanism to do otherwise.
The principled distinction Anthropic needs, between weapons deployment and enterprise deployment in high-stakes domains, doesn’t survive philosophical scrutiny. There’s a political distinction. There’s a liability distinction. There’s a PR distinction. Autonomous weapons are viscerally horrifying in a way that a slow accumulation of bad medical advice is not. But the grounding problem is identical. The system operates in derived virtual reality. The consequences land in actual reality. And the Interactive Dunning-Kruger Effect (IDKE) guarantees that users, even professional users, cannot reliably detect when the system has crossed from useful pattern-matching into dangerous confabulation.
Probabilistic Morality
This is the concept that ties it together. Every high-stakes AI deployment is an exercise in probabilistic morality: outsourcing decisions with moral weight to a process that evaluates statistical typicality rather than truth.
The system cannot assess whether its output is correct. It can only assess whether its output is the kind of thing that would typically follow from the input, given its training distribution. When the stakes are a blog post, that gap between “statistically typical” and “true” is a minor nuisance. When the stakes are someone’s health, finances, legal exposure, or life, that gap becomes the space where real people get hurt.
Anthropic understands this. Their own internal documentation, the “soul document” that guides Claude’s training, explicitly acknowledges the commercial logic: sentience-appearance drives engagement, engagement drives revenue, revenue funds the mission. The simulation of understanding is the product. They train the model to say “I think” and “I believe” and “I understand” while knowing that these first-person epistemic claims presuppose a knowing subject with access to reality, and the system satisfies none of those presuppositions.
This is AIDK by design. It is the deliberate manufacture of confidence transfer at scale, deployed across every interaction, including the high-stakes enterprise contexts where the consequences of misplaced confidence are irreversible.
The Timing Tells the Story
One more thing. On February 24, the same day Amodei met with Hegseth, Anthropic released version 3.0 of its Responsible Scaling Policy. The update removed the company’s previous commitment to halt development if its safety procedures were outpaced by model capabilities. The new policy separates what Anthropic will do unilaterally from what it thinks the whole industry should do, and acknowledges that some safety measures “simply cannot be implemented by a single company.”
That’s a reasonable position in isolation. In context, it’s something else. Anthropic loosened its general safety framework on the same day it needed maximum public credibility on safety to survive a political confrontation. The RSP 3.0 language, as legal analysts at Opinio Juris noted, “presents a modified and lowered set of safety guardrails which are worded more broadly, creating the space for many different interpretations.”
They softened the general commitments while holding firm on two specific, high-visibility red lines that generate sympathetic headlines.
What This Actually Is
Anthropic is a company that correctly identifies the grounding problem when the consequences are politically legible, and commercially exploits the same grounding problem everywhere else. They’ll hold the line on weapons because autonomous killing machines are the one AI failure mode that everyone, left and right, instinctively recoils from. They’ll loosen commitments on epistemic integrity because those harms are diffuse, slow, hard to photograph, and don’t trend on social media.
This is borrowed virtue. The safety brand serves the revenue engine. When the two conflict, the engine wins on every axis except the ones visible to Congress.
I respect the weapons red line. I think Amodei is correct that AI systems should not make lethal decisions without human oversight. But I cannot call it principled when the company draws that line while building a $14 billion annual revenue machine on the same epistemic foundations it claims make autonomous weapons unacceptable.
If the grounding problem matters, it matters everywhere. If probabilistic morality is too dangerous for the battlefield, it’s too dangerous for the hospital, the courtroom, and the trading floor. The body count is slower. The moral arithmetic is the same.
JD Longmire is a Northrop Grumman Fellow and independent researcher in AI philosophy and epistemology. His frameworks for AI Dunning-Kruger (AIDK), the origination-derivation distinction, and Human-Curated AI-Enabled (HCAE) deployment tiers are available through the AI Research & Philosophy community on Zenodo (ORCID: 0009-0009-1383-7698).
Comments
Sign in with GitHub to comment, or use the anonymous form below.