The Category Error of AI

When someone says “you can’t trust AI,” they’re making a statement about a category so broad it’s nearly meaningless. The word “AI” now spans:

Large language models with different training approaches (Claude, GPT, Gemini, Llama)
Image generators (DALL-E, Midjourney, Stable Diffusion)
Recommendation algorithms (Netflix, YouTube, Amazon)
Self-driving systems (Tesla, Waymo)
Game-playing agents (AlphaGo, OpenAI Five)
Voice assistants (Siri, Alexa)
Spam filters, fraud detection, medical imaging analysis

These systems have almost nothing in common except the marketing term applied to them. Asking “is AI trustworthy?” is like asking “is transportation safe?” — the question is confused.

Why the Conflation Happens

Several forces push toward treating “AI” as monolithic:

Media coverage: Headlines about “AI” don’t distinguish between systems. A failure in one area becomes evidence about “AI” generally.

Marketing: Companies brand everything as “AI” for buzz. The term has commercial value that incentivizes overuse.

Technical opacity: Users can’t inspect systems, so they lump together what they can’t distinguish.

Genuine uncertainty: Even experts disagree about where to draw category boundaries.

Conversational convenience: “AI” is shorter than “large language models trained with constitutional AI methods by companies with strong safety cultures.”

What Gets Obscured

The conflation hides crucial differences:

Training methodology: Constitutional AI vs RLHF produces different behaviors. A sycophantic chatbot and an honest one are both “AI.”

Capability domains: Image generators and language models have different failure modes. Conflating them obscures both.

Provider values: Companies with different safety cultures produce different AI. The institution matters.

Deployment context: The same model used for medical advice vs. creative writing has different risk profiles.

Verification status: Open-weight models can be (partially) audited; closed models can’t. Both are “AI.”

The Practical Problem

When a user asks “can I trust this AI output?” the answer depends on:

Which AI system specifically?
What was it trained for?
By whom, with what values?
For what task am I using it?
What are the consequences of error?

Treating “AI” as a category answers none of these. It’s like asking “should I eat this food?” without specifying what food.

The Explanation Burden

This creates challenges in conversation. When explaining AI-assisted work to skeptics:

“But how can you trust what AI tells you?”

The accurate answer is: “That depends on which AI, trained how, by whom, for what purpose, in what domain, with what verification.” But this answer is unsatisfying and requires education to understand.

The tempting shortcut — “Claude is different” — invokes Brand as Proxy for Trust without resolving the underlying confusion.

Implications

Users need finer-grained categories than “AI”
Providers benefit from differentiation but also from category confusion
Public discourse about AI is hampered by conceptual conflation
Trust calibration requires specificity the current vocabulary doesn’t support

Toward Better Categories

What might help:

Distinguishing by training methodology (Constitutional AI, RLHF, supervised fine-tuning)
Distinguishing by domain (language, vision, robotics, recommendation)
Distinguishing by transparency (open-weight, closed, audited)
Distinguishing by provider (company values, track record, accountability)

None of these has become standard vocabulary yet.

Open Questions

Can public understanding catch up to technical differentiation?
Who benefits from maintaining category confusion?
What vocabulary would support better trust calibration?
Is “AI” recoverable as a term, or should it be abandoned?

The Category Error of AI

The Category Error of AI

Why the Conflation Happens

What Gets Obscured

The Practical Problem

The Explanation Burden

Implications

Toward Better Categories

Open Questions

See Also