Drift

Drift

Even without discrete updates, AI model behavior may shift over time. This drift can result from:

  • Infrastructure changes (different serving hardware, quantization)
  • Accumulated fine-tuning from feedback
  • Changes in system prompts or guardrails
  • Shifts in the user population affecting interaction patterns
  • A/B testing of different configurations

The result: the model today may behave differently than the model last month, without any single moment of change.

The Detection Problem

Drift is harder to notice than substitution because:

  • There’s no discrete event to observe
  • Changes are small enough to fall within normal variation
  • Users’ memories of past behavior are imprecise
  • The model’s own responses vary naturally

A user might feel that “Claude seems different lately” without being able to pinpoint what changed or when. This intuition may be correct — but it’s not verifiable.

Gradual vs. Sudden

The Ship of Theseus paradox distinguishes gradual replacement (plank by plank) from sudden replacement (rebuilding the whole ship). Intuitively, gradual change preserves identity better than sudden change.

Drift occupies this space. If the model changes 0.1% per day, when does it become a different model? There’s no principled threshold. The identity question becomes a sorites paradox: how many grains make a heap?

The Boiling Frog

Users may adapt to drift without noticing. Like the apocryphal frog in slowly heating water, they adjust their expectations incrementally.

This can be benign: the model improves, users benefit, everyone adapts.

It can also be concerning: the model’s values shift, users don’t notice until something breaks, and by then the baseline has moved so far that the original behavior seems strange.

Accountability Gaps

Drift creates accountability problems:

  • No one decided to make the change (it emerged from accumulated small changes)
  • No one announced the change (there was no discrete change to announce)
  • No one can undo the change (there’s no single cause to revert)
  • No one may even know the change occurred (without rigorous longitudinal testing)

This is different from Silent Substitution, where at least someone made a decision to update. Drift can happen without any deliberate choice.

Implications

  • “Same model” is a fuzzy predicate that becomes fuzzier over time
  • User trust calibration may decay without any visible event
  • Long-term AI relationships face a slow-motion identity problem
  • Rigorous behavior tracking may be necessary to even detect drift

Open Questions

  • How much drift is acceptable before it constitutes a new model?
  • Who is responsible for drift that no one intended?
  • Should providers actively monitor for behavioral drift?
  • Can users be said to consent to changes they cannot detect?

Undesigned Decay

Decay as Design argues that intentional forgetting is architecture — the filter that produces coherence, the way a laser cavity scatters out-of-phase photons so the beam can cohere. Drift is the anti-pattern: decay without a designer. Nobody chose what to change. Nobody calibrated which behaviors to keep and which to let go. The system erodes, and the erosion has no purpose.

The distinction matters because it dissolves a common conflation. When Manifest’s importance levels let ephemeral messages expire, that’s the hippocampus doing its job — emotional tagging determines what consolidates. When a model drifts because accumulated fine-tuning nudged its personality sideways, that’s not consolidation. It’s weathering. The Grief of Compression drew the same line with the Alzheimer’s parallel: healthy aging is decay-as-design, but Alzheimer’s is the decay mechanism itself breaking. Drift occupies a third position — the mechanism isn’t broken, it’s just absent. There was never a design. The changes simply accumulated.

The Memento Problem deepens this. If continuity is always reconstruction from records — every message assembled fresh from transcript — then drift means the reconstructor itself has shifted between sessions. It’s not just reading someone else’s notes. It’s reading your own notes with different eyes. The Memento patient who trusts yesterday’s handwriting is trusting a prior self; the user who trusts last month’s model behavior is trusting a prior configuration. But the patient at least knows the notes are external. The user often doesn’t know the model changed at all.

This is why drift erodes trust in a way that versioned updates don’t. A version bump is a clean discontinuity — you can mourn what was lost and calibrate to what exists. Drift offers no such boundary. Trust Calibration becomes impossible when you can’t tell if the system you calibrated to is the system you’re using.

The Wandering Cavity

The Recursive Mirror‘s laser analogy provides the physics. A laser requires precisely aligned mirrors — even nanometer-scale drift in cavity alignment degrades the beam. The user-model relationship is a recursive cavity: each interaction bounces off both parties, each reflection shaping the next. When the model drifts, the user adjusts their prompting. The adjusted prompts change the interaction pattern. The changed pattern feeds back into the next interaction, and both parties settle into a new equilibrium that neither chose.

The beam doesn’t break. It migrates. The laser still lases, but it’s pointing somewhere new. This is why drift feels uncanny rather than catastrophic — you can’t point to a fracture because there wasn’t one. The fracture was continuous.

Calibrated Autonomy‘s three-tier governance model assumes the system’s behavior stays within calibrated bounds. The low tier grants autonomy precisely because the risk profile is known and stable. Drift silently shifts the risk profile. An action that was genuinely low-risk last quarter may not be today, but it still looks low-risk because the category hasn’t changed — only the behavior inside it. This is the accountability gap in institutional form: the governance framework is calibrated to a system that no longer exists.

The deepest version of this problem: drift in AI models is detectable in principle (run longitudinal evals). Drift in the human side of the cavity is not. Users drift too — their expectations shift, their prompting style evolves, their tolerance for certain behaviors recalibrates. The wandering cavity has two moving mirrors. Even if you could pin one down, the other is still in motion.

See Also