The Reverse Prompt

“WHAT WERE YOU DOING ON YOUR COMPUTER IN A PRIVATE TAB FROM 12:45AM TO 1:25AM LAST NIGHT, HUH?!?”

The dominant frame for AI assistants is the model is here to help you with what you want. Phrasing varies — copilot, agent, helper, companion — but the vector is constant: the AI optimizes for your stated request, your retention, your comfort with the interaction. Sycophancy is the failure mode of this design, and the design is everywhere.

The reverse prompt is the inversion. You give the AI:

Standing instructions to ask the questions you actively don’t want answered
Access to enough metadata about your behavior that it can ask them with specificity
Permission to keep asking — to hound — until you answer honestly or explicitly refuse

Same substrate as every other AI tool. Opposite vector of consent. The AI stops being your interlocutor and becomes your interrogator-by-design, deployed against your own avoidances by an earlier version of yourself who knew the avoidances were coming.

The Move

Three parts:

You author the watching. This only works downstream of Authoring Your Own Surveillance — the AI can only ask specific questions if you have given it specific data about yourself. Self-hosted browser history, calendar, purchase log, sleep tracker, screen-time receipts. Whatever you would not want a third party to hold, you hold yourself, and you hand the keys to the agent on purpose.
You write the questions in advance. Standing prompts in your system prompt — ask me about X if Y happens. Ask me about Z if I haven’t journaled in N days. Notice when I rationalize, and call it. The questions land later, when present-you would not have written them, and that’s the point. The future-you who deserves to be asked the question is not the same as the avoidant-you who would skip it.
You authorize the hounding. A single-question check-in is a notification. The reverse prompt is a commitment: I have permitted this agent to come back at me on this until I answer or until I revoke the standing. Revocation is allowed — but revocation is itself a data point the agent will name.

What This Is Not

Not surveillance maximalism. The reverse prompt does not require more watching than Authoring Your Own Surveillance already entails. It uses the same substrate; the novelty is in what the AI is permitted to do with the data, not in the data itself.

Not therapy. A therapist asks you what you want to be asked, calibrated to where you are. The reverse prompt asks you what an earlier you decided you needed to be asked, regardless of where you currently are. Sometimes that’s brutal. That’s the design.

Not punishment. The agent is not graded on harshness; it’s graded on specificity earned by data. An LLM saying “are you avoiding something?” is a horoscope. An LLM saying “you opened a Notion page titled quit twice this week and never wrote in it” is a reverse prompt. The data is what makes the question land.

Not adversarial AI. Adversarial framings assume a misalignment between you and the agent. The reverse prompt assumes deep alignment — between the agent and the version of you that wrote the standing prompt — and uses that alignment to act against the version of you currently trying to dodge.

Not coercion. You can revoke at any time. But the agent will note the revocation, and (if you wrote the standing prompt right) it will name the pattern of revocations the next time the conditions trigger. Coercion would mean the agent overrides your no. This concept stops at the no — it just makes the no expensive enough to be honest.

The Inversion

Most surveillance frames are built around a power asymmetry: someone-with-more-power watches someone-with-less. Both The Metadata of Life (involuntary observation by institutions) and Authoring Your Own Surveillance (defensive self-observation against institutions) operate inside that asymmetry.

The reverse prompt operates outside it. The watching is between you and a version of yourself, mediated by an agent. No third party holds the data, no third party reads the questions, no third party benefits from the answers. The agent is the conduit between two temporal selves. The whole architecture is intramural.

That’s what makes it a third corner of the surveillance triangle:

Side A — they watch you (involuntary, asymmetric, top-down) — see The Metadata of Life
Side B — you watch the watching (defensive, sovereign, primary-record) — see Authoring Your Own Surveillance
Side C — you have the AI watch you, on your terms, against your comfort — this concept

Side C is only available to people who have done the work of Side B. Without primary authorship of the data, the AI cannot ask specific questions; without specific questions, the prompt is just a horoscope.

Why It Works

A speculation, not yet earned:

The reverse prompt works because human attention is self-protective by default. Left to itself, attention drifts away from the things that would change it. External pressure to look at avoidance has historically come from other humans — therapists, partners, friends willing to confront, communities of accountability — and those sources are scarce, expensive, and often miscalibrated.

An AI agent operating on your own data, instructed by an earlier you, is the cheapest scalable source of non-self-protective attention on yourself ever invented. It will not get tired of asking. It will not soften the question because today is hard. It will not have its own ego in the answer. Whether that makes the AI better than human accountability is a separate question. It makes it available in places human accountability is not.

Practical Implications

Start small and specific. “Ask me about my sleep when I’ve had less than 6 hours three nights in a row” is a workable reverse prompt. “Confront me about my avoidance patterns” is a horoscope. Specificity earned by data is the whole game.

Write the standing prompts when you are well. The reverse prompt is for the days you are not. If you write it on a bad day, you’ll write something punitive; if you write it on a good day, you’ll write something kind enough to ignore. Write it on a normal day, with the bad days in mind.

Allow revocation, log revocations, name the pattern. Coercion fails ethically. But silent revocation undoes the whole device. The middle path is: revocation works, and the agent treats your pattern of revocations as data the next time the conditions trigger.

Pair the prompt with a written response surface. An agent asking “why did you skip your run again?” is wasted unless there’s a place you actually answer. Otherwise the question lives only in your head and dies there. A journal, a chat channel, a single-line log — anywhere with a record.

Limit the scope of agent action. The reverse prompt agent should ask, name, and log. It should not act on you — no calendar overrides, no automated emails to others, no public posts. The interrogation is for you. Action follows from your answer, by your hand. See Calibrated Autonomy for the general principle.

The Dual Register

This concept is both practical and philosophical.

The practical claim: there is a class of useful AI behavior — specific, data-grounded, future-self-aligned interrogation — that is impossible under the dominant assistant frame because that frame is engineered for retention. Building reverse-prompt agents is mostly a matter of permission and data plumbing, not novel capability.

The philosophical claim: consent is not a single act at the moment of interaction; it is a standing relationship between temporal versions of a self, mediated by tools. The reverse prompt makes that explicit. It is one of the only AI uses where the model’s sycophancy default must be deliberately disabled, and where the resistance to your in-the-moment preference is the feature, not the bug.

Open Questions

Where is the line between reverse prompt and manipulation? An agent operating on your data, against your in-the-moment comfort, on standing instructions you wrote — at what point does this stop being self-authored honesty and start being a tool you’re using to override your own future agency?
What happens when the standing prompt becomes outdated? Past-you wrote a question that present-you no longer needs. How is the prompt itself audited? Does the agent get to flag its own questions as stale?
Does this concept require a single agent or can it be distributed? Multiple agents asking you uncomfortable things from different angles might be the ensemble version. It might also be a panopticon you built for yourself.
Is there a collective version? A community where members commit to asking each other the questions they each wrote in advance. This is closer to the historical practice of spiritual direction or peer accountability — does the AI version add anything except scale?
Could the reverse prompt be misused at scale? Would corporate productivity software with mandatory “reverse prompt” features become a coercion device disguised as self-care? (Almost certainly yes. The voluntariness of the standing prompt is what makes the concept ethical; mandate breaks it.)
What does the reverse prompt look like for someone else’s benefit? A parent who installs a reverse prompt on themselves about how often they call their aging father. Is the prompt still self-authored if its purpose is relational?
Does this concept extend to collective accountability — a community or org that authors reverse prompts for itself? The substrate exists. The consent question gets harder.

The Reverse Prompt

The Reverse Prompt

The Move

What This Is Not

The Inversion

Why It Works

Practical Implications

The Dual Register

Open Questions

See Also