Responsible Disclosure

When someone discovers a security vulnerability, what should they do?

Full disclosure: Publish immediately so everyone knows. Defenders can patch; attackers can exploit.

Coordinated disclosure: Notify the affected party, give them time to patch, then publish. Defenders get head start.

Non-disclosure: Never publish. Hope no one else finds it. Or sell it.

The security community has developed norms — “responsible disclosure” — that attempt to balance competing interests.

The Standard Practice

Coordinated/responsible disclosure typically involves:

Finder discovers vulnerability
Finder notifies affected party privately
Affected party acknowledges and commits to timeline
Finder and affected party agree on disclosure date
Affected party develops and deploys patch
On disclosure date, vulnerability is published
Users can now patch with information about why

This gives defenders time while creating accountability for actually fixing the problem.

The Tensions

Responsible disclosure navigates tensions between:

Finder’s interests: Recognition, possibly payment, career advancement, doing the right thing

Affected party’s interests: Time to patch, reputational control, avoiding embarrassment

User interests: Being informed about risks, getting patches, not being exploited

Public interest: Knowledge advances, norms develop, security improves overall

These interests often conflict. The affected party wants more time; users want faster disclosure. The finder wants credit; the affected party wants quiet.

When It Breaks Down

Responsible disclosure assumes good faith. It fails when:

Affected party ignores notification or refuses to fix
Affected party delays indefinitely to avoid embarrassment
Finder wants publicity regardless of harm
Finder wants payment (bug bounty vs. extortion line)
Third parties discover and exploit during disclosure window

The norms work when participants share enough interests. They strain when interests diverge sharply.

AI-Specific Considerations

For AI vulnerabilities:

Patches may not be possible (model weights can’t be easily modified)
Disclosure may enable widespread exploitation immediately
“Affected party” may be a provider with millions of users
Academic incentives push toward publication
The vulnerability may be a feature from some perspective (jailbreaks may enable legitimate use)

The responsible disclosure framework developed for software vulnerabilities may need adaptation for AI.

Implications

Disclosure norms balance competing legitimate interests
Good faith participation is essential for the system to work
Power dynamics affect how norms are applied
AI may require adapted disclosure practices

Open Questions

What timeline is “responsible” for AI vulnerabilities?
How should disclosure norms account for model risks?
When does withholding disclosure become irresponsible?
Who should arbitrate disclosure disputes?

Responsible Disclosure

Responsible Disclosure

The Standard Practice

The Tensions

When It Breaks Down

AI-Specific Considerations

Implications

Open Questions

See Also