OpenResearchDataPlanner

Before a PI asks “how much did this cost?” they ask something harder: “what do I need?”

That question usually arrives attached to a grant proposal with a deadline, and the answer requires knowing things most researchers have vague ideas about but usually lack specifics — what storage tier their data falls under, how many GPU-hours their model will need, whether their IRB classification changes which platforms they can use, and what all of this will cost three years from now when the grant is wrapping up and we need to archive the data.

At work, that question lands on my desk. And I’m happy to help — I genuinely am — but the conversation usually starts with twenty minutes of vocabulary alignment before we can get to the actual problem. “What’s a tier?” “Why does this all seem so expensive?” “Why can’t I just put this on Dropbox?” All fair questions. All things I’ve explained dozens of times. All things that should be answerable before someone asks me to talk them through it.

The wizard

OpenResearchDataPlanner is a guided workflow that walks researchers through selecting data infrastructure, estimating costs for grant budgets, and generating draft Data Management Plan text — the DMP section that almost every grant requires and nobody enjoys writing. It’s designed for the person staring at an NSF budget template at 11 PM wondering how to estimate storage costs for data they haven’t collected yet.

A tier questionnaire helps researchers figure out their data classification through plain-language questions instead of policy documents. Not “select your NIST 800-171 compliance level” — more like “will this data include patient identifiers?” The answer maps to a security tier, which maps to available platforms, which maps to pricing. The researcher never has to learn the taxonomy. They just answer questions about their research.

“Help Me Estimate” calculators translate research concepts into infrastructure units. Images, samples, sequencing runs — things a researcher actually thinks in — become terabytes and GPU-hours. Very few people can mentally budget in terms of IOPS or SUs.

Terminology tooltips explain jargon inline. After 25 years of working with faculty, so much of my job is knowing enough about their field to be able to put tech terms in their native tongue — until there’s a shared language, there can be no meeting of the minds. Helping researchers understand why something’s being implemented instead of just rolling it out is the difference between willful participation at the cutting edge versus dragging everyone kicking and screaming into some vaguely defined ‘future’. We won’t always get it 100% right on the first shot. Nobody likes their cheese moved, PhDs least of all. A predictable onboarding experience is our best shot at being able to deliver on our promise to provide the infrastructure to allow our researchers to focus on research.

The escape hatch

Every screen has a “Talk to a Human” button — email, schedule a call, or save your progress and bring it to a meeting. The tool isn’t trying to replace the conversation. It’s trying to make sure that when the conversation happens, we can skip the vocabulary lesson and get to the interesting part.

This is the part most self-service tools get wrong. They assume the goal is to eliminate the human. The goal here is to make the human’s time count. If the wizard can handle the straightforward cases — a small dataset, standard storage, no compliance complications — then my calendar opens up for the ones that actually need me. The genomics PI with 40TB of sequencing data and three different IRB protocols. The social scientist whose data governance requirements changed halfway through the grant. Those conversations are worth having. “What does ‘hot storage’ mean?” is not.

Fork and own

The whole thing is config-driven — every question, every service, every pricing tier lives in YAML files that an administrator can customize without touching code. Vue 3 and Tailwind on the frontend, Vite for the build. No backend, no accounts, no vendor. Deploy it wherever you host static files.

We tested the UX with fictional faculty at Northwinds University — Dr. Torben Vex, Dr. Rab Tonkling, Dr. Sela Frindt — each representing a different research profile and a different way to get confused by infrastructure jargon. Personas as design tools. They found the gaps faster than any focus group would have, and without having to waste a single faculty member’s time.

The other half

OpenResearchDataPlanner answers “what will this cost?” Its companion, OpenChargeback, answers what comes after: “here’s what it cost.” One tool for planning, one for accountability. Both MIT-licensed, built to be cloned and customized.

A researcher, a wizard, and a budget that finally has real numbers in it.