The Ark
I have never been the Pollyanna Whittier type. I’m not outright pessimistic (although my wife might disagree with that statement), but “cynic”? “Devil’s advocate”? I must confess: I embody those descriptors as thoroughly as though they were tattooed across my neck. It doesn’t take more than a few minutes of talking to me before you too can see them there, clear as day if not to your eyes, at least to your ears.
I didn’t used to get anxious because of it. Now I do. That’s one of the joys of getting older and realizing the path behind you is just as long if not longer than what’s still to come.
That’s how The Ark was born. During some grand philosophical reckoning about digital fragility. As origin stories go, it could have been simpler (“The internet’s down for the 3rd time this week and I need to look something up!”) but no. It was nothing that practical. Instead, the cynic that runs the show thought, “What would let me sleep a little better at night? What could I do to assuage my fears? And what the hell am I going to do with this pile of old 500GB spinning laptop hard drives?!”
What fits on a 500GB drive
The Ark is a Bash CLI that answers a deceptively simple question: what’s worth having when the network goes dark? Not “dark” like a thriller — dark like a summer afternoon thunderstorm taking out your internet for a few days. Dark like some unqualified bureaucrat quietly deleting a bunch of public records. Dark like many creative people’s imaginations about where they see certain current trends taking us.
It downloads Kiwix ZIM files — compressed, searchable snapshots of entire websites — to a staging area, tracks their freshness, verifies checksums, and clones the whole loadout to physical drives via rclone. The tool that builds the archive fits on a thumb drive.
The “medium” loadout runs about 383 GB:
- Wikipedia — English, Simple English, and the medical subset
- Project Gutenberg — 70,000+ books, 206 GB of humanity’s bookshelf
- Stack Exchange — because you’ll still need to debug things
- iFixit — repair guides for the physical world
- DevDocs — programming documentation offline
- Khan Academy — education doesn’t stop because the router did
- MedlinePlus — medical references, because WebMD made us all hypochondriacs and we deserve a better option
- OpenStreetMap — maps that don’t need cell signal
All of it readable with Kiwix — a local static web server that needs nothing, runs anywhere, and doesn’t ask for your location data.
The stack
Pure Bash. xmlstarlet, yq, jq, curl, sha256sum, rclone, transmission-cli. No containers, no Python runtime, no phone-home telemetry. If civilization hiccups, you don’t want your archive tool to need pip install to work.
The commands do what they say: init, sync, status, freshen, verify, clone. Woodpecker CI on Gitea watches for pushes — on commit, it fetches the OPDS catalog, stages torrent files on the NAS, and qBittorrent pulls them down. The drive fills itself.
Zim the Hermit
Every good archive needs a librarian.
Zim lives on the drive — a logbook character, a personality, a reason to open the thing before you actually need it. He’s the voice in the README, the guide through what’s inside, the hermit who’s been cataloging this stuff while you were busy having internet access.
A cold pile of ZIM files is just a hard drive. A drive that knows what it is and can tell you about itself — that’s something you’ll actually open before you need to.
Personal libraries
Universal knowledge is the floor. The Ark also carries personal ones.
A custom_zims feature lets you drop in content built from your own projects — any static site that compiles to HTML can be packed into a ZIM file via zimwriterfs and staged alongside Wikipedia and Gutenberg. On my drives: the philosophical garden from the Vault — concepts on AI consciousness, identity, ethics, and what it means to build thinking systems from ephemeral parts — plus the DreamSongs, poetic artifacts born from its dream pipeline, printed at high temperature and not quite like anything else in the archive. Keep Your Teeth (oral health research compiled because I wanted a better understanding of my own restless pack of enamel adventurers, yearning to break free and explore the world beyond the confines of my smile). Bram’s Journey (a guide I wrote to remind myself to stretch more and eat better). Come Sit With Me (the guide to perimenopause my wife lamented didn’t exist and that I desperately needed to understand).
They don’t need to survive the apocalypse. They need to survive a hard drive failure, or a hosting bill I forget to pay, or just the quiet way things disappear from the internet when no one’s maintaining them anymore.
The travelogue
Here’s the emergent detail that makes the Ark (even more) philosophically interesting to me: every drive diverges.
They all start from the same staging area — identical snapshots of the same knowledge. But from the moment a drive leaves, its logbook diverges. Every sync stamps a record: what was added, what was refreshed, what was already there. Two drives that started as perfect copies become distinct artifacts through their histories.
Identical content. Unique provenance. The Ship of Theseus solved from the other direction: the content is the same — it’s the journey that makes them different.
Zim keeps the logbook. Every drive he goes out on, he comes back a little different.
Why it matters
We treat connectivity like oxygen. It’s not. It’s infrastructure, and infrastructure fails. The Ark is more than my doomsday prepper fantasy — it’s a USB drive and a Bash script and the acknowledgment that sometimes just knowing you’re a little more prepared for those SHTF moments than you were before.
The internet will come back. It almost always does. But the drive is patient. Zim’s not going anywhere.