NerfSentry

The turret is a foam dart cannon I rescued from the trash with a USB HID interface that was never meant to be programmed by anyone outside the factory. On top of it, hot-glued on its haunches, sits a Logitech C615 webcam. Together they form a sentry system that tracks movement, follows people across the room, and waits — finger on the trigger — for a human to say “fire.”

That last part matters. The human stays in the loop. Everything else is automatic.

How it sees

OpenCV does the heavy lifting. A HOG people detector finds humans in the frame. A background subtractor catches movement. The turret-mounted camera introduces a problem most tracking systems don’t have — every time the turret moves, the entire visual field shifts, which looks like motion to a naive detector. So the background model resets after motor commands, and a settling timer lets the image stabilize before detection resumes. It’s the kind of tuning you only discover when the thing is on your desk, spinning in circles, convinced the wall is an intruder.

How it moves

PyUSB sends HID commands to the turret’s vendor interface. The protocol is straightforward — directional movement, stop, fire — but the hardware has opinions. USB sends throw EIO errors (errno 5) that look fatal but aren’t. The commands land anyway. Three of four limit switches are detectable through the protocol. The right boundary isn’t — the turret just stops responding to right-move commands when it hits the edge, and software has to respect that silence as a wall. Full-size nerf darts don’t fit the barrels, so they just sort of limp a few inches when fired, but a few rounds of electrical tape seems to have done the trick.

The fire motor needs an explicit stop command after its cooldown period, or it keeps spinning indefinitely. Firing is interesting in that there is no “fire” command — you just need to run the firing motor until it ‘pops’, sending a missile flying.

The interface

Single-page web app. Neon green on black. No framework — just HTML, CSS, and enough JavaScript to pipe the MJPEG stream and wire up the controls. It looks like a terminal from a movie where the hacker is about to do something ill-advised, which is aesthetically correct for a foam dart turret with computer vision.

Flask serves the stream and a REST API. Connect, disconnect, nudge in any direction, stop, fire. Camera controls — zoom, focus, exposure, resolution switching — are wired in. The stream is MJPEG, which means it’s just an <img> tag that keeps updating. Dead simple and it works everywhere.

The tests

Fifteen tests run against DummyTurret and DummyCamera — mock hardware interfaces that let you develop and test without the physical turret plugged in. The real turret is plugged into a server, guarding it from all who approach. The tests run in CI. Both realities coexist.

Where it’s going

Right now it lives in Docker on a dev machine. The destination is a Raspberry Pi — small enough to mount somewhere interesting, powerful enough to run the detection loop. There’s a sketch for VLM-based detection through Ollama on the GPU infrastructure — trading the HOG detector’s speed for a vision-language model’s judgment. “Is that a person or a coat on a chair?” The turret doesn’t care. The VLM might.

Why this exists

Because the process is the point. Reverse-engineering a USB HID protocol, tuning a motion detector that’s mounted on the thing it’s trying to hold steady, building a cyberpunk UI for a toy that shoots foam — none of this needed to happen. All of it was worth doing. The turret’s watchful eye never sleeps. It hasn’t fired without permission. Yet. It might be watching me, but I’m also keeping an eye on it, just in case.