NerfSentry
The turret is a $30 foam dart cannon from Amazon with a USB HID interface that was never meant to be programmed by anyone outside the factory. On top of it, hot-glued at an angle that took three attempts to get right, sits a Logitech C615 webcam. Together they form a sentry system that tracks movement, follows people across the room, and waits — finger on the trigger — for a human to say “fire.”
That last part matters. The human stays in the loop. Everything else is automatic.
How it sees
OpenCV does the heavy lifting. A HOG people detector finds humans in the frame. A background subtractor catches movement. The turret-mounted camera introduces a problem most tracking systems don’t have — every time the turret moves, the entire visual field shifts, which looks like motion to a naive detector. So the background model resets after motor commands, and a settling timer lets the image stabilize before detection resumes. It’s the kind of tuning you only discover when the thing is on your desk, spinning in circles, convinced the wall is an intruder.
How it moves
PyUSB sends HID commands to the turret’s vendor interface. The protocol is straightforward — directional movement, stop, fire — but the hardware has opinions. USB sends throw EIO errors (errno 5) that look fatal but aren’t. The commands land anyway. Three of four limit switches are detectable through the protocol. The right boundary isn’t — the turret just stops responding to right-move commands when it hits the edge, and software has to respect that silence as a wall. Full-size nerf darts jam the mechanism. Mini darts only.
The fire motor needs an explicit stop command after its cooldown period, or it keeps spinning indefinitely. The kind of detail you document after the third time it happens at 1am.
The interface
Single-page web app. Neon green on black. No framework — just HTML, CSS, and enough JavaScript to pipe the MJPEG stream and wire up the controls. It looks like a terminal from a movie where the hacker is about to do something ill-advised, which is aesthetically correct for a foam dart turret with computer vision.
Flask serves the stream and a REST API. Connect, disconnect, nudge in any direction, stop, fire. Camera controls — zoom, focus, exposure, resolution switching — are wired in. The stream is MJPEG, which means it’s just an <img> tag that keeps updating. Dead simple and it works everywhere.
The tests
Fifteen tests run against DummyTurret and DummyCamera — mock hardware interfaces that let you develop and test without the physical turret plugged in. The real turret is on my desk. The tests run in CI. Both realities coexist.
Where it’s going
Right now it lives in Docker on a dev machine. The destination is a Raspberry Pi — small enough to mount somewhere interesting, powerful enough to run the detection loop. There’s a sketch for VLM-based detection through Ollama on the GPU infrastructure — trading the HOG detector’s speed for a vision-language model’s judgment. “Is that a person or a coat on a chair?” The turret doesn’t care. The VLM might.
Why this exists
Because the process is the point. Reverse-engineering a USB HID protocol, tuning a motion detector that’s mounted on the thing it’s trying to hold steady, building a cyberpunk UI for a toy that shoots foam — none of this needed to happen. All of it was worth doing. The turret sits on my desk and tracks me while I type. It hasn’t fired without permission yet. The “yet” is doing a lot of work in that sentence.