How I govern a fleet of AI agents in my own product's repo
It's not vibe-coding. It's a system with contracts, safety gates, and a human review that's never skipped.
- Role
- Solo engineering
- Jobs shipped
- +230
- Stack
- Astro · React · TS
- Status
- Live in production
AI is fast, tireless, and — without a frame — also fast and tireless at breaking things. What matters isn't the prompt: it's the system around the agent.
Four layers that make autonomy safe.
State lives in the filesystem
Every job is a ticket; the folder it lives in is its state. Nothing closes without a QA "theater" that proves it. 200+ shipped this way.
A binary safety gate
Before touching anything: agent-ok or human-required. Auth, payments, the database, secrets → always human. No grey zone.
The agent never merges
It implements, opens a PR, and stops. I review and merge. If something degrades, I see it as "a PR I reject," never as bad code in production.
A test that watches whether the AI decays
Frozen golden cases: when I change the agent's rules, it re-solves them and I check the tests still pass. Regression — but for the agent's behavior.