PATCHPILOT
A multi-agent AI cybersecurity platform that decodes, scans, classifies, retrieves, patches, and validates — running production-grade out of my homelab.
Run it from this page
The actual platform, embedded below. Drop in a code snippet or APK — the six-stage pipeline runs against it in real time. If the embed feels cramped on mobile, the launch button up top opens it full-screen in a new tab.
EMBEDDED OVER CLOUDFLARE TUNNEL · IF YOU SEE A 5XX, THE PI IS REBOOTING — TRY AGAIN IN A FEW MINUTES.
SIX AGENTS · ONE PIPELINE
DECODE
Parse the input — APK to smali, web app to AST, normalize for the pipeline.
SCAN
Regex + taint-flow analysis across 8 vulnerability categories.
CLASSIFY
CWE ID + severity + context. Catches false positives before the LLM stage.
RAG
Vector retrieval across CWE / OWASP. Grounds the LLM in real reference patches.
PATCH
Local LLM proposes a unified diff using retrieved context. Deterministic structure.
VALIDATE
Re-scan the patched output. If new issues appear or the original isn't fixed, reject.
Mistral 7B on Ollama. Source code never leaves the box. Cost is zero, latency is predictable, and the RAG step closes most of the capability gap with frontier models.
READ THE DEEP DIVE
14-min build log on why six agents instead of one big LLM call, how the RAG grounding loop works, and what the validate-by-re-scan step caught that single-prompt approaches miss.
READ THE BUILD LOG →