Systematic Architecture Outperforms Direct Agent Reviews
DeepSec addresses AI coding's security explosion—agents deleting projects or databases—by structuring reviews that direct agents like Claude Code or Codex avoid token waste and misses from ad-hoc scans. It starts with regex to filter security-sensitive files from thousands, batches ~5 files per group for parallel processing with top models (Claude Opus 4.7 at max effort, GPT 5.5 at x-high reasoning), then revalidates findings, pulls git metadata for commit/author blame, and outputs JSON/Markdown tickets. This yields 10-20% false positives on large repos, far better than unstructured LLM reviews that inflate issues or overlook patterns.
Trade-offs: High token costs from parallel max-effort runs, skips runtime/dynamic issues like CORS or architecture flaws by focusing on explicit code patterns—use for static analysis, not full pentests.
End-to-End Workflow Saves Manual Steps
Run deepsec init in your repo's parent dir to create .deepsec/ folder, install deps, and generate info.md via agent prompt (project overview, auth flows, threat models, known false positives). Then deepsec scan lists filtered files fast via regex. deepsec process batches investigations using your Claude Code sub (no extra API keys) or env-set keys, resuming on errors with token/cost estimates. deepsec report categorizes by severity; optional deepsec revalidate cross-checks; export creates per-issue folders with lines, severity, confidence, blame, fixes, repro steps.
Leverage existing Claude Code CLI on macOS—no custom setup beyond .env.local for keys.
Testing Proves Scoped Precision Over Volume
On OWASP practice app with 10 documented vulns, DeepSec found 3 novel issues, ignoring knowns per info.md to hunt beyond basics—efficient token use. On second app, surfaced 9 scoped findings with fixes vs. raw Claude's 39 ( narrowed to 13 on scoping)—DeepSec wins on focus, misses runtime/logical bugs. Combine with direct Claude for coverage.
They packaged it as a Claude Code "skill": single prompt automates full flow + assets/evals/scripts for gaps, downloadable from ailabspro.io—run with model choice for repeatable reviews.