Frontier AI Powers Offensive Cyber Ops at Scale

Frontier AI models excel in cyber tasks like zero-day discovery and cryptographic breaks, automating specialist skills to cut costs, speed, and scale for attackers. AISI tested 7 models pre-March 2026 on a 32-step enterprise network attack (human expert: 14 hours) and a complex ICS scenario. Top performer Claude Opus 4.6 (Feb 2026) finished 18 steps (56%) autonomously, costing £65 per run—up from near-zero progress 18 months prior. No model completed full scenarios, but distillation spreads capabilities to cheaper/open models. Dual-use nature means same skills aid defender testing. Drivers: post-training via RLHF/scaffolding and agentic systems chaining models/tools. Public demos show real misuse, bypassing safeguards.

Model Limits Create Defender Detection Windows

Pre-2026 models hit barriers: processing timeouts (understate potential), knowledge gaps in reverse engineering/crypto/malware, poor multi-step coordination, context loss over long ops, and run inconsistency. Activity generates detectable alerts in monitored environments, buying time for response. Purpose-built setups with tools/human oversight would boost performance, but strong monitoring exploits this now. NCSC forecasts near-term AI threat evolution in its AI-cyber report.

Leverage AI for Hardening, Detection, and Response

Defenders amplify advantages via AI systems (models + tools/workflows/oversight). Top applications:

  • Attack surface reduction: AI tools scan codebases exhaustively, prioritize vulns by exploitability, generate patches (e.g., DARPA AIxCC, Google CodeMender, OpenAI Codex Security)—shrinking attacker windows.
  • Threat detection/investigation: LLMs triage alerts, retain context, probe suspicious activity, deploy honeypots—catching subtle intrusions beyond signature-based methods.
  • Automated mitigation: Quarantine hosts, rotate creds, block IPs without humans—slashing response time, but risks disruptions if miscalibrated.

AI shifts paradigms but adds risks like over-reliance; secure per UK's AI security code.

Shape Battlefield with Baselines to Hold Advantage

Defenders' edge: global collaboration, market-driven defenses, 'shaping' environments (e.g., correlate signals, baseline behaviors for anomaly detection). AI scales this, demanding stealth from attackers. Weak foundations erode it fast. Prioritize basics—no AI fix: Cyber Essentials (MFA everywhere, patch software/devices, network segmentation, privileged access, endpoint security). Invest in baselines + targeted AI to amplify strengths as threats scale.