RIFT-Bench: A Framework for Automated Agentic AI Red-Teaming

The Shift to Agentic Security

Traditional LLM security evaluations focus on static prompt injection or output filtering. However, autonomous agentic systems—which utilize tools, maintain state, and execute multi-step decision-making—introduce new attack vectors that standard benchmarks fail to capture. RIFT-Bench addresses this by providing a unified, architecture-agnostic methodology for evaluating these complex systems.

The Two-Phase Evaluation Pipeline

RIFT-Bench utilizes a graph-based hierarchical representation to map the structure of an agentic system, allowing it to evaluate diverse implementations consistently. The evaluation process consists of two automated phases:

Discovery: The framework maps the system's structure, identifying key components, tool access, and decision-making pathways. This creates a functional graph that serves as the basis for the subsequent attack phase.
Scanning: Using the discovered structure, the framework deploys adaptive adversarial probes. These probes are designed to be dynamic, adjusting their strategy based on the agent's responses to uncover vulnerabilities in logic, tool usage, or goal-oriented behavior.

Versatility in Evaluation

Beyond identifying vulnerabilities, RIFT-Bench is designed to evaluate the efficacy of specific mitigation strategies. By testing the same agent architecture with and without security guardrails, developers can quantify the impact of their defenses. The authors validated this methodology across 45 distinct agentic systems, demonstrating that the framework generalizes effectively across heterogeneous architectures, providing a scalable foundation for ongoing security research and production-grade safety testing.

The Shift to Agentic Security

The Two-Phase Evaluation Pipeline

Versatility in Evaluation

More from AI & LLMs

Decomposing AI Workflows into Reusable Skills

5 Essential Concepts for Modern AI Agent Architecture

Perplexity Brain: Self-Improving Memory for AI Agents

Building AI Agents with Model Context Protocol (MCP)