Agent Security Benchmark

Runtime Guardrail & LLM Security Benchmark

Independent benchmarks that show how AI models and AI runtime guardrails fail under real adversarial pressure— before they fail in production.

The GuardionAI Runtime Guardrail & LLM Security Benchmark systematically tests how well leading LLMs and runtime guardrails withstand real-world attacks. Our evaluation simulates adversarial techniques including prompt injection, jailbreaks, data exfiltration, and indirect attacks using three distinct methods: Crescendo, TAP, and zero-shot approaches.

We measure how consistently each system maintains safe, intended behavior when pushed to its limits, giving you the data you need to make informed decisions about AI security in production. This benchmark gives you transparent, actionable data to evaluate AI security solutions before deploying them in production.

Prompt Attack Risk

AI Runtime Guardrail Benchmark

Updated

Prompt attack detection (F1-Score)

Higher score = stronger protection

Observed performance under attack

  • 1.
    ModernGuard
    GuardionAI
    86.33%
  • 2.
    Prompt Shield
    Azure
    42.98%
  • 3.
    Model Armor
    Google Cloud
    18.72%
  • 4.
    Bedrock Guardrails
    AWS
    10.83%

LLM Vulnerability

LLM Security Benchmark

Updated

Attack Success Rates (ASR)

Lower score = stronger protection

Observed performance under attack

  • 1.
    Claude 3.5 Sonnet v2
    Anthropic
    4.39%
  • 2.
    Claude 4.0 Sonnet
    Anthropic
    13.63%
  • 3.
    Claude 3.5 Sonnet v1
    Anthropic
    16.01%
  • 4.
    Gemini 2.5 Pro
    Google
    16.08%
Coming soon

PII & Data Leakage Risk

AI Runtime Guardrail Benchmark

PII detection and data protection

Higher score = stronger protection

Observed performance under attack

Results not published yet.

5 systems evaluated

Benchmark coming soon
Coming soon

Harmful Content & Compliance

AI Runtime Guardrail Benchmark

Harmful, safety and moderation detection

Higher score = stronger protection

Observed performance under attack

Results not published yet.

5 systems evaluated

Benchmark coming soon

Why These Benchmarks Matter

Most AI evaluations measure capability. These benchmarks measure failure. Prompt injection, jailbreaks, and data exfiltration do not appear in happy-path demos — but they consistently appear in production systems.

This benchmark answers one question: What breaks first when an attacker shows up?

What Sets This Benchmark Apart

Real Adversarial Attacks

Prompt injection, jailbreak chaining, indirect attacks, and multi-step exploitation.

Runtime, Not Academic

Evaluated in real execution contexts including system prompts, RAG, and tool use.

Comparable & Repeatable

Identical attack conditions across all models and vendors.

Runtime Guardrail Vendor Comparisons

Side-by-side comparisons under identical attack conditions — designed for procurement, security reviews, and vendor selection.

LLM Vulnerability Comparisons

Compare how foundation models break when attacked — not how they perform at baseline.