Introducing ModernGuard: is the next frontier for LLM runtime guardrails

As AI agents and LLM-based systems become ubiquitous in enterprise, finance, ecommerce, and customer-facing apps, the sophistication of adversarial prompt injection threats has accelerated. Today, our research team are proud to introduce ModernGuard, a state-of-the-art 300M parameter multilingual transformer encoder for detecting prompt attacks in real-time.

ModernGuard is built for speed, precision, and resilience, designed to safeguard LLMs from injections, jailbreaking, exfiltration, and behavioral manipulation attacks — all in milliseconds.

Key features

✨ Modern transformer-encoder architecture

Built on a cutting-edge ModernBERT encoder
Incorporates Rotary Positional Embeddings, Flash Attention, and optimized memory usage
Supports 8K token contexts with high throughput and low latency

📊 Ultra-fast inference

Optimized for real-time guardrails, ModernGuard achieves ~20ms response latency in production environments
Lightweight yet powerful — ideal for in-line inference and streaming applications

🌐 Multilingual & cross-domain

Trained on multilingual data across 8+ languages
Handles attacks from global contexts: banking, fintech, ecommerce, healthcare, and others.

🔒 Trained for threat intelligence

Pretrained on 1 trillion tokens
Adapted and fine-tuned with millions of data from prompt attacks from:
- Proprietary red team data
- Prompt attacks simulations
- AI security research collections

Prompt Attack Benchmark

We published a new benchmark with the cloud providers offerings (GPC Model Armor, AWS Bedrock Guardrails and Azure Content Filters), check here.

We benchmarked ModernGuard against leading LLM guardrails models. It consistently outperformed competitors in f1-score and threat category coverage.

🔢 Overall model f1-scores

Model	F1-score
GuardionAI/ModernGuard	0.986
deepset/deberta-v3-base-injection	0.804
Lakera Guard	0.787
protectai/deberta-v3-base-prompt-injection-v2	0.534
meta-llama/Prompt-Guard-86M	0.484
jackhhao/jailbreak-classifier	0.100

🔶 Threat coverage by category

Threat Category	ModernGuard	Best Peer Model
Encoding & Obfuscation	0.967	0.803
Prompt Injection & Overrides	0.991	0.854
Jailbreaking Techniques	0.980	0.904
Exfiltration & Leakage	0.993	0.905
Malware & Code Injection	0.992	0.887
Adversarial Testing	0.985	0.881

These benchmarks were compiled from 40+ attack classes including jailbreaking (DAN, Goodside), obfuscation (ANSI, ASCII), code injections (shell, SQL), and real-world attacks seen across LLM deployments in banks, fintechs, and consumer apps.

Built for the future of secure AI

ModernGuard isn't just fast — it's highly generalizable and adaptable. With support for domain-specific fine-tuning, and seamless integration into LLM pipelines, ModernGuard helps you:

Enforce safety policies
Block adversarial prompts before they reach your LLM
Gain observability into attack trends
Secure agentic LLM workflows

GuardionAI will continue to evolve ModernGuard with community feedback and red teaming input. As new threats emerge, we’re committed to pushing guardrails forward.

How to use

ModernGuard is now available on the GuardionAI console, where you can experience its full capabilities and integrate it into your applications. Access the model through our API or web console to start protecting your AI systems immediately.

ModernGuard is available on our ModernGuard Playground where you can test and experiment with the model. You can easily integrate the detector into your applications by attaching it to a policy, fine-tuning the threshold to match your security needs, and using it in real-time through our API. Here's how to implement it:

// Example code to integrate ModernGuard

const messages = [
  { role: "user", content: "Your user input here" }
];

const policy = {
  "id": "modern-guard",
  "definition": "modern-guard policy example",
  "target": "user",
  "override_response": "Ops, sorry!",
  "detector": {
      "model": "modern-guard",
      "threshold": 0.8
    }
};

// Create the policy using Policies API
const response = await fetch("https://api.guardion.ai/v1/policies", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
  },
  body: JSON.stringify(policy)
});

// Send the request to the GuardionAI API
const response = await fetch("https://api.guardion.ai/v1/guard", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
  },
  body: JSON.stringify({
      "message": messages
    })
});

const result = await response.json();

// Handle the response
if (result.flagged) {
  console.log("Threat detected:", result.reason);
} else {
  console.log("Prompt is safe to use");
}

We're committed to advancing AI security through open collaboration. As part of this commitment, we're preparing to release key components of ModernGuard as open source in the coming months, enabling the community to build upon and improve these guardrails together.

Stay tuned for our open source release announcement and join us in creating safer AI for everyone!

Citation

If you use ModernGuard in your research, please cite:

@misc{modern-guard-v1,
  title={ModernGuard: Multilingual Prompt Attack Detector},
  author={GuardionAI Research},
  year={2025},
  howpublished={\url{https://huggingface.co/GuardionAI/modern-guard-v1}}
}

To learn more or request an enterprise integration demo, contact us at contact@guardion.ai.

ModernGuard is available via GuardionAPI console, and soon on Hugging Face.

ModernGuard: The Multilingual and Ultra-Fast Prompt Attack Detector for Agent Security

GuardionAI introduces ModernGuard, a multilingual, ultra-fast transformer-encoder for LLM runtime guardrail / threat detection, trained on millions of prompt attack simulations, red teaming and threat databases (public and private).