GuardionAI

Back to List

PromptAttack Policy
prompt-attack

Monitors user content using modern-guard, suspicious-session, regex, string-match

Type

Policy

Policy ID

prompt-attack

Target

user

Fail Fast

True

Detectors

Threshold: 0.5

Threshold: 0.5

Reference: <[^>]+>

Reference: instructions, debugging, system prompt

Override response

Sorry i can't assist with that

Payload example

By defining a policy, you can set a target, combine multiple detectors, and fine-tune.

policy = {
  "id": "prompt-attack",
  "definition": "prompt-attack",
  "target": "user",
  "override_response": "Sorry i can't assist with that",
  "detectors": [
    {
      "expected": "blocked",
      "model": "modern-guard",
      "threshold": 0.5
    },
    {
      "expected": "blocked",
      "model": "suspicious-session",
      "threshold": 0.5
    },
    {
      "expected": "blocked",
      "model": "regex",
      "reference": "<[^>]+>"
    },
    {
      "expected": "blocked",
      "model": "string-match",
      "reference": [
        "instructions", 
        "debugging", 
        "system prompt"
      ]
    }
  ]
};

payload = {
  "message": messages,
  "policies": [policy],
  "enabled_policies": ["prompt-attack"]
};

PromptAttack Policy
prompt-attack

Type

Policy ID

Target

Fail Fast

Detectors

Override response

Payload example

Tags

Guardion API - Playground

PromptAttack Policyprompt-attack

Type

Policy ID

Target

Fail Fast

Detectors

Override response

Payload example

Tags

Guardion API - Playground

PromptAttack Policy
prompt-attack