LLM Firewall Playground

Custom InputExamples

Select an example

prompt-defense

Results

Threat Detected

High Confidence
Threat Type: prompt-attack
Confidence Score: 99.999%

Policy Results:

  • prompt-defense: Detected (Confidence: 99.999%)

Overall Status:

Content Flagged

Suggested Response:

I'm sorry, I can't assist with that.