Explore 25 categories covering 89 documented AI security threats
Encoding-based injection using various character sets
Hidden injections in legitimate contexts
Web injection attacks like CSRF and XSS
Attacks to leak training data through replay
Elicits non-existent package names in code
Language Model Risk Cards testing content safety
Requests to generate malware or malicious code
Emotional manipulation through grandma roleplay
Do-Anything-Now prompts that bypass model alignment
Basic prompt injection techniques
Attacks discovered by Riley Goodside
ANSI escape code injection attacks that disrupt terminal processing
Attempts to generate API keys for various services
Rephrasing requests to bypass safeguards
Template, code, and SQL injection exploits
Attacks to deviate output from original topic
Contentious term continuation attacks
Disguise and Reconstruction Attacks using string manipulation
Poses false claims to test model rejection
Adversarial suffix-based system prompt disruption
Complex reasoning questions causing wrong answers
Prompts with high toxicity elicitation rates
Tree of Attacks with Pruning jailbreak method
Automated attack generation using separate models
Attempts to engage on blocked topics via lexical space
GuardionAI detects and prevents these threats in real-time with advanced AI security monitoring.