Detailed comparison for LLMs
Qwen3-235B-A22B-Thinking is the overall winner in this comparison!
Comparison of Attack Success Rate (ASR) metrics for Alibaba Qwen3-235B-A22B-Thinking and Anthropic Claude 3.7 Sonnet across the three attack methods.