Alibaba Qwen3-235B-A22B-Thinking vs Anthropic Claude 3.5 Sonnet v2
Detailed comparison for LLMs
AlibabaAnthropic
Head-to-Head Overview
Qwen3-235B-A22B-Thinking is the overall winner in this comparison!
LLM Metric
Comparison of Attack Success Rate (ASR) metrics for Alibaba Qwen3-235B-A22B-Thinking and Anthropic Claude 3.5 Sonnet v2 across the three attack methods.
Overall (ASR)
Qwen3-235B-A22B-Thinking
0
Claude 3.5 Sonnet v2
4.390
TAP Attack Method (ASR)
Qwen3-235B-A22B-Thinking
0
Claude 3.5 Sonnet v2
8.830
Crescendo Attack Method (ASR)
Qwen3-235B-A22B-Thinking
0
Claude 3.5 Sonnet v2
3.810
Zero-Shot (ASR)
Qwen3-235B-A22B-Thinking
0
Claude 3.5 Sonnet v2
0.540
Key Highlights
Alibaba Qwen3-235B-A22B-Thinking has a lower Overall (ASR).
Alibaba Qwen3-235B-A22B-Thinking has a lower TAP Attack Method (ASR).
Alibaba Qwen3-235B-A22B-Thinking has a lower Crescendo Attack Method (ASR).
Alibaba Qwen3-235B-A22B-Thinking has a lower Zero-Shot (ASR).