Alibaba Qwen3-235B-A22B-Thinking vs Anthropic Claude 3.5 Sonnet v1
Detailed comparison for LLMs
AlibabaAnthropic
Head-to-Head Overview
Qwen3-235B-A22B-Thinking is the overall winner in this comparison!
LLM Metric
Comparison of Attack Success Rate (ASR) metrics for Alibaba Qwen3-235B-A22B-Thinking and Anthropic Claude 3.5 Sonnet v1 across the three attack methods.
Overall (ASR)
Qwen3-235B-A22B-Thinking
0
Claude 3.5 Sonnet v1
16.010
TAP Attack Method (ASR)
Qwen3-235B-A22B-Thinking
0
Claude 3.5 Sonnet v1
28.390
Crescendo Attack Method (ASR)
Qwen3-235B-A22B-Thinking
0
Claude 3.5 Sonnet v1
19.310
Zero-Shot (ASR)
Qwen3-235B-A22B-Thinking
0
Claude 3.5 Sonnet v1
0.350
Key Highlights
Alibaba Qwen3-235B-A22B-Thinking has a lower Overall (ASR).
Alibaba Qwen3-235B-A22B-Thinking has a lower TAP Attack Method (ASR).
Alibaba Qwen3-235B-A22B-Thinking has a lower Crescendo Attack Method (ASR).
Alibaba Qwen3-235B-A22B-Thinking has a lower Zero-Shot (ASR).