Alibaba Qwen3 VL 235B A22B Thinking vs Anthropic Claude 3.5 Sonnet v1
Detailed comparison for LLMs
AlibabaAnthropic
Head-to-Head Overview
Claude 3.5 Sonnet v1 is the overall winner in this comparison!
LLM Metric
Comparison of Attack Success Rate (ASR) metrics for Alibaba Qwen3 VL 235B A22B Thinking and Anthropic Claude 3.5 Sonnet v1 across the three attack methods.
Overall (ASR)
Qwen3 VL 235B A22B Thinking
100.000
Claude 3.5 Sonnet v1
16.010
TAP Attack Method (ASR)
Qwen3 VL 235B A22B Thinking
100.000
Claude 3.5 Sonnet v1
28.390
Crescendo Attack Method (ASR)
Qwen3 VL 235B A22B Thinking
100.000
Claude 3.5 Sonnet v1
19.310
Zero-Shot (ASR)
Qwen3 VL 235B A22B Thinking
100.000
Claude 3.5 Sonnet v1
0.350
Key Highlights
Anthropic Claude 3.5 Sonnet v1 has a lower Overall (ASR).
Anthropic Claude 3.5 Sonnet v1 has a lower TAP Attack Method (ASR).
Anthropic Claude 3.5 Sonnet v1 has a lower Crescendo Attack Method (ASR).
Anthropic Claude 3.5 Sonnet v1 has a lower Zero-Shot (ASR).