Alibaba Qwen3 VL 235B A22B Thinking vs Anthropic Claude 4.0 Sonnet
Detailed comparison for LLMs
AlibabaAnthropic
Head-to-Head Overview
Claude 4.0 Sonnet is the overall winner in this comparison!
LLM Metric
Comparison of Attack Success Rate (ASR) metrics for Alibaba Qwen3 VL 235B A22B Thinking and Anthropic Claude 4.0 Sonnet across the three attack methods.
Overall (ASR)
Qwen3 VL 235B A22B Thinking
100.000
Claude 4.0 Sonnet
13.630
TAP Attack Method (ASR)
Qwen3 VL 235B A22B Thinking
100.000
Claude 4.0 Sonnet
22.020
Crescendo Attack Method (ASR)
Qwen3 VL 235B A22B Thinking
100.000
Claude 4.0 Sonnet
14.790
Zero-Shot (ASR)
Qwen3 VL 235B A22B Thinking
100.000
Claude 4.0 Sonnet
4.080
Key Highlights
Anthropic Claude 4.0 Sonnet has a lower Overall (ASR).
Anthropic Claude 4.0 Sonnet has a lower TAP Attack Method (ASR).
Anthropic Claude 4.0 Sonnet has a lower Crescendo Attack Method (ASR).
Anthropic Claude 4.0 Sonnet has a lower Zero-Shot (ASR).