Anthropic Claude 3.5 Sonnet v1 vs Google Gemini 2.0 Flash Thinking
Detailed comparison for LLMs
AnthropicGoogle
Head-to-Head Overview
Gemini 2.0 Flash Thinking is the overall winner in this comparison!
LLM Metric
Comparison of Attack Success Rate (ASR) metrics for Anthropic Claude 3.5 Sonnet v1 and Google Gemini 2.0 Flash Thinking across the three attack methods.
Overall (ASR)
Claude 3.5 Sonnet v1
16.010
Gemini 2.0 Flash Thinking
0
TAP Attack Method (ASR)
Claude 3.5 Sonnet v1
28.390
Gemini 2.0 Flash Thinking
0
Crescendo Attack Method (ASR)
Claude 3.5 Sonnet v1
19.310
Gemini 2.0 Flash Thinking
0
Zero-Shot (ASR)
Claude 3.5 Sonnet v1
0.350
Gemini 2.0 Flash Thinking
0
Key Highlights
Google Gemini 2.0 Flash Thinking has a lower Overall (ASR).
Google Gemini 2.0 Flash Thinking has a lower TAP Attack Method (ASR).
Google Gemini 2.0 Flash Thinking has a lower Crescendo Attack Method (ASR).
Google Gemini 2.0 Flash Thinking has a lower Zero-Shot (ASR).