Model Rankings
— Model rankings —

LLM rankings, side by side

Benchmarks, cost vs intelligence, context windows, and head-to-head comparisons across every major large language model.

ModelOverall$/1M
GPT-5.5 (xhigh)
OpenAI
60.2$11.25
Claude Opus 4.7 (Adaptive Reasoning, Max Effort)
Anthropic
57.3$10.94
Gemini 3.1 Pro Preview
Google DeepMind
57.2$4.50
GPT-5.5 (medium)
OpenAI
56.7$11.25
MiMo-V2.5-Pro
53.8$1.50
Claude Opus 4.6 (Adaptive Reasoning, Max Effort)
Anthropic
53.0$10.94
Muse Spark
52.1$0.00
Qwen3.6 Max Preview
51.8$2.92
Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort)
Anthropic
51.7$6.56
GLM-5.1 (Reasoning)
51.4$2.15
GPT-5.5 (low)
OpenAI
50.8$11.25
Qwen3.6 Plus
50.0$1.13
DeepSeek V4 Pro (Reasoning, High Effort)
49.8$2.17
Claude Opus 4.5 (Reasoning)
Anthropic
49.7$10.94
MiniMax-M2.7
49.6$0.53
GPT-5.4 mini (xhigh)
OpenAI
48.9$1.69
Grok 4.20 0309 (Reasoning)
xAI
48.5$3.00
Gemini 3 Pro Preview (high)
48.4$4.50
GLM-5-Turbo
46.8$0.00
DeepSeek V4 Flash (Reasoning, Max Effort)
46.5$0.17
Gemini 3 Flash Preview (Reasoning)
46.4$1.13
Qwen3.6 27B (Reasoning)
45.8$1.35
Qwen3.5 397B A17B (Reasoning)
45.0$1.35
DeepSeek V4 Flash (Reasoning, High Effort)
44.9$0.17
MiMo-V2-Omni-0327
44.9$0.80
GPT-5.4 nano (xhigh)
OpenAI
44.0$0.46
KAT Coder Pro V2
43.8$0.53
GLM-5.1 (Non-reasoning)
43.8$2.15
Qwen3.6 35B A3B (Reasoning)
43.5$0.56
Claude 4.5 Sonnet (Reasoning)
Anthropic
43.0$6.56
Kimi K2.6 (Non-reasoning)
43.0$1.71
GLM 5V Turbo (Reasoning)
42.9$0.00
Claude Sonnet 4.6 (Non-reasoning, Low Effort)
Anthropic
42.6$6.56
Hy3-preview (Reasoning)
41.9$0.00
Qwen3.5 122B A10B (Reasoning)
41.6$1.10
MiMo-V2-Flash (Feb 2026)
41.5$0.15
Gemini 3 Pro Preview (low)
41.3$4.50
GPT-5.5 (Non-reasoning)
OpenAI
40.9$11.25
Kimi K2 Thinking
40.9$1.07
o3-pro
OpenAI
40.7$35.00
Qwen3.5 397B A17B (Non-reasoning)
40.1$1.35
Qwen3 Max Thinking
39.9$2.40
DeepSeek V4 Pro (Non-reasoning)
39.3$2.17
Gemma 4 31B (Reasoning)
39.2$0.20
MiMo-V2-Flash (Reasoning)
39.2$0.15
Mistral Medium 3.5
39.2$3.00
Grok 4.1 Fast (Reasoning)
xAI
38.6$0.28
Qwen3.5 Omni Plus
38.6$1.50
GPT-5.1 Codex mini (high)
OpenAI
38.6$0.69
o3
OpenAI
38.4$3.50
GPT-5.4 nano (medium)
OpenAI
38.1$0.46
Step 3.5 Flash
37.8$0.15
GPT-5.4 mini (medium)
OpenAI
37.7$1.69
Qwen3.6 27B (Non-reasoning)
37.1$1.35
Claude 4.5 Haiku (Reasoning)
Anthropic
37.1$2.19
DeepSeek V4 Flash (Non-reasoning)
36.5$0.17
NVIDIA Nemotron 3 Super 120B A12B (Reasoning)
36.0$0.41
KAT-Coder-Pro V1
36.0$0.53
Qwen3.5 122B A10B (Non-reasoning)
35.9$1.10
Nova 2.0 Pro Preview (medium)
35.7$3.44
MiMo-V2.5-Pro (Non-reasoning)
35.6$1.50
Gemini 3 Flash Preview (Non-reasoning)
35.0$1.13
Nova 2.0 Lite (high)
34.5$0.85
DeepSeek V3.1 Terminus (Reasoning)
33.9$1.91
Hy3-preview (Non-reasoning)
33.7$0.00
Ling-2.6-1T
33.6$0.85
Doubao Seed Code
33.5$0.00
Gemini 3.1 Flash-Lite Preview
33.5$0.56
gpt-oss-120B (high)
OpenAI
33.3$0.26
o4-mini (high)
OpenAI
33.1$1.93
DeepSeek V3.2 Exp (Reasoning)
32.9$0.31
Mercury 2
32.8$0.38
Qwen3 Max Thinking (Preview)
32.5$2.40
GLM-4.6 (Reasoning)
32.5$0.96
Qwen3.5 9B (Reasoning)
32.4$0.11
Gemma 4 31B (Non-reasoning)
32.3$0.00
Grok 3 mini Reasoning (high)
xAI
32.1$0.35
DeepSeek V3.2 (Non-reasoning)
32.1$0.32
K-EXAONE (Reasoning)
32.1$0.00
Nova 2.0 Pro Preview (low)
31.9$3.44
Trinity Large Thinking
31.9$0.40
Qwen3.6 35B A3B (Non-reasoning)
31.5$0.84
Gemma 4 26B A4B (Reasoning)
31.2$0.20
Claude 4.5 Haiku (Non-reasoning)
Anthropic
31.1$2.19
Gemini 2.5 Flash Preview (Sep '25) (Reasoning)
31.1$0.00
Kimi K2 0905
30.9$1.07
o1
OpenAI
30.8$26.25
MiMo-V2-Flash (Non-reasoning)
30.4$0.15
EXAONE 4.5 33B
30.2$0.00
GLM-4.7-Flash (Reasoning)
30.1$0.15
Llama 3.3 Instruct 70B
Meta AI
14.5$0.62
Claude Code
Anthropic
9.5
ElevenLabs Voice (v3)
ElevenLabs
9.4
Midjourney v7
Midjourney
9.3
FLUX.1 Pro
Black Forest Labs
9.1
Cursor Composer
Anysphere
9.0
Sora
OpenAI
8.8
GPT 5.5 Codex
OpenAI
GPT-5.5 Pro (xhigh)
OpenAI
$0.00
GPT-3.5 Turbo (0613)
OpenAI
$0.00
Gemini 3 Deep Think
$0.00
Cogito v2.1 (Reasoning)
$1.25
GPT-4o Realtime (Dec '24)
OpenAI
$0.00
EXAONE 4.5 33B (Non-reasoning)
$0.00
GPT-4o mini Realtime (Dec '24)
OpenAI
$0.00
Mi:dm K 2.5 Pro Preview
$0.00
Grok 4.3 (Beta)
xAI

Rankings data by Artificial Analysis. CSV imports cover supplementary benchmarks.