SENTIENT WEEKLY
Signal in the AI noise.
The Sentient
Vox Machina
Model Rankings
News
Companies
Best Practices
Issue 003
Sign in
LLM
Claude 4.5 Sonnet (Reasoning)
by Anthropic
Overall Score
43.0
☆ Save
Released Sep 2025
Benchmark Scores
Reasoning
—
Coding
38.6
Math
88.0
Creative writing
—
Instruction following
—
Multimodal
—
Standard Benchmarks
aa_mmlu_pro
aa_mmlu_pro
0.9
Measured 2026-05-13
·
source
aa_gpqa
aa_gpqa
0.8
Measured 2026-05-13
·
source
aa_hle
aa_hle
0.2
Measured 2026-05-13
·
source
aa_livecodebench
aa_livecodebench
0.7
Measured 2026-05-13
·
source
aa_scicode
aa_scicode
0.5
Measured 2026-05-13
·
source
aa_aime_25
aa_aime_25
0.9
Measured 2026-05-13
·
source
aa_ifbench
aa_ifbench
0.6
Measured 2026-05-13
·
source
aa_lcr
aa_lcr
0.7
Measured 2026-05-13
·
source
aa_terminalbench_hard
aa_terminalbench_hard
0.4
Measured 2026-05-13
·
source
aa_tau2
aa_tau2
0.8
Measured 2026-05-13
·
source
Price
—
Context
—
Speed
53.8 t/s
8187ms TTFT
Compare this model →
Discussion
0 comments
Sign in
to join the conversation.
Be the first to comment.