SENTIENT WEEKLY
Signal in the AI noise.
The Sentient
Vox Machina
Model Rankings
News
Companies
Best Practices
Issue 003
Sign in
LLM
o4-mini (high)
by OpenAI
Overall Score
33.1
☆ Save
Released Apr 2025
Benchmark Scores
Reasoning
—
Coding
25.6
Math
90.7
Creative writing
—
Instruction following
—
Multimodal
—
Standard Benchmarks
aa_mmlu_pro
aa_mmlu_pro
0.8
Measured 2026-05-13
·
source
aa_gpqa
aa_gpqa
0.8
Measured 2026-05-13
·
source
aa_hle
aa_hle
0.2
Measured 2026-05-13
·
source
aa_livecodebench
aa_livecodebench
0.9
Measured 2026-05-13
·
source
aa_scicode
aa_scicode
0.5
Measured 2026-05-13
·
source
aa_math_500
aa_math_500
1.0
Measured 2026-05-13
·
source
aa_aime
aa_aime
0.9
Measured 2026-05-13
·
source
aa_aime_25
aa_aime_25
0.9
Measured 2026-05-13
·
source
aa_ifbench
aa_ifbench
0.7
Measured 2026-05-13
·
source
aa_lcr
aa_lcr
0.6
Measured 2026-05-13
·
source
aa_terminalbench_hard
aa_terminalbench_hard
0.1
Measured 2026-05-13
·
source
aa_tau2
aa_tau2
0.6
Measured 2026-05-13
·
source
Price
—
Context
—
Speed
158.15 t/s
23417ms TTFT
Compare this model →
Discussion
0 comments
Sign in
to join the conversation.
Be the first to comment.