LLM

Gemini 3.1 Pro Preview

by Google DeepMind
Overall Score
57.2

Google's flagship reasoning + research model

Released Feb 2026
Benchmark Scores
Reasoning
9.3
Coding
55.5
Math
Creative writing
8.5
Instruction following
9.0
Multimodal
9.5
Standard Benchmarks
aa_gpqaaa_gpqa
0.9
Measured 2026-05-13 · source
aa_hleaa_hle
0.5
Measured 2026-05-13 · source
aa_scicodeaa_scicode
0.6
Measured 2026-05-13 · source
aa_ifbenchaa_ifbench
0.8
Measured 2026-05-13 · source
aa_lcraa_lcr
0.7
Measured 2026-05-13 · source
aa_terminalbench_hardaa_terminalbench_hard
0.5
Measured 2026-05-13 · source
aa_tau2aa_tau2
1.0
Measured 2026-05-13 · source
Price
$7 / 1M input · $21 / 1M output
Paid
Context
2M tokens
Speed
135.33 t/s
29889ms TTFT
Compare this model →

Overview

Google DeepMind's flagship reasoning + research model. The 2M-token context is unmatched for true long-form document and codebase work. Tight integration with Google Search and Workspace.

Strengths

Largest context window (2M) · strong on research + factual grounding · great Search integration · competitive pricing

Known limitations

Slightly weaker on creative writing tasks · Workspace lock-in for some features · less mature agentic tool use

Discussion

0 comments

Sign in to join the conversation.

Be the first to comment.