LLM

Gemini 3.1 Pro Preview

Name: Gemini 3.1 Pro Preview Review & Benchmarks
Item: Gemini 3.1 Pro Preview
Rating: 57.2
Author: Sentient Weekly

by Google DeepMind

Overall Score

57.2

Google's flagship reasoning + research model

Released Feb 2026

Benchmark Scores

Reasoning

9.3

Coding

55.5

Math

—

Creative writing

8.5

Instruction following

9.0

Multimodal

9.5

Standard Benchmarks

aa_gpqaaa_gpqa

0.9

Measured 2026-05-13 · source

aa_hleaa_hle

0.5

Measured 2026-05-13 · source

aa_scicodeaa_scicode

0.6

Measured 2026-05-13 · source

aa_ifbenchaa_ifbench

0.8

Measured 2026-05-13 · source

aa_lcraa_lcr

0.7

Measured 2026-05-13 · source

aa_terminalbench_hardaa_terminalbench_hard

0.5

Measured 2026-05-13 · source

aa_tau2aa_tau2

1.0

Measured 2026-05-13 · source

Price

$7 / 1M input · $21 / 1M output

Paid

Context

2M tokens

Speed

135.33 t/s

29889ms TTFT

Compare this model →

Overview

Google DeepMind's flagship reasoning + research model. The 2M-token context is unmatched for true long-form document and codebase work. Tight integration with Google Search and Workspace.

Strengths

Largest context window (2M) · strong on research + factual grounding · great Search integration · competitive pricing

Known limitations

Slightly weaker on creative writing tasks · Workspace lock-in for some features · less mature agentic tool use

Discussion

0 comments

Be the first to comment.