LLM

Llama 3.3 Instruct 70B

Name: Llama 3.3 Instruct 70B Review & Benchmarks
Item: Llama 3.3 Instruct 70B
Rating: 8.6
Author: Sentient Weekly

by Meta AI

Overall Score

8.6

Meta's leading open-weight model — frontier capability, self-hostable

Released Dec 2024Open Source

Benchmark Scores

Reasoning

8.5

Coding

10.7

Math

7.7

Creative writing

8.4

Instruction following

8.5

Multimodal

7.0

Standard Benchmarks

aa_livecodebenchaa_livecodebench

0.3

Measured 2026-06-28 · source

aa_scicodeaa_scicode

0.3

Measured 2026-06-28 · source

aa_tau2aa_tau2

0.3

Measured 2026-06-28 · source

aa_math_500aa_math_500

0.8

Measured 2026-06-28 · source

aa_aimeaa_aime

0.3

Measured 2026-06-28 · source

aa_aime_25aa_aime_25

0.1

Measured 2026-06-28 · source

aa_ifbenchaa_ifbench

0.5

Measured 2026-06-28 · source

aa_lcraa_lcr

0.1

Measured 2026-06-28 · source

aa_terminalbench_hardaa_terminalbench_hard

0.0

Measured 2026-06-28 · source

aa_mmlu_proaa_mmlu_pro

0.7

Measured 2026-06-28 · source

aa_gpqaaa_gpqa

0.5

Measured 2026-06-28 · source

aa_hleaa_hle

0.0

Measured 2026-06-28 · source

Price

Free for research and most commercial use (see Community License)

Free

Context

128K tokens

Speed

87.17 t/s

658ms TTFT

Compare this model →

Overview

The strongest fully open-weight LLM in regular use. Comparable to GPT-4 class models on most benchmarks while being free to download, fine-tune, and self-host. Foundation for thousands of derivative models.

Strengths

Open weights (Llama Community License) · self-hostable · strong English + multilingual · enormous fine-tune ecosystem · free to use commercially with caveats

Known limitations

Less capable than frontier closed models on hardest reasoning · 128K context smaller than rivals · weaker tool use than purpose-built agentic models

Discussion

0 comments

Be the first to comment.