Compare

Gemini 3.1 Pro vs Grok 4

Gemini 3.1 Pro (Google DeepMind) and Grok 4 (xAI) compared on benchmarks, pricing, context window, and use-case rankings.

Gemini 3.1 Pro leads on overall intelligence (ECI 156 vs 147).
Gemini 3.1 Pro ranks higher for coding (71th vs 15th percentile).

Gemini 3.1 ProGoogle DeepMind · 2 in the newsGrok 4xAI

Scores

Intelligence (ECI)156147

Coding7115

Math7622

Reasoning & Knowledge9648

Agentic & Tools8816

Specifications

DeveloperGoogle DeepMindxAI

FamilyGeminiGrok

ReleasedFeb 19, 2026Jul 9, 2025

Parameters—3T

AvailabilityAPI accessAPI access

Context window——

Price — $/M input——

Price — $/M output——

Inputs——

Outputs——

Benchmarks

AIME 2024/202596%84%

APEX34%15%

ARC-AGI98%67%

ARC-AGI-277%16%

FrontierMath37%20%

FrontierMath Tier 417%2%

GPQA Diamond94%87%

GSO (code optimization)23%—

Humanity's Last Exam46%—

METR task horizon6.4 h1.8 h

SimpleBench80%61%

SimpleQA Verified77%48%

SWE-bench Verified76%—

Terminal-Bench80%27%

WebDev Arena1461—

WeirdML72%46%

GDPval (win/tie rate)—24%

Use-case scores are 0–100 percentile composites across each area’s benchmarks, ranked against every model from the past year. Highlighted cells lead each row. Open a model for the full picture.

Frequently asked questions

Is Gemini 3.1 Pro better than Grok 4?

On Epoch AI's Capabilities Index, Gemini 3.1 Pro scores higher (156) than Grok 4 (147). The right pick depends on your task — compare their coding, math, and reasoning scores in the table above.

Which is better for coding, Gemini 3.1 Pro or Grok 4?

Across coding benchmarks like SWE-bench Verified and Terminal-Bench, Gemini 3.1 Pro ranks higher — 71th vs 15th percentile among the models tracked on Model Beat.

Want a different match-up? Open the compare tool to add or swap models.

More comparisons

Benchmarks & model data from Epoch AI (CC BY); pricing & specs from OpenRouter. ECI = Epoch Capabilities Index.