Compare

GLM-5.1 vs Grok 4

GLM-5.1 (Z.ai (Zhipu AI)) and Grok 4 (xAI) compared on benchmarks, pricing, context window, and use-case rankings.

GLM-5.1 leads on overall intelligence (ECI 150 vs 147).
GLM-5.1 ranks higher for coding (69th vs 15th percentile).

GLM-5.1Z.ai (Zhipu AI) · 5 in the newsGrok 4xAI

Scores

Intelligence (ECI)150147

Coding6915

Math6322

Reasoning & Knowledge4848

Agentic & Tools—16

Specifications

DeveloperZ.ai (Zhipu AI)xAI

Family—Grok

ReleasedApr 7, 2026Jul 9, 2025

Parameters754B3T

Availability—API access

Context window203K—

Price — $/M input$0.98—

Price — $/M output$3.08—

Inputstext—

Outputstext—

Benchmarks

AIME 2024/202592%84%

FrontierMath33%20%

FrontierMath Tier 413%2%

GPQA Diamond85%87%

SimpleBench59%61%

SimpleQA Verified37%48%

SWE-bench Verified74%—

WebDev Arena1534—

WeirdML57%46%

APEX—15%

ARC-AGI—67%

ARC-AGI-2—16%

GDPval (win/tie rate)—24%

METR task horizon—1.8 h

Terminal-Bench—27%

Use-case scores are 0–100 percentile composites across each area’s benchmarks, ranked against every model from the past year. Highlighted cells lead each row. Open a model for the full picture.

Frequently asked questions

Is GLM-5.1 better than Grok 4?

On Epoch AI's Capabilities Index, GLM-5.1 scores higher (150) than Grok 4 (147). The right pick depends on your task — compare their coding, math, and reasoning scores in the table above.

Which is better for coding, GLM-5.1 or Grok 4?

Across coding benchmarks like SWE-bench Verified and Terminal-Bench, GLM-5.1 ranks higher — 69th vs 15th percentile among the models tracked on Model Beat.

Want a different match-up? Open the compare tool to add or swap models.

More comparisons

Benchmarks & model data from Epoch AI (CC BY); pricing & specs from OpenRouter. ECI = Epoch Capabilities Index.