Rankings

Best AI models for agentic tasks

Ranked by a 0–100 composite of agentic and tool-use benchmarks (Terminal-Bench, APEX, METR task horizon, GDPval (win/tie rate)), scored as percentile rank against every model released in the past year.

Agentic & Tools score is a 0–100 percentile composite across that area’s benchmarks. Open a model for the raw scores.

Multimodal129 models

Gemini 3.5 FlashMay 19, 2026Google1M$1.50 / $9.00100

Claude Fable 5Jun 9, 2026 · 9 in the newsAnthropic1M$10.00 / $50.0097

GPT-5.5Apr 23, 2026 · 14 in the newsOpenAI1.1M$5.00 / $30.0095

Claude Opus 4.8May 28, 2026 · 1 in the newsAnthropic1M$5.00 / $25.0094

Claude Opus 4.7Apr 16, 2026 · 4 in the newsAnthropic1M$5.00 / $25.0091

Gemini 3.1 ProFeb 19, 2026 · 2 in the newsGoogle DeepMind——88

Claude Opus 4.6Feb 5, 2026Anthropic1M$5.00 / $25.0088

GPT-5.2Dec 11, 2025 · 7 in the newsOpenAI400K$1.75 / $14.0088

GPT-5.4Mar 5, 2026 · 6 in the newsOpenAI1.1M$2.50 / $15.0084

GPT-5.3 CodexFeb 5, 2026 · 4 in the newsOpenAI400K$1.75 / $14.0080

GPT-5.2 CodexDec 18, 2025 · 3 in the newsOpenAI400K$1.75 / $14.0077

GPT-5.1-Codex-miniNov 12, 2025OpenAI400K$0.25 / $2.0073

Gemini 3 FlashDec 17, 2025 · 2 in the newsGoogle DeepMind——72

GPT-5.4 MiniMar 17, 2026OpenAI400K$0.75 / $4.5068

Claude Opus 4.5Nov 24, 2025Anthropic200K$5.00 / $25.0068

Grok 4.20Feb 17, 2026xAI2M$1.25 / $2.5065

Gemini 3 ProNov 18, 2025 · 1 in the newsGoogle DeepMind——64

GPT-5.1-CodexNov 12, 2025 · 2 in the newsOpenAI400K$1.25 / $10.0064

Claude Sonnet 4.6Feb 17, 2026 · 1 in the newsAnthropic1M$3.00 / $15.0063

GPT-5.1-Codex-MaxNov 19, 2025 · 2 in the newsOpenAI400K$1.25 / $10.0054

GPT‑5-CodexSep 15, 2025OpenAI400K$1.25 / $10.0052

GLM-5Feb 17, 2026 · 26 in the newsZ.ai (Zhipu AI)203K$0.60 / $1.9251

MiniMax-M2.7Mar 18, 2026MiniMax205K$0.25 / $1.0050

GPT-5.1Nov 13, 2025 · 6 in the newsOpenAI400K$1.25 / $10.0050

Claude Sonnet 4.5Sep 29, 2025 · 3 in the newsAnthropic1M$3.00 / $15.0045

GPT-5Aug 7, 2025 · 61 in the newsOpenAI400K$1.25 / $10.0042

Kimi K2.5Feb 2, 2026 · 10 in the newsMoonshot262K$0.38 / $2.0239

GPT-5.4 NanoMar 17, 2026OpenAI400K$0.20 / $1.2538

MiniMax-M2.1Dec 23, 2025MiniMax205K$0.29 / $0.9535

Claude Opus 4.1Aug 5, 2025Anthropic200K$15.00 / $75.0031

Qwen 3.5 Plus (hosted 397B-A17B)Feb 16, 2026Alibaba——29

MiniMax-M2.5Feb 12, 2026MiniMax205K$0.15 / $0.9029

DeepSeek-V3.2Dec 1, 2025 · 1 in the newsDeepSeek131K$0.23 / $0.3429

GPT-5 miniAug 7, 2025 · 1 in the newsOpenAI400K$0.25 / $2.0028

Claude Haiku 4.5Oct 15, 2025 · 1 in the newsAnthropic200K$1.00 / $5.0026

Gemini 3.1 Flash-LiteMar 3, 2026 · 1 in the newsGoogle1M$0.25 / $1.5024

Grok 4.1 FastNov 19, 2025xAI——24

MiniMax-M2Oct 27, 2025MiniMax205K$0.26 / $1.0023

Kimi K2Jul 11, 2025 · 27 in the newsMoonshot131K$0.57 / $2.3020

Kimi K2 ThinkingNov 6, 2025 · 4 in the newsMoonshot262K$0.60 / $2.5017

GLM-4.7Dec 22, 2025 · 3 in the newsZ.ai (Zhipu AI)203K$0.40 / $1.7516

Grok 4Jul 9, 2025xAI——16

Qwen3-Coder-480B-A35BJul 22, 2025Alibaba——15

Qwen 3.6 35B-A3BApr 14, 2026 · 3 in the newsAlibaba262K$0.14 / $1.0013

GPT-5 nanoAug 7, 2025OpenAI400K$0.05 / $0.408

GLM-4.6Sep 30, 2025 · 3 in the newsZ.ai (Zhipu AI),Tsinghua University203K$0.43 / $1.747

gpt-oss-120bAug 5, 2025 · 2 in the newsOpenAI131K$0.04 / $0.186

Gemini 2.5 FlashSep 25, 2025 · 1 in the newsGoogle DeepMind1M$0.30 / $2.502

gpt-oss-20bAug 5, 2025 · 1 in the newsOpenAI131K$0.03 / $0.140

Kimi K2.7 CodeJun 12, 2026Moonshot262K$0.61 / $3.07—

Claude Mythos 5Jun 9, 2026 · 1 in the newsAnthropic———

Nemotron 3 UltraJun 4, 2026 · 1 in the newsNVIDIA———

MAI-Image-2.5Jun 2, 2026Microsoft———

MiniMax-M3Jun 1, 2026MiniMax1M$0.30 / $1.20—

Qwen 3.7 MaxMay 19, 2026 · 4 in the newsAlibaba1M$1.25 / $3.75—

Composer 2.5May 18, 2026Cursor———

TML-Interaction-SmallMay 11, 2026Thinking Machines———

Qwen 3.6 FlashApr 27, 2026Alibaba1M$0.19 / $1.13—

DeepSeek-V4-ProApr 24, 2026 · 3 in the newsDeepSeek1M$0.43 / $0.87—

DeepSeek-V4-FlashApr 24, 2026DeepSeek1M$0.09 / $0.18—

MiMo-V2.5-ProApr 23, 2026Xiaomi Corp1M$0.43 / $0.87—

GPT-5.5 ProApr 23, 2026OpenAI1.1M$30.00 / $180.00—

GPT Image 2Apr 21, 2026 · 1 in the newsOpenAI———

Kimi K2.6Apr 20, 2026 · 4 in the newsMoonshot262K$0.66 / $3.50—

Qwen 3.6 Max (Preview)Apr 20, 2026 · 2 in the newsAlibaba262K$1.04 / $6.24—

Grok 4.3 BetaApr 17, 2026xAI———

Gemini Flash 3.1 TTSApr 15, 2026Google DeepMind———

Muse SparkApr 8, 2026Meta AI———

GLM-5.1Apr 7, 2026 · 5 in the newsZ.ai (Zhipu AI)203K$0.98 / $3.08—

Gemma 4 31B ITApr 2, 2026Google DeepMind262K$0.12 / $0.35—

Qwen 3.6 PlusApr 1, 2026 · 3 in the newsAlibaba1M$0.33 / $1.95—

Qwen3.5-Omni-FlashMar 29, 2026Alibaba———

Qwen3.5-Omni-PlusMar 29, 2026Alibaba———

Composer 2Mar 19, 2026Cursor———

MiMo-V2-ProMar 18, 2026Xiaomi Corp———

Nemotron 3 SuperMar 11, 2026NVIDIA———

GPT-5.4 ProMar 5, 2026OpenAI1.1M$30.00 / $180.00—

Gemini 3.0 Flash-liteMar 3, 2026Google DeepMind———

SWE 1.6Mar 1, 2026Cognition———

Qwen 3.5 Flash (hosted 35B-A3B)Feb 25, 2026Alibaba———

Qwen3.5-122B-A10BFeb 24, 2026Alibaba262K$0.26 / $2.08—

Qwen3.5 397B-A17BFeb 13, 2026 · 1 in the newsAlibaba256K$0.39 / $2.45—

Seedance 2.0Feb 12, 2026 · 3 in the newsByteDance———

Qwen3-Coder-NextFeb 2, 2026 · 1 in the newsAlibaba262K$0.11 / $0.80—

Qwen3-Max-ThinkingJan 25, 2026 · 2 in the newsAlibaba262K$0.78 / $3.90—

Solar Open 100BDec 31, 2025Upstage———

K-EXAONEDec 31, 2025LG AI Research———

VAETKIDec 30, 2025NC AI———

A.X K1Dec 30, 2025SK Telecom———

HyperCLOVA X SEED 32B ThinkDec 29, 2025NAVER———

Nemotron 3-Nano-30B-A3BDec 15, 2025NVIDIA262K$0.05 / $0.20—

GPT-5.2 ProDec 11, 2025OpenAI400K$21.00 / $168.00—

SIMA 2Dec 4, 2025 · 1 in the newsGoogle DeepMind———

DeepSeekMath-V2Nov 27, 2025DeepSeek———

Olmo 3Nov 20, 2025 · 1 in the newsAllen Institute for AI———

Gemini 3 Pro Image (Nano Banana Pro)Nov 20, 2025Google DeepMind———

π0.6 (pi-0.6)Nov 17, 2025Physical Intelligence———

P1-235B-A22BNov 17, 2025Shanghai AI Lab———

Grok 4.1Nov 17, 2025xAI———

GPT-5.1 InstantNov 13, 2025 · 1 in the newsOpenAI———

Meta's Generative Ads Model (GEM)Nov 10, 2025Meta AI———

Tongyi DeepResearchOct 28, 2025Alibaba———

Veo 3.1Oct 15, 2025 · 1 in the newsGoogle DeepMind———

Ling-1TOct 10, 2025Ant Group———

GPT-5 ProOct 7, 2025OpenAI400K$15.00 / $120.00—

Sora 2.0Sep 30, 2025OpenAI———

DeepSeek-V3.2-ExpSep 29, 2025DeepSeek164K$0.27 / $0.41—

Gemini Robotics-ER 1.5Sep 25, 2025Google DeepMind———

Gemini 2.5 Flash-LiteSep 25, 2025 · 1 in the newsGoogle DeepMind1M$0.10 / $0.40—

Qwen3-Omni-30B-A3BSep 22, 2025Alibaba———

Grok 4 FastSep 19, 2025xAI———

AgentFounder-30BSep 16, 2025Alibaba———

Qwen3-MaxSep 5, 2025 · 3 in the newsAlibaba262K$0.78 / $3.90—

LongCat-FlashSep 1, 2025Meituan Inc———

Gemini 2.5 Flash Image (Nano Banana)Aug 26, 2025Google———

DeepSeek-V3.1Aug 21, 2025DeepSeek———

GLM-4.5Aug 5, 2025 · 7 in the newsZ.ai (Zhipu AI),Tsinghua University131K$0.60 / $2.20—

Qwen ImageAug 4, 2025 · 4 in the newsAlibaba———

Hierarchical Reasoning Model (HPM)Aug 4, 2025Sapient Intelligence———

MindLink-72BAug 1, 2025Kunlun Inc.———

Gemini 2.5 Deep ThinkAug 1, 2025Google,Google DeepMind———

Qwen3-235B-A22B-Thinking (Jul 2025)Jul 25, 2025Alibaba———

Qwen3-235B-A22B (Jul 2025)Jul 25, 2025Alibaba———

GLM-4.5-AirJul 20, 2025Z.ai (Zhipu AI),Tsinghua University131K$0.13 / $0.85—

EXAONE 4.0 (32B)Jul 15, 2025LG AI Research———

Gemini EmbeddingJul 14, 2025Google DeepMind———

Grok 4 HeavyJul 10, 2025xAI———

EXAONE Path 2.0Jul 9, 2025LG AI Research———

Grok-3 miniJun 24, 2025xAI———

Frequently asked questions

What is the best AI model for agentic tasks?

Gemini 3.5 Flash (Google) currently ranks first for agentic tasks on Model Beat, followed by Claude Fable 5 and GPT-5.5. Ranked by a 0–100 composite of agentic and tool-use benchmarks (Terminal-Bench, APEX, METR task horizon, GDPval (win/tie rate)), scored as percentile rank against every model released in the past year.

Which is the most affordable strong agentic tasks model?

Among the top-ranked agentic tasks models, Gemini 3.5 Flash is the cheapest at $1.50 per million input tokens.

How are these agentic tasks rankings calculated?

Benchmarks & model data from Epoch AI (CC BY); pricing & specs from OpenRouter. ECI = Epoch Capabilities Index.