← All models
Compare

GPT-5.5 vs Qwen 3.7 Max

GPT-5.5 (OpenAI) and Qwen 3.7 Max (Alibaba) compared on benchmarks, pricing, context window, and use-case rankings.

GPT-5.5OpenAI · 17 in the newsQwen 3.7 MaxAlibaba · 4 in the news
Scores
Intelligence (ECI)159153
Coding9181
Math9778
Reasoning & Knowledge9285
Agentic & Tools8778
Specifications
DeveloperOpenAIAlibaba
FamilyGPTQwen
ReleasedApr 23, 2026May 19, 2026
Parameters
AvailabilityAPI access
Context window1.1M1M
Price — $/M input$5.00$1.25
Price — $/M output$30.00$3.75
Inputsfile, image, texttext
Outputstexttext
Benchmarks
AIME 2024/2025100%95%
APEX38%
ARC-AGI95%
ARC-AGI-285%
FrontierMath52%
FrontierMath Tier 435%
GPQA Diamond94%92%
GSO (code optimization)40%
Humanity's Last Exam44%38%
SciCode56%49%
SimpleBench69%70%
SimpleQA Verified63%59%
SWE-bench Verified81%77%
Terminal-Bench85%
WebDev Arena15051541
WeirdML85%
τ²-bench94%95%

Use-case scores are 0–100 percentile composites across each area’s benchmarks, ranked against every model from the past year. Highlighted cells lead each row. Open a model for the full picture.

Frequently asked questions

Is GPT-5.5 better than Qwen 3.7 Max?

On Epoch AI's Capabilities Index, GPT-5.5 scores higher (159) than Qwen 3.7 Max (153). The right pick depends on your task — compare their coding, math, and reasoning scores in the table above.

Which is cheaper, GPT-5.5 or Qwen 3.7 Max?

Qwen 3.7 Max is cheaper on input tokens at $1.25 per million, versus $5.00 (representative OpenRouter pricing).

Which has a larger context window, GPT-5.5 or Qwen 3.7 Max?

GPT-5.5 supports up to 1.1M tokens, compared with 1M for the other.

Which is better for coding, GPT-5.5 or Qwen 3.7 Max?

Across coding benchmarks like SWE-bench Verified and Terminal-Bench, GPT-5.5 ranks higher — 91th vs 81th percentile among the models tracked on Model Beat.

Want a different match-up? Open the compare tool to add or swap models.

Benchmarks & model data from Epoch AI (CC BY); pricing & specs from OpenRouter. ECI = Epoch Capabilities Index.