← Back to Model Beat
5Research·Jun 19

New benchmark exposes how badly AI struggles with real knowledge work

A new testing framework designed to simulate professional knowledge tasks found that top-tier AI models successfully completed only three percent of assignments. The results highlight a significant performance gap between current systems and the requirements of complex, multi-step workflows. By using a more rigorous evaluation method than standard benchmarks, this study demonstrates that today’s models frequently struggle with the accuracy and reasoning needed for practical, real-world productivity.

Covered by 1 source

Related stories

ResearchOracle Cut 21,000 Jobs in 12 Months, Says AI Replaced Some RolesJun 22 · 11 sourcesResearchGoogle Deepmind loses another top AI researcher as Nobel laureate John Jumper leaves for AnthropicJun 19 · 6 sourcesResearchUsing AI to help physicians diagnose rare genetic diseases affecting childrenJun 18 · 3 sourcesResearchMore people get news from AI chatbots, but trust remains lowJun 17 · 3 sources