18
← Back to the timeline
Daily archive

Thursday December 18, 2025

10 stories — deduplicated across sources, ranked by significance, every source cited.
10
SIGNIFICANCE
★ Top story · IndustryDec 18

Evaluating chain-of-thought monitorability

OpenAI introduces a new framework and evaluation suite for chain-of-thought monitorability, covering 13 evaluations across 24 environments. Our findings show that monitoring a model’s internal reasoning is far more effective than monitoring outputs alone, offering a promising path toward scalable control as AI systems grow more capable.