← Back to Model Beat
10Research·Aug 7

From hard refusals to safe-completions: toward output-centric safety training

Discover how OpenAI's new safe-completions approach in GPT-5 improves both safety and helpfulness in AI responses—moving beyond hard refusals to nuanced, output-centric safety training for handling dual-use prompts.

Covered by 1 source

Related stories

ResearchWhat I've been reading (#2): More on Kimi K2, how to build a bad research center, Pretraining with RL, and sporks of AGI - Interconnects AIAug 10ResearchAccelerate ND-Parallel: A guide to Efficient Multi-GPU TrainingAug 8