4Models·5h ago

HaloGuard 1.0: An Open Weights Constitutional Classifier for Multilingual AI Safety

Researchers have released HaloGuard 1.0, an open-weights safety classifier designed to detect harmful prompts across multiple languages. By utilizing a constitutional AI framework, the model achieves high performance while requiring only a fraction of the computational resources typically needed for such safety systems.

Covered by 1 source

AarXiv CS.AI↗Navaneeth Sangameswaran, Preetham S, Ashmiya Lenin5h ago

HaloGuard 1.0: An Open Weights Constitutional Classifier for Multilingual AI Safety

Covered by 1 source

Related stories