← Back to Model Beat
5Products·Jun 16

First, do NOHARM: towards clinically safe large language models

Researchers have developed NOHARM, a new benchmark designed to systematically evaluate the clinical safety and risk profiles of large language models used in healthcare. Because these systems are increasingly consulted by patients and medical professionals despite lacking standardized safety validation, this tool provides a structured method to quantify potential harm. By establishing rigorous assessment protocols, the framework aims to address the current lack of transparency regarding the reliability of AI-generated medical advice.

Covered by 2 sources

  • AarXiv CS.AIDavid Wu, Fateme Nateghi Haredasht, Saloni Kumar Maharaj, Priyank Jain, Jessica Tran, Matthew Gwiazdon, Arjun Rustagi, Jenelle Jindal, Jacob M. Koshy, Vinay Kadiyala, Anup Agarwal, Bassman Tappuni, Brianna French, Sirus Jesudasen, Christopher V. Cosgriff, Rebanta Chakraborty, Jillian Caldwell, Susan Ziolkowski, David J. Iberri, Robert Diep, Rahul S. Dalal, Kira L. Newman, Kristin Galetta, J. Carl Pallais, Nancy Wei, Kathleen M. Buchheit, David I. Hong, Vartan Pahalyants, Ernest Y. Lee, Allen Shih, Tamara B. Kaplan, Vishnu Ravi, Sarita Khemani, Thomas A. Buckley, April S. Liang, Daniel Shirvani, Advait Patil, Nicholas Marshall, Kanav Chopra, Joel Koh, Adi Badhwar, Anastasia Perez, Austin J. Schoeffler, Mahbuba Tusty, Chase M. Walton, Liam G. McCoy, David J. H. Wu, Yingjie Weng, Sumant Ranji, Kevin Schulman, Nigam H. Shah, Jason Hom, Arnold Milstein, Arjun K. Manrai, Adam Rodman, Jonathan H. Chen, Ethan GohJun 17
  • HHacker NewsbenwenJun 16

Related stories

ProductsNew usage analytics and updated spend controls for enterprisesJun 18 · 2 sourcesProductsAdobe adds AI agents to Photoshop, Premiere, and more Creative Cloud appsJun 18ProductsMeta’s new ‘AI Mode’ on Facebook pulls from public info across its platformsJun 15 · 3 sourcesProductsPokémon Go data helped train AI now linked to military dronesJun 15