4Opinion·5h ago

Do LLMs Truly Generalize in the Molecular Domain? A Perturbation-Based Analysis

A new study evaluating large language models in molecular science reveals that these systems often struggle to maintain chemical accuracy when their input strings are perturbed. While LLMs excel at predicting sequences, they frequently fail to respect the underlying physical and topological rules required for valid molecular structures. This finding highlights a critical limitation in applying generative language models to drug discovery and materials science, where precision is essential for functional results.

Covered by 1 source

AarXiv CS.AI↗Jiatong Li, Weida Wang, Changmeng Zheng, Shufei Zhang, Yatao Bian, Xiao-yong Wei, Qing Li5h ago

Do LLMs Truly Generalize in the Molecular Domain? A Perturbation-Based Analysis

Covered by 1 source

Related stories