← Back to Model Beat
10Research·Dec 9

FACTS Benchmark Suite: Systematically evaluating the factuality of large language models

Systematically evaluating the factuality of large language models with the FACTS Benchmark Suite.

Covered by 1 source

Related stories

ResearchTen yearsDec 11ResearchEconomic Research - AnthropicDec 6