Facts Grounding: A New Benchmark For Evaluating The Factuality Of Large Language Models
Google News, Tuesday, December 17th, 2024
Our comprehensive benchmark and online leaderboard offer a much-needed measure of how accurately LLMs ground their responses in provided source material and avoid hallucinations
Large language models (LLMs) are transforming how we access information, yet their grip on factual accuracy remains imperfect. They can 'hallucinate' false information, particularly when given complex inputs. In turn, this can erode trust in LLMs and limit their applications in the real world.
Today, we're introducing FACTS Grounding, a comprehensive benchmark for evaluating the ability of LLMs to generate responses that are not only factually accurate with respect to given inputs, but also sufficiently detailed to provide satisfactory answers to user queries.