Can a technology called RAG keep AI models from making stuff up?

We’ve been living through the generative AI boom for nearly a year and a half now, following the late 2022 release of OpenAI’s ChatGPT. But despite transformative effects on companies’ share prices, generative AI tools powered by large language models (LLMs) still have major drawbacks that have kept them from being as useful as many would like them to be. Retrieval augmented generation, or RAG, aims to fix some of those drawbacks.

A framework for enhancing AI accuracy

Although RAG is now seen as a technique to help fix issues with generative AI, it actually predates ChatGPT. Researchers coined the term in a 2020 academic paper by researchers at Facebook AI Research (FAIR, now Meta AI Research), University College London, and New York University.

As we've mentioned, LLMs struggle with facts. Google’s entry into the generative AI race, Bard, made an embarrassing error on its first public demonstration back in February 2023 about the James Webb Space Telescope. The error wiped around $100 billion off the value of parent company Alphabet. LLMs produce the most statistically likely response based on their training data and don’t understand anything they output, meaning they can present false information that seems accurate if you don't have expert knowledge on a subject.

LLMs also lack up-to-date knowledge and the ability to identify gaps in their knowledge. “When a human tries to answer a question, they can rely on their memory and come up with a response on the fly, or they could do something like Google it or peruse Wikipedia and then try to piece an answer together from what they find there—still filtering that info through their internal knowledge of the matter,” said Giansiracusa.

But LLMs aren’t humans, of course. Their training data can age quickly, particularly in more time-sensitive queries. In addition, the LLM often can’t distinguish specific sources of its knowledge, as all its training data is blended together into a kind of soup.

In theory, RAG should make keeping AI models up to date far cheaper and easier. “The beauty of RAG is that when new information becomes available, rather than having to retrain the model, all that’s needed is to augment the model’s external knowledge base with the updated information,” said Peterson. “This reduces LLM development time and cost while enhancing the model’s scalability.”

Promoted Comments

Harvesterify

A very recent research paper explored the hypothesis that RAG would reduce hallucinations and improve recall, when applied to legal texts and legal-related tasks (summarizing caselaws, document drafting, etc), and the conclusion is negative, specialized models hallucinate between 17 and 33% (which is a slight improvement over general purposes models, but not much), while slightly improving recall.

Paper is the following one: "Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools", from Varun Magesh, Faiz Surani, Matthew Dahl, Mirac Suzgun, Christopher D. Manning, Daniel E. Ho

June 6, 2024 at 12:38 pm

quit your lying —

Can a technology called RAG keep AI models from making stuff up?

The framework pulls in external sources to enhance accuracy. Does it live up to the hype?

Further Reading

A framework for enhancing AI accuracy

Promoted Comments

Channel Ars Technica

Further Reading

A framework for enhancing AI accuracy

reader comments

Promoted Comments

Channel Ars Technica