VOOZH about

URL: https://thenewstack.io/3-ways-to-stop-llm-hallucinations/

⇱ 3 Ways to Stop LLM Hallucinations - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2023-09-15 10:15:23
3 Ways to Stop LLM Hallucinations
sponsor-datastax,sponsored-post-contributed,
AI / Large Language Models

3 Ways to Stop LLM Hallucinations

How retrieval-augmented generation, reasoning and iterative querying help large language models reply accurately to prompts.
Sep 15th, 2023 10:15am by Alan Ho
👁 Featued image for: 3 Ways to Stop LLM Hallucinations
DataStax sponsored this post.

Large language models have become extremely powerful today; they can help provide answers to some of our hardest questions. But they can also lead us astray: They tend to hallucinate, which means that they give answers that seem right but aren’t.

LLMs hallucinate when they encounter queries that aren’t part of their training data set — or when their training data set contains erroneous information (this can happen when LLMs are trained on internet data, which, as we all know, can’t always be trusted). LLMs also don’t have memory. Finally, “fine tuning” is often regarded as a way to reduce hallucinations by retraining a model on new data — but it has its drawbacks.

Here, we’ll look at three methods to stop LLMs from hallucinating: retrieval-augmented generation (RAG), reasoning and iterative querying.

DataStax, an IBM company, provides the real-time vector data tools that Gen AI apps need, with seamless integration with developers’ stacks of choice.
Learn More
The latest from DataStax

Retrieval-Augmented Generation

With RAG, a query comes into the knowledge base (which, in this case, is a vector database) as a semantic vector — a string of numbers.

The model then retrieves similar documents from the database using vector search, looking for documents whose vectors are close to the vector of the query.

Once the relevant documents have been retrieved, the query, along with these documents, is used by the LLM to summarize a response for the user. This way, the model doesn’t have to rely solely on its internal knowledge but can access whatever data you provide it at the right time. In a sense, it provides the LLM with “long-term memory” that it doesn’t possess on its own. The model can provide more accurate and contextually appropriate responses by including proprietary data stored in the vector database.

👁 Image

An alternate RAG approach incorporates fact-checking. The LLM is prompted for an answer, which is then fact-checked and reviewed against data in the vector database. An answer to the query is produced from the vector database, and then the LLM uses that answer as a prompt to discern whether it’s related to a fact.

👁 Image

Reasoning

LLMs are good at a lot of things. They can predict the next word in a sentence, thanks to advances in “transformers,” which transform how machines understand human language by paying varying degrees of attention to different parts of the input data. LLMs are also good at boiling down a lot of information into a concise answer, and finding and extracting something you’re looking for from a large amount of text. Surprisingly, LLMS can also plan — they can gather data and plan a trip for you.

And maybe even more surprisingly, LLMs can use reasoning to produce an answer, in an almost human-like fashion. Because people can reason, they don’t need tons of data to make a prediction or decision. Reasoning also helps LLMs to avoid hallucinations. An example of this is “chain-of-thought prompting.”

This method helps models to break multistep problems into intermediate steps. With chain-of-thought prompting, LLMs can solve complex reasoning problems that standard prompt methods can’t (for an in-depth look, check out the blog post Language Models Perform Reasoning via Chain of Thought from Google).

If you give an LLM a complicated math problem, it might get it wrong. But if you provide the LLM with the problem as well as the method of solving it, it can produce an accurate answer — and share the reason behind the answer. A vector database is a key part of this method, as it provides examples of questions similar to this and populates the prompt with the example.

Even better, once you have the question and answer, you can store it in the vector database to further improve the accuracy and usefulness of your generative AI applications.

👁 Image

There are a host of other reasoning advancements you can learn about, including tree of thought, least to most, self-consistency and instruction tuning.

Iterative Querying

The third method to help reduce LLM hallucinations is interactive querying. In this case, an AI agent mediates calls that move back and forth between an LLM and a vector database. This can happen multiple times iteratively in order to arrive at the best answer. An example of this is forward-looking active retrieval generation, also known as FLARE.

You take a question and then query your knowledge base for similar questions. You’d get a series of similar questions. Then you query the vector database with all the questions, summarize the answer, and check if the answer looks good and reasonable. If it doesn’t, repeat the steps until it does.

👁 Image

Other advanced interactive querying methods include AutoGPT, Microsoft Jarvis and Solo Performance Prompting.

There are many tools that can help you with agent orchestration. LangChain is a great example that helps you orchestrate calls between an LLM and a vector database. It essentially automates the majority of management tasks and interactions with LLMs and provides support for memory, vector-based similarity search, advanced prompt-templating abstraction and a wealth of other features. It also helps and supports advanced prompting techniques like chain-of-thought and FLARE.

Another such tool is CassIO, which was developed by DataStax as an abstraction on top of our Astra DB vector database, with the idea of making data and memory first-class citizens in generative AI. CassIO is a Python library that makes the integration of Cassandra with generative artificial intelligence and other machine learning workloads seamless by abstracting the process of accessing the database, including its vector search capabilities, and offering a set of ready-to-use tools that minimize the need for additional code.

Putting It All Together: SkyPoint AI

SkyPoint AI is a SaaS provider specializing in data, analytics and AI services for the senior care and living industry. The company leverages generative AI to enable natural and intuitive interactions between seniors, caregivers and software systems. By simplifying complex applications and streamlining the user experience, SkyPoint AI empowers seniors and caregivers to access information and insights effortlessly, which helps enhance care.

The company pulls from a wide variety of data that is both structured and unstructured to provide AI-generated answers to prompts like “How many residents are currently on Medicare?” said SkyPoint chief executive Tisson Mathew. This helps care providers make informed decisions quickly, based on accurate data, he said.

Getting to that point, however, was a process, Mathew said. His team started by taking a standard LLM and fine-tuning it with SkyPoint data. “It came up with disastrous results — random words, even,” he said. Understanding and creating prompts was something SkyPoint could handle, but it needed an AI technology stack to handle generating accurate answers at scale.

SkyPoint ended up building a system that ingested structured data from operators and providers, including electronic health-care record and payroll data, for example. This is stored in a columnar database; RAG is used to query it. Unstructured data, such as policies and procedures and state regulations, is stored in a vector database: DataStax Astra DB.

Tisson posed a question as an example: What if a resident becomes abusive? Astra DB provides an answer that is assembled based on state regulations and the users context, and a variety of different documents and vector embeddings in natural language that’s easy for a senior-care facility worker to understand,

“These are specific answers that have to be right,” Tisson said. “This is information an organization relies on to make informed decisions for their community and for their business.”

Conclusion

SkyPoint AI illustrates the importance of mitigating the risk of AI hallucinations; the consequences could be potentially dire without the methods and tools available to ensure accurate answers.

With RAG, reasoning and iterative querying approaches such as FLARE, generative AI — particularly when fueled by proprietary data — is becoming an increasingly powerful tool to help enterprises serve their customers efficiently and effectively.

Learn more about how DataStax helps you build real-time, generative AI applications.

DataStax, an IBM company, provides the real-time vector data tools that Gen AI apps need, with seamless integration with developers’ stacks of choice.
Learn More
The latest from DataStax
TRENDING STORIES
Alan Ho is a product manager and entrepreneur with a passion for advancing artificial intelligence and computing. He is currently working on adapting DataStax technology for generative AI. Alan founded a startup company that provides application performance management for mobile...
Read more from Alan Ho
DataStax sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Pragma.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.