VOOZH about

URL: https://thenewstack.io/get-started-with-aws-bedrock-for-genai-apps/

⇱ Get Started With AWS Bedrock for GenAI Apps  - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2024-08-02 11:06:20
Get Started With AWS Bedrock for GenAI Apps 
sponsor-zilliz,sponsored-post-contributed,
AI / Databases / DevOps

Get Started With AWS Bedrock for GenAI Apps 

Build a retrieval-augmented generation (RAG) framework with AWS Bedrock and a Zilliz Cloud vector database.
Aug 2nd, 2024 11:06am by Jason Myers
👁 Featued image for: Get Started With AWS Bedrock for GenAI Apps 
Image from GamePixel on Shutterstock.
Zilliz sponsored this post.

Retrieval-augmented generation (RAG) is a widely used technique that augments large language models (LLMs) and GenAI apps by providing contextual information from external sources.

This method can significantly mitigate LLMs’ annoying hallucination issues. For example, if you ask a GenAI app to write an article about sharks, a RAG approach helps to ensure that the AI doesn’t make up a new type of shark or create new “facts” about known species. In addition, RAG also allows users to use domain-specific or private data for content generation while ensuring data security.

How does RAG work? Everything starts with a query. There are three key component steps in the RAG process: retrieval, augmentation and generation.

👁 Diagram: How RAG works

Figure 1: How RAG works

  1. Retrieval. This step identifies and retrieves content relevant to the user query by conducting a semantic search in a vector database. Vector databases store, index and retrieve vector embeddings created from external sources by a pretrained embedding model of your choice or a model that you build.
  2. Augmentation. After retrieving the semantically similar information from the vector database, the augmentation step combines the retrieved data with the original query along with any prompts, organizing them into instructions for the LLM to generate a response.
  3. Generation. This piece assembles content, paying attention to syntax, grammar, structure, etc., using natural language processing (NLP) provided by an LLM.

Choosing the Right Model and Vector Database for Your GenAI Apps

In a basic RAG system, the embedding model, the vector database and the LLM are the three most crucial building blocks. When you build a RAG framework, you need to decide early on what technologies best suit your application.

Basically, you can use any embedding model relevant to your application data to create vector embeddings, but each model has a unique way of generating vectors. This means you need to use the same model to generate the vector embeddings for both queries and datasets.

What vector databases you need to choose depends on the size of your data, the purpose of your applications, the data requirements you need to meet and so many other factors. Therefore, if you have a large dataset and want to build a RAG app for production, it is important to choose a vector database that can handle the scale.

Choosing the right LLM can be challenging as well. Fortunately, AWS Bedrock offers a variety of pretrained models, including embedding models and LLMs, to simplify this process. AWS Bedrock is a cloud service that provides access to these models, allowing you to select the one that best fits your application. You can use the chosen model for generating vector embeddings and as the LLM component of your RAG framework.

Integrate Zilliz Cloud With AWS Bedrock To Build a RAG Chain

This example shows you how to integrate LangChain, Zilliz Cloud (the managed version of Milvus) and AWS Bedrock. Let’s take a guided tour through the example.

There are four main steps to this integration:

  1. Install the required LangChain and AWS SDK for Python packages.
  2. Connect Zilliz Cloud to AWS Bedrock.
  3. Load and split documents from external sources.
  4. Predefine template guidelines and generate responses.

Install the Required Packages

To install the required packages, run the following script.

Configure the Zilliz Cloud/AWS Bedrock Connection

Once you’ve installed everything, configure the requisite environment variables to ensure that Zilliz and Bedrock can talk to each other. On the AWS side, you’ll need the AWS region name, key ID and access key. On the Zilliz side, you’ll need the cloud Uniform Resource Identifier (URI) and API key.

The AWS SDK for Python (boto3) lets you create, configure and manage AWS services. Next, you’ll create a boto3 client to connect to the AWS Bedrock Runtime service.

Use a ChatBedrock instance to gain access to all the Bedrock models. In this example, we’ll link it to `anthropic.claude-3-sonnet-20240229-v1:0`.

You can select any of the other Bedrock models, but we use this one because it provides the infrastructure for generating text responses with model-specific settings, such as a low-temperature parameter to control response variability.

Load and Split Documents From External Sources

Now that everything is connected, we need to get some data from external sources. In this example, we’re pulling data from a specific web source: a blog post about AI agents.

We’ll use a WebBaseLoader instance to grab that data and then leverage the loader’s BeautifulSoup SoupStrainer function to parse the relevant parts of the web page. We’re only targeting the following classes: “post-content,” “post-title” and “post-header.”

Once that data is loaded, we use a RecursiveCharacterTextSplitter instance to split it into smaller pieces, making it easier to work with and load into other components.

Generating Responses

Now we want to use the data we loaded to generate new content. We also want to ensure the output is accurate and mitigates AI hallucination. We instruct the AI to use statistical information and hard data whenever possible to support its claims.

The response should be specific and use statistics or numbers when possible.

Next, we initialize a Zilliz vector store containing the embeddings of the chunked documents. Having the documents as vectors is what makes it possible for RAG to do a semantic search to find and retrieve documents quickly and efficiently. The output should provide accurate, insightful, relevant and fact-based answers.

To recap, here are the steps for RAG chain:

  • First, the question is converted to a vector embedding to enable retrieval of relevant documents stored in the vector database.
  • Next, these documents are processed by a retriever and formatter.
  • Then the documents are passed to a prompt template to format the response structure.
  • Finally, a large language model receives this structured input to generate a coherent response, which is parsed into a string format and presented to the user.

For the full code of this example, please refer to this notebook.

RAG Use Cases

A RAG framework can enhance a lot of different use cases. The following list includes brief use-case descriptions. As you can see, these use cases span a variety of industries and verticals. Depending on your goals, you can find or build niche LLMs for these and other use cases.

Question-Answering Systems

RAG frameworks can provide detailed and accurate answers to user questions by retrieving relevant information from a large database and generating a coherent response.

Customer Support

Automated customer support systems can use RAG to find relevant information in support documents, manuals or FAQs and generate helpful responses to customer inquiries.

Content Creation and Summarization

RAG frameworks can help create content by retrieving relevant information from various sources and generating articles, reports or summaries.

Personalized Recommendations

In recommendation systems, RAG can enhance the generation of personalized recommendations by retrieving and synthesizing information based on user preferences and past behavior.

Educational Tools

Educational platforms can use RAG to generate personalized study materials, answer student questions and provide explanations based on a vast pool of educational resources.

Legal and Medical Assistance

RAG frameworks can benefit legal and medical professionals by allowing them to retrieve and synthesize information from case laws, medical literature and patient records to assist in decision-making and provide advice.

Interactive Storytelling and Gaming

RAG can be used to create dynamic and interactive storytelling experiences in games, where the system generates plot twists and dialogues based on retrieved story elements and user interactions.

Research and Development

Researchers can use RAG to gather and summarize relevant research papers, patents or technical documents, helping them stay updated with the latest developments and find connections between different pieces of information.

Virtual Assistants

Virtual assistants can use RAG to provide more accurate and contextually relevant responses by retrieving information from a knowledge base and generating appropriate replies.

Market Analysis and Business Intelligence

Businesses can use RAG to analyze market trends, competitor strategies and customer feedback by retrieving relevant data and generating insightful reports and action plans.

Code Generation and Documentation

Developers can use RAG frameworks to generate code snippets, documentation or explanations by retrieving relevant programming information from code repositories and technical documentation.

Final Thoughts

A RAG framework provides developers with a way to leverage large datasets, whether structured or unstructured, to build applications that are accurate and reliable. Pairing Zilliz Cloud with AWS Bedrock in a RAG framework gives you quick access to powerful tools. The prebuilt models in AWS Bedrock give you many options for building a wide range of GenAI applications. This getting-started tutorial is just the tip of the iceberg. To learn more about Zilliz Cloud, visit zilliz.com.

Zilliz is a leading vector database company, offering high-performing and scalable solutions. We’re powered by Milvus, the popular open-source vector database that helps companies from any scale build AI-powered search solutions.
Learn More
TRENDING STORIES
Jason Myers earned a PhD in modern Irish history from Loyola University Chicago. Since then, he has used the writing skills he developed in his academic work to create content for a range of startup and technology companies. When he's...
Read more from Jason Myers
Zilliz sponsored this post.
SHARE THIS STORY
TRENDING STORIES
AWS is a sponsor of The New Stack.
TNS owner Insight Partners is an investor in: Uniform.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
👁 Image
Milvus Lite, a lightweight version of the open source vectorDB Milvus, installs easily & integrates with 20+ AI tools.