1. Overview
In this tutorial, weβll explore the fluent API of ChatClient, a feature of the Spring AI module version 1.0.0 M1.
The ChatClient interface from the Spring AI module enables communication with AI models, allowing users to send prompts and receive structured responses. It follows the builder pattern, offering an API similar to WebClient, RestClient, and JdbcClient.
2. Executing Prompts via ChatClient
We can use the client in Spring Boot as an auto-configured bean, or create an instance programmatically.
First, letβs add the spring-ai-openai-spring-boot-starter dependency to our pom.xml:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
With this, we can inject the ChatClient.Builder instance into our Spring-managed components:
@RestController
@RequestMapping("api/articles")
class BlogsController {
private final ChatClient chatClient;
public BlogsController(ChatClient.Builder chatClientBuilder) {
this.chatClient = chatClientBuilder.build();
}
// ...
}
Now, letβs create a simple endpoint that accepts a question as a query parameter and forwards the prompt to the AI:
@GetMapping("v1")
String askQuestion(@RequestParam(name = "question") String question) {
return chatClient.prompt()
.user(question)
.call()
.chatResponse()
.getResult()
.getOutput()
.getContent();
}
As we can see, the fluent ChatClient allows us to easily create a prompt request from the userβs input String, call the API, and retrieve the response content as text.
Moreover, if weβre only interested in the response body as a String and donβt need metadata like status codes or headers, we can simplify our code by using the content() method to group the last four steps. Letβs refactor the code and add this improvement:
@GetMapping("v1")
String askQuestion(@RequestParam(name = "question") String question) {
return chatClient.prompt()
.user(question)
.call()
.content();
}
If we send a GET request now, weβll receive a response without a defined structure, similar to the default output from ChatGPT when accessed through a browser:
π request article recommedation3. Mapping Response to a Specific Format
As we can see, the ChatClient interface simplifies the process of forwarding user queries to a chat model and sending the response back. However, in most cases, weβll want the modelβs output in a structured format, which can then be serialized to JSON.
The API exposes an entity() method, which allows us to define a specific data structure for the modelβs output. Letβs revise our code to ensure it returns a list of Article objects, each containing a title and a set of tags:
record Article(String title, Set<String> tags) {
}
@GetMapping("v2")
List<Article> askQuestionAndRetrieveArticles(@RequestParam(name = "question") String question) {
return chatClient.prompt()
.user(question)
.call()
.entity(new ParameterizedTypeReference<List<Article>>() {});
}
If we execute the request now, weβll expect the endpoint to return the Article recommendation in a valid JSON list:
π request aritcles as json4. Provide Additional Context
Weβve learned how to use the Spring AI module to create prompts, send them to an AI model, and receive structured responses. However, the article recommendations returned by our REST API are fictional and may not exist in reality, on our website.
To address this, the ChatClient leverages the Retrieval Augmented Generation (RAG) pattern, combining data retrieval from a source with a generative model to provide more accurate responses. Weβll use a vector store to take advantage of RAG and load it with documents relevant to our use case.
First, weβll create a VectorStore and load it with the augmented data from a local file, during the class initialization:
@RestController
@RequestMapping("api/articles")
public class BlogsController {
private final ChatClient chatClient;
private final VectorStore vectorStore;
public BlogsController(ChatClient.Builder chatClientBuilder, EmbeddingModel embeddingModel) throws IOException {
this.chatClient = chatClientBuilder.build();
this.vectorStore = new SimpleVectorStore(embeddingModel);
initContext();
}
void initContext() throws IOException {
List<Document> documents = Files.readAllLines(Path.of("src/main/resources/articles.txt"))
.stream()
.map(Document::new)
.toList();
vectorStore.add(documents);
}
// ...
}
As we can see, we read all the entries from articles.txt and created a new Document for each line of this file. Needless to say, we donβt have to rely on a file β we can use any data source if needed.
After that, weβll provide the augmented data to the model by wrapping the VectorStore in a QuestionAnswerAdvisor:
@GetMapping("v3")
List<Article> askQuestionWithContext(@RequestParam(name = "question") String question) {
return chatClient.prompt()
.advisors(new QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()))
.user(question)
.call()
.entity(new ParameterizedTypeReference<List<Article>>() {});
}
As a result, our application now returns data exclusively from the augmented context:
π request articles using rag5. Conclusion
In this article, we explored Spring AIβs ChatClient. We began by sending simple user queries to the model and reading its responses as plain text. Then, we enhanced our solution by retrieving the modelβs response in a specific, structured format.
Finally, we learned how to load the modelβs context with a collection of documents to provide accurate responses based on our own data. We achieved this using a VectorStore and a QuestionAnswerAdvisor.
