VOOZH about

URL: https://www.javacodegeeks.com/spring-ai-integration-with-google-cloud.html

⇱ Spring AI Integration with Google Cloud - Java Code Geeks


Artificial intelligence is rapidly transforming how modern applications are built, and developers increasingly need tools that simplify integration with cloud-based AI services. Spring AI offers a structured and developer-friendly approach to working with large language models, while Google Cloud Vertex AI provides advanced chat and embedding models, including Gemini and Text Embedding. In this article, we explore the integration of Spring AI with Vertex AI to enhance chat and embedding capabilities in Java applications.

1. Prerequisites

To get started, you must prepare your Google Cloud environment by enabling Vertex AI, authenticating locally, and selecting a supported region. These steps ensure your application can securely communicate with the Vertex AI API.

First, set your project ID:

gcloud config set project <YOUR_PROJECT_ID>

Here, <YOUR_PROJECT_ID> refers to the unique identifier of your Google Cloud project. It’s the official ID that all Google Cloud APIs use to link requests to your billing account and resources.

Next, authenticate your local environment:

gcloud auth application-default login <YOUR-ACCOUNT>

<YOUR-ACCOUNT> refers to the email address of the Google Cloud account (or service account) used for authentication. When you run this command, a browser window opens, prompting you to log in with that account, after which the credentials are stored locally. Spring AI, along with the underlying Google client libraries, automatically uses these credentials to sign requests to Vertex AI, eliminating the need to hard-code keys or tokens in your application.

2. Project Setup (Maven)

To get started, we need to configure our Maven project with the dependencies required for Spring AI and Vertex AI.

	<properties>
		<java.version>17</java.version>
		<spring-ai.version>1.0.1</spring-ai.version>
	</properties>
	<dependencies>
		<dependency>
			<groupId>org.springframework.ai</groupId>
			<artifactId>spring-ai-starter-model-vertex-ai-embedding</artifactId>
		</dependency>
		<dependency>
			<groupId>org.springframework.ai</groupId>
			<artifactId>spring-ai-starter-model-vertex-ai-gemini</artifactId>
		</dependency>
	</dependencies>
	<dependencyManagement>
		<dependencies>
			<dependency>
				<groupId>org.springframework.ai</groupId>
				<artifactId>spring-ai-bom</artifactId>
				<version>${spring-ai.version}</version>
				<type>pom</type>
				<scope>import</scope>
			</dependency>
		</dependencies>
	</dependencyManagement>

The pom.xml includes the Spring AI BOM to manage dependency versions, and adds two starters. one is Vertex AI Gemini for chat and another Vertex AI embedding for embeddings.

3. Application Configuration

To connect your application with Vertex AI, you need to provide the necessary settings in your configuration file. These properties include the Google Cloud project ID, the region where your Vertex AI models are deployed, and the model IDs for both chat and embeddings.

# Google Cloud Project ID
spring.ai.vertex.ai.gemini.project-id=

# Google Cloud Region 
spring.ai.vertex.ai.gemini.location=us-central1

# Chat Model
spring.ai.vertex.ai.gemini.model=gemini-2.5-flash
spring.ai.vertex.ai.gemini.chat.options.response-timeout=120s
spring.ai.vertex.ai.gemini.chat.options.connect-timeout=120s
spring.ai.vertex.ai.gemini.chat.options.temperature=0.2

# Embedding Model
spring.ai.vertex.ai.embedding.location=us-central1
spring.ai.vertex.ai.embedding.text.options.model=gemini-embedding-001

The first property, spring.ai.vertex.ai.project-id, specifies the Google Cloud project that will be billed and where resources are managed. Replace <YOUR_PROJECT_ID> with your actual project identifier.

The property spring.ai.vertex.ai.gemini.location defines the region in which Vertex AI will serve requests. Not all models are available in every region, so it’s important to select one that supports the models you intend to use.

The spring.ai.vertex.ai.gemini.model property sets the chat model used for conversational AI features. You can switch between different models offered by Vertex AI by changing this value.

Finally, spring.ai.vertex.ai.embedding.text.options.model points to the embedding model, which is responsible for generating vector representations of text.

4. Chat with ChatClient

To interact with Vertex AI’s chat models, we can use the ChatClient class provided by Spring AI. This client abstracts away the low-level details of model communication, allowing you to send prompts and straightforwardly receive responses. By combining it with ChatMemory, you can maintain conversational context across multiple interactions.

@RestController
public class ChatController {

 private final ChatClient chatClient;

 public ChatController(VertexAiGeminiChatModel chatModel, ChatMemory chatMemory) {
 this.chatClient = ChatClient.builder(chatModel)
 .defaultAdvisors(MessageChatMemoryAdvisor.builder(chatMemory).build())
 .build();
 }

 public String chat(String prompt) {
 return chatClient.prompt()
 .user(userMessage -> userMessage.text(prompt))
 .call()
 .content();
 }

 @PostMapping("/api/chat")
 public ResponseEntity<String> generateResponse(@RequestBody @NotNull String prompt) {
 String response = chat(prompt);
 return ResponseEntity.ok(response);
 }
}

In this example, we define a ChatController class that integrates with Spring AI’s ChatClient. This controller acts as the bridge between your application and Vertex AI’s chat models, making it possible to handle conversational interactions with minimal effort.

The constructor sets up a ChatClient instance by passing in a VertexAiGeminiChatModel along with a ChatMemory. The inclusion of MessageChatMemoryAdvisor ensures that the chat history is preserved across different requests, allowing the model to maintain context and support multi-turn conversations.

The chat(String prompt) method is where user input is sent to the model. Using the chatClient.prompt() builder, we define the user message and then invoke the model with .call(). The result is processed, and the .content() method extracts the generated response so that it can be returned as plain text.

Finally, the REST endpoint /api/chat is exposed using the @PostMapping annotation. This endpoint receives a prompt in the request body, passes it to the chat method, and returns the model’s response wrapped in a ResponseEntity.

Testing the Chat Endpoint

Once your application is running, you can test the /api/chat endpoint by sending a prompt with cURL.

curl -X POST http://localhost:8080/api/chat \
 -H "Content-Type: application/json" \
 -d '"Tell me a fun fact about space."'

5. Text Embeddings

Embeddings convert text into numeric vectors for semantic search and RAG.

@Service
public class EmbeddingService {

 private final EmbeddingModel embeddingModel;

 public EmbeddingService(EmbeddingModel embeddingModel) {
 this.embeddingModel = embeddingModel;
 }

 public EmbeddingResponse embed(List<String> texts) {
 return embeddingModel.embedForResponse(texts);
 }
}

The EmbeddingService class generates the embeddings using Spring AI’s EmbeddingModel. Through constructor injection, it receives the model, and its embed method takes a list of texts, calling embeddingModel.embedForResponse(texts) to return an EmbeddingResponse containing their vector representations.

To make the embedding functionality accessible through an API, we’ll create an EmbeddingController. This controller will expose a REST endpoint where clients can submit text and receive the corresponding vector embeddings generated by the service.

@RestController
@RequestMapping("/api/embeddings")
public class EmbeddingController {

 private final EmbeddingService service;

 public EmbeddingController(EmbeddingService service) {
 this.service = service;
 }

 @PostMapping
 public EmbeddingResponse embed(@RequestBody List<String> texts) {
 return service.embed(texts);
 }
}

With this controller in place, your application now exposes a REST API for embeddings. Incoming requests are forwarded to the EmbeddingService, which interacts with the underlying model, and the resulting embeddings are returned to the client.

Testing the Embedding Endpoint

Once your application is running, you can test the /api/embeddings endpoint using cURL.

curl -X POST http://localhost:8080/api/embeddings \
 -H "Content-Type: application/json" \
 -d '["Spring AI makes integration with Vertex AI easy", "Embeddings help power semantic search"]'

If the configuration is correct, you’ll receive a response similar to the following:

{
 "metadata": {
 "model": "gemini-embedding-001",
 "usage": {
 "promptTokens": 12,
 "completionTokens": 0,
 "totalTokens": 12
 }
 "empty": true
 },
 "results": [
 {
 "index": 0,
 "metadata": {
 "text": "Spring AI makes integration with Vertex AI easy"
 },
 "output": [0.0213, -0.0198, 0.0341, ...]
 }
 ],
 "status": "success"
}


6. Multimodal Prompt (Text + Image)

In addition to handling plain text, Vertex AI can also work with multimodal prompts, such as embedding documents, images, or other resources. This enables your application to capture semantic meaning not only from text but also from various types of content.

To demonstrate this, we’ll create a service and a controller that process documents or media resources and return embeddings.

@Service
public class DocumentEmbeddingService {

 private final DocumentEmbeddingModel documentEmbeddingModel;

 public DocumentEmbeddingService(DocumentEmbeddingModel documentEmbeddingModel) {
 this.documentEmbeddingModel = documentEmbeddingModel;
 }

 public EmbeddingResponse getEmbedding(MimeType mimeType, Resource resource) {
 Document document = new Document(new Media(mimeType, resource), Map.of());
 DocumentEmbeddingRequest request = new DocumentEmbeddingRequest(List.of(document));
 return documentEmbeddingModel.call(request);
 }
}

In this code, the DocumentEmbeddingService wraps a DocumentEmbeddingModel, which is responsible for generating embeddings from various types of content. The service accepts both a MimeType and a Resource, creating a Document that pairs the media with metadata. The request is then passed to Vertex AI through Spring AI, and the resulting embeddings are returned in an EmbeddingResponse.

@RestController
@RequestMapping("/api/documents")
public class DocumentEmbeddingController {

 private final DocumentEmbeddingService service;

 public DocumentEmbeddingController(DocumentEmbeddingService service) {
 this.service = service;
 }

 @PostMapping("/embed")
 public EmbeddingResponse embed(@RequestParam("file") MultipartFile file) throws IOException {
 Resource resource = file.getResource();
 MimeType mimeType = MimeType.valueOf(file.getContentType());
 return service.getEmbedding(mimeType, resource);
 }
}

This controller exposes this functionality through a REST endpoint. Using /api/documents/embed, clients can upload files (such as PDFs, text files, or images). The controller extracts the file’s MIME type and converts it into a Resource, then delegates the embedding generation to the service. This makes multimodal embedding accessible via a simple HTTP API, enabling downstream applications to consume embeddings for search, semantic understanding, or classification tasks.

7. Deployment to GCP

Deploying the application to Google App Engine is straightforward. From the project’s root directory, run:

gcloud app deploy

This command packages your application, uploads it to App Engine, and provisions the necessary resources. The process may take a few minutes to complete.

8. Conclusion

In this article, we explored how to integrate Spring AI with Google Cloud Vertex AI to unlock powerful chat and embedding capabilities in Java applications. We walked through configuring your environment, setting up application.properties, building REST endpoints for both chat and embeddings, and testing them with curl.

By combining the flexibility of Spring Boot with the scalability of Vertex AI, you can effectively add advanced AI-driven features to your applications, whether running locally or in production on GCP.

9. Download the Source Code

This article explored Spring AI integration with Google Cloud.

Download
You can download the full source code of this example here: spring ai google cloud
Do you want to know how to develop your skillset to become a Java Rockstar?
Subscribe to our newsletter to start Rocking right now!
To get you started we give you our best selling eBooks for FREE!
1. JPA Mini Book
2. JVM Troubleshooting Guide
3. JUnit Tutorial for Unit Testing
4. Java Annotations Tutorial
5. Java Interview Questions
6. Spring Interview Questions
7. Android UI Design
and many more ....
I agree to the Terms and Privacy Policy

Thank you!

We will contact you soon.

👁 Photo of Omozegie Aziegbe
Omozegie Aziegbe
September 19th, 2025Last Updated: September 19th, 2025
0 487 6 minutes read

Omozegie Aziegbe

Omos Aziegbe is a technical writer and web/application developer with a BSc in Computer Science and Software Engineering from the University of Bedfordshire. Specializing in Java enterprise applications with the Jakarta EE framework, Omos also works with HTML5, CSS, and JavaScript for web development. As a freelance web developer, Omos combines technical expertise with research and writing on topics such as software engineering, programming, web application development, computer science, and technology.
Subscribe

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Back to top button
Close
wpDiscuz