1. Overview
Spring Batch is a powerful framework for developing robust batch applications. In our previous tutorial, we introduced Spring Batch.
In this tutorial, weβll build on that foundation by learning how to set up and create a basic batch-driven application using Spring Boot.
2. Maven Dependencies
First, weβll add the spring-boot-starter-batch to our pom.xml:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
<version>3.0.0</version>
</dependency>
Weβll also add the h2 dependency, which is available from Maven Central as well:
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<version>2.1.214</version>
<scope>runtime</scope>
</dependency>
3. Defining a Simple Spring Batch Job
Weβre going to build a job that imports a coffee list from a CSV file, transforms it using a custom processor, and stores the final results in an in-memory database.
3.1. Getting Started
Letβs start by defining our application entry point:
@SpringBootApplication
public class SpringBootBatchProcessingApplication {
public static void main(String[] args) {
SpringApplication.run(SpringBootBatchProcessingApplication.class, args);
}
}
As we can see, this is a standard Spring Boot application. As we want to use default configuration values where possible, weβll use a very light set of application configuration properties.
Weβll define these properties in our src/main/resources/application.properties file:
file.input=coffee-list.csv
This property contains the location of our input coffee list. Each line contains the brand, origin, and some characteristics of our coffee:
Blue Mountain,Jamaica,Fruity
Lavazza,Colombia,Strong
Folgers,America,Smokey
As weβll see, this is a flat CSV file, which means Spring can handle it without any special customization.
Next, weβll add a SQL script schema-all.sql to create our coffee table to store the data:
DROP TABLE coffee IF EXISTS;
CREATE TABLE coffee (
coffee_id BIGINT IDENTITY NOT NULL PRIMARY KEY,
brand VARCHAR(20),
origin VARCHAR(20),
characteristics VARCHAR(30)
);
Conveniently Spring Boot will run this script automatically during startup.
3.2. Coffee Domain Class
Subsequently, weβll need a simple domain class to hold our coffee items:
public class Coffee {
private String brand;
private String origin;
private String characteristics;
public Coffee(String brand, String origin, String characteristics) {
this.brand = brand;
this.origin = origin;
this.characteristics = characteristics;
}
// getters and setters
}
As previously mentioned, our Coffee object contains three properties: brand, origin, and additional characteristics.
4. Job Configuration
Now weβll move on to the key component, our job configuration. Weβll go step by step, building up our configuration, and explaining each part along the way:
@Configuration
public class BatchConfiguration {
@Value("${file.input}")
private String fileInput;
// ...
}
First, weβll start with a standard Spring @Configuration class. Note that with Spring boot 3.0, the @EnableBatchProcessing is discouraged. Also, JobBuilderFactory and StepBuilderFactory are deprecated and it is recommended to use JobBuilder and StepBuilder classes with the name of the job or step builder.
For the last part of our initial configuration, weβll include a reference to the file.input property we declared previously.
4.1. A Reader and Writer for Our Job
Now we can go ahead and define a reader bean in our configuration:
@Bean
public FlatFileItemReader reader() {
return new FlatFileItemReaderBuilder().name("coffeeItemReader")
.resource(new ClassPathResource(fileInput))
.delimited()
.names(new String[] { "brand", "origin", "characteristics" })
.fieldSetMapper(new BeanWrapperFieldSetMapper() {{
setTargetType(Coffee.class);
}})
.build();
}
In short, the reader bean defined above looks for a file called coffee-list.csv and parses each line item into a Coffee object.
Similarly, weβll define a writer bean:
@Bean
public JdbcBatchItemWriter writer(DataSource dataSource) {
return new JdbcBatchItemWriterBuilder()
.itemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<>())
.sql("INSERT INTO coffee (brand, origin, characteristics) VALUES (:brand, :origin, :characteristics)")
.dataSource(dataSource)
.build();
}
This time around, weβll include the SQL statement needed to insert a single coffee item into our database, driven by the Java bean properties of our Coffee object.
4.2. Putting Our Job Together
Finally, weβll need to add the actual job steps and configuration:
@Bean
public Job importUserJob(JobRepository jobRepository, JobCompletionNotificationListener listener, Step step1) {
return new JobBuilder("importUserJob", jobRepository)
.incrementer(new RunIdIncrementer())
.listener(listener)
.flow(step1)
.end()
.build();
}
@Bean
public Step step1(JobRepository jobRepository, PlatformTransactionManager transactionManager, JdbcBatchItemWriter writer) {
return new StepBuilder("step1", jobRepository)
.<Coffee, Coffee> chunk(10, transactionManager)
.reader(reader())
.processor(processor())
.writer(writer)
.build();
}
@Bean
public CoffeeItemProcessor processor() {
return new CoffeeItemProcessor();
}
As we can see, our job is relatively simple and consists of one step defined in the step1 method.
Letβs take a look at what this step is doing:
- First, we configure our step so that itβll write up to ten records at a time using the chunk(10) declaration.
- Then we read in the coffee data using our reader bean, which we set using the reader method.
- Next, we pass each of our coffee items to a custom processor where we apply some custom business logic.
- Finally, we write each coffee item to the database using the writer we saw previously.
On the other hand, our importUserJob contains our job definition, which contains an id using the built-in RunIdIncrementer class. We also set a JobCompletionNotificationListener, which weβll use to get notified when the job completes.
To complete our job configuration, weβll list each step (though this job has only one step). We now have a perfectly configured job.
5. A Custom Coffee Processor
Now letβs take a detailed look at the custom processor we defined previously in our job configuration:
public class CoffeeItemProcessor implements ItemProcessor<Coffee, Coffee> {
private static final Logger LOGGER = LoggerFactory.getLogger(CoffeeItemProcessor.class);
@Override
public Coffee process(final Coffee coffee) throws Exception {
String brand = coffee.getBrand().toUpperCase();
String origin = coffee.getOrigin().toUpperCase();
String chracteristics = coffee.getCharacteristics().toUpperCase();
Coffee transformedCoffee = new Coffee(brand, origin, chracteristics);
LOGGER.info("Converting ( {} ) into ( {} )", coffee, transformedCoffee);
return transformedCoffee;
}
}
Of particular interest, the ItemProcessor interface provides us with a mechanism to apply some specific business logic during our job execution.
To keep things simple, weβll define our CoffeeItemProcessor, which takes an input Coffee object and transforms each of the properties to uppercase.
6. Job Completion
Weβre also going to write a JobCompletionNotificationListener to provide some feedback when our job finishes:
@Override
public void afterJob(JobExecution jobExecution) {
if (jobExecution.getStatus() == BatchStatus.COMPLETED) {
LOGGER.info("!!! JOB FINISHED! Time to verify the results");
String query = "SELECT brand, origin, characteristics FROM coffee";
jdbcTemplate.query(query, (rs, row) -> new Coffee(rs.getString(1), rs.getString(2), rs.getString(3)))
.forEach(coffee -> LOGGER.info("Found < {} > in the database.", coffee));
}
}
In the above example, we overrode the afterJob method and checked that the job completed successfully. Moreover, we ran a trivial query to check that each coffee item was stored in the database successfully.
7. Running Our Job
Now that we have everything in place to run our job, here comes the fun part. Letβs go ahead and run our job:
...
17:41:16.336 [main] INFO c.b.b.JobCompletionNotificationListener -
!!! JOB FINISHED! Time to verify the results
17:41:16.336 [main] INFO c.b.b.JobCompletionNotificationListener -
Found < Coffee [brand=BLUE MOUNTAIN, origin=JAMAICA, characteristics=FRUITY] > in the database.
17:41:16.337 [main] INFO c.b.b.JobCompletionNotificationListener -
Found < Coffee [brand=LAVAZZA, origin=COLOMBIA, characteristics=STRONG] > in the database.
17:41:16.337 [main] INFO c.b.b.JobCompletionNotificationListener -
Found < Coffee [brand=FOLGERS, origin=AMERICA, characteristics=SMOKEY] > in the database.
...
As we can see, our job ran successfully, and each coffee item was stored in the database as expected.
8. Virtual Threads Integration
With the release of Spring Batch 5.1 and the introduction of JDK 21βs virtual threads from Project Loom, there is a significant enhancement in how concurrency is handled. Virtual threads provide a lightweight, high-performance alternative to traditional threads, providing scalable and efficient execution of parallel tasks.
We can leverage virtual threads for various parallel processing scenarios, such as running concurrent steps or parallelizing the execution of a single step. This is facilitated by Spring Frameworks 6.1βs VirtualThreadTaskExecutor, which implements TaskExecutor using virtual threads.
First, letβs add the spring-context and spring-batch-core in the pom.xml file:
<dependency>
<groupId>org.springframework.batch</groupId>
<artifactId>spring-batch-core</artifactId>
<version>5.1.0</version>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-context</artifactId>
<version>6.1.0</version>
</dependency>
Once we have our dependency setup, we must create a VirtualThreadExecutor bean in the Spring Boot context. This executor creates and manages virtual threads:
@Bean
public VirtualThreadTaskExecutor taskExecutor() {
return new VirtualThreadTaskExecutor("virtual-thread-executor");
}
Now to enable parallel processing with virtual threads, all we have to do is configure VirtualThreadExecutor in the batch step:
@Bean
public Step step1(JobRepository jobRepository, PlatformTransactionManager transactionManager, JdbcBatchItemWriter<Coffee> writer, VirtualThreadTaskExecutor taskExecutor) {
return new StepBuilder("step1", jobRepository)
.<Coffee, Coffee> chunk(10, transactionManager)
.reader(reader())
.processor(processor())
.writer(writer)
.taskExecutor(taskExecutor)
.build();
}
Lets execute the job with virtual thread configuration:
20:41:32.134 [main] INFO o.s.batch.core.job.SimpleStepHandler - Executing step: [step1]
20:41:32.242 [virtual-thread-executor2] INFO c.baeldung.batch.CoffeeItemProcessor - Converting ( Coffee [brand=Blue Mountain, origin=Jamaica, characteristics=Fruity] ) into ( Coffee [brand=BLUE MOUNTAIN, origin=JAMAICA, characteristics=FRUITY] )
20:41:32.242 [virtual-thread-executor1] INFO c.baeldung.batch.CoffeeItemProcessor - Converting ( Coffee [brand=Folgers, origin=America, characteristics=Smokey] ) into ( Coffee [brand=FOLGERS, origin=AMERICA, characteristics=SMOKEY] )
20:41:32.242 [virtual-thread-executor0] INFO c.baeldung.batch.CoffeeItemProcessor - Converting ( Coffee [brand=Lavazza, origin=Colombia, characteristics=Strong] ) into ( Coffee [brand=LAVAZZA, origin=COLOMBIA, characteristics=STRONG] )
20:41:32.263 [main] INFO o.s.batch.core.step.AbstractStep - Step: [step1] executed in 128ms
As we can see in the logs, itβs using virtual threads for processor logic.
9. Conclusion
In this article, we learned how to create a simple Spring Batch job using Spring Boot.
We started by defining some basic configurations. Then we explained how to add a file reader and database writer. Finally, we demonstrated how to apply some custom processing and check that our job was executed successfully.
