VOOZH about

URL: https://www.javacodegeeks.com/2015/07/how-to-streamserialize-jpa-result-as-jax-rs-response-for-large-data.html

⇱ How To Stream/Serialize JPA Result As JAX-RS Response For Large Data - Java Code Geeks


There are times that retrieving a large data set through JPA is necessary (e.g. more than 1,000,000 records) and having them stuffed into a single instance of java.util.List is just risky (memory barrier). So, here’s a quick solution of how a JAX-RS REST resource end-point could still give us a timely Response without breaking the memory constrain through streaming or serialization of JPA entity by “pages”.

Example Database Table And JPA Entity

Database Table

To demonstrate how we could achieve outputting of large data, here’s an example MySQL database table that we could use.

create database large_data_test_db;
use large_data_test_db;

create table generated_uuids (
	record_no bigint not null auto_increment,
	uuid varchar(100) not null,
	datetime_generated datetime not null,
	primary key(record_no),
	unique(uuid)
);

JPA Entity

Next, define the JPA entity class which represents the table structure of the above.

Codes for GeneratedUuidEntity.java

package com.developerscrappad;

import java.io.Serializable;
import java.util.Date;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
import javax.persistence.NamedQueries;
import javax.persistence.NamedQuery;
import javax.persistence.Table;
import javax.persistence.Temporal;
import javax.persistence.TemporalType;

@Entity
@Table( name = "generated_uuids" )
@NamedQueries( {
 @NamedQuery( name = "GeneratedUuidEntity.listAll", query = "SELECT u FROM GeneratedUuidEntity u" ),
 @NamedQuery( name = "GeneratedUuidEntity.queryRecordsSize", query = "SELECT count(u) FROM GeneratedUuidEntity u" )
} )
public class GeneratedUuidEntity implements Serializable {

 private static final long serialVersionUID = 12312312234234123L;

 @Id
 @GeneratedValue( strategy = GenerationType.IDENTITY )
 @Column( name = "record_no" )
 private Long recordNo;

 @Column( name = "uuid" )
 private String uuid;

 @Column( name = "datetime_generated" )
 @Temporal( TemporalType.TIMESTAMP )
 private Date datetimeGenerated;

 public GeneratedUuidEntity() {
 }

 public GeneratedUuidEntity( Long recordNo ) {
 this.recordNo = recordNo;
 }

 public GeneratedUuidEntity( Long recordNo, String uuid, Date datetimeGenerated ) {
 this.recordNo = recordNo;
 this.uuid = uuid;
 this.datetimeGenerated = datetimeGenerated;
 }

 public Long getRecordNo() {
 return recordNo;
 }

 public void setRecordNo( Long recordNo ) {
 this.recordNo = recordNo;
 }

 public String getUuid() {
 return uuid;
 }

 public void setUuid( String uuid ) {
 this.uuid = uuid;
 }

 public Date getDatetimeGenerated() {
 return datetimeGenerated;
 }

 public void setDatetimeGenerated( Date datetimeGenerated ) {
 this.datetimeGenerated = datetimeGenerated;
 }

 @Override
 public int hashCode() {
 int hash = 0;
 hash += ( recordNo != null ? recordNo.hashCode() : 0 );

 return hash;
 }

 @Override
 public boolean equals( Object object ) {
 // TODO: Warning - this method won't work in the case the id fields are not set
 if ( !( object instanceof GeneratedUuidEntity ) ) {
 return false;
 }

 GeneratedUuidEntity other = ( GeneratedUuidEntity ) object;

 if ( ( this.recordNo == null && other.recordNo != null ) || ( this.recordNo != null && !this.recordNo.equals( other.recordNo ) ) ) {
 return false;
 }

 return true;
 }

 @Override
 public String toString() {
 return "com.developerscrappad.GeneratedUuidEntity[ recordNo=" + recordNo + " ]";
 }
}

There are two named queries defined in GeneratedUuidEntity. The GeneratedUuidEntity.queryRecordsSize is to query the total record number of the table, whereas the GeneratedUuidEntity.listAll is to retrieve all of the records in the table.

Implementing The JAX-RS REST Resource (The Java EE way)

Let’s have a JAX-RS REST resource class by the name JPAStreamingRESTResourcewith an available JPA EntityManager (Persistence Unit Name: JPAStreamingPU) to be injected and to be obtained through a protected method getEntityManager().

@Path( "generated-uuids" )
@Stateless( name = "JPAStreamingRESTResource", mappedName = "ejb/JPAStreamingRESTResource" )
public class JPAStreamingRESTResource {

 @PersistenceContext( unitName = "JPAStreamingPU" )
 private EntityManager entityManager;

 protected EntityManager getEntityManager() {
 return entityManager;
 }
 
 /**
 * Say "NO" to response caching
 */
 protected Response.ResponseBuilder getNoCacheResponseBuilder( Response.Status status ) {
 CacheControl cc = new CacheControl();
 cc.setNoCache( true );
 cc.setMaxAge( -1 );
 cc.setMustRevalidate( true );

 return Response.status( status ).cacheControl( cc );
 }
}

Besides, we have a method name getNoCacheResponseBuilder(), which is to obtain a non-caching javax.ws.rs.core.Response.ResponseBuilder, so that we don’t get weird cached results later.

The JPA Invocation Methods

Next, let’s define two methods within the resource class, namely:

queryGeneratedUuidRecordsSize() – to retrieve the total number of records in the table

private int queryGeneratedUuidRecordsSize() {
 return getEntityManager().createNamedQuery( "GeneratedUuidEntity.queryRecordsSize", Long.class )
 .getSingleResult().intValue();
}

listAllGeneratedUuidEntities() – to retrieve the all of the data from the table, but with certain limitation criteria such as the start position of the record (recordPosition) and the maximum number of records per round trip to the database (recordsPerRoundTrip). The intention is to “page” the results so that the result list will not be overly bloated. We’ll see this in action later.

private List<GeneratedUuidEntity> listAllGeneratedUuidEntities( int recordPosition, int recordsPerRoundTrip ) {
 return getEntityManager().createNamedQuery( "GeneratedUuidEntity.listAll" )
 .setFirstResult( recordPosition )
 .setMaxResults( recordsPerRoundTrip )
 .getResultList();
}

Let The Streaming Begin

Now, let’s implement the resource end-point method, which retrieves the data without compromising size, at least in theoretical speaking. This method will return a JSON response with the data format of:

{
 "result": [
 {
 "record_no": 1,
 "uuid": "34d99089-3e36-4f00-ab93-846b61771eb3",
 "datetime_generated": "2015-06-28 21:02:23"
 },
	…
 ]
}
 @GET
 @Path( "list-all" )
 @Produces( "application/json" )
 @TransactionAttribute( TransactionAttributeType.NEVER )
 public Response streamGeneratedUuids() {

 // Define the format of timestamp output
 SimpleDateFormat df = new SimpleDateFormat( "yyyy-MM-dd HH:mm:ss" );

 return getNoCacheResponseBuilder( Response.Status.OK ).entity( new StreamingOutput() {

 // Instruct how StreamingOutput's write method is to stream the data
 @Override
 public void write( OutputStream os ) throws IOException, WebApplicationException {
 int recordsPerRoundTrip = 100; // Number of records for every round trip to the database
 int recordPosition = 0; // Initial record position index
 int recordSize = queryGeneratedUuidRecordsSize(); // Total records found for the query

 // Start streaming the data
 try ( PrintWriter writer = new PrintWriter( new BufferedWriter( new OutputStreamWriter( os ) ) ) ) {

 writer.print( "{\"result\": [" );

 while ( recordSize > 0 ) {
 // Get the paged data set from the DB
 List<GeneratedUuidEntity> generatedUuidEntities = listAllGeneratedUuidEntities( recordPosition, recordsPerRoundTrip );

 for ( GeneratedUuidEntity generatedUuidEntity : generatedUuidEntities ) {
 if ( recordPosition > 0 ) {
 writer.print( "," );
 }

 // Stream the data in Json object format
 writer.print( Json.createObjectBuilder()
 .add( "record_no", generatedUuidEntity.getRecordNo() )
 .add( "uuid", generatedUuidEntity.getUuid() )
 .add( "datetime_generated", df.format( generatedUuidEntity.getDatetimeGenerated() ) )
 .build().toString() );

 // Increase the recordPosition for every record streamed
 recordPosition++;
 }

 // update the recordSize (remaining no. of records)
 recordSize -= recordsPerRoundTrip;
 }

 // Done!
 writer.print( "]}" );
 }
 }
 } ).build();
 }

Cord Explanation:

This is quite simple actually. The trick is to define the expression of the anonymous class StreamingOutput by overriding the write() method, which in it, first query the total record size through queryGeneratedUuidRecordsSize(), then retrieves the records page by page through listAllGeneratedUuidEntities(). This method will make several round trips to the database, depending on the recordsPerRoundTrip value defined.

Full source codes for JPAStreamingRESTResource.java:

package com.developerscrappad;

import java.io.BufferedWriter;
import java.io.IOException;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import java.io.PrintWriter;
import java.text.SimpleDateFormat;
import java.util.List;
import javax.ejb.Stateless;
import javax.ejb.TransactionAttribute;
import javax.ejb.TransactionAttributeType;
import javax.json.Json;
import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;
import javax.ws.rs.GET;
import javax.ws.rs.Path;
import javax.ws.rs.Produces;
import javax.ws.rs.WebApplicationException;
import javax.ws.rs.core.CacheControl;
import javax.ws.rs.core.Response;
import javax.ws.rs.core.StreamingOutput;

@Path( "generated-uuids" )
@Stateless( name = "JPAStreamingRESTResource", mappedName = "ejb/JPAStreamingRESTResource" )
public class JPAStreamingRESTResource {

 @PersistenceContext( unitName = "JPAStreamingPU" )
 private EntityManager entityManager;

 private List<GeneratedUuidEntity> listAllGeneratedUuidEntities( int recordPosition, int recordsPerRoundTrip ) {
 return getEntityManager().createNamedQuery( "GeneratedUuidEntity.listAll" )
 .setFirstResult( recordPosition )
 .setMaxResults( recordsPerRoundTrip )
 .getResultList();
 }

 private int queryGeneratedUuidRecordsSize() {
 return getEntityManager().createNamedQuery( "GeneratedUuidEntity.queryRecordsSize", Long.class )
 .getSingleResult().intValue();
 }

 protected EntityManager getEntityManager() {
 return entityManager;
 }
 
 /**
 * Say "NO" to response caching
 */
 protected Response.ResponseBuilder getNoCacheResponseBuilder( Response.Status status ) {
 CacheControl cc = new CacheControl();
 cc.setNoCache( true );
 cc.setMaxAge( -1 );
 cc.setMustRevalidate( true );

 return Response.status( status ).cacheControl( cc );
 }

 @GET
 @Path( "list-all" )
 @Produces( "application/json" )
 @TransactionAttribute( TransactionAttributeType.NEVER )
 public Response streamGeneratedUuids() {

 // Define the format of timestamp output
 SimpleDateFormat df = new SimpleDateFormat( "yyyy-MM-dd HH:mm:ss" );

 return getNoCacheResponseBuilder( Response.Status.OK ).entity( new StreamingOutput() {

 // Instruct how StreamingOutput's write method is to stream the data
 @Override
 public void write( OutputStream os ) throws IOException, WebApplicationException {
 int recordsPerRoundTrip = 100; // Number of records for every round trip to the database
 int recordPosition = 0; // Initial record position index
 int recordSize = queryGeneratedUuidRecordsSize(); // Total records found for the query

 // Start streaming the data
 try ( PrintWriter writer = new PrintWriter( new BufferedWriter( new OutputStreamWriter( os ) ) ) ) {

 writer.print( "{\"result\": [" );

 while ( recordSize > 0 ) {
 // Get the paged data set from the DB
 List<GeneratedUuidEntity> generatedUuidEntities = listAllGeneratedUuidEntities( recordPosition, recordsPerRoundTrip );

 for ( GeneratedUuidEntity generatedUuidEntity : generatedUuidEntities ) {
 if ( recordPosition > 0 ) {
 writer.print( "," );
 }

 // Stream the data in Json object format
 writer.print( Json.createObjectBuilder()
 .add( "record_no", generatedUuidEntity.getRecordNo() )
 .add( "uuid", generatedUuidEntity.getUuid() )
 .add( "datetime_generated", df.format( generatedUuidEntity.getDatetimeGenerated() ) )
 .build().toString() );

 // Increase the recordPosition for every record streamed
 recordPosition++;
 }

 // update the recordSize (remaining no. of records)
 recordSize -= recordsPerRoundTrip;
 }

 // Done!
 writer.print( "]}" );
 }
 }
 } ).build();
 }
}

Watch Out

Do remember to tune the application server’s response connection timeout value to prevent java.io.IOException Premature EOF exception being thrown by the REST or Http Client.

Testing It

To test whether this works, just get the table loaded with just 567 records. Then, have the unit test invoke the end-point URL and save the retrieved JSON data to a file with the below unit test codes (Apache HttpClient is used):

Codes for JPAStreamingUnitTest.java:

package com.developerscrappad;

import java.io.File;
import java.io.FileInputStream;
import static org.junit.Assert.*;

import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.UUID;
import javax.json.Json;
import javax.json.JsonArray;
import javax.json.JsonObject;
import javax.json.JsonReader;
import org.apache.http.HttpEntity;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Test;

public class JPAStreamingUnitTest {

 private static final String dbDriverClassname = "com.mysql.jdbc.Driver";
 private static final String dbUrl = "jdbc:mysql://localhost:3306/large_data_test_db";
 private static final String username = "username";
 private static final String password = "password";
 private static final int numberOfRecords = 567;
 private static final String jsonResultOutputFilename = "testing123.json";

 @BeforeClass
 public static void setUpClass() {
 try {
 Class.forName( dbDriverClassname );

 try ( Connection conn = DriverManager.getConnection( dbUrl, username, password ) ) {
 String insertSQL = "insert into generated_uuids (uuid, datetime_generated) values (?, now())";

 try ( PreparedStatement stmt = conn.prepareStatement( insertSQL ) ) {
 for ( int i = 0; i < numberOfRecords; i++ ) {
 System.out.println( "Inserting row: " + i );

 stmt.setString( 1, UUID.randomUUID().toString() );
 stmt.executeUpdate();
 }
 }
 }
 } catch ( final Exception ex ) {
 ex.printStackTrace();
 fail( ex.getMessage() );
 }
 }

 @AfterClass
 public static void tearDownClass() {
 try {
 Class.forName( dbDriverClassname );

 try ( Connection conn = DriverManager.getConnection( dbUrl, username, password ) ) {
 String truncateSQL = "truncate generated_uuids";
 conn.createStatement().executeUpdate( truncateSQL );
 }

 new File( System.getProperty( "java.io.tmpdir" ), jsonResultOutputFilename ).delete();
 } catch ( final Exception ex ) {
 ex.printStackTrace();
 fail( ex.getMessage() );
 }
 }

 @Test
 public void testJPAStreaming() {
 String url = "http://localhost:8080/JPAStreaming/rest-api/generated-uuids/list-all/";

 try {
 CloseableHttpClient httpclient = HttpClients.createDefault();
 HttpGet httpGet = new HttpGet( url );

 try ( CloseableHttpResponse response1 = httpclient.execute( httpGet ) ) {
 System.out.println( response1.getStatusLine() );
 HttpEntity entity1 = response1.getEntity();
 Files.copy( entity1.getContent(), FileSystems.getDefault().getPath( System.getProperty( "java.io.tmpdir" ), jsonResultOutputFilename ) );
 }

 // Validate
 try ( JsonReader jsonReader = Json.createReader( new FileInputStream( new File( System.getProperty( "java.io.tmpdir" ), jsonResultOutputFilename ) ) ) ) {
 JsonObject jsonObj = jsonReader.readObject();
 assertTrue( jsonObj.containsKey( "result" ) );

 JsonArray jsonArray = jsonObj.getJsonArray( "result" );
 assertEquals( numberOfRecords, jsonArray.size() );

 SimpleDateFormat validationDF = new SimpleDateFormat( "yyyy-MM-dd HH:mm:ss" );

 for ( int i = 0; i < jsonArray.size(); i++ ) {
 JsonObject generatedUuidJsonObj = jsonArray.getJsonObject( i );
 int recordNumber = generatedUuidJsonObj.getInt( "record_no" );
 assertTrue( recordNumber > 0 );

 try {
 UUID.fromString( generatedUuidJsonObj.getString( "uuid" ) );
 } catch ( IllegalArgumentException ex ) {
 fail( "Invalid UUID format at record number: " + recordNumber );
 }

 try {
 validationDF.parse( generatedUuidJsonObj.getString( "datetime_generated" ) );
 } catch ( final NullPointerException | ParseException ex ) {
 fail( "datetime_generated field must not be null and must be of format yyyy-MM-dd HH:mm:ss" );
 }
 }
 }

 } catch ( final Exception ex ) {
 ex.printStackTrace();
 fail( ex.getMessage() );
 }
 }
}

And we are done. Thanks for reading and hope this helps.

Do you want to know how to develop your skillset to become a Java Rockstar?
Subscribe to our newsletter to start Rocking right now!
To get you started we give you our best selling eBooks for FREE!
1. JPA Mini Book
2. JVM Troubleshooting Guide
3. JUnit Tutorial for Unit Testing
4. Java Annotations Tutorial
5. Java Interview Questions
6. Spring Interview Questions
7. Android UI Design
and many more ....
I agree to the Terms and Privacy Policy

Thank you!

We will contact you soon.

👁 Photo of Max Lam
Max Lam
July 1st, 2015Last Updated: July 1st, 2015
1 640 8 minutes read

Max Lam

Born and currently resides in Malaysia, a seasoned Java developer whom had held positions as Senior Developer, Consultant and Technical Architect in various enterprise software development companies.
Subscribe

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Seb Trap
9 years ago

The JSR-352 and here the javax.batch.api.chunk was introduced for the handling of large operations.
There are connectors available for JDBC, JPA, and others based on the needs.

0
Reply
Back to top button
Close
wpDiscuz