VOOZH about

URL: https://thenewstack.io/real-time-recommendations-with-graph-and-event-streaming/

⇱ Real-Time Recommendations with Graph and Event Streaming - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-11-15 06:15:06
Real-Time Recommendations with Graph and Event Streaming
contributed,sponsor-datastax,sponsored,sponsored-post-contributed,
Data / Software Development

Real-Time Recommendations with Graph and Event Streaming

How a video service, for example, can architect high-performance persistence to get users happily watching their next movie.
Nov 15th, 2022 6:15am by Aaron Ploetz
👁 Featued image for: Real-Time Recommendations with Graph and Event Streaming
Image via Pixabay.
DataStax sponsored this post.

Real-time data is becoming increasingly important to enterprise success. Successfully collecting incoming data while reacting both quickly and strategically is paramount.

However, the data collection process is often far from trivial. Write contention is a common bottleneck with many large-scale architectures. As data storage infrastructure evolves, developers are further abstracted away from the critical areas of the write path. This can make troubleshooting any issues that crop up with write durability and performance difficult.

The question becomes: How can we ensure that all data is being stored while not overloading the storage layer? Here, we’ll explore using DataStax Astra Streaming to help with some of the pitfalls of ensuring real-time data delivery.

The Tech Stack

Let’s say that we’re supporting a video service. Once a user finishes watching their favorite show or movie, they’re prompted to give it a star rating between one star and five stars. From that, we take a data-driven approach to infer a few recommendations that we “think” they might like. This one small rating action helps improve the future accuracy of this inference for everyone.

How can we architect a system to accomplish this? We will need some simple components, most notably a database and a service layer to query it.

DataStax, an IBM company, provides the real-time vector data tools that Gen AI apps need, with seamless integration with developers’ stacks of choice.
Learn More
The latest from DataStax

The Database

For our data storage layer, we will use DataStax Enterprise Graph, a Gremlin/TinkerPop property graph database built on top of Apache Cassandra. Graph databases are perfect for use cases where the relationships between the data are just as important as the data itself.

In our case, we’re focusing on the relationship between our users and the movies they like. This way, we can help to point them toward additional movies they might enjoy.

Using DSE Graph, we can track data about users and movies, storing them as “vertices” in the database (figure 1). Whenever a user rates a movie, we can add a “rated” edge (with the rating value as a property) from the user to the movie.

When we want to get a recommendation, we can use a particular movie as the entry point and “walk” the graph out to other users, then further out to movies that they’ve rated similarly.

👁 Image

Figure 1 – A partial graph showing how “User” and “Movie” vertices are connected by edges containing the user’s rating.

The Service Layer

To interact with the database, we’ll build a simple, restful service using Java Spring Boot. Inside the controller, we’ll build out two services:

  • addUserRating – Takes a new rating for a movie from a user and adds an edge to the graph.
  • findRealtimeRecommendationsByMovieId – Takes a movie (id) and returns a list of similarly rated movies.

The services are largely self-explanatory. One writes user ratings to the database (stored as an edge in graph). The other returns recommendations based on the movie provided, using item-based collaborative filtering to match like-rated movies. (For more on item-based collaborative filtering, check out chapter 10 of “The Practitioner’s Guide to Graph Data.”)

Astra Streaming Topic

Astra Streaming is a distributed streaming-as-a-service built atop Apache Pulsar. We’ll use a streaming topic to handle the incoming write traffic. Calls to the addUserRating service will send a user’s new rating for a movie to the topic (as shown in figure 2). We will then have a process “subscribing” to the topic, which will consume the data and write it as an edge into the graph.

Using Astra Streaming in this approach gives us a few advantages:

  • Topics can provide message delivery guarantees – In case of a failure event, this helps to ensure that our “in flight” data is persisted.
  • Protection from write back pressure – If the application has periods of unpredictable or “bursty” write activity, an Astra Streaming topic can help to throttle down the write throughput, protecting the underlying storage infrastructure from overutilization.
👁 Image

Figure 2 – A visual representation of how the different components of the recommendation system interact with each other

Streaming Topic Consumer

Once new ratings are sent to the Astra Streaming topic, a consumer process will take over. This process will “subscribe” to the topic and await any messages posted to it. Upon the arrival of a rating message on the topic, the consumer will acknowledge it and write it as an edge into the graph database using the following (Fluent) Gremlin code (figure 3).

👁 Image

Figure 3 – Fluent Gremlin for creating a “rated” edge between “User” and “Movie” nodes.

As the consumer process is running continuously, it will continue to monitor the topic and apply new “rated” edges to the graph as they come in. The great thing about this is that additional ratings messages will queue up and be applied at a consistent level of throughput.

Traversal and Results

The read path will be built using a simple graph traversal. Using the original movie as an entry point, we’ll move along similar ratings edges out to the users [who submitted them], and then continue to the adjoining movie nodes. Running this traversal with the movie “Back to the Future” as the entry point for our sample data set produces the following results:

Title Total # of High Ratings
Indiana Jones and the Raiders of the Lost Ark (1981) 20
Shawshank Redemption (1994) 18
Star Wars: Episode IV – A New Hope (1977) 18
Forrest Gump (1994) 18
Matrix The (1999) 17

Table 1 – Traversal results for movies that are similarly rated to “Back to the Future.”

Summary

We discussed steps to improve our write path into a real-time recommendation system. We’ve implemented our main storage model in a graph database, which offers methods and algorithms that can take data discovery to a whole new level. Likewise, we’ve improved our data persistence guarantees while simultaneously protecting the storage layer from becoming overwhelmed in the event of a spike in user traffic.

While the use case of a movie recommendation system was the example here, the concepts discussed can be applied to many types of real-time systems. It’s common to find event processing and graph databases in use cases for many areas, such as supply chain, cybersecurity, and product-data management. Employing the methods discussed above can help ensure real-time data persistence.

The code for the Java Spring Boot service layer can be found in this repository, and the code for the Java consumer can be found

DataStax, an IBM company, provides the real-time vector data tools that Gen AI apps need, with seamless integration with developers’ stacks of choice.
Learn More
The latest from DataStax
TRENDING STORIES
Aaron Ploetz is a developer advocate at DataStax. He's been a professional software developer since 1997 and has several years of experience working on and leading DevOps teams for startups and Fortune 50 enterprises, including several national retailers. He is...
Read more from Aaron Ploetz
DataStax sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Real, Pragma.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.