VOOZH about

URL: https://dzone.com/articles/real-time-event-correlation-distributed-systems

⇱ Solving Real-Time Event Correlation in Distributed Systems


Related

  1. DZone
  2. Data Engineering
  3. Databases
  4. Solving Real-Time Event Correlation in Distributed Systems

Solving Real-Time Event Correlation in Distributed Systems

Learn how to join streaming events in real time using Spring Boot with event-time windows, watermarking, and in-memory state β€” no Kafka or Flink needed.

By Nov. 27, 25 Β· Analysis
Likes
Comment
Save
2.0K Views

Join the DZone community and get the full member experience.

Join For Free

Modern digital platforms operate as distributed ecosystems β€” microservices emitting events, APIs exchanging data, and asynchronous communication becoming the norm. In such environments, correlating events across multiple sources in real time becomes a critical requirement.

Think of payments, orders, customer metadata, IoT sensors, logistics tracking β€” all flowing continuously.

These events rarely arrive in order, rarely originate from one system, and often must be joined before downstream processing. This leads to a crucial question:

How do we join two streaming event sources (left and right streams) by key and time window using plain Spring Boot, without Kafka Streams, Spark, or Flink?

This blog explains a clean, production-ready way to implement:

  • Real-time stream join
  • Event-time based correlation
  • Watermarking to drop late events
  • Sliding/Tumbling window joins
  • In-memory state store
  • REST API-based ingestion

Why Do We Need a Streaming Join Engine?

In enterprise systems, operations often depend on combining information from multiple event producers.

1. Order + Customer Metadata

  • Left stream: Order event from OMS.
  • Right stream: Customer country/geo from User Service. This is needed for real-time enrichment before invoicing or fraud detection.

2. Payment Event + Fraud Score Event

  • Payment attempt arrives.
  • Fraud score arrives from ML model. Join is required within 5–10 seconds to allow real-time approval decisions.

3. IoT Sensor Fusion

  • Temperature event
  • Humidity event. Must be joined within milliseconds for accurate predictions.

4. Telecom Real-Time Monitoring

  • Subscriber movement.
  • Tower signal. Join is needed for real-time network-performance metrics.

These joins cannot rely on nightly batch jobs, because decisions must be:

  • Real-time
  • Event-driven
  • Low-latency
  • Correct, even if events arrive out of order

Core Challenge: Event-Time Disorder and Late Arrivals

In real systems:

  • Network jitter delays events
  • Services are produced at different speeds
  • Clock skew causes timestamp variations
  • Event bursts cause uneven flow
  • Events can arrive seconds to minutes late

A naΓ―ve "join the last event you saw" strategy will break.

Example

Left Event (Order)

  • t = 10:00:05

Right Event (CustomerMeta)

  • t = 10:00:02 (arrives late at 10:00:08)

β†’ Without event-time logic, this join fails.
β†’ Business logic says it should match.

This is why we need:

  • Event-time based windows
  • Watermarking
  • Late-event handling

Architecture: Event-Time Join Engine With Watermarking


Event Models

Left Event

Java
public class LeftEvent {
 private String key;
 private long eventTime; // epoch millis
 private String orderId;
 private double amount;
}


Right Event

Java
public class RightEvent {
 private String key;
 private long eventTime; // epoch millis
 private String country;
}


How the Join Works: Step-by-Step

1. Accept Event via REST

Java
@PostMapping("/left") β†’ engineService.acceptLeft(e);
@PostMapping("/right") β†’ engineService.acceptRight(e);


Events are added to:

  • leftStore
  • rightStore

2. Update Maximum Event Time

Python
maxEventTime = max(maxEventTime, event.eventTime)


3. Compute Watermark

Allowed lateness = 10 seconds β†’ 10000 ms

Python
watermark = maxEventTime - 10000


Events older than this are discarded.

4. Perform Join

On the arrival of a new Left Event:

Python
scan RightStore[key]
match if |leftTime - rightTime| <= joinWindow


Same for Right Event.

5. Emit Joined Output

A sample join result:

JSON
{
 "key": "CUST-1002",
 "orderId": "ORD-9002",
 "amount": 1989.75,
 "country": "India",
 "joinedAt": 1700000105000
}


6. Cleanup Using Watermark

Any event older than the watermark is removed from memory.

Example API Requests and Outputs

POST /api/left

JSON
{
 "key": "CUST-1002",
 "eventTime": 1700649000202,
 "orderId": "ORD-9002",
 "amount": 1989.75
}


Response: accepted.


State:

Python
leftStore["CUST-1002"] = [{ORD-9002, 1989.75}]
rightStore["CUST-1002"] = []


POST /api/right

JSON
{
 "key": "CUST-1002",
 "eventTime": 1700649000152,
 "country": "India"
}

Joined Output

JSON
{
 "key": "CUST-1002",
 "orderId": "ORD-9002",
 "amount": 1989.75,
 "country": "India",
 "joinedAt": 1700000105000
}


Watermark Example

  • Max event time: 1700000105000
  • Allowed lateness: 10s 
  • Watermark: 1700000095000

API endpointGET /api/watermark

Response: 1700000095000

REST Controller (Final Version)

Java
@RestController
@RequestMapping("/api")
public class EventController {

 private final EngineService engineService;

 public EventController(EngineService engineService) {
 this.engineService = engineService;
 }

 @PostMapping("/left")
 public ResponseEntity<?> postLeft(@RequestBody LeftEvent e) {
 engineService.acceptLeft(e);
 return ResponseEntity.ok().body("accepted");
 }

 @PostMapping("/right")
 public ResponseEntity<?> postRight(@RequestBody RightEvent e) {
 engineService.acceptRight(e);
 return ResponseEntity.ok().body("accepted");
 }

 @GetMapping("/watermark")
 public ResponseEntity<?> watermark() {
 return ResponseEntity.ok().body(engineService.getWatermark());
 }
}


Endpoints

Endpoint description

POST /api/left

Ingest Left events

POST /api/right

Ingest Right events

GET /api/watermark

Shows computed watermark


Why Not Use a Database JOIN?

Because database joins are:

  • Based on stored/batch data
  • High latency (seconds to minutes)
  • Order-dependent (arrival order matters)
  • Cannot handle late events.
  • Cannot scale for streaming workloads

Our Spring Boot join engine provides:

  • Millisecond joins
  • Accurate event-time semantics
  • In-memory performance
  • Watermark-based cleanup
  • No external streaming infrastructure

Practical Real-Time Use Case: E-Commerce Order Enrichment

Business Flow

1. Order Service emits:

  • orderId
  • amount
  • eventTime
  • customerId (key)

2. Customer Profile Service emits:

  • customerId
  • country
  • eventTime

Business Need

Before routing an order to:

  • Tax engine
  • Fraud decisioning
  • Payment gateway
  • Invoicing pipeline

It must be enriched with the customer’s country.

Join Logic

  • Join window: 10 seconds
  • Join based on: event-time, not arrival time

Why?

  • The customer profile may arrive late.
  • Order service may be faster.
  • Network latency varies.
  • A message queue (Kafka/SQS) might delay only one stream.

Example Timeline

Time (Actual) event arrival time

10:00:02

RightEvent(country=β€œIN”)

arrives at 10:00:05

10:00:05

LeftEvent(order=550)

arrives at 10:00:05


Both timestamps fall within the 10-second window.

β†’ Join event created.

Real-Time Benefits

  • Sub-second join processingAccurate even when events arrive out of order.
  • No Kafka, Flink, Spark required.
  • Stateless REST producers β†’ State held centrally.
  • Runs on any environment (VM, on-prem, Kubernetes).
  • Production-ready with watermark-based cleanup

Where This Is Used in the Real World

  • Fraud detection: Join payment events with risk-score events.
  • Real-time order enrichment:Join orders with customer metadata (country, geo, segment).
  • Logistics/fleet tracking: Join GPS location with route updates.
  • IoT sensor fusion: Combine temperature + humidity into a unified sensor reading.
  • Telecom analytics:Join subscriber session with tower signal events.
  • Online gaming: Join user action events with current session metadata.

Final Summary

You now have a complete:

  • Real-time streaming join engine
  • Event-time window correlation
  • Watermarking and cleanup
  • REST-based ingestion
  • In-memory state store
  • Real-world business use case
  • Ready-to-deploy Spring Boot implementation

This is a lightweight, efficient alternative to Kafka Streams/Flink for small-to-medium-scale real-time joins.

Event Joins (concurrency library) systems

Opinions expressed by DZone contributors are their own.

Related

  • When Events Move Faster Than Your Database: A Resilient Design Pattern
  • Clock Synchronization and Ordering Events in Distributed Systems: Lamport Clocks vs. Vector Clocks
  • Building a Deterministic Event Correlation Engine in Go for High-Volume Alert Systems
  • Queues Don't Absorb Load β€” They Delay Bankruptcy

Partner Resources

Γ—

Comments

The likes didn't load as expected. Please refresh the page and try again.

Let's be friends: