VOOZH about

URL: https://thenewstack.io/building-real-time-location-intelligence-apps/

⇱ Building Real-Time Location Intelligence Apps - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2023-03-03 06:08:52
Building Real-Time Location Intelligence Apps
sponsor-kinetica,sponsored-post-contributed,
Data / Software Development

Building Real-Time Location Intelligence Apps

It has become easier, thanks to new specialized technologies designed to handle the processing, analysis and visualization of location-based data.
Mar 3rd, 2023 6:08am by Hari Subhash
👁 Featued image for: Building Real-Time Location Intelligence Apps
Kinetica sponsored this post.
Real-time location intelligence refers to the ability to gather, analyze and act on location-based data in real-time. Real-time location intelligence is used for a range of applications such as fleet management, smart cities, detecting threats in our airspace, smart agriculture, proximity marketing, telco network optimization, supply chain management and more. Over the past few years, building real-time location intelligence solutions has become easier thanks to the introduction of new specialized technologies that are designed to handle the processing, analysis and visualization of location-based data. In this article, I will demonstrate how to set up a location intelligence pipeline that is built on top of real-time data feeds from Apache Kafka. The workbook contains an end-to-end pipeline that connects to streaming data sources via Kafka, performs spatial computations to detect different events and patterns, and then streams these to an external application. All of this is done using simple SQL code without any need for complicated data engineering. You can access the workbook that I used from this link to try this out on your own.
Kinetica is the real-time database platform that leverages generative AI and vectorized processing to let you ask anything of your sensor and machine data. Kinetica offers native vectorized analytics in generative AI, spatial, time-series, and graph.
Learn More
The latest from Kinetica

Setup

I used Kinetica’s free developer edition to run the queries in this article. There are a few things to consider when picking a database for real-time spatial analysis.
  1. Native connectors to streaming data sources like Kafka
  2. Functional coverage: The ST_Geometry library from PostGIS is the gold standard for geospatial functions. Most spatial databases emulate this library.
  3. Spatial joins: The ability to combine multiple tables based on a spatial relationship. This is important for any type of sophisticated analysis of spatial data.
  4. Performance: Real-time applications require spatial functions to be executed on large amounts of data at high speeds.
  5. Reactivity: The system needs to automatically detect and respond to new data without any elaborate engineering on our part.
  6. Integrated capabilities: Location intelligence workloads often have niche analytical requirements such as routing and optimization using graph networks and the ability to visualize large amounts of data.
  7. Stream data out: The final piece of the pipeline is the ability to stream data out of an analytical database into external systems.
See the last section of this article for more information on how to try the queries here on your own.

Data

We will use the following data for this demo.
  1. A stream of GPS coordinates that record the movement of seven trucks in Washington, D.C. The data is being streamed in via a Kafka topic.
  2. A set of polygons that outline different landmarks in D.C.
  3. A stream of taxi trips that are occurring in New York City (a few hundred per second)
All the data feeds are preconfigured so you don’t have to do any setup to replicate the queries here on your own. The maps below are a visual representation of the data.  
👁 Image

Figure A: trucks in Washington, D.C.

👁 Image

Figure B: fences around landmarks in Washington, D.C.

👁 Image

Figure C: NYC taxi pickup points

Let’s dive into a few common analytical tasks that underpin location intelligence.

Spatial Filtering

A spatial filter is used to identify a subset of records from a table that meet some criteria based on a specific spatial query. Let’s start with a really simple query to kick things off. The query below, identifies all the “fences” — the outlines around landmarks that have an area greater than 0.5 km2 (500,000 m2).
CREATE OR REPLACE TABLE fence_filter AS 
SELECT wkt
FROM dc_fences 
WHERE ST_AREA(wkt, 1) > 500000
This is a pretty easy query to implement in any database that supports spatial operations since it is filtering from a small table. Now let’s explore a few use cases that are harder to implement in real time.

Spatial Join of Streaming Locations with the Fences Table

Spatial joins combine two different tables using a spatial relationship. They are used to answer questions such as “which areas overlap,” “where do the boundaries occur” and “what is the area covered by a certain feature.” Spatial joins are typically computationally expensive to perform, especially on larger data. The query below performs an inner join between the `recent_locations` table and the `dc_fences` table that outlines landmarks in D.C. based on the following criteria: Find all records where a truck came within 200 meters of a landmark in D.C.
CREATE OR REPLACE MATERIALIZED VIEW vehicle_fence_join 
REFRESH ON CHANGE AS
SELECT 
 TRACKID, 
 x,
 y,
 DECIMAL(STXY_DISTANCE(x, y, wkt, 1)) AS distance,
 fence_label,
 wkt
FROM recent_locations
INNER JOIN dc_fences 
 ON 
 STXY_DWITHIN(x, y, wkt, 200, 1) = 1;
Everything looks similar to PostGIS but … This query looks remarkably similar to any query that you would see in PostGIS or any other spatial database that uses ST_Geometry. The key difference, however, is the use of the materialized view that is set to refresh on change. This means that the query will be updated in real time as new records with truck location hit the Kafka topic. With this change, the database will constantly monitor and maintain the join view (`vehicle_fence_join`) to reflect the most up-to-date version of truck locations. Or in other words, if there is a new truck location that is close to a landmark, it will be automatically reflected in the join view.

Take It up a Notch with a Time Filter

In the previous query we identified all the records that matched spatial criteria. Now, let’s add time-based criteria. The query below identifies all instances from the last 5 minutes where a truck location was inside one of the “fences” around a landmark in Washington, D.C.
CREATE OR REPLACE MATERIALIZED VIEW fence_events 
REFRESH ON CHANGE AS 
SELECT 
 TRACKID, 
 wkt, 
 x, 
 y, 
 TIMESTAMP, 
 CONCAT(TRACKID, CONCAT(' is at ', fence_label)) AS event_text 
FROM recent_locations, dc_fences
WHERE 
 STXY_CONTAINS(wkt, x, y) = 1 AND 
 TIMEBOUNDARYDIFF('MINUTE', TIMESTAMP, NOW()) < 5
The query above performs a very common location intelligence task: geofencing. Geofencing is a location-based service in which a software application uses GPS, RFID, Wi-Fi or cellular data to trigger a preprogrammed action when an object enters or exits a virtual boundary set up around a geographical location. Geofencing can be used for various purposes, such as sending notifications to customers when they enter or leave an area, providing targeted advertising based on location, and tracking employee time and attendance. Now, there is one particular scenario where the query we used above will fail. Consider a fence that is really small; an object might enter this fence and then subsequently exit before it emits a location signal. Or in other words, if we don’t have data on a geofence event, we won’t be able to detect it. But there is a workaround for this. We can interpolate the “track” between two subsequent points to then infer if the track intersects with the fence. Let’s look at that next.

Add Interpolation with Tracks for More Accurate Detection

The animation below shows the scenario we just described. Without interpolation it might seem that a particular object did not enter a fence. But with interpolation, we can correctly infer that the track did indeed intersect with a fence. Kinetica offers a native track object that does all the interpolation behind the scenes. A track requires the following information: a timestamp (so that we know the order of the points), the X and Y (longitude/latitude) coordinates of the object and an ID variable that identifies each track. We can use this with the `ST_TRACKINTERSECTS` function to infer geofencing events as shown below.
CREATE OR REPLACE MATERIALIZED VIEW fence_trucks
REFRESH EVERY 5 SECONDS AS
SELECT *
FROM TABLE
(
 ST_TRACKINTERSECTS
 (
 TRACK_TABLE => INPUT_TABLE(recent_locations),
 TRACK_ID_COLUMN => 'TRACKID',
 TRACK_X_COLUMN => 'x',
 TRACK_Y_COLUMN => 'y',
 TRACK_ORDER_COLUMN => 'TIMESTAMP',
 GEOFENCE_TABLE => INPUT_TABLE(dc_fences),
 GEOFENCE_ID_COLUMN => 'fence_id',
 GEOFENCE_WKT_COLUMN => 'wkt'
 )
);

There are several other track functions that come with Kinetica, but you can try those out on your own using the workbook.

Binning and Aggregation

The final spatial operation I want to cover is binning and aggregation. We will use the stream of taxi trips from New York City for this section. This data contains information about the pickup and drop-off locations, fare and distance of each taxi trip in New York City. We are interested in finding hotspots in the taxi locations in the city. Spatial binning is a technique used in data analysis to group geographically referenced data into user-defined bins or areas. The technique is used to reduce the resolution of a dataset, making it easier to visualize and analyze large datasets. This can be done by dividing the area into equal-sized rectangular grids, circles, polygons or hexagons. The resulting bins are then used for statistical analysis, such as calculating the frequency of events or measuring average values.
👁 Image

Image 1

Kinetica offers multiple ways to generate grids for binning data: `ST_HEXGRID` or using H3 geohashing functions. Let’s use the latter. Uber developed H3 (Hexagonal Hierarchical Spatial index) for efficiently optimizing ride pricing and visualizing and exploring spatial data. Each grid in the systems has an associated index and a resolution. We can use the `STXY_H3` function to identify the corresponding grid index value for each taxi drop-off and pickup location for the taxi data stream — into which hexagon a particular point falls. Note that the materialized view is set to refresh on change, so as new data hits the system (a few hundred trips every second), the view below is automatically maintained by calculating the H3 index values for all the new records.
CREATE OR REPLACE MATERIALIZED VIEW nytaxi_h3_index
REFRESH ON CHANGE AS 
SELECT 
 pickup_latitude,
 pickup_longitude,
 STXY_H3(pickup_longitude, pickup_latitude, 9) AS h3_index_pickup,
 STXY_H3(dropoff_longitude, dropoff_latitude, 9) AS h3_index_dropff
FROM taxi_data_streaming;
Once we have the index values, we can use those to generate the corresponding bins using the `ST_GEOMFROMH3` function. The data is then grouped by the index to calculate the total number of pickups in each bin. This is another materialized view that is listening for changes to the materialized view we set up in the query above. Again the aggregations are automatically updated and as new records arrive, the view will always reflect the most up-to-date aggregation without any additional work on our part.
CREATE OR REPLACE MATERIALIZED VIEW nytaxi_binned
REFRESH ON CHANGE AS 
SELECT 
 h3_index_pickup,
 ST_GEOMFROMH3(h3_index_pickup) AS h3_cell,
 COUNT(*) AS total_pickups
FROM nytaxi_h3_index 
GROUP BY h3_index_pickup
The map below shows the results. We can see that the hotspots for pickup are mostly in Lower Manhattan. 👁 Image

Try This on Your Own

You can try all of this on your own for free using the Spatial Analytics workbook on Kinetica Cloud, or by downloading the workbook from our GitHub repo here and importing it into Kinetica’s Developer Edition. I have preconfigured all the data streams so you don’t have to do any additional set up to run this workbook. Both Kinetica Cloud and Developer Edition are free to use. The cloud version takes only a few minutes to set up, and it is a great option for quickly touring the capabilities of Kinetica. However, it is a shared multitenant instance of Kinetica, so you might experience slightly slower performance. The developer edition is also easy to set up and takes about 10 to 15 minutes and an installation of Docker. The developer edition is a free personal copy that runs on your computer and can handle real-time computations on high-volume data feeds.
Kinetica is the real-time database platform that leverages generative AI and vectorized processing to let you ask anything of your sensor and machine data. Kinetica offers native vectorized analytics in generative AI, spatial, time-series, and graph.
Learn More
The latest from Kinetica
TRENDING STORIES
Hari Subhash is director of community content strategy and enablement for Kinetica. He has more than 15 years of experience in the data analytics space across a wide range of industry sectors such as financial management, international development and technology....
Read more from Hari Subhash
Kinetica sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Real, Pragma, Docker.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.