VOOZH about

URL: https://thenewstack.io/bridging-the-data-gap-real-time-user-facing-analytics/

⇱ Bridging the Data Gap: Real-Time User-Facing Analytics - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2024-07-08 06:43:28
Bridging the Data Gap: Real-Time User-Facing Analytics
sponsor-redpanda,sponsored-post-contributed,
Data / Data Streaming / Databases

Bridging the Data Gap: Real-Time User-Facing Analytics

This architecture delivers real-time insights to users while preserving data freshness, minimizing latency and withstanding high-query throughput.
Jul 8th, 2024 6:43am by Dunith Danushka
👁 Featued image for: Bridging the Data Gap: Real-Time User-Facing Analytics
Image from MARENZO on Shutterstock
Redpanda sponsored this post.

The word “decision-maker” likely makes you immediately think of “C-suite” or “executive.” But nowadays, we’re all decision-makers. Whether you’re a banker or a blogger, you have important decisions on your plate and need accurate insights to make them.

With the rise of SaaS and self-serve data platforms, organizations are constantly looking for ways to equip their employees with the insights they need to make data-driven decisions. Having access to fast, fresh analytics is expected to speed up decision-making and foster a data-driven culture within the organization.

User-facing analytics provide customers and employees with direct access to data analysis results. This typically involves dashboards that display data using graphs and other easily understandable formats. Essentially, the goal of user-facing analytics is to provide the data users need to make informed decisions without relying on data analysts or other specialists.

There are three crucial aspects of user-facing analytics: data freshness, query latency and query throughput. We’ll introduce a solution architecture for implementing a scalable user-facing analytics solution and explain how each component efficiently addresses each aspect.

Considerations When Building User-Facing Analytics

Think of a user logging into their service provider’s mobile application. The user might see a dashboard that shows how much data they’ve used in the current billing cycle, how much data remains in their package and a forecast based on their current consumption rate. This dashboard might also provide insights into their usage patterns, such as peak usage times or the apps that consume the most data.

This real-time information can help the user make informed decisions about their data usage and avoid overage charges.

👁 Typical user-facing dashboard with computed metrics

Typical user-facing dashboard with computed metrics

But what does it take to design and implement such a dashboard? Implementing a user-facing analytics solution requires three critical characteristics:

  • Data freshness is the ability to ingest and process data in real-time, ensuring users have the most current insights.
  • Ultra-low query latency is key to a responsive user experience, allowing complex queries to be processed quickly, even when dealing with large volumes of data.
  • High-query throughput, facilitated by a system’s capacity to handle a large number of simultaneous queries, is crucial in user-facing analytics because it ensures the system can support a large user base without slowing down or crashing. This results in a seamless and responsive user experience, even under heavy load.

Designing a User-Facing Analytics Solution

Now let’s explore a potential reference architecture for user-facing analytics.

👁 The high-level solution architecture

The high-level solution architecture

The diagram above packs in various technologies, so we’ll break it down into three layers based on the direction of the data flow (from left to right).

Ingestion Layer

Data ingestion is the first layer of this architecture. It collects data from various sources and delivers it to the analytics data store to run queries later.

The data sources can include anything that users and the business plan to track and monitor for state changes, including change data streams from databases, transactional events from microservices and line of business applications, etc.

Once captured from data sources, the data is ingested into a streaming data platform like Apache Kafka and Redpanda, allowing downstream components to process the data in real-time. In the architecture diagram above, you might notice that Redpanda is in the data pipeline instead of directly moving data to the analytics data store. There are several benefits to this approach. It:

  • Introduces decoupling. Data sources and the analytics data store are decoupled, allowing both parties to scale and evolve independently.
  • Eliminates synchronous writes and acts as a load leveler. Redpanda can absorb and buffer sudden traffic surges from data sources, preventing the analytics infrastructure from becoming overwhelmed.
  • Introduces publish-subscribe semantics. The data ingested into Redpanda can be routed to multiple destinations in real time allowing concurrent processing. For example, the data can be synced into a data lakehouse for archival purposes while feeding the analytics data store.

Data sources produce data to Redpanda via the Kafka API. This means clients that already natively support the Kafka API can integrate with Redpanda without any changes to the codebase. They simply continue to produce messages as they would to a Kafka broker, but the endpoint would be a Redpanda broker instead.

Redpanda Connect for Data Integration

What about the clients that don’t natively support the Kafka API? They could use Redpanda Connect to transfer data from different systems to Redpanda — even those that don’t comply with the Kafka protocol.

Alternatively, you could use other platforms in the Kafka ecosystem because of Kafka API compatibility, such as Kafka Connect.

👁 The ingestion layer collects raw data across different sources and sinks them into Redpanda

The ingestion layer collects raw data across different sources and sinks them into Redpanda

Metrics Computation

Once the data arrives, it requires cleansing and transformations before it can be sent to the analytics data store.

For instance, the data may need to be transformed into a format that can be easily queried. This can involve enriching the data with additional context, normalizing values or converting data types. Additionally, you might need to filter out irrelevant data or handle missing or erroneous data points. In some cases, complex calculations may be required to create derived metrics that are more relevant for analysis.

These transformations should be done in real-time as data arrives from Redpanda to minimize the processing latency. So, we’ll opt for a stream-processing engine over batch- and micro-batch-processing technologies. This stream processor can perform stateful aggregations, lookup joins for enrichment, joins between streams and window operations to transform the data into a format ready for running analytics.

👁 The data processing layer pre-processes and computes the metrics and sends them downstream to the serving layer.

The data-processing layer pre-processes and computes the metrics and sends them downstream to the serving layer.

Analytical Data Store

The final stage of the data pipeline is the serving database. This is where we store the transformed data ready to be served to users.

From a technology perspective, we use a real-time online analytical processing (OLAP) database, like Apache Druid, Apache Pinot or ClickHouse because they go above and beyond to meet the user-facing analytics requirements we mentioned above.

Key reasons are:

  • Supports streaming data ingestion. Real-time OLAP databases natively support data ingestion from streaming data sources, ensuring increased data freshness.
  • Ultra-low query latency. Real-time OLAP databases deliver ultra-low query latency because of their efficient data storage format and indexing mechanisms. This ensures that even complex queries can be processed quickly, providing users with a great user experience.
  • High-query throughput. Real-time OLAP databases support high concurrency, which means they can handle a large number of simultaneous queries. This is possible due to their distributed architecture, which allows for effective data partitioning and parallel processing of queries. High concurrency support is crucial in user-facing analytics because it ensures that the system can serve a large number of users at the same time without slowing down or crashing. This leads to a smooth and responsive user experience, even under heavy load.

If we use Apache Pinot as the analytics data store, it can be configured to ingest data from a topic, where the stream processor intakes the transformed data. Once ingested, Pinot converts the data into relevant Pinot tables, allowing you to run SQL queries immediately. This approach allows users to access the generated insights while the data is still fresh and relevant.

Serving Layer

The last mile of the solution is the serving layer. This is where the metrics stored in the analytical database are served to the application’s users.

For user-facing applications, we can expose the analytics database through an API layer. These APIs can fetch specific metrics or aggregated data, which can then be displayed in the application’s user interface.

For example, Apache Pinot provides Java Database Connectivity (JDBC) and Open Database Connectivity (ODBC) query interfaces and language-specific driver implementations like Python. This enables frontend application developers to create user-facing data products and dashboards using frontend technologies they are most comfortable with. Additionally, developers are not exposed to the complexity of the data-processing layer.

For internal audiences, such as product managers or data analysts, you can use data visualization tools that connect directly to the analytics database. These tools provide a detailed view of the data, allowing for deep dives and more complex analyses. This is particularly useful for understanding trends and patterns and making data-driven decisions.

Streaming Metrics Serving

Sometimes application clients can’t poll the serving layer but need real-time insights as they are processed. Examples of this include low-latency, real-time applications like stock trading desks, live sports stats, broadcasting apps and so on.

To enable these clients, we can bypass the analytics serving layer and directly stream from the processed insights topic in the streaming data platform. However, it should be served via a web-friendly streaming protocol such as WebSockets or Server-sent Events (SSE).

These protocols have been designed with low latency and reliability in mind, which comes in handy when delivering fast-changing content from servers to clients. Furthermore, these protocols are supported and implemented by many web browsers and devices across the internet, reaching a wider audience.

👁 The serving layer provides access to computed metrics/insights.

The serving layer provides access to computed metrics/insights.

Conclusion

The architecture can be a handy reference for data engineers, data architects and IT professionals designing and implementing analytic solutions. It can also be useful for product managers or business decision-makers interested in understanding how insights from data can be made directly accessible to end users in a real-time, user-friendly format.

Our proposed solution delivers real-time insights to its users while preserving data freshness, minimizing latency and withstanding high-query throughput. We mentioned a few technological choices in the solution, but you’re free to swap them with your preferred alternatives. The key takeaway is understanding the layered architecture and the data flow between layers. This knowledge will help you build a streamlined user-facing analytics solution to bridge the gap between your users and the data they need to uplift your organization.

Redpanda is the streaming data platform for developers. Built with a native Kafka API, Redpanda eliminates complexity, maximizes performance and reduces costs. Its lean architecture gives you 10x lower latencies and up to a 6x lower cloud spend — without sacrificing reliability or durability.
Learn More
The latest from Redpanda
TRENDING STORIES
Dunith Danushka is a senior developer advocate at Redpanda Data where he creates developer-friendly content about building modern streaming data applications. He has over 10 years of experience designing, building and operating real-time, event-driven architectures and loves to share his...
Read more from Dunith Danushka
Redpanda sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Real, ClickHouse.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.