VOOZH about

URL: https://thenewstack.io/why-observability-needs-to-go-headless/

⇱ Why Observability Needs To Go Headless - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-02-18 07:01:07
Why Observability Needs To Go Headless
sponsor-hydrolix,sponsored-post-contributed,
Backend development / Data Streaming / Observability

Why Observability Needs To Go Headless

By making long-term log data storage less costly, headless observability maximizes the value of telemetry data for answering key business questions.
Feb 18th, 2025 7:01am by Franz Knupfer
👁 Featued image for: Why Observability Needs To Go Headless
Featured image by Peshkova on Shutterstock.
Hydrolix sponsored this post.

Many enterprises generate terabytes of log data every day, resulting in high costs to ingest, store and analyze that data. Even worse, many observability platforms are walled gardens, making it hard to use log data for use cases beyond observability, such as business intelligence, data science and machine learning.

To solve both of these problems, it’s time for headless observability, a fresh approach that decouples the frontend (visualization, querying and analytics) from the backend (data ingestion and storage) — all while keeping operations simple.

A New Approach: Headless Observability

Headless observability combines two core concepts: headless architecture and the decoupled observability stack. With a headless approach, you can have multiple “heads,” or visualization and analytical tools, for your log and telemetry data. In addition to observability tools, your teams can also use cybersecurity, business intelligence and other analytical tools to maximize the value of that data. And all observability components (such as analytics and telemetry data collection) are decoupled instead of consolidated into a single observability platform.

👁 A headless architecture can have multiple heads including cybersecurity, data science/BI, and observability.

While the term “headless observability” is new, the concepts around it have been established. Confluent has written about the concept of headless data architecture where multiple “heads” can be used to analyze data. As that article discusses, there are some critical differences between headless data architecture and traditional data lakes. Data isn’t limited to one centralized location (as is typically the case with a data lake), and any service should be able to access that data.

Meanwhile, StarTree’s Neha Pawar argued in The New Stack that observability needs to move toward a disaggregated stack where components such as storage, ingest and visualization are decoupled, resulting in greater flexibility and lower costs. While Pawar focuses on data disaggregation through the prism of observability, you can easily apply all these concepts to data disaggregation in general with certain observability components such as analytics (“headless observability”) added on.

Ultimately, headless observability and disaggregation can solve similar problems, including reducing costs, increasing scalability and making telemetry data available for other use cases. However, the headless approach has advantages for organizations that don’t want to build a solution from the ground up. With headless, teams can simplify operations, use fewer engineering resources and take on less risk.

Going Headless: Simplifying Disaggregated Observability

Disaggregated observability is about breaking down the observability monolith into smaller, composable services, which can be a daunting task for teams. Headless observability, on the other hand, is about making that decoupling simple — in the same way a headless content management system (CMS) simplifies the process of creating digital experiences.

Pawar’s article shows that building a composable system is possible, but it involves using collection tools like OpenTelemetry, streaming pipelines like Apache Kafka or Apache Flink, and cloud object storage combined with data lake wrappers like Apache Iceberg. Teams can DIY real-time analytics and headless observability if they have the wherewithal and resources needed to build a solution from scratch, but there are risks with this approach.

To make a solution that works for observability and other use cases, you need to build a data lake that’s fast enough for real-time analytics and cost-effective enough for long-term storage. That includes a real-time streaming pipeline with ETL (extract—transform—load) to standardize and contextualize log data; a system for structuring, compacting and merging data in object storage; and analytics platforms that can perform queries and run dashboards with subsecond query latency.

Netflix and other huge enterprises that use log data in many different ways might have the resources to build these kinds of systems — along with the risk tolerance to troubleshoot them, especially when systems use more compute than expected, driving up costs, and when they end up having disappointing performance. Combining tools like Flink or Kafka with a high-performance data lake and real-time analytics is a complex project, and a lot can go wrong. Even for major enterprises, building a system from scratch often isn’t feasible, nor is it the best use of engineering resources.

Headless observability can be much simpler and require much less maintenance for teams that use mature, fully established solutions for both the frontend and the backend. It’s akin to using a content delivery network (CDN) to deliver assets for a headless content management system (CMS) instead of a full-stack web application where engineers have to manage every aspect of delivery, performance and reliability.

The key for headless observability is the backend: a storage solution optimized for logs that handles all the heavy lifting of ingesting, storing and preparing data for analysis.

What’s Required From a Storage Solution for Headless Observability

The cornerstone of headless observability is the storage system. At a high level, it needs to do the following:

  • Manage the backend: The storage solution should handle all aspects of the backend, from streaming to processing to efficiently storing data for analysis.
  • Maximize frontend interoperability: The solution must offer connectors that allow for data federation and querying without moving data. This includes interoperability with dashboard and visualization tools (decoupled dashboards) as well as analytics platforms.

Let’s break down these two attributes to a more granular level.

  • Object storage: In order to store petabytes of data cost-effectively, storage systems need to favor horizontally scalable and cheap object storage over expensive hardware options such as SSDs.
  • Real-time streaming: Real-time streaming is required for use cases like observability, but it doesn’t need to be “true” real time (to the order of milliseconds or nanoseconds). The system should be able to ingest daily loads at terabyte-scale and make data available to analytical frontends in seconds. Typically, the data should be partitioned to maximize object storage throughput while staying within typical cloud provider rate limits.
  • Streaming ETL: Compounding the challenge of real-time streaming, an ETL process is typically required for log data. Logs should be standardized and often need additional context (such as information about their source). And this transformation process needs to happen with minimal additional latency.
  • Merge, indexing and compaction services: Subsecond analytical queries with object storage require significant storage optimization. This includes data compaction to minimize the amount of data that must be traversed and then transferred from object storage for queries, a merge service to optimize partitions for query optimizations like partition pruning and comprehensive indexing strategies.
  • Optimized for log data: In the case of a solution that’s used for observability, it should be optimized for log and event data — which means partitioning data by timestamp. However, solutions specifically optimized for time series are not a good solution for observability because they typically aggregate data. This leaves operations teams without the granularity they need to diagnose the root cause of issues.
  • Compatible with dashboards and visualizations: These include tools like Grafana, Superset and even Kibana (which has traditionally only been compatible with the ELK stack), so teams can access log data using the tools that work best for them.
  • Compatible with observability and analytics platforms: This includes both observability tools like Splunk as well as analytics platforms like Databricks. You should be able to query data from the storage solution without migrating all your data into a new platform.

The Benefits of Going Headless

Going headless makes analytics use cases — from observability to business intelligence — much easier. Here are the more general benefits of a headless approach:

  • Simplified log storage: Instead of managing many storage solutions — along with access and security for each of those systems, you can keep all of your log and event data in one solution.
  • Cost-effective storage for petabytes of data: Solutions that use object storage are more cost-effective, and you can negotiate cloud provider discounts for larger volumes of data.
  • Easier to secure and access: Because data is kept in one place, it’s easier to set up security, permissions and access.
  • No complex migrations or data duplicates needed: Instead of duplicating data in multiple places (such as with denormalization) or needing to perform complex, resource-intensive data migrations, you have a single source of truth.
  • Less risk and overhead compared to full DIY: Unlike a fully disaggregated system where engineering builds and optimizes all of the components, the headless approach is much quicker and easier to stand up.
  • Unlock the value of telemetry data: There’s no need for vendors that lock in telemetry data for observability and then discard it after a short period of time.

Let’s also take a look at a specific use case for headless observability.

Value of Unlocking Observability: A Use Case for Headless

Some operations teams and leaders might wonder whether it’s worth unlocking the value of log data for use cases beyond observability in the first place. After all, some services can generate huge volumes of data that don’t seem to have much use beyond immediate observability. As an example, why would a data science team want to analyze the logs of thousands of ephemeral Kubernetes containers?

Not all logs have long-term value, but that’s one of the advantages of headless observability and decoupled storage. Teams have the freedom and flexibility to determine which logs should be retained for longer periods. Web application firewall (WAF) and other security logs can be retained over the long term and made available to cybersecurity teams and threat hunters. Other application logs can provide long-term insights into how resources are being used for capacity planning and anomaly detection.

Let’s take a closer look at a real, tangible use case where observability data can be valuable for other teams: real user monitoring (RUM). In the realm of observability, RUM allows teams to proactively monitor how end users are experiencing web applications. Issues like slow page loads can be mitigated before they frustrate users.

Beyond observability, RUM data can also provide insights into how your end users are interacting with your brand and your products. This data is invaluable for marketing, advertising and leadership teams that need to plan strategy. For enterprises generating terabytes of telemetry data that show exactly how users are interacting with their web properties, ensuring fast page load times is critical — but it shouldn’t be the only use case for that data.

As a real-world example, many enterprises use CDN log data for real user monitoring. In the short term, monitoring CDNs is important for ensuring good user experiences and fast loading times of digital assets. However, being able to retain huge volumes of log data (including CDN data) long term and cost-effectively provides certain advantages to enterprises. For example, major streaming and media broadcasters that previously couldn’t retain log data long-term due to costs are now analyzing that data for capacity planning, detecting and mitigating stream piracy, and are better understanding how their end users are interacting with both live and on-demand streaming content.

For these enterprises, which often generate terabytes of CDN log data per day, even basic monitoring and observability use cases aren’t possible with traditional SaaS observability vendors because of the cost and scale. By using a headless observability approach for CDN logs (such as Hydrolix with Grafana dashboards), they’re able to unlock not just increased event observability and lower costs, but also use that data for a wide range of other use cases. They also don’t need to build entire disaggregated observability solutions from scratch.

The Barriers to Headless Observability

The benefits of a headless approach are clear for users but potentially difficult for observability and other analytics platforms to implement and monetize. The traditional Software as a Service (SaaS) model of observability calls for storing log data within observability platforms for a short period of time and maximizing the usability and value of that data for operations teams. To that end, traditional platforms have built rich ecosystems of agents and connectors — but only for incoming data. Once the data is at rest, it’s essentially in a walled garden: challenging to migrate and not possible to federate.

Even if data federation were possible, the typical observability platform simply doesn’t store data long enough to maximize its usability with other analytics platforms, and they often use proprietary query languages instead of SQL. In order to solve this problem, observability platforms need to make the transition to disaggregated storage — but many have incurred too much technical debt with their current infrastructure, making this transition difficult or even impossible.

As a result, traditional SaaS observability platforms aren’t well-suited for headless observability, leading to a gap in the market that solutions like Iceberg and Hydrolix can fill.

Finally, while a lower total cost of ownership is an immediate advantage of going headless, the longer-term benefits of democratizing and maximizing the value of telemetry data are still intangible for many enterprises. This isn’t a surprise — many teams are stuck answering short-term questions like, “How long should I keep this data?” and “How can I reduce costs?” Long-term retention of high volumes of telemetry data for use cases beyond observability hasn’t been possible for these enterprises.

Leaders and their teams should take the time to consider the future of their businesses, not just the present moment, by asking: “What business-critical questions could we answer if we could keep all this data long term, without worrying about high costs, and make it accessible to all our teams?”

Learn how Hydrolix can help you keep more data longer and more cost-effectively by maximizing the performance of disaggregated object storage.

The Hydrolix streaming data lake powers the industry’s fastest growing observability and security products, transforming the economics of managing high cardinality, high dimensionality log data.
Learn More
The latest from Hydrolix
TRENDING STORIES
Franz Knupfer is director of Content and Research at Hydrolix, a streaming data lake for log and event data. Prior to Hydrolix, he taught and was director of curriculum at a code school, and has also worked in the observability...
Read more from Franz Knupfer
Hydrolix sponsored this post.
SHARE THIS STORY
TRENDING STORIES
Confluent is also a sponsor of The New Stack.
TNS owner Insight Partners is an investor in: Databricks, Real.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.