VOOZH about

URL: https://thenewstack.io/a-diy-framework-for-optimizing-observability-costs/

⇱ A DIY Framework for Optimizing Observability Costs - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2024-03-20 08:58:12
A DIY Framework for Optimizing Observability Costs
sponsor-coralogix,sponsored-post-contributed,
FinOps / Microservices / Observability

A DIY Framework for Optimizing Observability Costs

A look at the various technological and organizational factors that have driven up observability costs and what to do about them.
Mar 20th, 2024 8:58am by Chris Cooney
👁 Featued image for: A DIY Framework for Optimizing Observability Costs
Image from Dovile Kuusiene on Shutterstock.
Coralogix sponsored this post.

Observability costs are exploding as businesses strive to deliver maximum customer satisfaction with high performance and 24/7 availability.

Global annual spending on observability in 2024 is well over $2.4 billion and is expected to reach $4.1 billion by 2028. On an individual company basis, this is reflected by observability costs ranging from 10% to 30% of overall infrastructure spend.

These costs will undoubtedly rise with digital environments expanding and becoming ever more complex. As such, it’s imperative for cost-conscious companies to evaluate how they can best reduce this cost while maintaining overall excellence in observability.

Let’s discuss why observability software is in such high demand, how to implement a DIY cost optimization approach and the criteria for selecting an off-the-shelf option that ensures observability costs stay as low as possible.

Why Is Observability So Damn Expensive?

The most obvious cause for the growth of observability costs is that businesses must cater to today’s consumers, who expect lightning-fast, on-demand, 24/7 access to anything digital. Monitoring system health is imperative for modern companies. But alongside that, various technological and organizational factors have driven up observability costs.

Let’s take a look at some of them:

Microservices

Microservices produce more observability data than their equivalent monolithic application. This is especially significant for trace data that shows how data flows through the application via all the intersected interfaces. The more microservices exist, the more data there is — with increasingly complex interdependencies.

Ephemeral Servers

In the past, a server would run for years; but in our cloud-centric world, the ability to spin up servers on demand and the increased use of spot instances, along with the very nature of microservices and containerization, make ephemeral servers quite common. This, too, drives up the complexity of infrastructure and increases data volumes.

SRE and Chaos Engineering

Site reliability engineers (SREs) commonly use chaos engineering to test applications, purposely introducing failures to verify resilience. For example, SREs will destroy a server just to see how the system will respond. The resulting failures are not typically seen in normal day-to-day system behavior, so once again, observability data is increased to cover these test modes and scenarios.

Indexing and Hot Storage

As a result of the factors above, observability solutions must ingest and process enormous amounts of data so companies can understand where issues exist, and ensure their application or website’s health is not compromised. However, this typically entails indexing data to speed up search and query operations, then storing the data in hot storage for frequent and fast retrieval. This directly drives up observability costs, particularly because hot storage is extremely expensive.

Data Volume Is Not the Problem — Data Management Is

While some observability vendors will recommend limiting data ingestion to reduce costs, this strategy can hurt observability, with missed detection of production issues, loss of valuable data needed for root cause analysis and increased risk of noncompliance with various regulatory requirements.

Before we discuss how you can better manage your data and associated costs, let’s look at some eye-popping statistics about the data consumption of over 1,000 companies.

👁 Image

Source: Coralogix

Do-It-Yourself Observability

Taking the DIY approach might work best for many companies with experienced DevOps and SRE teams. Here is what you need to know when building DIY observability.

Start With the Right Framework for DIY Cost-Effective Observability

Given the complexity of data management, it’s easy to get lost in the details. However, to reduce your observability costs and keep them low, you just need to start with the right approach.

Reducing observability costs doesn’t need to be a big or complex consulting project. The key steps to follow:

Determine How the Data Is Used

Here are three categories you can use to get organized:

  • Data you search on a daily basis
  • Data you use for dashboards and alerts but don’t frequently search
  • Data you keep for compliance purposes only

Many open source tools will give you some insight into what is being searched the most. For example, the Prometheus query logs can tell you which queries are running the most and, thus, which time series metrics are most important.

As you go, you may wish to expand on the above categories, as your organization undoubtedly has many different data usage scenarios. However, getting started with this basic categorization is essential as we will require it later.

Abandon the Pattern of Indexing Everything

A typical tendency with observability solutions is to index all ingested data in a tool like OpenSearch and then, over time, move it to less expensive storage options like S3. Not all ingested data will be used in fast searches, with 30% of the data never used at all. Indexing is very expensive, so it should be limited to data that will be searched frequently.

This pattern is typical because it is easy to set up the flow. However, by defining use cases, teams can create a more intelligent data routing pattern that categorizes the data before determining what should be done with it.

Route Data to the Appropriate Storage

Once data use cases and statistics are in place, categorizing the data becomes more straightforward. The categorizations allow teams to understand which data needs to be queried quickly, which data will never be queried at all and everything in between. Based on the category, you may decide to route your data to be archived, stored in hot storage solid-state disks (SSDs), or perhaps an intermediate option like magnetic Amazon Elastic Block Store (EBS) volumes.

With this flow, only highly important, frequently searched data will be indexed and stored in expensive SSDs (hot storage). On the other hand, compliance data that doesn’t add operational value can be sent directly to inexpensive archive storage. Data required for intermittent usage can be stored in magnetic EBS volumes.

Don’t Do the Reindexing Thing

Reindexing is done when data is already put into archive storage, but you need to access it again. For example, regulatory data may be regularly archived, but once a year, you need it to generate a report. This act of reindexing is very expensive even though the data is eventually deleted from hot storage. Further, operational queries are slowed down when adding this bulk data back into the index.

As an alternative to this costly and inefficient reindexing, archived data should be saved in an easy-to-access, open source format like Parquet or CSV. By doing this, the archive can be queried directly without indexing. This reduces the cost of your observability bill, but more importantly, it keeps historical and operational data separate and keeps operational data queries working quickly.

Minimize Data Generation Where Possible

Stop producing unnecessary logs, traces and metrics. The categorization we’ve described will help you understand what data is useful and what is not.

Data needed for regulatory compliance or peace of mind should be put directly into low-cost archive storage. Most of the time this data will not be used, but it could be queried directly from the archive, as described in the previous section.

Convert Logs and Spans to Metrics

No rule says you need to ingest data in its original form. Logs are especially expensive to store due to their size. Not all fields in the log data are helpful. If a log has limited useful fields, consider converting them into time series metrics and drop the original log from storage. Metrics are small in comparison and are less expensive to store. DevOps teams also receive the same insights because this data can still be indexed; there is just significantly less data to index, which optimizes the cost.

One exception to metrics being low cost to store is when they are high cardinality. These metrics have a label with many distinct values, such as a metric of IP addresses where millions of users are supported. Each distinct value under a label provides a different way the data can be queried. This slows your queries, increases costs and results in longer-lasting outages. Metrics generally work better with many different time series than a single time series with a large number of high cardinality and high dimensionality labels.

To avoid high cardinality, teams can aggregate metrics to reduce labels, remove unnecessary labels or generate smaller metrics with lower cardinality. These actions will help reduce costs and are critical to keep performance standards high.

Off-the-Shelf Observability

Sometimes the operational overhead of managing your own observability solution is too high. This overhead can divert your team’s focus and burden them with laborious maintenance of your observability stack and its underlying infrastructure. If you are considering a managed observability solution, the following sections provide some general guidance.

What To Look For in an Observability Vendor

When looking at SaaS observability options, cost optimization will look different depending on the provider, its architecture and how insights are generated in its proprietary system.

Here are some tips for choosing a cost-efficient solution.

Ask the Right Questions About Cost

For consumers using SaaS observability providers, there should be a way to optimize costs in each system. Whether you have already integrated with a provider or are choosing one for the first time, be sure to ask about cost optimization in a specific way. Ask: “What tools do you offer for customers to optimize costs?”

The answer the vendor provides will shed light on what tool it has invested in and built instead of putting the cost optimization onus directly on you, the consumer. Since customers don’t have as much control over the flow when using a third-party solution, a typical response to cost reduction is simply to reduce data volume. As we have discussed, that is not the best option since you can also lose insight into your software system health, and significant engineering time is required to implement these reductions (further increasing your costs). So if actual cost optimization tools aren’t built into the offering, the provider likely does not want you to optimize your costs and, as such, should be avoided.

Understand the Vendor’s Pricing Model

This might take more effort, but it’s critical to read the fine print. If a vendor has many bundled services that force you to buy features you don’t need, per-host pricing that doesn’t differentiate on host size, or lots of different and nonstandardized fees for every feature, you should be thinking twice.

Look for a vendor that provides straightforward, clear pricing so you can easily estimate costs and avoid costly overages.

Coralogix is a modern, full-stack observability platform specializing in comprehensive monitoring of logs, metrics, traces, and security events. Coralogix’s unique architecture powers in-stream analytics without reliance on indexing or hot storage, reducing the total cost of ownership by up to 70%.
Learn More
The latest from Coralogix
TRENDING STORIES
Chris Cooney is the developer advocate for Coralogix, and is passionate about all things observability, organizational leadership and cutting-edge engineering.
Read more from Chris Cooney
Coralogix sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Shelf.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
👁 Image
Download Coralogix’s free eBook to learn how to reduce observability costs and scale with your growing data.