VOOZH about

URL: https://thenewstack.io/large-language-model-observability-the-breakdown/

⇱ Large Language Model Observability: The Breakdown - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2024-03-28 13:46:48
Large Language Model Observability: The Breakdown
podcast,video,
Large Language Models / Observability

Large Language Model Observability: The Breakdown

The LLM stack brings a different set of metrics than your team usually tracks. In this Makers episode, co-host Janakiram MSV identifies the new "golden signals."
Mar 28th, 2024 1:46pm by Alex Williams
👁 Featued image for: Large Language Model Observability: The Breakdown

Getting the most out of a larger language model is the point of LLM observability.

“In the last 12 months or so, there is a new stack that has evolved,” noted Janikiram MSV, an independent analyst and frequent contributor to The New Stack, who joined me as co-host for this episode of The New Stack Makers.

And that is the LLM stack, which has multiple pieces of the puzzle like the large language model, the vector databases, the embedding model, the retrieval systems, the read anchor models, and it’s a whole new ecosystem. So making sure that we are monitoring the golden signals that come out of this new stack and making sure that we are getting what we want out of the system is primarily the objective of LLM observability.”

But what is the goal?

“For folks familiar with DevOps- and SRE-based metrics, they already know what infrastructure observability is,” MSV said. “The goal of any observability mechanism is to make sure we have insights into a system. So in infrastructure observability, we look at four golden signals, which are called MELT: metrics, events, logs, and traces.

“Now, if I am a systems administrator or an Ops guy, I am responsible for measuring these four metrics and keeping an eye on them to ensure my systems are delivering the uptime, which is 99.9%,” or whatever the service-level agreement is.

He continued, “Very similar to this, the LLM also has certain metrics entirely different from what we have been tracking for infrastructure, which we will do a deep dive on.”

MSV detailed the critical aspects of LLM observability. He broke it down by starting with the overall GenAI stack, which has several sub-topics, including:

  • GPU.
  • CPU.
  • Storage and vector database.
  • The model serving the model usage.
  • The change in agents in the application.

Other topics covered in this episode included hallucinations, span traces, relevance, retrieval models, latency usage, monitoring, and user feedback.

First, MSV said, it is important to examine the overall stack that an enterprise may use on-premises to understand LLM observability.

Accelerated computing sits at the bottom layer, which contains high-end CPUs and GPUs. Now, monitoring the usage, we must determine whether the infrastructure resources are oversubscribed or undersubscribed.

“And we already have enough mechanisms to track that,” MSV said. “So that is the first layer. The second layer is the storage layer, which will be your model catalog or the model garden. Now, this needs to be in sync with an external model provider like Hugging Face, because that’s where you’re going to pull the models from.”

Important: check for the most updated models, MSV said.

“And then there is the vector database,” he said. “The vector database contains the embeddings and the vectors of your ground truth. Keeping that always highly available is very critical. So you need to treat that the way you treat your Postgres or MySQL, or any other database. and ensure uptime of the vector database.”

The inference engine sits at the third layer, combining the model-serving environment and the API server.

MSV went into considerable depth in this episode. We concluded by looking at the peer companies in the LLM observability space. These companies included: Arize.ai (Phoenix), Datadog, Dynatrace, LangChain (LangSmith), New Relic, Signoz, and Truera.

We’ll explore these different companies in an upcoming episode of The New Stack Makers.

TRENDING STORIES
Alex Williams is founder and publisher of The New Stack. He's a longtime technology journalist who did stints at TechCrunch, SiliconAngle and what is now known as ReadWrite. Alex has been a journalist since the late 1980s, starting at the...
Read more from Alex Williams
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.