VOOZH about

URL: https://thenewstack.io/avoiding-the-ai-agent-reliability-tax-a-developers-guide/

⇱ Avoiding the AI Agent Reliability Tax: A Developer’s Guide - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-08-26 09:00:37
Avoiding the AI Agent Reliability Tax: A Developer’s Guide
sponsor-salesforce-developers,sponsored-post,
AI / AI Agents / AI Engineering

Avoiding the AI Agent Reliability Tax: A Developer’s Guide

An unreliable agent can trigger operational breakdowns, legal exposure and reputational damage. Learn how to reduce your risks.
Aug 26th, 2025 9:00am by Drew Robb
👁 Featued image for: Avoiding the AI Agent Reliability Tax: A Developer’s Guide
Featured image by The New York Public Library on Unsplash.
Salesforce Developers sponsored this post.

Interest in generative AI (GenAI) is transitioning from developing models to creating agents that autonomously perform a broad range of tasks. But trouble awaits those unleashing the capabilities of agentic AI without a firm grip on end-to-end reliability.

“Unreliable agents are not simply inefficient; they represent a significant source of operational, financial, legal and reputational risk,” Mohith Shrivastava, principal developer advocate at Salesforce, told me in an interview. “With agentic AI being deployed at scale, reliability becomes the central architectural principle.”

Avoiding the AI Agent ‘Reliability Tax’

An unreliable agent introduces far more than mere inefficiency. It is a liability that can trigger operational breakdowns, legal exposure and reputational damage. Shrivastava calls this the “reliability tax.”

Too many of today’s AI applications and agentic AI deployments are brittle, inconsistent and demand constant oversight. Hence, organizations face an ongoing investment in guardrails, retrieval pipelines, monitoring, governance and security hardening to fix unforeseen AI issues and prevent inaccuracies and hallucinations.

“We’ve moved from deterministic automation — where a system executes preprogrammed rules — to probabilistic autonomy, where agents perceive, reason and act on their own,” said Shrivastava. “This brings incredible potential but also introduces entirely new failure modes.”

The 5 Pillars of Agentic AI Success

He stresses that reliability is a multidimensional element composed of five pillars:

  • Predictability: Consistent actions within defined bounds.
  • Fidelity: Accuracy grounded in verifiable sources.
  • Controllability: Following explicit instructions and constraints.
  • Robustness: Resilience in messy or adversarial conditions.
  • Safety and security: Avoiding harm and resisting malicious exploitation.

Many designers do well on some of these pillars. But each one is essential. If one fails, cascading breakdowns are inevitable.

Preventing Scope Creep and Hallucinations

Adherence to sound reliability principles must be balanced with the need to avoid scope creep. Shrivastava recommends beginning with a strategic scope definition before you build. That definition can then be enforced with:

  • Zero trust identity and access control
  • Tool use allowlists
  • Human-in-the-loop checkpoints
  • Logging and monitoring
  • Emergency kill switches

Hallucination can be addressed with techniques like Retrieval-Augmented Generation (RAG) without requiring model retraining. This helps reduce the amount of hallucination, whether it is in terms of faithfulness errors (contradicting context) or factuality errors (contradicting reality).

Moving Beyond Prompt Engineering

Prompt engineering techniques like chain-of-thought or self-consistency are designed to ensure that agents follow commands. For true instruction adherence, however, developers must embrace context engineering. Just as prompt engineering goes well beyond simple prompting by carefully considering context and structure, context engineering architects the full context using a rigorous, iterative approach that optimizes instructions to ensure they achieve the desired result.

“Context engineering is the art and science of giving an AI agent the right information, the right tools and the right instructions, so that the agent is able to accomplish the given goal. Think of context as the agent’s runtime ‘RAM’ — the prompt, instructions, retrieved data and history,” said Shrivastava. “Overload it, poison it or create conflicts, and reliability suffers.”

Developers need tools they can use that put context engineering into practice. Such tools must be able to define topics that capture the exact tasks to be accomplished so that the AI agent understands the scope, triggers and desired outcomes for each scenario. These topics provide the structure for when and how the agent should act, ensuring that responses remain relevant and aligned to business goals.

From there, added Shrivastava, agents should be equipped with effective tools to accomplish a given goal. Memory can then be efficiently managed by summarizing ongoing conversations and reusing that context through prompt templates, conversation variables or context variables. As a result, diligent prompt engineering can refine the agent’s behavior within the framework of topics, instructions and scopes — and retrieval via RAG can dynamically pull in relevant data to deliver precise, context-aware responses while keeping the context window optimized.

“Enterprises need a platform that provides all of the tools to accomplish the many aspects of context engineering in a way that is simple to manage,” said Shrivastava. “This must include built-in guardrails and governance that evaluates how well agents interpret topic instructions when generating responses.”

With all of these steps in place, it remains critical that agent performance is closely monitored and measurable. Developer tools, therefore, should provide deep observability, live health monitoring, consumption tracking and rich adoption analytics to support the validation and steady improvement of outcomes. For example, service agents configured with Agentforce from Salesforce have a feature to report the percentage of resolved conversations, escalations and abandoned conversations. Similarly, sales agents that come out of the box with Agentforce have analytics to report how sales revenue is impacted by the agent.

The Right Platform and the Right Tools = Reliability

For enterprise agentic AI, reliability is no longer an optional feature. It has become a fundamental part of any architecture. This shift is necessary because agents now operate on probabilistic autonomy rather than deterministic scripts.

Achieving agent reliability requires a disciplined, end-to-end approach that goes beyond simply using the model. According to Shrivastava, it involves:

  1. Context engineering: Carefully defining the agent’s scope, actions, memory, prompts and RAG usage.
  2. Tight governance: Implementing strict controls like zero trust security, approved action lists (allowlists), human oversight (human-in-the-loop, or HITL) and comprehensive logging.
  3. Continuous evaluation: Constantly monitoring and testing the agent’s performance in real-world scenarios.

“Organizations must plan for the ongoing cost of maintaining guardrails, data pipelines, testing and observability,” said Shrivastava. “Platforms that provide built-in tools for evaluation, RAG and performance analytics can help reduce this cost and enable the development of more advanced, self-correcting AI systems.”

Agentforce from Salesforce has built-in tools to help enterprises deploy agents at scale. The Testing Center part of Agentforce, for example, lets teams run scenario-based, dataset-driven evaluations (including synthetic test cases) in a sandbox before going live. This way, they can surface failure to follow instructions and tool-use errors early — shrinking the reliability tax. Agentforce provides all of the tools needed to implement autonomous AI agents at scale while adding the guardrails, governance and control called for in enterprise operations.

For more information, visit Agentforce.

Salesforce helps organizations reimagine their business with AI. Agentforce, the first digital labor solution for enterprises, seamlessly integrates with Customer 360 apps, Data Cloud, and Einstein AI to create a limitless workforce, bringing humans and agents together to deliver customer success.
Learn More
The latest from Salesforce Developers
TRENDING STORIES
Drew Robb has been a full-time freelance writer reporting on all areas of IT and energy for more than 25 years. He has written thousands of articles for The New Stack, Information Week, Computerworld, eWeek, Tech Republic, Data Center Knowledge,...
Read more from Drew Robb
Salesforce Developers sponsored this post.
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.