VOOZH about

URL: https://thenewstack.io/ai-alignment-in-practice-what-it-means-and-how-to-get-it/

⇱ AI Alignment in Practice: What It Means and How to Get It - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2025-01-28 05:02:33
AI Alignment in Practice: What It Means and How to Get It
sponsor-emcie,sponsored-post-contributed,
AI / AI Agents / AI Engineering / Large Language Models

AI Alignment in Practice: What It Means and How to Get It

LLMs' wide and variegated training process makes them inherently misaligned to any specific scope and standards. Here’s what you can do about it.
Jan 28th, 2025 5:02am by Yam Marcovitz and Emily Omier
👁 Featued image for: AI Alignment in Practice: What It Means and How to Get It
Featured image by Ayush Kumar for Unsplash+.
Parlant by Emcie sponsored this post.

When we talk about artificial intelligence, what we’re really talking about is using computers to automate human intelligence. When we talk about AI alignment and misalignment in practice, we’re really talking about whether the AI application we’re working with acts in alignment with our needs and expectations.

Alignment issues happen in purely human interactions as well. For example, a customer service rep who is given outdated or insufficient training material is likely to communicate unfactual or wrongly extrapolated information to customers, similarly to an AI agent trained on outdated or incomplete documents. The difference is that most managers have a fairly good idea of how to correct or otherwise deal with their human agents.

But with AI agents, it’s not always so simple. That’s precisely because large language models (LLMs) are not human, nor is the process of training them.

We Can’t Settle for 70% Accuracy

When humans talk to each other, we intuitively convey a huge number of contextual clues. While such implicit exchanges occur during conversations, there’s also the largely unappreciated amount of context that we share in common with others in our everyday circles. We have a sense of how to deal with people and situations around us. Most importantly, we each do it a little differently due to divergences in opinions, needs and circumstances.

But LLMs have to be explicitly provided with our context, and directed on how we personally wish to approach the critical scenarios that happen within it. Otherwise, misalignment is practically guaranteed. The corollary problem is that they often start to lose focus and behave unpredictably if provided with more than just a few instructions. LLMs also have trouble prioritizing and resolving subtle conflicts between instructions unless they are explicitly trained to do so.

In addition to the challenges of aligning an LLM agent with our own intentions, there is also the problem of getting the agent to align with the customers and their own context. This is where most AI frameworks today fall short, as they’re built around recognizing and reacting to a user’s so-called intent.

The problem with this approach is twofold. First of all, even during real-life interactions, intent can take time and help to clarify. At the same time, a user could have multiple simultaneous intents that should be tied together in a response. Just as importantly, there are situations where, instead of reacting directly to the user’s perceived intent, you would want the AI application to counter with guiding questions or otherwise divert the flow of the conversation. And an intent-based approach, while simpler to reason about from the standpoint of engineering, isn’t actually well suited to these situations.

Because of these challenges, generative AI applications that get it right even 70% of the time are often portrayed as a success. But, especially in a customer-facing situation, that standard is ridiculously low. It leaves brands open to both reputational and legal risk, especially if they operate in regulated industries.

Common AI Misalignments

Before we dive into how to fix alignment issues, let’s first address the types of misalignment in AI applications.

Factual Misalignment

When a generative AI application hallucinates — offering made-up context, information or services — that is a type of factual misalignment. An easy yet common example here is if a bank’s client asks its AI agent, “What are my limits?” and the agent responds with anything from made-up facts such as, “Your withdrawal limit is $300 per day,” all the way to entirely decontextualized responses such as, “While knowing your limits can be challenging, stretching our limits is an important aspect of living a full and productive life.”

Factual misalignment can also arise from simpler causes, such as when the AI application conveys incorrect information that was directly provided to it. This generally happens because the knowledge base that the AI is trained on or fed to is out of date. The good news is that it’s often straightforward to update knowledge bases; the bad news is that it often takes considerable time and effort, which is why this is such a common reason for factual misalignment in AI applications.

Another kind of factual misalignment is when an AI agent reveals information that it is not supposed to reveal. It might have access to pricing information that isn’t public, for example, and reveal that information if asked in the right way. This pitfall is most commonly found in custom or fine-tuned generative models trained on your private data, which is why extreme care needs to be taken with these approaches.

Behavior Misalignment

Beyond whether or not the AI application is hallucinating, you must consider whether it is behaving in a way that could hurt your reputation, land you in legal hot water, or simply fail to engage your users effectively with the services it offers. It’s possible, for example, for an AI agent to get the outcome you want while exhibiting unacceptable behavior.

Brand alignment and outcome alignment, the other types of alignment talked about in generative AI, both boil down to behavior alignment. Is your AI application behaving in a way that’s in line with your brand? Is it getting the outcomes you want, and is it behaving in a way that you find acceptable to get those outcomes?

For example, your agent could report increased sales — a positive outcome — through over-promising or offering unapproved discounts. Behavioral misalignment can also be more subtle, such as when an agent skips crucial parts of service protocols, like informing customers that calls are recorded or asking them to confirm their identity before proceeding to provide service.

So it’s not enough to simply evaluate whether or not the AI agent is achieving the outcomes you want; you also need to make sure it’s behaving in a way that fits with the image you want for your brand, and isn’t undermining the company’s goals and requirements with its behavior.

How To Get the Best Possible Alignment in Your AI Agents

When you’re working on aligning an LLM agent — in realistic, production-grade use cases — you’ll need to give it at least a few dozen instructions, if not hundreds. The big problem here is that LLMs do not think logically, and if you give LLMs too many instructions at the same time, or instructions that are conflicting in any way, it diminishes the LLM’s ability to follow them.

So the first step, when working on aligning your LLM’s behavior, is to be able to sort through all of the instructions dynamically and identify which ones are relevant for a particular conversation and which ones are not. If we can eliminate the non-relevant instructions, that will immediately help focus and align the LLM’s behavior. Parlant, the recently launched open source alignment framework, calls this process contextual guideline matching.

The next step is to put in place self-critique, smart prioritization and conflict-resolution mechanisms. The challenge here is that most LLMs are not able to do this particularly well out of the box, because they are being asked to pay attention to too many inputs that pull their attention mechanism in different directions, equalizing output probabilities and consequently diminishing their behavioral consistency. They can also struggle because LLMs are trained to give priority to the parts of the prompt that come at the end. So prompts need to be researched and tested accordingly, in specific connection with the models they are used on, keeping in mind the models’ various properties.

One of the most practical ways to handle this is to implement a supervision element in your prompts. Parlant implements a new technique it has developed for this purpose, called attentive reasoning queries (ARQs), that helps divert the LLM’s attention back to the relevant parts of the prompt at the right times, to ensure that critical information and instructions aren’t being overlooked. This leads the LLM to give appropriate weight to each of the provided guidelines, no matter where in the prompt it appears, or how long or complex the prompt is.

Maximize Alignment to Reduce Risks

Companies that are serious about using AI agents in a customer-facing capacity should be aware of the risks of different types of misalignment and keep up with the latest techniques and innovations to maximize alignment in their AI agents. Otherwise, at least for some use cases, the risks are simply too high.

Parlant is an open-source framework that transforms how AI agents make decisions in customer interactions. It replaces prompts with granular guidelines that are easier to enforce consistently and automatically, achieving unparalleled accuracy in instruction-following and adherence to business rules.
Learn More
TRENDING STORIES
An experienced software builder with extensive experience in mission-critical software and system architecture, Yam understands what it takes to create reliable, production-ready software. This background informs his distinctive approach to the development of predictable and aligned AI systems.
Read more from Yam Marcovitz
Emily helps open source startups accelerate revenue growth with killer positioning. She writes about entrepreneurship for engineers, and hosts The Business of Open Source, a podcast about building open source companies.
Read more from Emily Omier
Parlant by Emcie sponsored this post.
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.