VOOZH about

URL: https://thenewstack.io/engineering-ai-slop-registry/

⇱ Your engineering org needs an AI slop registry - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2026-06-26 10:00:00
Your engineering org needs an AI slop registry
sponsor-aviator,sponsored-post-contributed,
AI Engineering / Developer tools / Software Development

Your engineering org needs an AI slop registry

Stop AI coding tools from scaling mistakes. Discover how an AI slop registry and separate verification enforce repo standards.
Jun 26th, 2026 10:00am by Ankit Jain
👁 Featued image for: Your engineering org needs an AI slop registry
Naila Conita for Unsplash+
Aviator sponsored this post.

AI coding tools don’t just help engineers write code faster. They help engineers make the same mistake faster, at scale, across every PR that touches a given pattern. I’m not talking about AI code that’s obviously wrong; I’m talking about code that compiles, passes basic checks, and looks plausible but is subtly wrong, bloated, or misaligned with what was actually needed.

In practice, that usually looks like AI overengineering the abstraction layer for a problem that needs 10 lines, code that ignores your repo’s patterns, naming, or architecture, calls to APIs that don’t exist, or copying patterns without understanding why, like retry logic where it’s not needed.

Errors like that are systematic, which is what makes them preventable.

You have CLAUDE.md and Skills, but…

Most teams respond to this by trying to give the AI better instructions. They document their standards in a CLAUDE.md file, configure Skills, and describe the conventions they want the model to follow. This is the right impulse, but it doesn’t always work. 

They’re asking the same non-deterministic agent that generated the code to also catch its own mistakes. It may follow the rules you’ve written. It may not. There’s no evidence either way, no audit trail, and you can’t know in advance which run you’ll get. A CLAUDE.md file is an input to generation. It is not a verification system.

“A CLAUDE.md file is an input to generation. It is not a verification system.”

Catching slop reliably requires something structurally separate: a system that independently checks the output, uses a different agent, and produces the same result every time it sees the same code.

The two layers of verification

The shift we’ve been working toward at Aviator is replacing code review with verified intent. Instead of a reviewer reading a diff and asking, “Does this look right?” the team agrees on what the code is supposed to do before it’s written, and a separate verification system checks the output against that agreement.

Think of it like a building inspection. A building isn’t approved by an architect watching every nail get hammered. It’s approved by an inspector evaluating the finished structure against the blueprints. Intent-driven verification follows the same pattern: the spec is the blueprint, the agent’s implementation is the construction, the verifier pipeline produces verdicts and evidence for each criterion, and the reviewer approves based on intent fit and evidence quality.

“Instead of a reviewer reading a diff and asking, ‘Does this look right?’ the team agrees on what the code is supposed to do before it’s written.”

The model has two layers, and understanding why there are two rather than one is the key to making it work.

User criteria are the acceptance criteria for a specific change, generated by the agent from the expressed intent or written by hand. They’re scoped to this PR only. The endpoint path, the response shape, the behavior under failure, and what’s explicitly out of scope. This is where the intent for a particular task lives.

Invariant criteria come from the team’s Invariants catalog and are rules that automatically apply to every matching change. Where user-supplied acceptance criteria describe what this change should do, invariants describe what every change should respect. They live in your account and update once for everyone.

Your Invariants should be specific about the rule but vague about the implementation:

  • All HTTP handlers must call an authentication middleware before any business logic.
  • “All migrations must declare a down block.”

These are defined once and checked on every run. Developers don’t need to include them in the specs because the system automatically loads the matching set.

The test for promoting a check to an invariant is recurrence: anything that you post in a review comment multiple times should become an invariant. Aviator actually does this automatically. It auto-creates invariants based on past comments.

When verification runs, both layers are assembled into a single list of acceptance criteria and flow through the same pipeline. A spec adding a subscription status endpoint might contain these user criteria:

# Add subscription status endpoint

## Acceptance Criteria

– [ ] Endpoint: GET /api/v1/subscription/status

– [ ] Response includes: status, renewal_date

The invariant catalog then adds its own criteria automatically, say, a rule that all HTTP handlers must use AuthMiddleware. Verification checks all of them:

  • ✓ Endpoint exists at the correct path (user criterion)
  • ✓ Response includes status, renewal_date (user criterion)
  • ✓ Handler uses AuthMiddleware (invariant)

All must pass. The spec author didn’t need to remember the authentication requirement. It was enforced by the catalog without anyone asking for it.

Invariants as the anti-slop registry

Invariants are what we call the ‘anti-AI slop registry,’ and that makes this work at scale. They address the most common category of AI slop: convention blindness, deprecated APIs, module boundaries the model doesn’t know about, and security baselines that should apply everywhere. None of these are in the model’s training data for your specific codebase. They live in the heads of your senior engineers and show up as recurring review comments.

Most invariants worth writing start as a review comment that’s been left more than twice. Here is an example of turning a real review comment into an invariant:

Comment on PR #4173:

“Please don’t write to users directly — go through UserRepository.UpdateProfile. We had a partial-write bug last quarter from a similar pattern.”

Invariant body:

Copy
Writes to the users table must go through UserRepository. Direct INSERT,
UPDATE, or DELETE statements against the users table are not allowed
outside the repository package. Schema migrations under src/db/migrations are exempt.

Conditions: file_path_glob: src/**/*.go (skip non-Go files).

Category: functional_correctness.

You can mine historical review comments, cluster them, and generate invariant candidates for human approval. Each invariant you codify is a check that will never cost a reviewer time again.

I may have said that code review is a historical approval gate that no longer matches the shape of engineering work or that we can stop reading the code, but that will not happen overnight. In practice, over time, we move the human judgment upstream, where it’s more valuable. Not everything has to be reviewed to the same depth. Humans should review specs, plans, constraints, and acceptance criteria, not 500-line diffs.

The other thing that sets this apart from a rules file is what happens at the time of verification. The writing agent and the verifying agent are different. They don’t share context, they don’t share blind spots, and the verifier produces a structured report per criterion — file references, reasoning, pass/fail/partial — not a gut-check opinion from the same model that wrote the code.

What we built, and what it found

At Aviator, we recently ran an experiment to test the intent-driven verification approach: what if the review happens before the code is written?

Instead of AI writing code and engineers reviewing it, the team spent time writing and reviewing scope, acceptance criteria, and edge cases before any implementation started. Then we handed it to an AI agent and let it build.

The result was about 6,000 lines of code. A second agent then verified the output against the 65 user criteria items in the spec. It took six minutes. 60 passed, 4 failed, and 1 was partial. 

“You’re not building software anymore. You’re building the machine that builds software, and quality control is part of that machine.”

Human reviewers still found things, but design-level decisions were verified before any code was generated, and org invariants were enforced automatically throughout.

Instead of leaving the same comment for the fifteenth time, you’re identifying the pattern, writing it once, and letting the system enforce it on every change that follows. You’re not building software anymore. You’re building the machine that builds software, and quality control is part of that machine.

Aviator is a developer productivity platform used by modern engineering teams to ship AI-generated code at scale. Shared context, faster reviews, deterministic verification, and merge-to-deploy automation built for the volume AI creates.
Learn More
The latest from Aviator
Hear more from our sponsor
TRENDING STORIES
Ankit Jain is a cofounder and CEO of Aviator, a developer productivity platform used by modern engineering teams to ship AI-generated code at scale. He also leads The Hangar, a community of senior DevOps and senior software engineers focused on...
Read more from Ankit Jain
Aviator sponsored this post.
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.