VOOZH about

URL: https://thenewstack.io/infrastructure-apis-the-good-the-bad-and-the-ugly/

⇱ Infrastructure APIs: The Good, the Bad and the Ugly - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2023-04-07 08:00:33
Infrastructure APIs: The Good, the Bad and the Ugly
contributed,
API Management / Cloud Native Ecosystem / Operations

Infrastructure APIs: The Good, the Bad and the Ugly

Why do we have so many Terraform orchestration projects? There’s a deeper challenge with the operations contracts (a.k.a. the APIs) between the users and the infrastructure.
Apr 7th, 2023 8:00am by Rob Hirschfeld
👁 Featued image for: Infrastructure APIs: The Good, the Bad and the Ugly

This year at SCaLE 20x, I kept overhearing people talking about struggles scaling with HashiCorp‘s open source Infrastructure as Code (IaC) software Terraform.

True, we can wrap it inside a runner so that at least we can get control, shared state and visibility about runs, but that doesn’t address the fundamental reuse and collaboration problems with plans.  When I asked myself why, I realized this is not specifically a Terraform problem.

I’m convinced that there’s a deeper challenge with the ops contract (aka API) between the users and the infrastructure.

I’ve been thinking deeply about scalable infrastructure APIs for nearly a decade in my role at RackN, which offers an Infrastructure as Code platform called Digital Rebar. We’ve collaborated with operations teams at banks, service providers, telcos, gaming and media companies to manage globally distributed infrastructure.  These operations APIs challenges are universal.

So what makes an API scalable?  It’s not just the number of requests or machines that it can service.  It is much more about enabling reliability, consistency and uniformity of the service supporting the API.  Even more so, if it empowers the teams supporting the API to collaborate openly while invisibility maintaining the infrastructure and systems.

Good APIs Bad APIs Ugly APIs
Cognitive Impact Low Load High Load Creates Anxiety
Reliable people forget the service actually does complex work, it’s function is assumed people resist, requires lots of support on backend In practice, hard to troubleshoot and figure out what happened
Consistent people trust the results provided by the API both when successful and when failed. people have to create lots of defenses when using, require specialists users cannot predict which inputs and behavior is needed
Uniform people can use the APIs abstraction in many scenarios without having to understand the underlying system. people have to create a wrapper layer above it, overly restrictive & cannot innovate information means different things depending on the context its used

Terraform client tool is an incredibly powerful API abstraction and brilliant single pass orchestration, but now everyone is wrapping it with a service API to improve scale.  So we’d better think carefully about what it means to have a good API around the individual Terraform run.

The first thing everyone needs to really understand is what it is they are using TerraForm to try to fix.  Each plan is not abstract because it must be specific to the cloud, application and cluster. It doesn’t have any way to provide real feedback about what’s been done, how it’s been done, or being updated. It’s not even an addressable API unless you wrap it in something else.  And APIs that just wrapped Terraform are stuck with the plans’ design contract. They have to maintain an expectation that your unit of work is defined by a plan, not by the target start of the broader system, workflow or IaC process.

To have a good API here, Operators need to have a control plane that serves operational interests behind that API.

We need an abstraction that creates improved transparency for the infrastructure.  That’s important to provide clear insights into the workflow and the actions it is taking.  Unlike with a Terraform plan, the requestor should influence but not be able to redefine the processes, that is what got me thinking about how we’ve been building operational APIs at RackN.

In our product Digital Rebar, we specifically differentiate between intent, workflow and state with clear APIs for each.  For an operator, these differences are important.  Intent is your objective and can be described as inputs to a process.  These intents are an abstraction that shapes what the system will do like configuration, but cannot (and should not) fully describe the system because many decisions cannot be made until the workflow has started.  The fact that an intent does not have to be specific configuration details allows operators to work at a higher level of abstraction.  The automation fills in details via code or makes assumptions based on defaults.

Once the operator starts a workflow, Digital Rebar collects state information and guides the transformation of the system towards the intent.  The state information is observable, subscribable and addressable via the API throughout the workflow.  This means that operators have the transparency to manage systems throughout the process.  As automation inevitably bumps into unexpected situations, it is possible to understand how we arrived at this state in the process and even recover.  In addition, the workflow artifacts themselves are defined and managed via the API.

That design provides clear, persistent and strong controls over the infrastructure behind the API.  Even more importantly, it means automation can be secured, repeated and tested using true Infrastructure as Code techniques and Infrastructure Pipelines.

There is significant power in being able to clearly explain what makes APIs effective!  It allows us to emphasize the productive behaviors of the platforms we are using.  It helps us define criteria to select new systems.  And it enables us to ask for targeted improvements to the systems that we are using.

That’s good news because it is possible to have great amazing and productive APIs for infrastructure.  We just have to be willing to elevate system and operational needs.  When we do, we’ve proven that everyone in the value chain benefits.

(Note: This story was updated from an earlier version posted today.)

TRENDING STORIES
Rob Hirschfeld is CEO and co-founder of RackN, leaders in physical and hybrid DevOps software. He has been in the cloud and infrastructure space for nearly 15 years from working with early ESX Betas to serving four terms on the...
Read more from Rob Hirschfeld
SHARE THIS STORY
TRENDING STORIES
HashiCorp is a sponsor of The New Stack.
TNS owner Insight Partners is an investor in: Uniform.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.