VOOZH about

URL: https://thenewstack.io/the-hidden-pain-of-diy-on-premises-k8s-based-software-distribution/

⇱ The Hidden Pain of DIY On-Premises K8s-Based Software Distribution - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-09-30 06:47:27
The Hidden Pain of DIY On-Premises K8s-Based Software Distribution
contributed,sponsor-cncf,sponsored,sponsored-post-contributed,
Kubernetes / Software Development

The Hidden Pain of DIY On-Premises K8s-Based Software Distribution

Let’s explore the experience of companies trying to build their own software distribution tools to deliver apps in customer Kubernetes environments in the cloud.
Sep 30th, 2022 6:47am by Nikki Rouda
👁 Featued image for: The Hidden Pain of DIY On-Premises K8s-Based Software Distribution
CNCF sponsored this post.

This is part of a series of contributed articles leading up to KubeCon + CloudNativeCon in October.

Let’s explore the experience of companies trying to build their own software distribution tooling. This hypothetical scenario is based on a Software-as-a-Service (SaaS) company and/or a traditional on-premises software company that is delivering their app to customer Kubernetes (K8s) environments in the cloud for the first time. Think of it as a composite of many people’s experiences. We hope you don’t make the same mistakes!

A Timeline of Hope and Pain

Day 0 — The sales or product team asks engineering simple-sounding questions: “Can we deliver our SaaS application into our customer’s self-hosted Kubernetes environments?” or “Now that we’ve modernized and containerized our application, can we distribute it to customer-managed clusters in the cloud?” Either way, what they are really saying is, “Our prospects keep asking us to do this, and we’re leaving money on the table every time we say ‘no.’”

Day 1 — How hard can it be? The lead engineer spends a couple weekends hacking out a rough solution, very excited to build something new. It seems to be fairly straightforward to refactor the app to work in any AWS or customer-hosted environment, right? We could use Terraform, maybe.

Day 30 — The field engineers deliver the app to their first customer-hosted K8s cluster running in an AWS virtual private cloud (VPC.) The proof-of-concept (POC) installation doesn’t go as smoothly as hoped, but after a couple of escalations to engineering and some patience from the customer, they finally get the app deployed. High fives!

Day 45 — The lead engineer has shipped several updates and changes to the new “on-premises” K8s installer to make it work. A production install is started in a different environment, but it’s not working the same way, and no one is quite sure why. More and more engineering time is being spent on Zoom with the customer, whose frustration is steadily growing. Other modernization, innovation and/or backlog work is starting to take priority, and this project is starting to look a lot more complicated than expected. The sales team is getting a bit nervous about their account and escalating to management.

KubeCon + CloudNativeCon conferences gather adopters and technologists to further the education and advancement of cloud native computing. The vendor-neutral events feature domain experts and key maintainers behind popular projects like Kubernetes, Prometheus, Envoy, CoreDNS, containerd and more.
Learn More
The latest from KubeCon + CloudNativeCon

Day 60 — The project is no longer fun and continues to suck time and people. The Terraform scripts are failing security reviews at some companies. The lead engineer asks the manager to get them off this ASAP because they are burning out. The company doesn’t want to halt the project because product and sales are close to closing this customer. There are a surprising number of on-premises and K8s cluster-based opportunities in the pipeline, and in this economy, the vice president of sales doesn’t want to turn away any revenue. The head of engineering begrudgingly assigns more engineers to work on the on-premises installer project, delaying the schedule for other planned app features and innovations.

Day 180 — A lot has gone on in the past four months. New customers are running the installer, but each one has a slightly different environment and installation requirements. A few examples:

  • While the first customer accepted the Ubuntu-based installer, the next customer wanted a RHEL installer. So the team spent two weeks building a second package and designing CI/CD pipeline to build and test it in parallel with the Ubuntu-based package.
  • Two government and financial services customers needed air gap installers. Engineers decided this is too much effort with everything else going on. This represents a substantial hit to the revenue stream that drove the idea in the first place.

Day 270 — With mixed failures and successes, the on-premises K8s install initiative carries on in fits and starts. More issues keep popping up. The install success rate is hovering around 50%, where half the attempted installs end with the customer getting fed up and losing trust. Other customers and prospects keep asking for it, and a number of big accounts are now deployed with it, so it seems impossible to turn back, but the quagmire is getting deeper:

  • One customer runs into some common vulnerabilities and exposures (CVEs), which block an install, and it’s an all-hands-on-deck late-night scramble to patch the vulnerabilities and get everything stable again.
  • Several customers have now (auto-)upgraded their Linux operating systems, which unfortunately broke the app packages, requiring rework and updates to the installer. It looks like this will happen at least once a quarter.
  • Mysterious storage and networking failures have required more than 10 hours of hands-on troubleshooting across several weeks.
  • The first customer to install has yet to upgrade their installer and is at risk due to unpatched bugs, which were fixed long ago in newer versions. Because the first version was not built with a self-serve upgrade path in mind, engineers spend another 10+ hours helping the customer perform a very manual migration to the latest version of the tool.
  • Despite management efforts to bring in other team members to the project, the lead engineer who built v1 is still constantly pulled into on-premises install support escalations.
  • One end customer had modified the base image for Ubuntu to change the names of all the default network interfaces. More mysterious network issues cause problems until this change is discovered.
  • In environments where the customer brings their own Kubernetes cluster, the team encounters 10 different flavors of Kubernetes ingress that need to be supported by the application configuration. Every single one takes hours to fix and takes time away from other engineering work.
  • Several end customers need enterprise long-term support (LTS) versions, which creates internal chaos and more firefighting. There’s a need to hire and train a lot of support engineers on Kubernetes or just keep escalating to engineering.

Day 360 — One year in, the engineering team, exasperated and burnt out, holds another all-hands-on-deck meeting to reset and decide what to do. Everyone dreads doing a rotation on the on-premises installer team; some people actively seek to get off the team. A few veteran engineers sit permanently on the team because they understand that without them, a big source of revenue would be in jeopardy. Engineering and product leadership agree to deemphasize new feature work to give the team up to 50% of their time for three months to invest in the install tooling. While they’re at it, engineering agrees to spend significant time developing the air gap installer that more and more customers are requesting. The team develops a wishlist for everything they’d want:

  • Set up CI/CD and automated testing for all releases of the application in all supported environments.
  • Convert the ragtag of hard-to-maintain bash scripts used to collect diagnostic info into a CLI tool that can be delivered with the installer. Consolidate into a framework that allows field engineers to contribute to the list of information that gets collected. Stretch goal: Package the internal scripts used to analyze these log bundles for common errors into a tool that end customers can run in their own environment.
  • Design so that the team can centralize on one architecture and install method, and solutions architects working with customers don’t need to hack a bunch of strange custom configurations for specific customer environments.
  • Give customers the option to bring an external database instead of using a datastore embedded in the application. This should help address some of the catastrophic failures in storage and networking.
  • Offer snapshot and restore functionality that will work in the majority of customer environments, relying on a hunch that this will include SSH File Transfer Protocol (SFTP), Network File System (NFS), storage area network (SAN) and maybe others. Do some discovery with the product team and several key customers to scope it out.
  • Automate scanning for CVEs in all code and enforce a policy of not shipping a release without patching all CVEs for which a patch is available.
  • Invest time in ensuring that the build/test process for developers in local environments can be shortened from 10+ minutes to under 30 seconds.
  • Automate testing for all installer versions on a quickly growing multidimensional support matrix of OS versions, Kubernetes versions, add-ons, cloud providers and other dimensions.
  • Build a specific “area of responsibility” for a product team to ensure that they can support new versions of operating systems within 30 days of release
  • Adopt an aggressive policy of deprecating old versions to reduce the total number of things that need to be maintained and patched.

Day 390 — The team is making progress, and even the lead engineers who built v1 are engaged again. A few improvements are made and momentum is building, but there’s still so much to do. The most knowledgeable people are still getting pulled into many support escalations with existing and new customers.

Day 480 — The three-month sprint has now sprawled out to six months. With half the team still improving the build/test/distribute/support platform for on-premises installs, app feature development is still behind pace. Work on an air gap installer has not even reached the prototype phase. With half the backend team focused on infrastructure-flavored tasks, frontend engineers staffed to work on a SaaS application or other modernization efforts are consistently running out of things to do. Disillusioned and completely burned out, the two engineers who built v1 of the installer and have the deepest knowledge of the project leave to join small startups founded by former colleagues. This sets back the team even further.

Some might read this and conclude that distributing software to customer-managed on-premises K8s and private cloud environments simply isn’t worth the pain. But 80% of all software spending still goes to applications that aren’t pure SaaS, and most organizations now expect applications to be K8s-friendly. We’re seeing a looming trend of application boomerangs from the cloud for reasons of security, compliance, performance and cost. There’s got to be a better way to solve the hard problems outlined above and still increase your addressable market!

👁 Image

The Cloud Native Computing Foundation (CNCF) hosts critical components of the global technology infrastructure including Kubernetes, OpenTelemetry, and Argo. CNCF is the neutral home for cloud native collaboration, bringing together the industry’s top developers, end users, and vendors.
Learn More
The latest from CNCF
TRENDING STORIES
Nikki Rouda is vice president of marketing at Replicated. He has deep experience leading enterprise modern apps, containers, big data, analytics and data center infrastructure initiatives. Nikki previously held senior positions at Amazon Web Services (AWS), Cloudera, Enterprise Strategy Group...
Read more from Nikki Rouda
CNCF sponsored this post.
SHARE THIS STORY
TRENDING STORIES
Image via Pixabay.
TNS owner Insight Partners is an investor in: Pragma.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.