VOOZH about

URL: https://thenewstack.io/a-look-at-the-slacks-new-gitops-based-build-platform/

⇱ A Look at Slack's New GitOps-Based Build Platform - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-09-21 03:00:45
A Look at Slack's New GitOps-Based Build Platform
CI/CD / DevOps / Kubernetes

A Look at Slack’s New GitOps-Based Build Platform

As its messaging service grew more popular, Slack's Jenkins build platforms proliferated, and grew harder to manage. A modernization effort standardized plug-ins and settled on a GitOps workflow.
Sep 21st, 2022 3:00am by Jessica Wachtel
👁 Featued image for: A Look at Slack’s New GitOps-Based Build Platform

Enterprise messaging giant Slack has modernized its build platform, forgoing the individual Jenkins clusters for a more standardized model to improve developer efficiency and user experience.

It was a necessary move: Slack is growing at a rapid pace, with its revenue basically doubling every year since 2014. With a company growing at that pace, what once worked may not be the best solution now. In this case, it was Slack’s build platform.

A recent blog post written by a team of Slack engineers went into great detail on this topic.

Slack has used Jenkins as its build platform since the early days. The idea of allowing each team to create their own customized Jenkins cluster, known as “Snowflake Clusters,” was a solid idea in 2014. But with the hyper-growth came an increase in the product service dependency on Jenkins and different teams started using Jenkins for their own unique needs, such as plugins, credentials, security practices, backup strategies, managing jobs, upgrading packages, etc.

In plain terms, there were enough “Snowflake Clusters” to cause an avalanche of complications considering that each unique cluster has its own ecosystem rich with plugins to upgrade, vulnerabilities to deal with, and processes around managing them.

There. were. challenges. A long list of challenges. And while every company has a long list of technical challenges unique to them, overall the list of Slack’s challenges read similarly to the universal reasons why companies decide to modernize their tech: the code as it stood currently was effective but not optimal for the future and led to a loss in productivity. And there was technical debt. No one ever wants technical debt.

Though the system wasn’t optimal, a complete rewrite wasn’t needed. The goals of the modernization were to fix key issues, modernize deployed clusters, and standardize the Jenkins inventory.

At a high level, the Build team would provide a platform for “build as a service” with enough knobs for customization of Jenkins clusters.

Where to Start?

Slack did what we all do… they conducted research on what large-scale companies were using for their build systems. Slack engineers did have the opportunity to meet with multiple companies to discuss their build systems. These meetings helped them learn and replicate (when possible) other build systems.

From someone who reads many engineering blog posts on a weekly basis, I see the same build system requirements that keep coming up.

The following is an incomplete list of features and concepts that Slack implemented (the team’s post is much more comprehensive):

Stateless and Immutable CI Service: Separating the business logic from the underlying build infrastructure made the CI service stateless. This led to quicker and safer building and deploying of build infrastructure, the option to involve shift left strategies, and an improvement in maintainability. All build-related scripts were moved to a repo independent from where the business logic resided. The team used Kubernetes to help build Jenkins services which helped solve issues of immutable infrastructure, efficient utilization of resources, and high availability. Every service was built from scratch thus eliminating the residual state.

Security Operations as part of the Service Deployment Pipeline: Obvious for many reasons in today’s never-ending blast of cyberattacks. Slack instituted identity and access management (IAM) and role-based access control (RBAC) policies per cluster. Vulnerability scanning takes place each time the Jenkins service is built.

More shift-left to avoid finding issues later:Testing is definitely the move here. This one specifically is coming up more and more. It is always better to find bugs in development than it is to find bugs in production.

Slack used a blanket test cluster and pre-staging area for testing out small/large impact changes to the CI system even before they hit the rest of the staging environments. This also allowed high-risk changes to be baked in for an extended time period before pushing changes to production. The additional testing led to better developer productivity and an improved user experience which is similar to results found in other shift-left articles.

In Addition to New Features, the Clusters Had to Be Standardized

Standardization in this case meant that a single fix could be applied uniformly to the Jenkins inventory. For this, Slack used Casc, a configuration management plugin for Jenkins.

Central storage ensured all Jenkins instances used the same plugins to avoid snowflaking. This allowed for automatic upgrades and alleviated any need for manual intervention or version incompatibility.

GitOps Style Management

Git became the single source of truth. Nothing was built or run on Jenkins controllers. This was enforced with GitOps. Configurations were managed through the use of templates to make it easy for users to create clusters, re-using existing configurations to easily change configurations.

The entire build infrastructure could be recreated from scratch with the exact same result every time as all infrastructure operations came from Git using the GitOps model.

Configuration Management

Debugging was aided by the enabling of metrics, logging, and tracing on each cluster. The ability to re-use credentials was now available on applicable clusters. Upgrading the Jenkins operating system, packages, and plugins was quick as everything was contained in a Dockerfile.

The features listed here as well as others included in the original article made up this diagram which represents the flow of the new build system.

👁 Image

Distributed Ownership

The Build team managed systems in the build platform infrastructure and the remaining systems would be managed by service owner teams using the build platform.

👁 Image

Challenges and Conclusion

As we mentioned, there was a long list of challenges. But overall the modernization effort led to a lot of learning, teaching, debugging, and top-notch documentation writing. The graph below details a process is well worth the struggle though.

👁 Image

Individual services were built and deployed quickly and in a safe and secure manner.Time to address security vulnerabilities went down and standardization of the Jenkins inventory reduced multiple code paths required to maintain the fleet. Infrastructure changes could be rolled out quickly and rolled back quickly if required.

The migration started slowly with a few of their existing production build clusters. Currently, the new clusters are being built with the new system and this is what’s helping improve the delivery timelines shown above. Currently, the migration of all services into the new build system is underway and new features are being added.

Tech changes on a dime and there is a difference between operational and optimal and sometimes it’s just time to dive in and make the change.

TRENDING STORIES
Jessica Wachtel is a developer marketing writer at InfluxData where she creates content that helps make the world of time series data more understandable and accessible. Jessica has a background in software development and technical journalism.
Read more from Jessica Wachtel
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.