VOOZH about

URL: https://thenewstack.io/embracing-testing-in-production/

⇱ Embracing Testing in Production - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-05-02 09:22:46
Embracing Testing in Production
contributed,sponsor-octopus-deploy,sponsored,sponsored-post-contributed,
Cloud Native Ecosystem / Microservices

Embracing Testing in Production

While testing in production is no substitute for comprehensive unit and integration tests, it allows you to reproduce and observe quirky behavior.
May 2nd, 2022 9:22am by Matt Casperson
👁 Featued image for: Embracing Testing in Production
Photo by Tianyi Ma on Unsplash.
Octopus Deploy sponsored this post. Insight Partners is an investor in Octopus Deploy and TNS.

At Octopus, we recently embarked on a new initiative to deliver tools that build opinionated GitHub Actions workflows and Jenkins Pipelines to help customers implement their continuous integration and continuous delivery (CI/CD) workflows.

The team was small (me), the deadlines were tight, and the tools were designed to build custom scripts and templates that themselves built yet more scripts and templates to be executed in platforms that were proving impractical to emulate locally.

Matt Casperson
I have loved technology since my first Commodore 64, and that passion remains to this day. As principal content engineer, my days have me integrating enterprise platforms with Octopus, writing guides and books for platforms like Kubernetes, blogging, and training my colleagues, testing bleeding edge open source projects and contributing to various blogs.

I found writing code that writes code to be a uniquely frustrating experience. Linters and compilers are only useful on the final output, but offer little insight into my original templates, as they are full of markup syntax that often makes them invalid examples of their final output. I also did not have a clear picture of what the final result should look like, instead of iterating with many small changes tested in rapid succession.

Compounding the problem, the target platform executing the templates, GitHub Actions, did not have a robust offline option. Some interesting and active open source projects have sprung up to fill this gap, but I preferred to validate my templates in GitHub directly.

The final hurdle was the microservice architecture with which these tools were developed. Sitting in front of the template generator were convenient and well-tested web-based interfaces and services that pushed files to GitHub on the end users’ behalf. I could, of course, automate the process of typing git push in an isolated testing loop, but I did wonder if there was a better way to reuse the services that had already implemented this process.

Unsurprisingly, a majority of the advice on the internet implored me to isolate, record, replay, mock, substitute and automate my testing and development efforts. A recent Twitter post by Mitchell Hashimoto sums this up nicely:

I often recommend to more junior engineers: build feature X without ever running the software. Open a PR with confidence to say “this works” without having to SEE it work. This helps build understanding, confidence, and forces writing testable code.

I totally agree with this statement. At the same time, what I really wanted to do was leverage the existing microservice stack deployed to a shared environment while locally running the one microservice I was tweaking and debugging. This process would remove the need to reimplement live integrations for the sake of isolated local development, which was appealing because these live integrations would be the first things to be replaced with test doubles in any automated testing anyway. It would also create the tight feedback loop between the code I was working on and the external platforms that validated the output, which was necessary for the kind of “Oops, I used the wrong quotes, let me fix that” workflow I found myself in.

Looking for Inspiration

My Googling led me to “Why We Leverage Multi-tenancy in Uber’s Microservice Architecture,” which provides a fascinating insight into how Uber has evolved its microservice testing strategies.

The post describes parallel testing, which involves creating a complete test environment isolated from the production environment. I suspect most development teams are familiar with test environments. However, the post goes on to highlight the limitations of a test environment, including additional hardware costs, synchronization issues, unreliable testing and inaccurate capacity testing.

The alternative is testing in production. The post identifies the requirements to support this kind of testing:

There are two basic requirements that emerge from testing in production, which also form the basis of multitenant architecture:

  • Traffic Routing: Being able to route traffic based on the kind of traffic flowing through the stack.
  • Isolation: Being able to reliably isolate resources between testing and production, thereby causing no side effects in business-critical microservices.

The ability to route test traffic to a specific and isolated microservice was exactly what I was looking for. It removed the need to recreate the entire microservice stack and supporting platforms locally for testing while leaving any production traffic unaffected.

The only question was how to implement this with AWS Lambdas, which were hosting our microservices.

Looking for an Existing Solution

Unfortunately, while Kubernetes platforms can take this kind of routing for granted with advanced tooling like service meshes, there was no such ecosystem for Lambdas. Lambda extensions come tantalizingly close, but are focused on collecting metrics or modifying the execution environment rather than intercepting and modifying network traffic like Kubernetes does with sidecars:

You can deploy multifunction Lambda Layers to manage large binaries, or (now in preview) Lambda Extensions to plug in third-party agents that I’ve been told should definitely not be thought of as “sidecars for Lambda.”

The AWS App Mesh FAQ makes no mention of Lambdas, and while there are many questions around the dynamic routing of Lambda traffic on sites like StackOverflow, such as here and here, the response is always “you’re on your own.”

Defining the Problem

The problem I was trying to solve was traditionally the domain of a reverse proxy. However, unlike traditional reverse proxies, which have rich, static, server-side rules, what I needed was a rather dumb reverse proxy that implemented routing rules embedded in test requests.

This dumb reverse proxy (DRP, or even better, “derp”) needed to be deployed as a Lambda, and required the ability to forward traffic to upstream HTTP servers, Lambdas and even Amazon Simple Queue Service (SQS) queues. Looking further ahead, it would also be nice if the DRP could integrate with other platforms, like Azure or Google Cloud.

Go was the perfect choice to build the DRP. It already has an HTTP reverse proxy included in the standard library, compiles to native binaries with a short cold boot time and is popular enough to have first-class SDKs for major cloud providers.

Deciding on Routing Rules

Given the routing rules are included with each test request, it made sense to include them in an HTTP header. It is certainly possible to send complex objects, like JSON blobs, in HTTP headers, but a better solution was to allow the routing rules to be defined as a simple string.

The rules take the form route [/path/to/resource:METHOD]=destination[destinationname] where:

  • /path/to/resource is an HTTP path, optionally supporting Ant path syntax e.g. /path/**/resource
  • METHOD is the HTTP method such as GET, POST, DELETE, PATCH, etc
  • destination is the upstream service to redirect the traffic to such as sqs, lambda, or http
  • destinationname identifies the upstream service, be it a Lambda name, SQS queue, or HTTP server

Multiple such rules are concatenated with a semicolon, leading to headers like route[/api/audits:GET]=http[https://1198-118-208-16-252.ngrok.io];route[/api/customers:DELETE]=lambda[CustomerLambda-MyFeatureBranch].

This string is passed to each microservice in the Routing header, and each microservice is expected to pass this header along with each outgoing call. In the absence of any routing rules, the DRP routes traffic to a default upstream service.

Enabling Advanced Deployment and Testing Patterns

These routing rules provide for some interesting deployment and testing scenarios.

Feature branching is supported by deploying a feature branch Lambda with a unique name, like TemplateGenerator-MyFeatureBranch, and routing test requests to the feature branch Lambda.

👁 Image

Blue/green deployments are achieved by deploying the new green microservices parallel to the existing blue microservices, testing the green stack by routing test requests via the DRP, and once the tests pass, reconfiguring the DRP to set the default upstream services to those in the green stack.

👁 Image

Perhaps most exciting of all is the ability to route test traffic from the cloud network back to your local PC. Using services like ngrok to expose a local port via a public hostname or a standard client VPN into an AWS VPC, it is possible to route traffic for a single microservice back to your desktop environment. Much like the Kubernetes offerings Telepresence or Bridge to Kubernetes, this effectively allows a locally run microservice to participate in requests passed around a remote microservice stack.

👁 Image

Thinking about Security

There are obvious issues with allowing anyone to route traffic anywhere based on a well-known header, and so the DRP is configured to only inspect the Routing header if the request is accompanied by an OAuth JavaScript Web Token (JWT) bearer token indicating the sender is a member of a known Cognito group. If the JWT is missing or invalid, the Routing header is ignored. This ensures that only trusted team members can route production traffic.

The DevX Impact of Testing in Production

This local development experience was incredibly valuable. It allowed me to use the stable production environment, removing the need to recreate the complete microservice stack and supporting platforms locally, while also allowing me to iterate locally on a single microservice that received test traffic, but otherwise identical to production traffic. And all of this was done safe in the knowledge that no production traffic was affected.

An additional benefit was that all logic implemented in the API Gateway was respected. API Gateway is a complex platform offering almost unlimited options for manipulating traffic before it reaches upstream services. It is possible to run a local test API Gateway, but setting this up is now unnecessary.

Octopus Deploy is more than just a deployment tool; it’s a complete enterprise solution designed to streamline and automate CI/CD processes. Whether managing multi-tenant environments or ensuring security and compliance across deployments, Octopus empowers organizations to handle deployments at scale.
Learn More
The latest from Octopus Deploy

However, this approach does require that each microservice expose both a Lambda event handler to respond to traffic from the API Gateway and an HTTP server to expose the service while debugging locally. This wasn’t a burden though, as all modern frameworks make it easy to spin up an HTTP server. In practice, this means those structuring their code along the DDD layers will have an application layer exposing both an HTTP and Lambda interface, with the lower layers being ignorant of how the traffic was received.

Conclusion

I fully expect all Lambdas we deploy in the future will be hosted behind a DRP. The productivity gains unlocked by the ability to develop and debug individual microservices locally against a stable production microservice stack are undeniable. While testing in production is no substitute for comprehensive unit and integration tests, it does allow you to quickly reproduce and observe quirky behavior, and experiment with new ideas and solutions.

The DRP source code is available on GitHub. Let us know if you find it useful!

Octopus Deploy is more than just a deployment tool; it’s a complete enterprise solution designed to streamline and automate CI/CD processes. Whether managing multi-tenant environments or ensuring security and compliance across deployments, Octopus empowers organizations to handle deployments at scale.
Learn More
The latest from Octopus Deploy
TRENDING STORIES
I have loved technology since my first Commodore 64, and that passion remains to this day. As principal content engineer, my days have me integrating enterprise platforms with Octopus, writing guides and books for platforms like Kubernetes, blogging, and training...
Read more from Matt Casperson
Octopus Deploy sponsored this post. Insight Partners is an investor in Octopus Deploy and TNS.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Pragma, Octopus Deploy.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.