VOOZH about

URL: https://thenewstack.io/how-to-deal-with-flaky-tests/

⇱ How to Deal with Flaky Tests - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2021-03-30 11:00:48
How to Deal with Flaky Tests
contributed,sponsor-thundra,sponsored,sponsored-post-contributed,
CI/CD

How to Deal with Flaky Tests

Flaky tests aren’t trivial. Even if flakiness seems random at first glance, many factors play a role in test functionality.
Mar 30th, 2021 11:00am by Serkan Özal
👁 Featued image for: How to Deal with Flaky Tests
Thundra sponsored this post.

Thundra sponsored this post.

Serkan Özal
Serkan is co-founder and CTO of Thundra. He has 10+ years of expertise in software development, is an AWS Certified PRO and has a patent on distributed environments. He mainly works on serverless architectures, distributed systems and monitoring tools.

The main goal of building a CI/CD pipeline is to improve developer velocity. The more you can automate your integration and deployment process, the faster you can get new releases out the door. A test suite — a collection of tests that check for bugs that were introduced into your code — is a crucial part of such a pipeline.

Sometimes the tests can be flaky, which means they fail or succeed at seemingly random intervals without any code changes. You’ll need to run a test suite multiple times to determine whether you have a bug or a flaky test. This process slows down your CI/CD pipeline tremendously. As noted above, these pipelines should improve developer velocity, not hold you back.

If you have too many flaky tests in your suite, they can also wear down the trust you have in those tests. If one test fails at random, how can you trust that another test won’t do the same? Without confidence in your tests, your team could stop taking test results seriously and even stop writing them in the first place.

How to Spot a Flaky Test

There are two ways to get a feel for how flaky your tests are. One is to run a test, or even a whole test suite, multiple times. If you didn’t change any code between test runs and the test suite shows a different number of failed tests at every run, you can be sure that something has gone awry with your test.

The other way is to run them in a different order each time. If your tests fail when their order is changed, it’s a sign that you haven’t accounted for inter-test dependencies.

Some tools can help with this. For Java developers, there’s a Gradle plugin that will rerun your tests to see if they’re deterministic. While this doesn’t save you time, it at least automates the process of finding flaky tests.

Spotify built a GitHub bot that can run tests for new code multiple times before a merge–you can manually start it on a pull request. While the bot is not publicly available, it’s easy to build such a tool.

Thundra Foresight empowers developers to build successful CI pipelines by providing deep analytics and debugging capabilities. With observability into the CI process, Thundra Foresight helps optimize build duration, enable more frequent deployments, increase productivity, and lower CI costs.
Learn More
The latest from Thundra

AVA is another test runner for JavaScript. It runs your tests in parallel, which has the advantage of starting all tests simultaneously so they can’t depend on each other.

Another option is Thundra Sidekick, which enables developers to troubleshoot tests with non-intrusive debugging by letting them set breakpoints. In this way, developers can figure out if the tests are really flaky or not.

How to Address Flaky Tests?

Now that you can find the flaky tests, you need to fix them. Here are seven highly effective strategies.

1. Visualizing Test Runs

While test-run visualizations alone can give you an idea of how well your tests work, (in combination with multiple randomized test runs) you’ll also get a clear depiction of whether flaky tests are increasing or decreasing over time. A simple table with rows showing time and columns showing tests can be enough for this.

2. Quarantining Flaky Tests

Once you’ve found your flaky tests, you should create a separate test suite for them to serve as a quarantine. Your non-flaky tests don’t have to be run multiple times, so creating an extra test suite will save you from duplicating part of the work. Google has even created a tool to help with this by automatically putting flaky tests in a separate test suite.

This practice will also help when fixing the flaky tests because it allows you to focus on the flaky tests independently. If isolating the tests in their own suite fixes the flakiness, that change alone will give you an idea if inter-test dependence is the reason for the flakiness.

3. Cleaning up State

Remove all state and data generated before a test run, so your test can’t be derailed by existing data you forgot about. This state can live in caches, databases or even variables. You’ll also want to check that your tests clean up correctly after they’re done — clean-up errors are often silently ignored in test suites. In a worst-case scenario, you’ll need to rebuild the whole system for every test run.

For databases, it can be helpful to use transactions. These can be rolled back after a test run, bringing the database back to the state it was in before the test was started.

4. Looking for Timeouts

Asynchronous tests that access network resources are especially prone to flake due to timeouts. The network can be quick or slow depending on the number of services using it. A too-short timeout can cause a test to flake. Setting your timeout variables in bulk will allow you to change them quickly in the future.

If you have a complex test relying on asynchronous services, try to check the service for availability before starting the test. This will save time when your timeouts become too long.

5. Using Test Doubles

You can create a simplified version if you test a service that isn’t deterministic. A common critique of this practice is that test doubles don’t always accurately mimic the actual service. By ensuring that the test double doesn’t deviate from the original, you can account for updates. Writing contract tests can help mitigate this problem.

6. Checking the System Clock

If your code depends on data that can’t be known in advance, such as the system clock, wrap these data sources in your code and don’t rely on them directly. This will allow you to replace their outputs with hard-coded data before running a test.

7. Checking for Memory Leaks

Profile your test code to get a feeling for its memory usage over time. If your code has memory leaks, you’ll see your test suite’s memory usage grow with every run test. Depending on the available resources and other systems running on that hardware, a memory leak could very well be the source of your flakiness problems.

If you use resource allocation pools as wrappers between your code and the actual memory allocation, requesting too much will cause your code to fail in a determined way. You can then try to fix the issue by lowering its memory allocation.

Summary

Flaky tests aren’t trivial. Even if flakiness seems random at first glance, it’s important to keep in mind that pre-existing state, network problems, timing, or even memory allocation can all play a role in how well a test functions.

Luckily, there are viable methods to eliminate flakiness from your tests. Rerun your tests multiple times, change their execution order, and visualize how they succeed and fail over time. Finally, quarantine flaky tests into a separate test suite and try to fix them by looking at the potential root causes.

Flaky tests cost you time and money by slowing down your CI/CD pipeline with multiple reruns and erode the trust your team has in testing. Thundra Sidekick is currently available as an IntelliJ IDEA plugin for Java applications. You can get started in just a few minutes and begin troubleshooting tests today so your team doesn’t lose any more time.

Featured image via Pixabay.

Thundra Foresight empowers developers to build successful CI pipelines by providing deep analytics and debugging capabilities. With observability into the CI process, Thundra Foresight helps optimize build duration, enable more frequent deployments, increase productivity, and lower CI costs.
Learn More
The latest from Thundra
TRENDING STORIES
Serkan Özal is co-founder and CTO of Thundra. He has 10+ years of expertise in software development, is an AWS Certified PRO and has a patent on distributed environments. He mainly works on serverless architectures, distributed systems and monitoring tools.
Read more from Serkan Özal
Thundra sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Pragma.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.