VOOZH about

URL: https://thenewstack.io/the-unexpected-costs-of-flaky-tests/

⇱ The Unexpected Costs of Flaky Tests - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2021-04-06 12:00:52
The Unexpected Costs of Flaky Tests
contributed,sponsor-thundra,sponsored,sponsored-post-contributed,
CI/CD / Software Development

The Unexpected Costs of Flaky Tests

By fixing flaky tests right from the start, you’ll be able to correct course before things get out of hand.
Apr 6th, 2021 12:00pm by Serkan Özal
👁 Featued image for: The Unexpected Costs of Flaky Tests
Featured image via Pixabay.
Thundra sponsored this post.
Serkan Özal
Serkan is co-founder and CTO of Thundra. He has 10+ years of expertise in software development, is an AWS Certified PRO and has a patent on distributed environments. He mainly works on serverless architectures, distributed systems and monitoring tools.

With cloud computing and serverless technology, where most systems are distributed by nature, flakiness can become a major concern for companies whose primary products are software. What’s more, the bigger an organization’s systems get, the greater the likelihood that they’ll end up with even more flaky tests.

So, why are flaky tests such a menace to the development process? Flaky tests can render a whole test suite unreliable, which leads to technical and commercial problems of all kinds. If you can’t tell the difference between an error in your code, operating on a slow network, or a configured test-timeout, your team will lose its trust in the test suite.

In a previous article, we talked about what causes flaky tests. Today, we’ll look into their implications for your team and company.

Understanding Flaky Tests

In 2019, a group of researchers from the University of Zürich and Mozilla interviewed and surveyed over a hundred developers to determine the consequences of flaky tests.

The resulting paper, “Understanding Flaky Tests: The Developer’s perspective,” also looks into the causes and impacts of flaky tests on development and test teams. In asking developers to categorize flaky tests by their root causes, researchers found 11 types:

  1. Concurrency: In multithreaded software, when the threads rely on an implicit ordering of the data, but race-conditions occur.
  2. Async await: When a system starts asynchronous tasks, but doesn’t wait for them to finish.
  3. Too restrictive range: Tests define a range of valid outputs, but actual outputs go out of that range while still being valid results.
  4. Test order dependency: The outcome of one test relies on the test running before it.
  5. Test case timeout: The size of a test grew over time, but the timeout wasn’t increased.
  6. Resource leak: Memory isn’t released properly and can overflow in some cases.
  7. Platform dependency: A test relies on platform-specific behavior. Such as a task that yields a deterministic result on one operating system and non-deterministic on another one.
  8. Float precision: Float overflows or underflows were not considered, but are a crucial part of the test result.
  9. Test suite timeout: Contrary to the test case timeout, in this case, no single test is responsible for the flakiness; but the aggregate of the tests causes the entire test suite to timeout.
  10. Time: Here a test relies on the local system clock and becomes flaky. For example, this could happen when two timestamps of different time zones are compared.
  11. Randomness: Sometimes actual randomness is required for a test case, but the developer forgets to check for edge cases.

Flaky Tests Are Ignored in Order to Ship Faster

The focus of developers is usually on adding new features to their software. New and improved features lead to customer acquisition and, in turn, money. Quality assurance (QA) in writing tests and fixing bugs are often seen as just an annoyance that has to be endured to get new features released.

If developers aren’t incentivized to take QA seriously, they will likely take shortcuts that lead to problems in the long run. This can include — but is not limited to — ignoring flaky tests and simply rerunning the test suite until they get the result they want.

If more features are added without regard for these testing problems, the problems will accumulate. More flaky tests need more reruns of the test suites, which in turn slows down the release of every new feature. In the worst-case scenario, you’ll end up with so many flaky tests that the test suite won’t yield success anymore, no matter how many times you run it. At that point, continuous integration will come to a screeching halt.

This halting event is the worst possible outcome. You now have the highest need to get rid of flaky tests and the highest number of them.

If you are just starting a new test suite, there are guidelines to help you avoid getting flaky tests in the first place.

Flaky Tests Increase Costs and Slow Development

In “Understanding Flaky Tests,” the authors report that “according to the opinions of 92 developers (77%), flaky tests are time-consuming since reproducing the test failure is not easy and not always guaranteed to be possible.”

To make a flaky test work without fixing it, developers need to execute the test suite multiple times. This is an easy but time-consuming workaround. The alternative is to dig into the problem and try to fix the root cause, which takes time and experience that not every developer has.

Thundra Foresight empowers developers to build successful CI pipelines by providing deep analytics and debugging capabilities. With observability into the CI process, Thundra Foresight helps optimize build duration, enable more frequent deployments, increase productivity, and lower CI costs.
Learn More
The latest from Thundra

The authors of the study note that “since flaky tests are often intertwined with other tests, they require a certain level of knowledge to be able to fix them and, thus, intermittent tests may lead to problems in the allocation of the available resources.”

In short, fixing the root cause of a flaky test takes a skilled developer away from building new features for an unknown (and possibly extended) period of time.

Flaky Tests Undermine Trust

One of the biggest problems flaky tests create is that with inconsistent test results, developers lose faith in their validity. According to “Understanding Flaky Tests,” the majority of developers report that they don’t find tests results reliable. “Thus,” the paper states, “developers start to trust the test output less and, therefore, may start disregarding it, potentially leading to ignoring an actual failure.”

The impression of randomness in failures would lead any team to doubt their tests’ value. After all, tests are seen as extra work anyway; writing feature code and writing test code can more than double the development effort. If this extra work won’t pay off, nobody is motivated to do it.

If the tests are disregarded, the company loses a crucial part of quality assurance. Without checking their implementation, even developers can’t trust their code anymore. This leads to more time in manual testing, because they don’t want to be held liable if something goes wrong. If the developers don’t properly invest in their manual testing, the software’s quality will suffer — and with it, the trust of other teams in the developer’s products. Without trust between teams, your company culture will erode.

But the whole chain can also go the other way around. Often developers see the need for fixing tests but don’t have the time to do so. They ask their higher-ups to allocate more resources for quality assurance, but the decision-makers only have new features in mind. In that case, the developers lose faith in their managers to do the right thing.

Conclusion

Flaky tests are notoriously hard to fix and will bind your best developers who could otherwise create the best features. When a team is overburdened with fixing tests, they don’t have time to build new components. This is a double-lose situation, where your company uses more resources to ship features and still loses potential customers because the software is unstable.

Finally, flaky tests can erode trust in your company if they are ignored for too long. In this destructive chain, developers stop trusting tests, other teams stop trusting the developers, and developers feel bad about their work even if they’re working extra to solve issues. In the end, your developers may even consider switching companies to get out of this vicious cycle. By fixing flaky tests right from the start, you’ll be able to correct course before things get out of hand.

With Thundra Sidekick, developers can determine if their tests are flaky by setting breakpoints that allow them to troubleshoot tests with non-intrusive debugging. Get started with Thundra today.

Thundra Foresight empowers developers to build successful CI pipelines by providing deep analytics and debugging capabilities. With observability into the CI process, Thundra Foresight helps optimize build duration, enable more frequent deployments, increase productivity, and lower CI costs.
Learn More
The latest from Thundra
TRENDING STORIES
Serkan Özal is co-founder and CTO of Thundra. He has 10+ years of expertise in software development, is an AWS Certified PRO and has a patent on distributed environments. He mainly works on serverless architectures, distributed systems and monitoring tools.
Read more from Serkan Özal
Thundra sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Pragma.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.