VOOZH about

URL: https://thenewstack.io/the-case-of-the-vanishing-performance-improvements/

⇱ Trivago Cracks the Case of the Vanishing Performance Improvements - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-09-28 09:00:40
Trivago Cracks the Case of the Vanishing Performance Improvements
Frontend Development / Observability / Software Development

Trivago Cracks the Case of the Vanishing Performance Improvements

...But it worked just fine in dev!!! Trivago engineers offer a harrowing tale of debugging a front-end migration from Express.js to Fastify.
Sep 28th, 2022 9:00am by Jessica Wachtel
👁 Featued image for: Trivago Cracks the Case of the Vanishing Performance Improvements

In its staging environment, online travel service Trivago found the Fastify web framework responded to HTTP requests 107% (more than two times) faster than Express.js on average, as carried out on k6 load testing. Great news. But these performance improvements vanished in production. Why?

This conundrum was the subject of a recent blog post, written by Abdelrahman Abdelhafez, a Trivago backend software engineer. The post documents a technical challenge and investigation that sprouted up after migrating a monolithic Node.js GraphQL server from Express to Fastify.

Control Environment

To simplify the deployment of the tests, Trivago’s staging environment simulates the production environment to a high degree of parity. To further ensure accurate metrics, both the Express and Fastify staging deployments were:

  • identical. The only point of divergence were the web frameworks running under the hood.
  • run concurrently to simulate running conditions because the data sources that GraphQL fetches data from automatically scale.

With issues within the test environments ruled out and Abdelhafez and team, still hunting for the “Eureka!” moment, investigated the Express-to-Fastify migration pull request diff. They hoped to find something that pinpointed what lead to the bug. The answers they were seeking were not found so the debugging continued.

Abdelhafez explains this next part as, “[bringing] out the big guns: live profiling.”

For this part, Trivago rolled out a single Kubernetes pod equipped with a node-heapdump package which allowed them to take a snapshot of the V8 heap in a live deployment. Afterward, they directed a tiny portion of their incoming traffic to this special pod.

And for the first time, they laid eyes on the bug: The log reaction logic was consuming “obnoxious” amounts of memory. Why?

Log Redaction

This is the process of overwriting sensitive information (passwords, access tokens, emails, etc) before they are written to the application logs. Log redaction simplifies Trivago’s compliance with GDPR and makes their logs less confidential thus making them viewable by more devs for monitoring and debugging purposes. It definitely sounds like something they need to keep in the application.

Log Redaction in the Express Days: Trivago used a “super fast” JSON logger called pino. Sensitive information was redacted from incoming HTTP requests manually. Trivago would intercept JavaScript objects before being logged by pino, traverse them deeply using the deepdash package and overwrite any sensitive information it found.

Log Redaction in the Fastify Days: Fastify uses pino internally and recommended it so that didn’t change. The Fastify docs say that pin has built-in low-overhead redaction support so pino was explored without deepdash.

The image below shows the redaction feature configured with pino’s redact option:

👁 Image

The code above logs the following to the console:

👁 Image

In theory, this is perfect. No object traversal of any kind. One just needs to specify the paths needed to redact and “voilá” pino redacts them. But, real-life use cases don’t usually work like this. And in the case of Trivago, they need to stringify the GraphQL variables object after redaction but before logging to prevent log post-processing tools (e.g. Elasticsearch/Kibana) from indexing individual GraphQL variable properties.

Enter fast-redact

And this limitation caused Trivago to abandon pino’s baked-in redaction and try something else — fast-redact. fast-redact is a library that pino uses and promises “very fast object redaction” with it being only ≈ 1% slower than JSON.stringify. The code was implemented.

👁 Image

But fast-redact did not redact fast. Zooming in on the log redaction showed that the code written using fast-redact was the exact code slowing down the entire Node.js server. Why?

Finally an answer!

The deep dive investigation into fast-redact revealed a little implementation error.

👁 Image

When the image above is compared with the previous image of the fast-redact function, it’s right there in plain sight. Rather than generating the redact function once, it was regenerating the redact function on every single request.

Here’s what the correct implementation looks like per the ReadME.md.

👁 Image

The code was updated and it turns out fast-redact is fast. here is an isolated stat to prove it.

👁 Image

Abdelhafez explains this in a way that brings all the drama and I’m here for it: If the distance between the Earth and the Sun is 1 AU (≈ 8.3 light minutes) then 238,756 AU (≈ 3.8 light years) is roughly the distance between Earth and the second nearest star to Earth after the Sun, Proxima Centauri.

The Node.js server’s memory usage dropped from an average of 4GBs to 2GBs. This made their Kubernetes cluster in the EU region automatically scale down from 210 to 160 pods. Now hold the pods are able to handle the same amount of traffic in the busiest region.

How was this missed in the testing phase? Trivago manually disabled logging in the Express and Fastify deployments in the staging during load tests. The engineers didn’t consider that disabling logging also disabled the logic surrounding logging (ie redaction) and rendered that disk I/O “irrelevant” in the context of load tests. Abdelhafez said, “🤦‍♂️.”

To summarize: Fastify is superior to Express, performance-wise, and don’t disable any production flags during testing.

TRENDING STORIES
Jessica Wachtel is a developer marketing writer at InfluxData where she creates content that helps make the world of time series data more understandable and accessible. Jessica has a background in software development and technical journalism.
Read more from Jessica Wachtel
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.