VOOZH about

URL: https://thenewstack.io/how-to-manage-45-billion-client-records-with-aerospike/

⇱ How To Manage 45 Billion Client Records With Aerospike - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2024-07-01 06:51:30
How To Manage 45 Billion Client Records With Aerospike
sponsor-aerospike,sponsored-topic,
Data / Data Streaming / Databases

How To Manage 45 Billion Client Records With Aerospike

At Aerospike's Real-Time Data Summit last week, Adjust's Bubunyo Nyavo explained how the company used Aerospike to help clients track the return on investment of their marketing channels.
Jul 1st, 2024 6:51am by Joab Jackson
👁 Featued image for: How To Manage 45 Billion Client Records With Aerospike
Images from Aerospike’s Real-Time Data Summit. 

When your operations outgrow the capabilities of a single database, what are your options?

For the Berlin-based mobile measurement service provider Adjust, the answer came with Aerospike, a real-time, high-performance NoSQL key-value store that can be run across multiple data centers.

At Aerospike’s Real-Time Data Summit last week, Adjust Senior Software Engineer Bubunyo Nyavo, explained how the company used Aerospike to help clients track their return on investment of their marketing channels.

Adjust’s service can generate 52 million requests every minute on average. These requests can set off the need for an operation of some sort, such as a query, and, of course, to reconcile state. A customer may post material on Meta, LinkedIn, or some other social media outlet, and the Adjust gathers the number of people who viewed the content and how many clicked on it

“Depending on what operation it is, we fetch some data, we write some data. Sometimes we write in batches, sometimes delete data, and then we return a response for these requests,” Nyavor said.

Overall, the company keeps about 45 billion records in Aerospike, and these are just recording the states of devices. With an average of 512 Bytes per record, these results in 351TB worth of data.

The data is stored in three separate three separate clusters, located in geographically-dispersed data centers. Each cluster has 64 nodes and runs on bare metal, with Gentoo Linux serving as the operating system. Each server has about 400GB of RAM and 16TB of solid-state of NVMe disk space, and a 10 Gigabit network card. Either two or three copies of the data are kept as backup.

“So that if single rack goes offline, it doesn’t send us into a tailspin,” Nyavor said.

👁 A chart showing the average number of devices connecting to Aerospike.

A chart showing the average number of devices connecting to Aerospike.

Beyond Key-Store Values

The Aerospike key-value store was launched in 2009 (originally as CitrusLeaf) and quickly found an audience in the online advertising industry for storing and subsequently analyzing customer cookies at rapid speed.

Subsequent releases expanded the analytics, incorporated batch processing, and introduced secondary indexes and cross-data center replication.

At the Real-Time Data Summit, Aerospike Senior Developer Experience Engineer Art Anderson discussed how Aerospike could also do graph and vector data formats, which can help online shops easily build out recommendation systems.

For Adjust, low latency was critical. Customers wanted data updated as close to real-time as possible. This is a challenge given the cross-cluster communications.

As with any distributed system with duplicate data, Adjust must trade-offs between consistency and availability of the data (two of the three pillars of the CAP Theorem).

In a consistent mode, accurate data will always be delivered, though it may take some time. In an availability-oriented mode, data will be returned to the requester as quickly as possible, though it may not include the most recent changes (as it takes to propagate new data across different clusters).

👁 Operational modes of Aerospike: Consistency and Available.

Operational modes of Aerospike: Consistency and Available.

“You will get fast responses but there’s no guarantee on the freshness of the data,” Nyavor explained, especially since Adjust writes a lot more data to disk than reads it.

There are several tools that help. Aerospike offers an intelligent client driver that knows which nodes on a cluster to send the requests to. The database system also allows Adjust to store secondary indexes on the speedy solid-state hard drives, an advantage given that it would be cost-prohibitive to store them on the server’s own main memory.

“Aerospike does sufficiently well to be able to help us take advantage of cheaper hardware,” Nyavor said.

Overall, the system can do, on average about 1.2 million write operations per second, and 2 million get operations per second.

👁 Aerospike operations per second at Adjust.

Aerospike operations per second at Adjust.

About 50% of all requests take less than 500 milliseconds or less, an impressive feat given the vastness of the database itself, Nyavor said.

👁 Aerospike operations under 500 milliseconds (Chart).

Aerospike operations under 500 milliseconds.

Scanning is one of the larger operations. It is necessary to delete user records, when requested or when a customer leaves the program. Scanning an entire cluster takes about three days.

“It is a slow and intensive process because it takes a lot of resources to scan,” he said. The good news is that Aerospike can run the scan operations as a background task, temporarily suspending them when reads and writes are needed to be executed.

👁 Image

How Aerospike Is Upgraded

There is still work on Aerospike that needs to be done, according to Nyavor.

For instance, the upgrade process is still pretty manual-intensive.

The process involves going through the change log to ensure nothing has been broken in the upgrade process.

But overall, the database is very configurable, and you need to understand all the options to get the most out of it, Nyavor said.

And if you don’t know something, ask. The Aerospike support team has been really helpful in answering questions, he added.

“Don’t take anything in the documentation that you don’t understand for granted, because it can snowball and bite you in the ass,” he said.

Aerospike is the real-time database built for infinite scale, speed, and savings. Our customers are ready for what’s next with the lowest latency and the highest throughput data platform. Cloud and AI-forward, we empower leading organizations like Adobe, Airtel, Criteo, Experian, and PayPal.
Learn More
The latest from Aerospike
TRENDING STORIES
Joab Jackson is a senior editor for The New Stack, covering cloud native computing and system operations. He has reported on IT infrastructure and development for over 30 years, including stints at IDG and Government Computer News. Before that, he...
Read more from Joab Jackson
SHARE THIS STORY
TRENDING STORIES
Aerospike is a sponsor of The New Stack. 
TNS owner Insight Partners is an investor in: Real.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.