VOOZH about

URL: https://thenewstack.io/instagram-supercharges-cassandra-pluggable-rocksdb-storage-engine/

⇱ Instagram Supercharges Cassandra with a Pluggable RocksDB Storage Engine - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2018-03-05 11:06:57
Instagram Supercharges Cassandra with a Pluggable RocksDB Storage Engine
news,
Storage

Instagram Supercharges Cassandra with a Pluggable RocksDB Storage Engine

Mar 5th, 2018 11:06am by Joab Jackson
👁 Featued image for: Instagram Supercharges Cassandra with a Pluggable RocksDB Storage Engine
Photo by RKTKN via Unsplash.

To boost the performance of a mission-critical instance of Cassandra, Instagram engineers replaced the storage engine of this Java-based distributed open source database with a faster C++-based one from another database, RocksDB.

With the resulting high-performance database, dubbed “Rocksandra,” engineers from the company are hoping the move will help propel the Apache Cassandra development team to adopt this plug-in storage engine architecture that they developed, which could pave the way for more flexible use of Cassandra. The company has released as open source the Rocksandra code base and benchmark framework as open source.

Rocksandra is a fork, said Francois Deliege, Instagram engineering manager, but not one designed to compete with Cassandra itself. “We are not trying to sell the community on the particular RocksDB implementation, but it’s just a showcase of how it could help. If we had any other database or storage engine supporting this, it would be a great win for the community,” he said.

Instagram uses Cassandra as a general key-value storage service, to support the user photo feed, direct messages, as well as for fraud detection. While exceptionally fast at writing data to disk, Instagram was finding that read times were starting to increase, resulting in slow performance for users. The performance lag came from Java’s garbage collection routine, which periodically stops all client requests to clean up unused memory. “Cassandra is really good at writing queries, but on the read side, it’s a little bit of a pain,” Deliege said.

“Cassandra is really good at communication across all the different nodes, but Java is not the most efficient language to interact with the kernel of the OS,” explained Dikang Gu, a Facebook staff software engineer who is working on the project (Facebook owns Instagram).

A storage engine can be seen as the heart of the database. It is responsible for formatting the data so it can be placed on disk, as well as reading the data from disk later. The leaner RocksDB storage engine, written in C++, did not have this issue.  This storage engine, developed at Facebook, was optimized for performance, especially on fast storage like SSD. Others have plugged it into MySQL, MongoDB, and other popular databases, to speed performance.

The initial challenge to this task was that Cassandra did not have a pluggable storage engine architecture into which the RocksDB engine could be easily embedded. So the engineering team went ahead defined a new storage engine API that cleanly separated the distribution layers from the storage engine, one that incorporated the most common read/write and streaming interfaces.

“This way we could implement the new storage engine behind the API and inject it into the related code paths inside Cassandra,” the developers wrote in a blog post explaining the work.

👁 Image

They faced other challenges in the transplant as well. They had to match Cassandra’ support of rich data types against RocksDB simpler key-value interfaces. This entailed work in defining encoding/decoding algorithms for these data models.

Lastly, the company needed to accommodate its streaming data architecture. Whenever a node is entered into a cluster or removed from one, the stream of incoming data flows across all the nodes. This model had to be changed to accommodate the RocksDB APIs.

Results

The team had worked on the project for a better part of a year and was able to test it in production, where they found that the red latency with the new architecture from 60ms to 20ms. Moreover, the temporary operational stalls owing from garbage collection dipped from 2.5 percent to 0.3 percent, a 10X reduction.

The Instagram engineers are looking for ways to add more C/C++ features into the codebase, to cover other features such as secondary indexes, repair, and so on.

TRENDING STORIES
Joab Jackson is a senior editor for The New Stack, covering cloud native computing and system operations. He has reported on IT infrastructure and development for over 30 years, including stints at IDG and Government Computer News. Before that, he...
Read more from Joab Jackson
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.