VOOZH about

URL: https://thenewstack.io/slatedb-bottomless-databases-built-on-cloud-object-stores/

⇱ SlateDB: 'Bottomless' Databases Built on Cloud Object Stores - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2024-11-07 07:00:18
SlateDB: 'Bottomless' Databases Built on Cloud Object Stores
Data Streaming / Databases / Storage

SlateDB: ‘Bottomless’ Databases Built on Cloud Object Stores

SlateDB can dramatically cut costs of running a key-value store in the cloud, as long as users don't mind a bit of latency.
Nov 7th, 2024 7:00am by Joab Jackson
👁 Featued image for: SlateDB: ‘Bottomless’ Databases Built on Cloud Object Stores

Twisting the dials of the infamous CAP theorem, a new open source storage engine called SlateDB can run a key-value store entirely within the object storage of your favorite cloud provider, costing a fraction of what it would on standard cloud compute engine. And you don’t even have to worry about scalability or the durability of your data.

As long as you don’t mind a little latency.

“It really simplifies the architecture if you can ditch the traditional disks or [block-storage]-styled storage systems and leverage the replication and scalability that [managed] object storage provides,” said software developer and tech investor Chris Riccomini, who is one of the creators of SlateDB, in a talk at ScyllaDB‘s P99Conf last month

This Zero-Disk Architecture (ZDA) is an idea that Riccomini (and others) have cribbed from WarpStream, a startup purchased earlier this year by enterprise Kafka-provider Confluent.

ZDA spawned not only SlateDB but others database systems as well. TurboPuffer is a vector search service built atop of S3; Neon runs a serverless PostgreSQL that stores its page data on S3; The TiDB serverless offering relies on object storage.

What Is Zero Disk Architecture?

WarpStream offers a Kafka-equivalent streaming engine built on Amazon Web Services. But it did not use the cloud giant’s compute instances. Instead, it was built entirely on AWS’ S3 object storage, which the company managed on the customer’s behalf on its own control plane.

True, it wasn’t as fast as an EC2 instance, but S3 had no multizone egress fees, thereby saving a lot of money for users who did not mind a bit of latency that came from running directly on object storage.

“Every time a GiB of data is transferred across zones, it costs $0.022, $0.01 for egress in the source zone and $0.01 for ingress in the destination zone,” a WarpStream blog post noted.

Kafka is not nothing if not a machine for generating messages and so moving to a ZDA could provide an orders-of-magnitude levels of savings simply by eliminating networking fees between the VMs and storage.

It also solved a lot of maintenance headaches.

“AWS employs literally hundreds of engineers whose only job is to make sure that S3 runs reliably and scales infinitely, so you don’t have to,” The WarpStream post noted.

The only drawback? Higher latency.

But many customers, as it turned out, were ready to trade a few microseconds for significant cost savings.

Bottomless Storage

“I thought it was kind of a brilliant idea because it addresses a lot of the traditional challenges you see when you build a distributed system around dealing with consistency and durability,” Riccomini said.

AWs’ S3, as with other cloud providers, has built-in scalability, in affect offering “bottomless storage capability.” AWS also offers backup and redundancy safeguards.

Riccomini noted that AWS manages 280 trillion objects a day, with 100 million requests per second. So, “They can definitely handle your workload,” he said.

Other advantages are provided to SlateDB by its use of the Log Structured Merge (LSM) Tree, a data structure popular found to be favorable for writing key/value values to in-memory, disk or flash-based storage. It’s “essentially an append-only log, where you periodically compact out duplicate writes to get only the latest version,” Riccomini explained.

“LSMs are a good fit for object stores because they sidestep a lot of the inherent drawbacks that object stores have,” further explained software engineer Rohan Desi, who co-presented this talk. “Object stores typically have a constrained API, so they can’t do random writes, and LSMs will only ever write immutable objects so they don’t have to deal with random writes.”

What Is SlateDB?

SlateDB is a storage engine with a key/value interface, perhaps most closely akin to RocksDB. It is written as a set of Rust libraries. It can support only one writer at a time, though it can accommodate multiple simultaneous reads.

To save API costs, SlateDB writes the data to the storage in batch mode, as a (configurable) string-sorted table.

👁 SlateDB use cases (chart)

Users can trade off where they want to be in the CAP matrix. CAP stands for consistency, availability, and partition tolerance. The rule of thumb is that database users only get two of these three traits.

With partition tolerance effectively handled by the cloud provider, SlateDB can then adjust the ratio of availability and consistency to the user’s liking.

Those wanting more durability (i.e. guaranteed no data loss) can configure the system not to proceed until an acknowledgement has been returned that a data write has been committed. Those wanting faster writes can use the asynchronous put method, which doesn’t acknowledge each data bit received, with all that back-and-forth taking up extra time.

“You can pick and choose which of these things you care the most about,” Riccomini said.

SlateDB also has the usual assortment of caching techniques to minimize API read time, such as in-memory block caches, compression, bloom filters, and local SST disk caches.

Currently, SlateDB can be run on AWS S3, or Microsoft Azure Blob Storage.

Of course, one could use AWS’ own DynamoDB key-value store, but it’s more expensive. DynamoDB charges $0.1/GiB for storage, while using S3 with SlateDB is nearly five times cheaper, at $0.023/GiB.

SlateDB is still betting on some advanced features from the cloud storage providers, such as multiregion buckets and atomic writes (“compare-and-swap”). Snapshots and transactions may be supported in future editions.

TRENDING STORIES
Joab Jackson is a senior editor for The New Stack, covering cloud native computing and system operations. He has reported on IT infrastructure and development for over 30 years, including stints at IDG and Government Computer News. Before that, he...
Read more from Joab Jackson
SHARE THIS STORY
TRENDING STORIES
Amazon Web Services, Confluent and ScyllaDB are sponsors of The New Stack. 
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.