![]() |
VOOZH | about |
We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.
Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.
Follow TNS on your favorite social media networks.
Become a TNS follower on LinkedIn.
Check out the latest featured and trending stories while you wait for your first TNS newsletter.
Write-heavy database workloads bring a distinctly different set of challenges than read-heavy ones. For example:
While cost matters — quite a lot, in many cases — it’s not a topic we want to cover here. Rather, let’s focus on the performance-related complexities that teams commonly face and discuss your options for tackling them.
First, let’s clarify what we mean by this term. We’re talking about workloads that:
In the wild, they occur across everything from online gaming to real-time stock exchanges. A few specific examples:
Next, let’s look at key architectural and configuration considerations that affect write performance.
The choice of storage engine architecture fundamentally affects write performance in databases. Two primary approaches exist: LSM trees and B-Trees.
👁 Underlying storage engine diagram
Databases known to handle writes efficiently — such as ScyllaDB, Apache Cassandra, HBase and Google BigTable — use log-structured merge trees (LSM). This architecture is ideal for handling large volumes of writes. Since writes are immediately appended to memory, this allows for very fast initial storage. Once the “memtable” in memory fills up, the recent writes are flushed to disk in sorted order. That reduces the need for random I/O.
For example, here’s what the ScyllaDB write path looks like:
With B-tree structures, each write operation requires locating and modifying a node in the tree — and that involves both sequential and random I/O. As the data set grows, the tree can require additional nodes and rebalancing, leading to more disk I/O, which can affect performance. B-trees are generally better suited for workloads involving joins and ad-hoc queries.
Payload size also affects performance. With small payloads, throughput is good but CPU processing is the primary bottleneck. As the payload size increases, you get lower overall throughput and disk utilization also increases.
Ultimately, a small write usually fits in all the buffers, and everything can be processed quite quickly. That’s why it’s easy to get high throughput. For larger payloads, you need to allocate larger buffers or multiple buffers. The larger the payloads, the more resources (network and disk) are required to service those payloads.
Disk utilization is something to watch closely with a write-heavy workload. Although storage is continuously becoming cheaper, it’s still not free.
Compression can help keep things in check, so choose your compression strategy wisely. Faster compression speeds are important for write-heavy workloads, but also consider your available CPU and memory resources.
Be sure to look at the compression chunk size parameter. Compression basically splits your data into smaller blocks (or chunks) and then compresses each block separately. When tuning this setting, realize that larger chunks are better for reads while smaller ones are better for writes, and take your payload size into consideration.
For LSM-based databases, the compaction strategy you select also influences write performance. Compaction involves merging multiple SSTables into fewer, more organized files, to optimize read performance, reclaim disk space, reduce data fragmentation and maintain overall system efficiency.
When selecting compaction strategies, you could aim for low read amplification, which makes reads as efficient as possible. Or, you could aim for low write amplification by avoiding compaction from being too aggressive. Or, you could prioritize low space amplification and have compaction purge data as efficiently as possible. For example, ScyllaDB offers several compaction strategies (and Cassandra offers similar ones):
For write-heavy workloads, we warn users to avoid leveled compaction at all costs. That strategy is designed for read-heavy use cases. Using it can result in a regrettable 40x write amplification.
In databases like ScyllaDB and Cassandra, batching can actually be a bit of a trap, especially for write-heavy workloads. If you’re used to relational databases, batching might seem like a good option for handling a high volume of writes. But it can actually slow things down if it’s not done carefully. Mainly, that’s because large or unstructured batches end up creating a lot of coordination and network overhead between nodes. However, that’s really not what you want in a distributed database.
Here’s how to think about batching when you’re dealing with heavy writes:
So, if you’re in a write-heavy situation, structure your batches carefully to avoid the delays that big, cross-node batches can introduce.
We offered quite a few warnings, but don’t worry. It was easy to compile a list of lessons learned because so many teams are extremely successful working with real-time write-heavy workloads. Now you know many of their secrets — without having to experience their mistakes. :-)
If you want to learn more, here are some firsthand perspectives from teams who tackled quite interesting write-heavy challenges:
Also, take a look at the following video, where we go into even greater depth on these write-heavy challenges, and also walk you through what these workloads look like on ScyllaDB.