![]() |
VOOZH | about |
We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.
Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.
Follow TNS on your favorite social media networks.
Become a TNS follower on LinkedIn.
Check out the latest featured and trending stories while you wait for your first TNS newsletter.
Reading and writing are distinctly different beasts. This is true with reading and writing words, reading and writing code, and also about reading and writing data to a database. So, when it comes to optimizing database performance, your read:write ratio really does matter.
I recently wrote about performance considerations that are important for write-heavy database workloads — covering factors like log-structured merge (LSM) tree vs. B-tree engines, payload size, compression, compaction and batching. But read-heavy database workloads bring a different set of challenges, including:
Let’s explore high-level considerations and best practices for handling latency-sensitive read-heavy workloads.
When we talk about a “real-time read-heavy” workload, we mean workloads that:
Here are a few examples of how they manifest themselves in the wild:
Here are some key considerations that impact read performance in real-time database systems.
To understand how databases like ScyllaDB process read operations, let’s recap its read path. When you submit a read (a SELECT statement), the database first checks for the requested data in memtables, which are in-memory data structures that temporarily hold your recent writes.
Additionally, the database checks whether the data is present in the cache. Why is this extra step necessary? Because the memtable may not always hold the latest data. Sometimes data could be written out of order, especially if applications consume data from unordered sources. As the protocol allows for clients to manipulate record timestamps to prevent correct ordering, checking both the memtable and the cache is necessary to ensure that the latest write gets returned.
Then, the database takes one of two actions:
As memtables flush to disk, the data also gets merged with the cache. That way, the cache ends up reflecting the latest on-disk data.
Reading from cache is always faster than reading from disk. The more data your database can serve directly from cache, the better its performance (since reading data from memory has a practically unlimited fetch ceiling).
But how can you tell whether your reads are going to cache or disk? Monitoring. You can use tools such as the ScyllaDB Monitoring Stack to learn all about your cache hits and misses. The fewer cache misses, the better your read latencies.
ScyllaDB uses a least recently used (LRU) caching strategy, similar to Redis and Memcached. When the cache gets full, the least-accessed data is evicted to make room for new entries. With the LRU approach, you need to be mindful about your reads. You want to avoid situations where a few “expensive” reads end up evicting important items from your cache.
If you don’t optimize cache usage, you might encounter a phenomenon called “cache thrashing.” That’s what happens when you’re continuously evicting and replacing items in your cache, essentially rendering the cache ineffective. For instance, full table scans can create significant cache pressure, particularly when your working set size is larger than your available caching space.
During a scan, if a competing workload relies on reading frequently cached data, its read latency will momentarily increase because those items were evicted. To prevent this situation, expensive reads should specify options like ScyllaDB’s BYPASS_CACHE to prevent its results from evicting important items.
Paging is another important factor to consider. It’s designed to prevent the database from running out of memory when scanning through large results. Basically, rows get split into pages as defined by your page size, and selecting an appropriate page size is essential for minimizing end-to-end latency.
For example, assume you have a quorum read request in a three-node cluster. Two replicas must respond for the request to be successful. Each replica computes a single page, which then gets reconciled by the coordinator before returning data back to the client. Note that:
Testing various page sizes helps with finding the optimal balance. Most drivers default to 5,000 rows per page, which works well in most cases, but you may want to increase from the defaults when scanning through wide rows or during full scans — at the expense of letting the database work more before receiving a response. Sometimes trial and error is needed to get the page size nicely tuned for your application.
While ScyllaDB’s LSM tree storage engine makes it quite well-suited for write-heavy workloads, our engineers have introduced a variety of features that optimize it for ultra-low latency reads as well. These include unified cache, SSTable index caching, workload prioritization, heat-weighted load balancing, prepared statements and high concurrency.
There are many ways you can keep latencies low with read-heavy workloads — even on databases such as ScyllaDB that are also optimized for write-heavy workloads. In fact, ScyllaDB performance is comparable to dedicated caching solutions like Memcached for certain workloads.
If you want to learn more, here are some firsthand perspectives from teams that tackled some interesting read-heavy challenges:
Also, take a look at the following video, which goes into even greater depth on these read-heavy challenges and also walks through what these workloads look like on ScyllaDB.