![]() |
VOOZH | about |
We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.
Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.
Follow TNS on your favorite social media networks.
Become a TNS follower on LinkedIn.
Check out the latest featured and trending stories while you wait for your first TNS newsletter.
Redis is the “Swiss Army knife” of in-memory databases with many data types, and it’s often used for caching, but it does even more. It can also function as a loosely coupled distributed message broker, so in this article, we’ll have a look at the original Redis messaging approach, Redis Pub/Sub, explore some use cases and compare it with Apache Kafka.
A Beatles-inspired submarine cocktail. Image used under license from Shutterstock.com. Contributor: Evlakhov Valerii
The theme of “pub” pops up frequently in my articles. In a previous article, I wrote about a conversation in an outback pub, “Apache ZooKeeper Meets the Dining Philosophers,” and in this article, we are investigating Redis Pub/Sub. This sounds like some sort of Beatles-inspired exotic yellow submarine cocktail, but Pub/Sub is actually short for publish/subscribe, the well-known pattern for loosely-coupled distributed messaging systems.
Redis Pub/Sub is the oldest style of messaging pattern supported by Redis and uses a data type called a “channel,” which supports typical pub/sub operations, such as publish and subscribe. It’s considered loosely coupled because publishers and subscribers don’t know about each other. Publishers publish messages to a channel, or multiple channels, and subscribers subscribe to one or more channels.
A channel can have zero or more subscribers, and the messages are delivered to all the current connected subscribers. Redis Pub/Sub is therefore flexible and supports multiple topologies including fan-in (multiple producers, single subscriber), fan-out (single producer, multiple subscribers), and 1-1 (one producer, one consumer).
So far this sounds like a fairly typical pub/sub system, however, one feature is important to highlight: “connected” delivery semantics.
Connected delivery functions like radio. Radio stations are constantly broadcasting on different frequencies (channels), but listeners can only hear the broadcast while their receiver is plugged in, turned on, and they are tuned in to a station. (“Stay tuned” could be the motto for Redis Pub/Sub).
Redis Pub/Sub — to listen, plugin, turn on and tune in. Image used under license from Shutterstock.com. Contributor: Everett Collection
Connected delivery means that:
This implies that:
Note that “disconnection” is intentional by design, but can also be due to network or client failures, so may be unexpected, and this will also result in potential message loss.
From reading the Redis Pub/Sub documentation, and other articles, it appears that Redis uses push notification to ensure messages are delivered to all current subscribers, which has potential performance penalties for large numbers of subscribers.
Redis Pub/Sub channels can have multiple subscribers, but too many may have a performance impact (unlike real radio which works perfectly for unlimited receivers). Image used under license from Shutterstock.com. Contributor: jimeone
What are appropriate use cases for the Redis Pub/Sub “connected” delivery semantics?
Finally, I wondered if Apache Kafka can do something similar to Redis Pub/Sub? Can Kafka do:
Yes. This corresponds to multiple consumer groups in Kafka. A message sent to a Kafka topic with multiple consumer groups is received by one consumer in each group. However, to ensure that only a single consumer per consumer group gets each message, in Kafka you would have a sole subscriber per consumer group.
Redis Pub/Sub messages don’t have a key, just a value, although in Redis the channel is really the key.
Yes, as keys are optional in Kafka. If there’s no key, then Kafka uses a round-robin load-balancing algorithm to distribute the messages sent to a topic among the available consumers in each group. If there’s only one consumer, then that consumer gets all the messages.
Yes, Kafka can do this as well, as Kafka consumers can choose what offset, or alternately time, to read from, enabling tricks like replaying the same messages, reliable disconnected delivery from the last read message and skipping messages etc. Kafka consumers poll for messages, so each time they poll, they can choose to read from the next (unread) offset, or alternatively, they can skip the unread messages and start reading from the end offset (using seekToEnd()), and only read new messages. This is certainly not the normal model of operation for Kafka, but it is logically possible and fits several use cases and operational requirements, such as if consumers are getting behind, they can catch up by skipping messages, etc.
Redis Pub/Sub is designed for speed (low latency), but only with low numbers of subscribers. Subscribers don’t poll and while subscribed/connected are able to receive push notifications very quickly from the Redis broker — in the low milliseconds, even less than 1 millisecond as confirmed by this benchmark.
Also note that some articles report that Redis Pub/Sub performance is sensitive to message size; it works well with small messages, but not large ones.
Average Kafka latency is typically in the low 10s of milliseconds. (The average producer latency was 15 to 30 milliseconds reported in our partition benchmarking article).
Kafka also wasn’t designed for large messages, but it can work with reasonably large messages, even up to 1GB, particularly if compression is enabled and potentially in conjunction with Kafka tiered storage.
To maximize Redis throughput, you need to pipeline the producer publish operation, but this will push out the latency, so you can’t have both low latency and high throughput with Redis Pub/Sub.
Redis is mostly single-threaded, so the only way to improve broker concurrency is by increasing the number of nodes in a cluster.
On the other hand, Kafka consumers rely on polling, and potentially batching of messages, so the latency will be potentially slightly higher, typically 10s of milliseconds. However, scalability is better due to Kafka consumer groups and topic partitions, which enable very high consumer concurrency backed by broker concurrency (multiple nodes and partitions).
And just a reminder that Redis Pub/Sub isn’t durable (the channels are in-memory only), but Kafka is highly durable (it’s disk-based, and has configurable replication to multiple different nodes).
Kafka also has automatic failover for consumers in groups — if consumers fail, others take over (but watch out for rebalancing storms). And Kafka Connect enables higher reliability by automatically restarting connector tasks for some failure modes. Given that Redis Pub/Sub doesn’t have the concept of subscriber groups, you are on your own here and would need to handle this differently, perhaps running Redis subscriber clients in Kubernetes pods with automatic restarts and scaling, etc.