VOOZH about

URL: https://thenewstack.io/raft-native-the-foundation-for-streaming-datas-best-future/

⇱ Raft Native: The Foundation for Streaming Data’s Best Future - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2023-05-30 09:44:09
Raft Native: The Foundation for Streaming Data’s Best Future
sponsor-redpanda,sponsored-post-contributed,
Data / Software Development / Storage

Raft Native: The Foundation for Streaming Data’s Best Future

KRaft brings more simplicity to Kafka. But truly Raft-native systems show the most promise for tomorrow’s hypergig streaming data workloads.
May 30th, 2023 9:44am by Doug Flora
👁 Featued image for: Raft Native: The Foundation for Streaming Data’s Best Future
Redpanda sponsored this post.

Consensus is fundamental to consistent, distributed systems. To guarantee system availability in the event of inevitable crashes, systems need a way to ensure each node in the cluster is in alignment, such that work can seamlessly transition between nodes in the case of failures.

Consensus protocols such as Paxos, Raft, View Stamped Replication (VSR), etc. help to drive resiliency for distributed systems by providing the logic for processes like leader election, atomic configuration changes, synchronization and more.

As with all design elements, the different approaches to distributed consensus offer different tradeoffs. Paxos is the oldest consensus protocol around and is used in many systems like Google Spanner, Apache Cassandra, Amazon DynamoDB and Neo4j.

Redpanda is the streaming data platform for developers. Built with a native Kafka API, Redpanda eliminates complexity, maximizes performance and reduces costs. Its lean architecture gives you 10x lower latencies and up to a 6x lower cloud spend — without sacrificing reliability or durability.
Learn More
The latest from Redpanda

Paxos achieves consensus in a three-phased, leaderless, majority-wins protocol. While Paxos is effective in driving correctness, it is notoriously difficult to understand, implement and reason about. This is partly because it obscures many of the challenges in reaching consensus (such as leader election, and reconfiguration), making it difficult to decompose into subproblems.

Raft (for reliable, replicated, redundant and fault-tolerant) can be thought of as an evolution of Paxos — focused on understandability. This is because Raft can achieve the same correctness as Paxos but is more understandable and simpler to implement in the real world, so often it can provide greater reliability guarantees.

For example, Raft uses a stable form of leadership, which simplifies replication log management. And its leader election process, driven through an elegant “heartbeat” system, is more compatible with the Kafka-producer model of pushing data to the partition leader, making it a natural fit for streaming data systems like Redpanda. More on this later.

👁 Image

Because Raft decomposes the different logical components of the consensus problem, for example by making leader election a distinct step before replication, it is a flexible protocol to adapt for complex, modern distributed systems that need to maintain correctness and performance while scaling to petabytes of throughput, all while being simpler to understand to new engineers hacking on the codebase.

For these reasons, Raft has been rapidly adopted for today’s distributed and cloud native systems like MongoDB, CockroachDB, TiDB and Redpanda to achieve greater performance and transactional efficiency.

How Redpanda Implements Raft Natively to Accelerate Streaming Data

When Redpanda founder Alex Gallego determined that the world needed a new streaming data platform — to support the kind of gigabytes-per-second workloads that bring Apache Kafka to a crawl without major hardware investments — he decided to rewrite Kafka from the ground up.

The requirements for what would become Redpanda were: 1) it needed to be simple and lightweight to reduce the complexity and inefficiency of running Kafka clusters reliably at scale; 2) it needed to maximize the performance of modern hardware to provide low latency for large workloads; and 3) it needed to guarantee data safety even for very large throughputs.

The initial design for Redpanda used chain replication: Data is produced to node A, then replicated from A to B, B to C and so on. This was helpful in supporting throughput, but fell short for latency and performance, due to the inefficiencies of chain reconfiguration in the event of node downtime (say B crashes: Do you fail the write? Does A try to write to C?). It was also unnecessarily complex, as it would require an additional process to supervise the nodes and push reconfigurations to a quorum system.

Ultimately, Alex decided on Raft as the foundation for Redpanda consensus and replication, due to its understandability and strong leadership. Raft satisfied all of Redpanda’s high-level design requirements:

  • Simplicity. Every Redpanda partition is a Raft group, so everything in the platform is reasoning around Raft, including both metadata management and partition replication. This contrasts with the complexity of Kafka, where data replication is handled by ISR (in-sync replicas) and metadata management is handled by ZooKeeper (or KRaft), and you have two systems that must reason with one another.
  • Performance. The Redpanda Raft implementation can tolerate disturbances to a minority of replicas, so long as the leader and a majority of replicas are stable. In cases when a minority of replicas have a delayed response, the leader does not have to wait for their responses to progress, mitigating impact on latency. Redpanda is therefore more fault-tolerant and can deliver predictable performance at scale.
  • Reliability. When Redpanda ingests events, they are written to a topic partition and appended to a log file on disk. Every topic partition then forms a Raft consensus group, consisting of a leader plus a number of followers, as specified by the topic’s replication factor. A Redpanda Raft group can tolerate ƒ failures given 2ƒ+1 nodes; for example, in a cluster with five nodes and a topic with a replication factor of five, two nodes can fail and the topic will remain operational. Redpanda leverages the Raft joint consensus protocol to provide consistency even during reconfiguration.

Redpanda also extends core Raft functionality in some critical ways to achieve the scalability, reliability and speed required of a modern, cloud native solution. Redpanda enhancements to Raft tend to focus on Day 2 operations, for instance how to ensure the system runs reliably at scale. These innovations include changes to the election process, heartbeat generation and, critically, support for Apache Kafka `acks`.

Redpanda’s optimistic implementation of Raft is what enables it to be significantly faster than Kafka while still guaranteeing data safety. In fact, Jepsen testing has verified that Redpanda is a safe system without known consistency problems and a solid Raft-based consensus layer.

👁 Image

But What about KRaft?

While Redpanda takes a Raft-native approach, the legacy streaming data platforms have been laggards in adopting modern approaches to consensus. Kafka itself is a replicated distributed log, but it has historically relied on yet another replicated distributed log — Apache ZooKeeper — for metadata management and controller election.

This has been problematic for a few reasons: 1) Managing multiple systems introduces administrative burden; 2) Scalability is limited due to inefficient metadata handling and double caching; 3) Clusters can become very bloated and resource intensive — in fact, it is not too uncommon to see clusters with equal numbers of ZooKeeper and Kafka nodes.

These limitations have not gone unacknowledged by Apache Kafka’s committers and maintainers, who are in the process of replacing ZooKeeper with a self-managed metadata quorum: Kafka Raft (KRaft).

This event-based flavor of Raft achieves metadata consensus via an event log, called a metadata topic, that improves recovery time and stability. KRaft is a positive development for the upstream Apache Kafka project because it helps alleviate pains around partition scalability and generally reduces the administrative challenges of Kafka metadata management.

Unfortunately, KRaft does not solve the problem of having two different systems for consensus in a Kafka cluster. In the new KRaft paradigm, KRaft partitions handle metadata and cluster management, but replication is handled by the brokers using ISR, so you still have these two distinct platforms and the inefficiencies that arise from that inherent complexity.

The engineers behind KRaft are upfront about these limitations, although some exaggerated vendor pronouncements have created ambiguity around the issue, suggesting that KRaft is far more transformative.

👁 Image

Combining Raft with Performance Engineering: A New Standard for Streaming Data

As data industry leaders like CockroachDB, MongoDB, Neo4j and TiDB have demonstrated, Raft-based systems deliver simpler, faster and more reliable distributed data environments. Raft is becoming the standard consensus protocol for today’s distributed data systems because it marries particularly well with performance engineering to further boost the throughput of data processing.

For example, Redpanda combines Raft with speedy architectural ingredients to perform at least 10 times faster than Kafka at tail latencies (p99.99) when processing a 1GBps workload, on one-third the hardware, without compromising data safety.

Traditionally, GBps+ workloads have been a burden for Apache Kafka, but Redpanda can support them with double-digit millisecond latencies, while retaining Jepsen-verified reliability. How is this achieved? Redpanda is written in C++, and uses a thread-per-core architecture to squeeze the most performance out of modern chips and network cards. These elements work together to elevate the value of Raft for a distributed streaming data platform.

👁 Image

Redpanda vs. Kafka with KRaft performance benchmark – May 11, 2023

An example of this in terms of Redpanda internals: Because Redpanda bypasses the page cache and the Java virtual machine (JVM) dependency of Kafka, it can embed hardware-level knowledge into its Raft implementation.

Typically, every time you write in Raft you have to flush to guarantee the durability of writes on disk. In Redpanda’s approach to Raft, smaller intermittent flushes are dropped in favor of a larger flush at the end of a call. While this introduces some additional latency per call, it reduces overall system latency and increases overall throughput, because it is reducing the total number of flush operations.

While there are many effective ways to ensure consistency and safety in distributed systems (Blockchains do it very well with Proof of Work and Statement of Work protocols), Raft is a proven approach and flexible enough that it can be enhanced to adapt to new challenges.

As we enter a new world of data-driven possibilities, driven in part by AI and machine learning use cases, the future is in the hands of developers who can harness real-time data streams. Raft-based systems, combined with performance-engineered elements like C++ and thread-per-core architecture, are driving the future of data streaming for mission-critical applications.

Redpanda is the streaming data platform for developers. Built with a native Kafka API, Redpanda eliminates complexity, maximizes performance and reduces costs. Its lean architecture gives you 10x lower latencies and up to a 6x lower cloud spend — without sacrificing reliability or durability.
Learn More
The latest from Redpanda
TRENDING STORIES
Doug Flora is a director of product for Redpanda Data, where he focuses on product strategy, go to market and product-led growth for Redpanda’s streaming data platform. He has more than 12 years experience in technology, spanning the analytics, database,...
Read more from Doug Flora
Redpanda sponsored this post.
SHARE THIS STORY
TRENDING STORIES
MongoDB is a sponsor of The New Stack.
TNS owner Insight Partners is an investor in: Pragma, Statement.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.