VOOZH about

URL: https://thenewstack.io/a-closer-look-at-the-portworx-storage-cluster-architecture/

⇱ A Closer Look at the Portworx Storage Cluster Architecture - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2020-03-27 09:29:00
A Closer Look at the Portworx Storage Cluster Architecture
feature,tutorial,
Cloud Native Ecosystem / Kubernetes / Storage

A Closer Look at the Portworx Storage Cluster Architecture

An introduction to the Portworx cloud native storage system
Mar 27th, 2020 9:29am by Janakiram MSV
👁 Featued image for: A Closer Look at the Portworx Storage Cluster Architecture

Portworx is a modern, distributed, cloud native storage platform designed to work with orchestrators such as Kubernetes. The platform, from the company of the same name, brings some of the proven techniques applied to traditional storage architecture to the cloud native environment.

Continuing the series on stateful workloads and the cloud native storage offerings, I will introduce the architecture of Portworx.

What Is Portworx?

Portworx is a software-defined storage platform built for containers and microservices. It abstracts multiple storage devices to expose a unified, overlay storage layer to cloud native applications. Portworx users can deploy highly available stateful applications across multiple physical hosts in a data center, compute instances running in multiple zones, regions, and even different cloud providers.

Portworx can be easily installed on any host that runs a container runtime such as Docker. Since the platform relies on its own distributed services, it is possible to configure a multinode Portworx storage cluster without the need for installing Kubernetes. But through tight integration with Kubernetes, Portworx makes it possible to create hyperconverged or disaggregated deployments. In a hyperconverged scenario, compute and storage run on the same node while in a disaggregated scenario, only designated nodes act as storage nodes.

During the last year, Portworx has matured from being an overlay storage layer to an enterprise data platform. The current offering includes everything from integrated security to business continuity to dynamic scaling of storage pools.

Portworx Architecture

Like most of the distributed platforms, Portworx implements a control plane and a data plane. The control plane acts as the command and control center for all the storage nodes participating in the cluster. Each storage node runs a data plane responsible for managing the I/O and the attached storage devices.

Both the control plane and the data plane run in a distributed mode. This ensures the high availability of the storage service. To achieve the best uptime, Portworx recommends running at least three storage nodes in a cluster. Depending on the size of the cluster, each node may run the control plane as well as the data plane components. In large clusters, it is possible to have nodes that don’t participate in the data plane which means they are not designated storage nodes.

👁 Image

The above illustration depicts a three-node Portworx storage cluster. The control plane runs on three individual nodes that share the same key/value database. The cluster is identified by a unique id that all the participating nodes of the control plane use. When a new node that runs the control plane joins, it is expected to use the same cluster-id.

The data plane runs on one or more storage nodes that have an attached block storage device. The data plane is responsible to manage the node-level operations and I/O redirection across the storage nodes.

The control plane uses gRPC to communicate with the data plane.

Let’s take a closer look at the components of the control plane and the data plane.

Portworx Control Plane

The control plane exposes an external interface for managing the cluster. It is used by the native CLI of Portworx, pxctl, to perform all the storage-related tasks such as the creation of volumes and storage pools. Orchestrators such as Kubernetes use this to API for coordinating the placement and scheduling of stateful pods.

👁 Image

The Portworx service API is available as a REST endpoint, gRPC service, and through the Container Storage Interface (CSI). Portworx has open sourced OpenStorage SDK, a specification and a library that defines common storage operations performed in the context of cloud native environments. The SDK has bindings for Golang and Python which makes it easy to invoke the API. There is also a Swagger UI available within a Portworx cluster that can be accessed on port 9021 of any node.

👁 Image

The nodes in the control plane use a gossip protocol to send the heartbeat, real-time statistics of I/O usage and available CPU and memory across the nodes. This mechanism ensures the high availability of the control plane. The stats from the control plane are also shared with the data plane that helps in making scheduling decisions.

👁 Image

The cluster’s metadata is stored in etcd, a distributed key/value database. The root of the KVDB consists of the cluster id common to all the nodes, along with other information such as volume configuration and node registration status. This acts as a single source of truth reflecting the current state of the cluster.

The provisioning management component is responsible for configuring the storage pools, provisioning volumes, sending instructions to the data plane for mounting and unmounting volumes, and even distributing the replicas of storage blocks across multiple fault-domains. Essentially, the provisioning service deals with the lifecycle of storage pools and volumes.

Finally, the background tasks component performs RAID-scan, incrementing the HA count, forcing resync of replicas, and taking snapshots of volumes based on a predefined schedule.

Portworx Data Plane

The control plane and data plane talk to each other through a gRPC protocol. All decisions taken by the control plane are sent as instructions to the corresponding node in the data plane.

👁 Image

The data plane performs I/O to the devices through the POSIX interface. Each node of the data plane communicates with the other node over RPC which is used for replicating the data across multiple nodes.

In a Portworx cluster, a disk or a block storage device can be attached to only one node at a time which assumes the responsibility of the data path.

👁 Image

When data is written to a volume exposed though a bind-mount within the container/pod, it goes to the Linux kernel via the I/O queue. The data then goes to the write-through cache which eventually commits the data to the disk. If the data is available within the cache, it responds to a read operation without going to the underlying storage. Each write operation is also associated with a timestamp which will help identify the most recent data across all the nodes. In case one of the nodes participating in a replication becomes unavailable, the time stamp will help resync the missing data on the node.

The I/O dispatch component identifies the node with the target storage volume and redirects the operation to that specific node. If a volume has multiple replicas, the I/O dispatcher ensures that each node receives a replica. The device store acts as an interface between the attached storage and the node.

The node to which the block storage device is attached becomes the transaction coordinator. The I/O targeting the volume always goes through the transaction coordinator. Portworx also supports shared volumes that are visible to all the nodes. Even in the scenario of using shared volumes, the transaction coordinator takes the responsibility of committing the write and sending an acknowledgment to the up steam components.

When a write operation is performed across the replicas of a quorum, the write is acknowledged by the Linux kernel which will be forwarded to the user.

Portworx’s data plane also ensures the security of data at rest. It is done through dm-crypt, a disk encryption system built into the Linux kernel based on the crypto subsystem and device-mapper. On Intel CPUs, Portworx takes advantage of hardware acceleration to minimize the burden on hosts. Every write operation goes through the encryption process before committed to the disk.

Portworx has a fascinating architecture to implement a modern, distributed, cloud native storage platform. In the future articles of this series, I will cover the security, custom scheduling, migration, disaster recovery, and dynamic volume management aspects of Portworx. Stay tuned!

Janakiram MSV’s Webinar series, “Machine Intelligence and Modern Infrastructure (MI2)” offers informative and insightful sessions covering cutting-edge technologies. Sign up for the upcoming MI2 webinar at http://mi2.live.

Portworx is a sponsor of The New Stack.

Feature image by dariasophia from Pixabay.

TRENDING STORIES
Janakiram MSV (Jani) is a practicing architect, research analyst, and advisor to Silicon Valley startups. He focuses on the convergence of modern infrastructure powered by cloud-native technology and machine intelligence driven by generative AI. Before becoming an entrepreneur, he spent...
Read more from Janakiram MSV
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Docker.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.