VOOZH about

URL: https://thenewstack.io/distributed-database-architecture-what-is-it/

⇱ Distributed Database Architecture: What Is It? - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2023-04-18 06:20:04
Distributed Database Architecture: What Is It?
sponsor-influxdata,sponsored-post-contributed,
Data / Software Development / Storage

Distributed Database Architecture: What Is It?

A look at the different types, their benefits and drawbacks, and how to design one.
Apr 18th, 2023 6:20am by Alexander Fridman
👁 Featued image for: Distributed Database Architecture: What Is It?
InfluxData sponsored this post.

Databases power all modern applications. They’re behind your Angry Birds mobile game as much as they’re behind the space shuttle. In the beginning, databases were hosted on a single physical machine. Basically, it was a computer running only one program: the database. Then we moved to running databases on virtual machines, where resources are shared among multiple operating systems and applications.

In recent years, we moved to running databases in the cloud. And we no longer use a single database instance to store the data. Modern database systems are spread across multiple computers or nodes, which work together to store, manage and access the data.

This post is about distributed database architecture. We’ll cover what a distributed database is, what types exist, their benefits and drawbacks and how to design one.

InfluxData is the creator of InfluxDB, the leading time series platform. More than 1,900 customers use InfluxDB to collect, store, and analyze all time series data at any scale. Developers can query and analyze their time-stamped data to predict, respond, and adapt in real-time.
Learn More
The latest from InfluxData

What Is a Distributed Database?

As stated above, a distributed database is a database design that comprises several nodes working together. A node is basically a computing instance (it can also be a virtual machine or a container) that’s running the database. Each node in the distributed database has its own copy of the database, and these nodes communicate with each other to make sure they all have the same information.

Distributed databases offer many benefits over traditional single-server databases, including improved scalability, availability, performance and fault tolerance.

Why Switch from a Single Node to a Multinode Setup?

In the past, when data was measured in megabytes and database users were measured in dozens, a single database node could have done the work. A typical scenario for this kind of architecture was hosting the database on an on-premises mainframe machine. Developers connected to the database, ran queries, received the output, then disconnected. A single system administrator or a database administrator took care of the system in terms of availability, performance and upgrades.

Take Netflix as an example. It has a modern database architecture. Hundreds of millions of users all over the world use the application from different devices. Millions use the system at the same time. It should be available 24/7.

In this scenario, Netflix couldn’t possibly rely on a single computer running a single database application. If it goes down, millions of users will suffer a service disruption. In addition, storing all the data in one place is neither economically beneficial nor practical.

Imagine saving all the user data in one database instance running on a single server. The database backend should grow automatically as more subscribers join the service. Thus, a single on-premises database is simply not practical in terms of availability, scalability and fault tolerance.

Benefits of a Distributed Database Architecture

As mentioned above, distributed databases offer many benefits over traditional single-server databases, including improved scalability, availability, performance and fault tolerance.

Scalability

Compared to a single database that can only scale horizontally, distributed databases can scale vertically. In other words, if you have a single database, the only way to scale it so it can handle more load is to add memory and RAM. With a distributed database, you can add additional nodes.

Availability and Fault Tolerance

If you only have one database and the database goes down, the application will go down with it. But with a distributed database, losing a node won’t affect the whole application, and the service will continue to function.

Data Security

You can split data across multiple nodes. Therefore, if a node is breached, most of the application’s data will remain secure. The same goes for data corruption. If node data was corrupted due to a server or software error, it won’t affect other nodes.

Reduced Network Traffic

Distributed databases can reduce network traffic by storing data closer to where it will be used, reducing the need to transmit data over the network.

Drawbacks of Distributed Databases

Designing and implementing a single database instance is much easier than designing and implementing a distributed database architecture. The same applies to monitoring, troubleshooting, maintaining and upgrading. A distributed database requires thorough planning, the right database vendor, the right architecture and so forth.

In addition to the increased complexity, there’s also higher cost as it often requires more hardware, software and skilled personnel. Lastly, there are consistency and coordination issues. Ensuring consistency across all nodes in a distributed database can be challenging, especially in systems with high concurrency or large amounts of data.

Types of Distributed Database Architecture

There are several types of distributed database architectures. Each has its own strengths and weaknesses, and the choice of architecture depends on the application’s specific needs.

Master-Slave Replication

In master-slave architecture, there’s a single primary database that manages all write operations while one or more slave databases replicate the data from the master for read operations. So all insert operations go to one node, and read operations are distributed across nodes. This setup is ideal for read-intensive applications.

Multi-Master Replication

With multi-master replication, all nodes provide both read and write capabilities, both master and slave.

Shared-Nothing Architecture

In shared-nothing architecture, data is shared, and each node is responsible for only some of the data. Data is essentially split across nodes, and each node is responsible for both read and write.

In a federated database architecture, there are several independent databases (and even several database types) organized as one meta-database.

Federated Database Architecture

In a federated database architecture, there are several independent databases (and even several database types) organized as one meta-database. Basically, what you have here is a unified virtual database that you can query. The queries are distributed internally by the virtual database manager.

Examples of Distributed Databases

There are many examples and vendors that provide database solutions that work and that you can deploy as a distributed architecture. The following are the most popular:

  1. MongoDB, a popular NoSQL document database that you can distribute across multiple servers. It stores data in collections rather than tables and in documents rather than rows.
  2. Apache Cassandra, a highly scalable, distributed database system that’s designed for managing large volumes of structured and unstructured data across multiple data centers.
  3. Amazon DynamoDB, a fully managed NoSQL database service.

👁 Image
Choosing and Designing Your Distributed Database Architecture

When it’s time to choose which database architecture you should use for your organization or application, there are several things to consider. There are no right or wrong answers here. Each architecture has its use cases, so you should choose an architecture that best fits yours. Consider (among other factors) data partitioning, replication and consistency. In more detail, here are some of the steps that you should take:

  1. Identify the data that needs to be stored and accessed in the distributed database. This will help determine the amount of storage, schema design and so forth.
  2. Determine your data partitioning strategy. Decide on the strategy for partitioning across multiple nodes.
  3. Choose your replication strategy. You can choose between master-slave, multi-master or something else.
  4. Decide on a consistency model. Choose whether you need your data to be consistent across nodes, eventually consistent or strongly consistent.

This is of course not an exhaustive list. You’ll also need to enlist an experienced architect.

Conclusion

Like any other technology, distributed databases have their advantages and drawbacks. However, for modern use cases, their advantages outweigh the drawbacks. There are several types of distributed database architecture, and you should only choose the one that best fits your needs after careful consideration.

InfluxData is the creator of InfluxDB, the leading time series platform. More than 1,900 customers use InfluxDB to collect, store, and analyze all time series data at any scale. Developers can query and analyze their time-stamped data to predict, respond, and adapt in real-time.
Learn More
The latest from InfluxData
TRENDING STORIES
Alexander Fridman is a veteran in the software industry with over 11 years of experience. He worked his way up the corporate ladder and has held the positions of senior software developer, team leader, software Architect and CTO. Alexander is...
Read more from Alexander Fridman
InfluxData sponsored this post.
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Pragma.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
👁 Image
Join the millions of developers using InfluxDB to predict, respond, and adapt in real-time.