A distributed transaction spans multiple systems, ensuring all operations either succeed or fail together, crucial for maintaining data integrity and consistency across diverse and geographically separated resources in modern computing environments.
What is the need for a Distributed Transaction?
The need for distributed transactions arises from the requirements to ensure data consistency and reliability across multiple independent systems or resources in a distributed computing environment. Specifically:
Atomicity: Ensuring that either all operations within a transaction are completed successfully or none of them are, avoiding partial updates that could lead to inconsistencies.
Consistency: Ensuring that all changes made as part of a transaction are committed or rolled back atomically, maintaining data integrity.
Isolation: Guaranteeing that concurrent transactions do not interfere with each other, preserving data integrity and preventing conflicts.
Durability: Confirming that committed transactions persist even in the event of system failures, ensuring reliability.
Real Life Example of Distributed transaction
E-commerce Checkout: Payment, inventory update, and order creation must all succeed or rollback together to ensure data consistency.
Banking Fund Transfer: Transferring money between accounts in different banks involves debiting one account and crediting another atomically across multiple databases.
Flight Booking System: Booking a flight may involve reserving a seat, processing payment, and updating frequent flyer miles across separate systems; all must succeed or fail together.
Working of Distributed Transactions
The working of Distributed Transactions is the same as that of simple transactions but the challenge is to implement them upon multiple databases. Due to the use of multiple nodes or database systems, there arises certain problems such as network failure, to maintain the availability of extra hardware servers and database servers. For a successful distributed transaction the available resources are coordinated by transaction managers.
Below are some steps to understand how distributed transactions work:
Step 1: Application to Resource - Issues Distributed Transaction
The first step is to issue that distributed transaction. The application initiates the transaction by sending the request to the available resources. The request consists of details such as operations that are to be performed by each resource in the given transaction.
Step 2: Resource 1 to Resource 2 - Ask Resource 2 to Prepare to Commit
Once the resource receives the transaction request, resource 1 contacts resource 2 and asks resource 2 to prepare the commit. This step makes sure that both the available resources are able to perform the dedicated tasks and successfully complete the given transaction.
After the second step, Resource 2 receives the request from Resource 1, it prepares for the commit. Resource 2 makes a response to resource 1 with an acknowledgment and confirms that it is ready to go ahead with the allocated transaction.
Step 4: Resource 1 to Resource 2 - Ask Resource 2 to Commit
Once Resource 1 receives an acknowledgment from Resource 2, it sends a request to Resource 2 and provides an instruction to commit the transaction. This step makes sure that Resource 1 has completed its task in the given transaction and now it is ready for Resource 2 to finalize the operation.
When Resource 2 receives the commit request from Resource 1, it provides Resource 1 with a response and makes an acknowledgment that it has successfully committed the transaction it was assigned to. This step ensures that Resource 2 has completed its task from the operation and makes sure that both the resources have synchronized their states.
Step 6: Resource 1 to Application - Receives Transaction Acknowledgement
Once Resource 1 receives an acknowledgment from Resource 2, Resource 1 then sends an acknowledgment of the transaction back to the application. This acknowledgment confirms that the transaction that was carried out among multiple resources has been completed successfully.
Types of Distributed Transactions
Distributed transactions involve coordinating actions across multiple nodes or resources to ensure atomicity, consistency, isolation, and durability (ACID properties). Here are some common types and protocols:
This is a classic protocol used to achieve atomicity in distributed transactions.
It involves two phases: a prepare phase where all participants agree to commit or abort the transaction, and a commit phase where the decision is executed synchronously across all participants.
2PC ensures that either all involved resources commit the transaction or none do, thereby maintaining atomicity.
Advantages: Reduces indefinite blocking under specific network conditions.
Limitations: Assumes bounded network delays and no partitions; more complex and rarely used in practice compared to 2PC.
3. XA Transactions
XA (eXtended Architecture) Transactions are a standard defined by The Open Group for coordinating transactions across heterogeneous resources (e.g., databases, message queues).
XA specifies interfaces between a global transaction manager (TM) and resource managers (RMs).
The TM coordinates the transaction's lifecycle, ensuring that all participating RMs either commit or rollback the transaction atomically.
Best for: Cross-resource atomicity within enterprise systems using heterogeneous databases and message brokers.
Implementing Distributed Transactions
Below is how distributed transactions is implemented:
Transaction Managers (TM): Coordinate transactions across multiple resources, ensuring ACID properties are maintained.
Resource Managers (RM): Manage individual resources and follow the TM’s instructions to commit or rollback transactions.
Coordination Protocols: Use protocols like 2PC, 3PC, Paxos, or Raft to ensure all participants reach a consistent commit or rollback decision.
Advantages of Distributed Transactions
Below are the advantages of distributed transaction:
Data Consistency: Data Consistency is being provided across multiple resources by distributed transactions. Various Operations are being coordinated across multiple database resources. This makes sure that system remains in a consistent state even in case of any type of failure.
Fault Tolerance: Distributed systems can handle faults and ensure proper transactions. If the participating resource fails during the execution of the transaction the transaction can be then rolled back on alternate resources and completed successfully.
Guarantees Transactions: Distributed systems guarantee the transaction. It provides features such as durability and isolation. The durability makes sure that if any transaction is committed, the changes last even if any failures occur.
Trade-offs and Challenges
Performance Impact: Extra coordination round trips and durable logging add latency, especially across geographic regions.
Blocking Risk: In 2PC, if the coordinator fails, participants may block indefinitely until recovery.
Reduced Availability: Strong consistency via distributed transactions can reduce system availability during network partitions, per the CAP theorem.
Lock Contention: Holding locks during the prepare-to-commit window can reduce concurrency and throughput for hot records.
Alternative Approaches: Saga Pattern
For long-running, user-facing workflows across microservices, the Saga pattern offers an alternative to 2PC. Sagas break a transaction into a sequence of local transactions, each with a compensating action to undo it on failure.
Steps: order, reserve, pay, with comp. (refund/cancel)
Applications Distributed Transactions
Below are the applications of Distributed Transaction:
Enterprise Resource Planning (ERP) Systems: ERP systems consist of departments within one organization. Therefore distributed transactions are used here in order to maintain transactions from various modules such as sales, inventory, finance, and human resources management.
Cloud Computing: Distributed transactions are being used in cloud-based applications. Transactions can be done with the help of multiple data sources and ensure that data updates and operations that are performed consistently.
Healthcare Systems: Healthcare systems make use of Distributed transactions when coordinating patient records, scheduling appointments for patients, and managing the billing systems. Distributed transactions maintain data consistency and performance in healthcare systems.
Financial Services: Banking transfers, payment processing, and stock trades require atomic operations across data centres to prevent double-spending and lost updates.
Distributed transactions ensure atomic operations across multiple databases and services, making them critical for enterprise applications requiring strong data consistency. Understanding the trade-offs between protocols like 2PC and patterns like Sagas enables architects to design systems that balance consistency, performance, and availability.