![]() |
VOOZH | about |
We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.
Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.
Follow TNS on your favorite social media networks.
Become a TNS follower on LinkedIn.
Check out the latest featured and trending stories while you wait for your first TNS newsletter.
Distributed SQL is generating a lot of buzz lately, and for good reason. Relational databases are trusted by every company in the world to run their mission-critical applications. But as the first innovators began shifting to online business and customer engagement, relational databases couldn’t keep up. NoSQL databases captured the spotlight with their ability to support modern web applications with massive workloads, but they did so by sacrificing consistency and data integrity. Distributed SQL is the best of both worlds.
A distributed SQL database is made up of multiple database instances working together (i.e., nodes in a cluster), with each one storing and querying a subset of the data rather than all of it. Simply add more instances to increase storage capacity and/or query performance. And of course, in order to guarantee strong consistency, all copies of data (i.e., replicas) are synchronously written at the same time.
Let’s go through a few examples where distributed SQL is a good fit:
Until recently, there haven’t been a lot of options when it comes to scaling relational databases. The first was to scale up, to replace an existing server with an even bigger one. And when the biggest commodity server was no longer enough, some companies turned to hardware appliances such as Oracle Exadata. The second was to scale out with sharding. This became an alternative for tech giants with deep engineering expertise and anyone else unwilling or unable to spend millions of dollars on a hardware appliance. However, while more cost-effective than scaling up, sharding tends to be brittle, difficult to maintain and limited in scope — often specialized to support specific applications and queries. With distributed SQL you get uncompromising scalability for any application without having to spend a fortune on a hardware appliance or accept the limitations and technical burden of sharding.
Database workloads are often quite variable. Distributed SQL is engineered to ramp up capacity in response to seasonal and business requirements. It lets you run with a few resources when that’s all that is needed, and quickly add more to handle spikes in your business. This elasticity is critical for a myriad of applications.
The ability to adapt to changing conditions on the fly is a key differentiator from other SQL iterations. Distributed SQL allows you to add nodes on-demand and scale-out. It also eliminates the issue of having to change your shard key or modifying how the data is split. Trust me — no one wants to spend their time sharding.
Some distributed SQL databases can also spread data across various database instances, constantly checking to ensure that the most optimal, efficient distribution of data is in use. If there is a change in customer behavior or there are new applications, it’s looking to see if moving the data around a little bit here and a little bit there will enhance performance.
This flexibility also mitigates downtime and cost concerns. For example, it could take 100 instances to meet your peak traffic conditions, which you don’t want to pay for 24/7 if you only need them all running 1% of the time. Customers can, depending on the solution they’re using, scale-out and back on an hourly basis in order to cut their spend as much as possible. In the cloud it’s literally as easy as pushing a button when you need more capacity or to decrease it to save costs when traffic evens out.
Additionally, uniform data distribution is maintained as nodes are added and removed, or if data is inserted unevenly. This automatic data distribution approach removes the need to manage shards, and enables data availability to be maintained in the event of node loss.
Distributed SQL is engineered and designed to assume that the more instances you have in a distributed database, the more likely some are going to fail. That’s just reality. So, if there is an infrastructure failure, the fact that the data is distributed and replicated to a handful of other nodes ensures that it remains available. The database assumes instances will come and go without notice, and it handles topology changes automatically. This is a major advantage for businesses as it eliminates the costs, lost revenue, embarrassment and bad reputation that can result from downtime.
Teams responsible for databases need their data to be durable, safe, consistent and the databases themselves to always be available and scalable. Yes, that’s asking a lot, but it’s all of critical importance to any business’ success.
In the not-too-distant past, few databases were able to guarantee that level of functionality. There were a lot of constraints, limitations and trade-offs — such as giving up data integrity and transactions to get scalability. One reason for these constraints is the amount of technical challenges involved in performing transactions across multiple database instances safely (e.g., being ACID-compliant). With the rise of distributed SQL, there’s no need to compromise anymore.
Distributed SQL continues to evolve, responding to marketplace demands for more reliable, transparent, accurate data, delivered at ever faster speeds, and at a truly global level. Given that it’s a newer technology and has been engineered for the cloud, it’s cloud native by definition. And if there’s anything the tech and business communities can agree on, it’s that the move to the cloud is basically a runaway train, as a recent survey of IT decision-makers demonstrated.
As we head into 2021 and beyond, I see the future for distributed SQL spanning multiple data centers and making that transparent to applications — letting them write anywhere and read anywhere. And not just within a data center, but at a truly global level. That will be the next maturity phase. The future looks bright for distributed SQL and you’ll be hearing a lot more about it.
Feature image via Pixabay.