VOOZH about

URL: https://thenewstack.io/data-2023-outlook-rethink-the-modern-data-stack-1/

⇱ Data 2023 Outlook: Rethink the Modern Data Stack - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2023-01-04 12:00:56
Data 2023 Outlook: Rethink the Modern Data Stack
Cloud Services / Data

Data 2023 Outlook: Rethink the Modern Data Stack

Enterprises will continue their cloud journeys, but they want to get smarter with their cloud consumption. That demands simplification, and with it, rethinking how the modern data stack is packaged and delivered.
Jan 4th, 2023 12:00pm by Tony Baer
👁 Featued image for: Data 2023 Outlook: Rethink the Modern Data Stack
Photo licensed from Shutterstock.

As we enter the fourth year of the 2020s, there’s little secret that the global economy continues to be in upheaval. Will there be a recession or not? It’s anyone’s guess, and that uncertainty took a toll on what had been hot growth tech growth stocks like Snowflake, while providing reprieves for less sexy old guard mainstays like IBM and VMware.

The bulk of the breakneck growth has been in the cloud. Since the dawn of the pandemic, the narrative has centered on COVID accelerating the existing secular trend of growing cloud adoption. Going forward, the usual suspects are still predicting cloud spending growth in 2023, but Snowflake’s lowered Q4 guidance points to a possible narrative of “whatever comes up must (at some point) come down,” or at least slow down.

In this post, we voice our predictions on the operational side of cloud data platforms and analytics. Tomorrow, we’ll direct the focus on what’s coming with the management side of data, in a follow-up post. But first, let’s understand the bigger picture of what and why this is happening.

This won’t be Dot Com Bust 2.0.

Here’s the general context: Despite, or because of economic uncertainty, cloud adoption will continue to advance.

There’s little debate that the cloud is no longer a financial budgeting maneuver to move costs from the capital to the operating budget; that makes overall adoption fairly resilient to spikes in the overall economy. It is an enabler and accelerator for business transformation, as it removes many of the barriers to launching new apps and business services, and provides the flexibility for changing gears far more readily than systems, with their own dedicated infrastructure operating inside the data center.

Economic uncertainty simply ramps up the pressure for businesses to transform. So, nope, this won’t be a repeat of the dot come bust. Even as internet giants like Amazon, Meta, and maybe even Google, might be shedding jobs, enterprises in the mainstream economy will gladly soak up all the cloud infrastructure, cybersecurity, data science, and AI expertise they can find.

While we’re not about to see retrenchment in cloud adoption, we’ll see more scrutiny of cloud spend. While cloud compute and storage might be cheap, a lot of cheap eventually gets expensive. More to the point, the spotlight will manifest itself in more ways than simply arbitrary budget caps; instead, we expect that it will also surface in platform choices and desire to streamline their modern data stacks that, until recently, the cloud had been disaggregating. The cloud was supposed to make IT simple, and in the coming year, enterprises will hold the hyperscalers’ feet to the fire to live up to that promise.

Of course, there are multiple planes of attack for optimizing cloud spend. There is a well-established and varied ecosystem of solutions ranging from monitoring tools from the hyperscalers (e.g., AWS CloudWatch) to providers like BMC, CloudBolt, Datadog Dynatrace, Flexera, Micro Focus, ServiceNow, VMware, Yotascale, Zerto and many, many others addressing cost control, security, governance, observability, workload optimization. Many of these solutions can drill down to granular reporting of consumption by SaaS service, app, and line organization. Consider this universe of tools as latter-day cloud manifestations of traditional IT service management and IT chargeback solutions.

There’s another side to this coin that goes beyond the traditional pane of glass, and it’s how services are delivered. AWS for instance offers hundreds of services addressing everything from analytics to application integration, contain center, containers, databases, gaming, IoT, machine learning, quantum computing, security, storage and others. Trying to optimize from the horn of plenty is challenging enough. Back to our core point, which is that complexity is the enemy of efficiency, and complexity adds cost. As data is the lane that we live in, we’ll focus our spotlight there. And it’s in data where we have an all-too-inviting target: The Modern Data Stack.

As data is the lane we play in, let’s focus our sights on how data management gets rationalized. Time to cut to the chase.

Simplify the Modern Data Stack for the Cloud

If you’re aiming to be smarter about how you use the cloud, complexity is your enemy. We ranted about this last year in our post, When Will the Cloud Get Simpler?

We expect to see a refactoring of what’s been termed the “Modern Data Stack,” as described by providers as diverse as Fivetran and MongoDB. That stack has typically encompassed a data pipeline for harvesting, transforming, and ingesting data (the modern-day successor to ETL tools), the data warehouse, and the various visualization and analytic tools for gaining insights. To all this, we would add the operational or transaction database, which more often than not, is the primary source for this data.

What made the data stack modern? Well, it is hosted and delivered in the cloud and it takes advantage of the cloud’s elasticity. OK, that’s a start; customers no longer have to worry about provisioning or the housekeeping for patches and upgrades, and with many of these modern data stack services being serverless, there’s a lot less upfront hassle and a lot more flexibility.

But that’s just not enough. The modern data stack boasts an almost too-rich array of data and analytics SaaS services, and while each SaaS service individually makes its own process simpler for customers to launch and manage, they’re still on the hook for integrating them. And did we neglect to mention, these toolchains can get highly complex?

We’ve called on database and analytics SaaS providers to, literally, get it together. Make life simpler for the customer. Simpler is more economical and simpler is smarter. Less wasted cycles and expenses for the customer, more consumption of value-added services for the provider. Everybody should win. Over the past year, we saw that a few providers have started feeling your pain, and this is where we expect to see more positive responses in 2023.

Look for Bundling

The low-hanging fruit is for offering a combo of services that are frequently used together. This has been a long-established pattern in the on-premises solution world. Here we’re making some predictions and suggestions on where to see more linkages this year.

Bundling presents a golden opportunity for hyperscalers to new platinum tiers to their partner programs. It would involve extending core database and analytic services by pre-integrating, bundling, and promotional pricing popular third-party services, stitching them together with under-the-hood orchestration. The goal is shifting the integration burden off the customer’s shoulders, and with your most popular partners, and attractively pricing those combos to stimulate adoption.

Let’s throw out some examples. Add light analytics to transaction databases, and for more “serious” use cases (where you don’t want to slow down transaction processing with analytics), prepackage change feeds to data warehousing services where you can perform ELT. And as for ELT, have ready-made integrations in the target. That’s where the opportunity for competition comes in. AWS Glue, Azure Data Factory, Google Cloud Data Fusion will have their home court advantages with Redshift, Synapse Analytics, and BigQuery, respectively. But developers are not about to abandon their own favorites like dbt or Fivetran. That’s where your partner program kicks in with bundling pre-integrated stacks. And, by the way, the same holds true for analytics and AutoML services.

In-database machine learning has already become a checkbox feature for cloud data warehousing services, although the degree to which data must be moved still varies by provider. Blending of light, operational analytics into transaction databases is also already happening. Google and Oracle introduced API-compatible implementations of MySQL and PostgreSQL that combined the trifecta: transaction processing, analytics, and in-database AutoML. Meanwhile, SingleStore reinvented tiered storage and indexing. And even Snowflake has gotten into the act by dipping their toes in the water for lightweight transaction processing with Unistore.

Of course, the so-called (depending on whether you use Gartner’s or Forrester’s terminology) augmented analytics or translytical database is not new. Appending column and row stores are practices that date back over a decade with IBM BLU, Oracle Database In-Memory, and MariaDB SkySQL, among others.

But as noted above, let’s not stop there. Blend in ELT. AWS at least simplified the Aurora-to-Redshift data pipeline with a prebuilt Zero ETL change data capture feed. Google already builds change data capture support into BigQuery, while Azure Synapse Analytics pre-integrates Azure Data Factory. For almost every analytic platform, there is plenty of opportunity for blending in streaming as well with integration with data flow pipelines and Kafka PubSub feeds. The customer should not have to individually configure these integrations themselves and pay a la carte pricing.

Separate platforms for transaction databases and data warehouses won’t go away, and end users won’t give up their visualization or reporting tools. But we expect hyperscalers and third-party SaaS services to get more creative with blending, bundling, and pricing. Here’s a potential example. In Google Cloud, several databases (e.g., AlloyDB, BigQuery, Dataproc) share common storage. With unified governance, provided courtesy of Google’s Dataplex data fabric, the data could be selectively surfaced by the engine of choice, paid for on a per-use basis.

Serverless Plays a Supporting Role

Another move toward smarter cloud consumption is the growth of serverless. It provides an obvious form of simplification customers no longer need to worry about provisioning or capacity sizing. A critical mass of hyperscaler databases already offer it; for instance, with OpenSearch, AWS just addressed the last gap in its analytic portfolio that lacked serverless. We’d like to see new entries, like Oracle MySQL Heatwave and Google AlloyDB offer serverless options as well. Of course, serverless is not the answer for everything because if your workloads are stable and/or predictable, reserved instances are obviously the better way to go. But for new workloads, serverless removes friction, not to mention the cost of overprovisioning, which should encourage more development for the dollar. Serverless can be an entry-level stage for new workloads that can shift to reserved instances as their uptake matures.

While serverless is pretty well entrenched with cloud data warehouses, analytics, and machine learning services, we expect more transaction and operational services to add the option this year.

What about Multicloud?

All too much verbiage has been written about a core fact of cloud life: Most organizations are going to use more than one cloud. We’ve railed on and on about the administrative overhead that managing multiple clouds will bring on. But as with plate tectonics, there’s no realistic option of turning back, and besides, for competitive reasons, should any organization tie its fortunes to one cloud?

Our take is that multicloud is about freedom of cloud: the freedom to run the workload on the cloud of your choice. We have not been a heavy believer in running the same workload, or database, across multiple public clouds, given the latencies, varying security and access management structures, and infrastructure differences across hyperscalers. Of course, nature abhors a vacuum, and of course, this is where a variety of third parties are jumping in. With regard to data services, freedom of cloud is prominent among the messaging from the likes of Databricks, Snowflake, MongoDB, and others. They promise that, regardless of the infrastructure and administrative differences for each hyperscaler, operationally, their database services will look the same regardless of which cloud you run in.

Still, multicloud is the next frontier for simplification. We’ll throw a shout-out to Silicon Angle for identifying an emerging tier of the cloud ecosystem that they term Supercloud. But in the meantime, hyperscalers will have their hands full simplifying the rats’ nest of connections in their own backyards.

Tomorrow, we’ll take a look at what 2023 will mean for the Data Nerds.

TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
Amazon Web Services, BMC, Dynatrace, MongoDB and VMware are sponsors of The New Stack.
TNS owner Insight Partners is an investor in: SingleStore, Databricks.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.