VOOZH about

URL: https://thenewstack.io/snowflake-databricks-and-the-fight-for-apache-iceberg-tables/

⇱ Snowflake, Databricks and the Fight for Apache Iceberg Tables - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2024-06-10 09:46:49
Snowflake, Databricks and the Fight for Apache Iceberg Tables
Data / Databases

Snowflake, Databricks and the Fight for Apache Iceberg Tables

The market for data lakes and data lakehouses is clearly being disrupted by open source software, given recent news from Databricks and Snowflake.
Jun 10th, 2024 9:46am by Joab Jackson
👁 Featued image for: Snowflake, Databricks and the Fight for Apache Iceberg Tables
For these past two weeks, San Francisco has hosted back-to-back data lake conferences from both Snowflake and Databricks. Feature images by TNS.

SAN FRANCISCO — Last week, Snowflake announced it had adopted Apache Iceberg tables as a native format. Now customers can put their Snowflake data lakes into Iceberg, and even create external tables on a cloud provider of their choice, and have Snowflake manage them.

In addition, Snowflake released Polaris, a catalog for Iceberg tables that could be called by any data processing engine that could read the format (Spark, Dremio, Snowflake).

With the catalog, using the engine of your choice, you could do joins across tables gathering info heretofore much more difficult to obtain. Permissions, for who can see what, are managed by the catalog itself. And shortly, you will be able to pull in metadata from other catalogs.

The company discussed these interoperability initiatives during its own user conference, the Snowflake Data Cloud Summit, held last week in San Francisco,

But the company was not alone in its eager adoption of Iceberg.

Also, last week, chief Snowflake rival Databricks announced it had purchased Iceberg distribution provider Tabular, a company that offers an Iceberg distribution that was founded by the three people who created the technology,  Ryan Blue, Daniel Weeks, and Jason Reid.

How did Apache Iceberg become the Belle of the Ball? Clearly, the data lakes and data lake houses are about to undergo a fundamental shift to open source.

Apache Iceberg Came from Netflix

👁 A photo of Snowflake's Ron Ortloff.

“I think in this space, we have a classic customer who wants control of their solution. “–Snowflake’s Ron Ortloff.

Iceberg grew from a series of frustrations on the part of Netflix engineers to scale their data operations, with existing file formats not reliable in distributed scenarios.

Netflix open sourced the project in 2018 and donated it to the Apache Software Foundation. Since then, AirBnB, Amazon Web Services, Alibaba, Expedia, and others have contributed.

The advantage that Iceberg brings is that it allows data to be stored once — eliminating a whole mess of compliance and security issues around having data copies in multiple places — and queried by any one of a number of Iceberg-compliant engines.

A large number of Iceberg distributions are available these days, from Celerdata, Clickhouse, Cloudera, Dremio, Starburst, and of course Tabular. Earlier this month, Microsoft announced that it would support Snowflake’s Iceberg tables on its own Microsoft Fabric, an analytics service on Azure.

Customers are very, very sensitive about lock-in these days, said Ron Ortloff, Snowflake’s senior product manager. “I think in this space, we have a classic customer who wants control of their solution,” he said in an interview with The New Stack. “So we want to give those customers a choice.”

Snowflake has traditionally been a company that manages a client’s data from the cloud, relieving the customer of the considerable burden of managing it themselves. So why risk the customer base with an offer to allow customers to manage their own data?

“We think there’s 100 to 200 times more data outside of Snowflake in data lakes that we can tap into with Iceberg,” Ortloff said. Instead, the company sees itself competing on a “great platform experience,” especially as the stakes grow richer as more enterprises adopt AI in a big way.

“If we build great platform experiences, that data gravity is going to flow right there through,” he said.

👁 A diagram showing how Polaris integrates with the rest of the Snowflake infrastructure.

A diagram showing how Polaris integrates with the rest of the Snowflake infrastructure. From Ron Ortloff’s presentation.

Databricks Solidifies Its Iceberg Expertise

Databricks’ acquisition of Tabular was indeed spurred by customer demand for better interoperability for formats for data lakes.

“This is a long journey, one that will likely take several years to achieve in those communities,” the company admits in a blog. To this end, Databricks has released Delta Lake UniForm, which is a set of tables that work across Iceberg and Databricks’ own Delta Lake format, and the Apache Hudi transactional data lake format.

Others have weighed in on the significance of the Databricks purchase in light of Snowflake’s activity.

“After storage and compute became decoupled, all of the layers from storage through analytics began to be similarly unbundled, a process currently taking place with tables,” wrote New Relic CTO Siva Padisetty, in a a statement. “Databricks seeks to match open-source parity with Iceberg and Tabular is how they expect to achieve it.”

The competition will shift to which company can, in the open source format, process data most quickly and cost-effectively, with all the governance and security safeguards in place, Padisetty summarized.

Last week, we covered the news from the Snowflake event, and this week, TNS will continue its coverage of the Iceberg wars with TNS data correspondent Andrew Brust covering Databricks’ Data + AI Summit, taking place this week in San Francisco. There we will hear more about Databricks’ own plans for the the future of Iceberg.

TRENDING STORIES
Joab Jackson is a senior editor for The New Stack, covering cloud native computing and system operations. He has reported on IT infrastructure and development for over 30 years, including stints at IDG and Government Computer News. Before that, he...
Read more from Joab Jackson
SHARE THIS STORY
TRENDING STORIES
Microsoft and Snowflake are sponsors of The New Stack. 
TNS owner Insight Partners is an investor in: Dremio, Databricks.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.