VOOZH about

URL: https://www.digitalapplied.com/blog/databricks-customerlake-agentic-cdp-2026-marketing-guide

⇱ Databricks CustomerLake: Agentic CDP for Marketers


CRM & AutomationNew Release11 min readPublished June 17, 2026

Warehouse-native agentic CDP · governed by Unity Catalog · Private Preview

Databricks CustomerLake: an agentic CDP for marketers

Databricks entered the marketing software market on June 16, 2026 with CustomerLake — a customer data platform built natively on the lakehouse and governed by Unity Catalog, with no separate CDP data store. It ships in Private Preview, with named early adopters and a 21-partner launch ecosystem. Here is what changes for marketers and the wider CDP market.

DA
Digital Applied Team
Senior strategists · Published Jun 17, 2026
PublishedJun 17, 2026
Read time11 min
SourcesDatabricks + analyst coverage
Launched
Jun 16
Data + AI Summit 2026
Private Preview
Launch partners
21
identity, activation, services
Named early adopters
4
HP, Circle K, AB InBev, Getnet
Gartner forecast
80%
composable CDPs by 2030

Databricks CustomerLake is a warehouse-native agentic customer data platform that Databricks announced on June 16, 2026 at its Data + AI Summit in San Francisco — the company’s formal entry into the marketing software market. Rather than store another copy of your customer data, CustomerLake runs the CDP directly on the Databricks lakehouse, governed by Unity Catalog alongside the AI models and agents that act on that data.

That architectural choice is the whole story. For a decade the CDP category sold a separate system that ingested, unified, and re-stored customer data so marketers could act on it. CustomerLake argues the warehouse already holds the data, the governance, and now the agents — so the CDP should be a front door on the lakehouse, not a second database next to it. The platform is in Private Preview, not generally available, with a handful of named early adopters and no disclosed pricing rates.

This guide covers what actually launched, why warehouse-native architecture matters, the agentic loop Databricks calls Infinity Campaigns, the market context that makes the move land, and the honest open questions a marketing leader should weigh before treating this as an infrastructure decision. Everything below is sourced to Databricks’ own launch materials and independent martech coverage, with vendor claims labelled as such.

Key takeaways
  1. 01
    Databricks entered martech with a warehouse-native CDP.Announced June 16, 2026 at Data + AI Summit, CustomerLake runs the CDP natively on the Databricks lakehouse — no separate data store — governed by Unity Catalog. It is the company's second software application after Lakewatch (March 2026).
  2. 02
    It is Private Preview, not generally available.Named early adopters include HP, Circle K, AB InBev (Zé Delivery), and Getnet by Santander. These are vendor-curated early adopters, not GA case studies — and Databricks has disclosed no pricing rates beyond a consumption-based model.
  3. 03
    Agents replace the campaign waterfall.Profile Agents unify Customer 360 data via Agentic Identity Resolution; Campaign Agents build audiences, recommend next-best actions, and activate across channels in what Databricks calls Infinity Campaigns — continuous loops rather than plan-build-ship-measure batches.
  4. 04
    The composable CDP movement built the on-ramp.Vendors like Hightouch and GrowthLoop spent years arguing the warehouse should be the CDP's source of truth. They made that case so well that the warehouse vendor built its own front end — an irony at the centre of this launch.
  5. 05
    Gartner sees the architecture, not the product, winning.Gartner predicts that by 2030, 80% of net-new enterprise CDP deployments will be embedded in or composable with data platforms, and advises CMOs to treat CustomerLake as an infrastructure decision. The figure sits behind a paywall but is widely cited.

01 — What LaunchedA CDP that lives inside the lakehouse.

At Data + AI Summit 2026 — a conference Databricks says drew more than 30,000 in-person attendees to the Moscone Center, with tens of thousands more joining virtually from 150-plus countries — the company unveiled CustomerLake, its first move into the marketing software category. It follows Lakewatch, the security lakehouse Databricks shipped in March 2026, and marks the second time the data-platform vendor has packaged a vertical application on top of its core lakehouse.

The defining claim is that CustomerLake is built natively on the Databricks lakehouse and governed by Unity Catalog. There is no separate CDP data store: customer data, AI models, and the agents that act on them co-reside in one governed platform. Two agent families do the work — Profile Agents for data unification and Campaign Agents for activation — and a Genie natural-language interface lets marketers query governed customer data without writing SQL or filing a BI request.

Unify
Profile Agents
Customer 360 · Agentic Identity Resolution

Build a unified Customer 360 via Agentic Identity Resolution (AIR) — combining deterministic matching, probabilistic matching, and LLM-assisted edge-case resolution, with a continuous human-review feedback loop. (Vendor-stated; no independent identity benchmark exists.)

databricks.com/product/customerlake-cdp
Activate
Campaign Agents
Audiences · next-best action · cross-channel

Build audiences, recommend next-best actions, activate across channels, and continuously optimize — replacing the traditional plan → build → ship → measure sequence with a continuous loop Databricks calls an Infinity Campaign.

databricks.com/blog/introducing-customerlake-agentic-cdp
Launch snapshot
CustomerLake was announced June 16, 2026 in Private Preview — not generally available. Databricks named four early-adopter customers (HP, Circle K, AB InBev’s Zé Delivery, and Getnet by Santander) and a launch ecosystem of 21 partners across identity, activation, measurement, and services — including Adobe, Meta, Braze, Bloomreach, Iterable, Twilio, The Trade Desk, LiveRamp, and IAS. Pricing is described only as consumption-based; no rates were disclosed.

CustomerLake also ships with Lakehouse Federation, which Databricks says enables cross-platform queries across Databricks, Snowflake, BigQuery, and operational databases without duplicating data — a nod to the reality that few enterprises run a single warehouse. The scale claims around what those agents can do are firmly vendor-stated and worth reading with care: Databricks says the platform is designed to deliver 1:1 personalized experiences at very large scale, a figure that has no independent verification and should be treated as aspirational rather than benchmarked.

02 — ArchitectureWhy warehouse-native changes the math.

A traditional CDP is a second system of record. It ingests events and attributes from your sources, stitches identities, stores a unified profile, and then ships segments out to channels. Every step in that chain is a copy, a sync lag, and a governance boundary the data crosses. The composable CDP movement attacked the first problem — the extra copy — by activating directly from the warehouse. Databricks attacks the rest of the chain by putting the activation layer, the models, and the governance in the same place the data already lives.

The governance point is the one marketers underrate. Because Unity Catalog governs the data, the AI models, and the agents as one surface, lineage and access control do not stop at the CDP boundary — they extend to the agent that built the audience and the model that scored it. For regulated industries, that single governed plane is a materially different posture than a CDP that holds its own copy of customer data under its own access controls.

Data store
Separate CDP databases
0

CustomerLake holds no separate copy of customer data. The CDP reads and writes against the lakehouse directly, so there is no second system of record to keep in sync. (Vendor-stated architecture.)

Native to the lakehouse
Governance
Unity Catalog plane
1

Data, AI models, and agents are governed together under Unity Catalog — lineage and access control extend across the unification, scoring, and activation steps rather than stopping at a CDP boundary.

Data + models + agents
Federation
Query surfaces
4

Lakehouse Federation reaches across Databricks, Snowflake, BigQuery, and operational databases without duplicating data — acknowledging that most enterprises run more than one warehouse.

No data duplication

03 — Infinity CampaignsFrom the campaign waterfall to a continuous loop.

The agentic framing is more than branding. Databricks positions Infinity Campaigns as a replacement for the batch campaign waterfall — the plan, build, ship, and measure sequence that has defined lifecycle marketing for years. Instead of a marketer queuing a segment, building creative, shipping a send, and reading the report a week later, Campaign Agents are meant to analyze, decide, and activate against every customer in a continuous loop, with the human setting goals and guardrails rather than operating the machinery.

Marketing stops being a series of campaigns and becomes a continuous loop — agents that constantly analyze, decide, and act on every customer in real time.— Ali Ghodsi, Co-founder & CEO, Databricks

The honest read is that this is a vision statement attached to a Private Preview, not a measured outcome. The continuous-loop model is genuinely different from batch marketing, but its value depends entirely on whether the agents make good decisions on real customer data at scale — and no independent evaluation of that exists yet. The two-sided framing Databricks puts around it is the more durable idea: marketers will increasingly both use agents internally and need to market to their customers’ AI agents, the ones researching products on a buyer’s behalf. The legacy CDP category was built for a world where a human always sat at the other end of the message. That assumption is the one quietly breaking.

Before
Batch waterfall
plan → build → ship → measure

A marketer queues a segment, builds creative, schedules a send, and reads the report afterward. Each cycle is discrete, sequential, and slow to react to what the data is telling you mid-flight.

Legacy lifecycle marketing
After
Infinity Campaign
analyze ⇄ decide ⇄ act — continuously

Campaign Agents build audiences, recommend next-best actions, and activate across channels in a loop. The human sets goals and guardrails; the agents run the cycle. (Vendor-described; results unproven at this stage.)

Agentic engagement loop

04 — The RunwayThe composable CDP movement built Databricks’ on-ramp.

The most interesting dynamic in this launch is one most coverage skipped: the composable CDP category inadvertently paved the runway for CustomerLake. For years, vendors like Hightouch and GrowthLoop built their entire pitch on a single argument — the warehouse should be the CDP’s source of truth, and activation should happen from there rather than from a separate copy. They were persuasive enough that more than a quarter of CDPs now support a warehouse-centric architecture. The unintended consequence is that they taught the market to want exactly the thing only a warehouse vendor can build best: a CDP that is the warehouse.

Databricks reinforced the build-over-buy signal in how it staffed the effort. Rather than acquire an existing CDP, it recruited founding teams from ActionIQ and Census to build CustomerLake in house — a deliberate choice to own the architecture rather than bolt a packaged product onto the lakehouse. When the data layer decides to ship its own activation front end, the standalone vendors that spent a decade arguing the warehouse should be central find themselves competing against the warehouse itself.

The structural point
You cannot easily win a price war against a company that does not need the product to make money. Databricks can offer CDP functionality on a consumption model because that revenue is additive to its core data-platform business — while standalone CDP vendors must charge full platform fees to survive. That asymmetry is structurally different from feature-parity competition, and it is what will reshape CDP renewal negotiations more than any feature on the spec sheet.

05 — ComparisonThree CDP archetypes, side by side.

Most CDP comparisons stop at two options — packaged versus composable — and assume you need one of them. The table below adds the third tier CustomerLake represents and reads eight operational dimensions across all three archetypes. The legacy column reflects the Segment / mParticle packaged model, the composable column the Hightouch / GrowthLoop warehouse-activation model, and the agentic column CustomerLake as Databricks describes it. Cells in the agentic column are vendor-stated and Private Preview; read them as architecture intent, not proven capability.

Architectural comparison of three CDP archetypes — legacy packaged, composable warehouse-activation, and warehouse-native agentic (CustomerLake) — across eight operational dimensions. Sources: CDP Institute composable primer (cdp.com), Databricks blog and product page, Segment / Twilio documentation, and CMS Wire analysis, retrieved June 17, 2026. Agentic-column cells are vendor-stated and Private Preview.
DimensionLegacy packagedComposableWarehouse-native agentic
Data residencySeparate CDP data store — a copy outside the warehouseLives in your warehouse; CDP reads from itNative to the lakehouse — no separate store at all
Identity resolutionDeterministic + probabilistic rules, batch-runWarehouse SQL models you build and maintainAgentic Identity Resolution (AIR) — deterministic, probabilistic, plus LLM-assisted edge cases with human review
Governance layerThe CDP's own access controlsWarehouse permissions you wire up yourselfUnity Catalog governs data, models, and agents together
Personalization modelBatch segments shipped to channels on a scheduleReverse-ETL syncs audiences from warehouse to toolsContinuous agentic loops Databricks calls Infinity Campaigns
Pricing modelPer-platform software license, often per-profileLicense plus your own warehouse compute spendConsumption-based — no rates disclosed at launch
Developer / marketer splitMarketer-led UI; limited engineering touchEngineer-heavy — data team builds the modelsGenie natural-language interface lets marketers query without SQL
Third-party enrichmentBuilt-in connectors to data brokersWhatever you pipe into the warehouseLaunch partners (e.g. IAS) connect via clean rooms — no third-party cookies
Availability todayGenerally available, matureGenerally available across multiple vendorsPrivate Preview only — not yet GA

Reading down the agentic column, the pattern is consistent: every row collapses a boundary the previous two archetypes preserved. The separate store disappears, governance becomes one plane, and identity resolution gains an LLM-assisted layer on top of the deterministic and probabilistic matching the category already used. The honest asterisk on the whole column is availability — it reads as the most consolidated architecture precisely because it is the least proven, still in Private Preview with no GA date or disclosed rates. For a grounding in the packaged-versus-composable trade-offs before you weigh the third tier, our CDP build-buy-or-skip decision matrix walks the maturity signals that point to each path, and the customer data platform fundamentals guide covers what a CDP actually does before the architecture debate.

06 — Market ContextA growing market with a utilization problem.

CDP market sizing varies widely by analyst firm, so the honest move is to cite the range rather than a single headline. For 2026, Mordor Intelligence puts the market at about $4.58B, Fortune Business Insights at $4.07B, and MarketsandMarkets at $9.72B — the last projecting growth to $37.11B by 2030 at a 30.7% CAGR. The CDP Institute counts roughly 208 active vendors as of July 2025, with a concentrated core where a small group of large vendors accounts for about 67% of CDP employment and 73% of total funding.

The more telling number is who is growing. Composable, warehouse-native vendors grew at about 7.8% organic employment growth — nearly six times the 1.3% industry average — and more than a quarter of CDPs now support warehouse-centric architecture. That is the trend line CustomerLake steps onto, not against.

Where CDP employment is growing · warehouse-native vs industry average

Source: CDP.com industry statistics, retrieved June 17, 2026
Warehouse-native CDP vendorsOrganic employment growth, 2025
7.8%
CDP industry averageOrganic employment growth, 2025
1.3%

But adoption is not the same as use, and this is the paradox at the heart of the category. The scorecard below pulls the gap together: 41% of companies have implemented a CDP, yet only 22% of marketers report high utilization, and organizations estimate they use roughly 47% of available capabilities. The question CustomerLake does not yet answer is whether moving the CDP inside the warehouse fixes that adoption problem — or simply relocates the under-utilization into a new layer.

CDP adoption-gap scorecard mapping five market metrics to their current figure, source, and implication — showing that implementation outpaces utilization. Sources: CDP.com industry statistics and Databricks press materials, retrieved June 17, 2026.
MetricCurrent figureSourceImplication
CDP implementation rate41% of companiesCDP.com industry statsAdoption is mainstream — the install base is large
High-utilization rate22% of marketersCDP.com industry statsMost teams barely use what they bought
Average capabilities used~47% of featuresCDP.com industry statsRoughly half of every CDP investment sits idle
Warehouse-native adoption>25% of CDPsCDP.com industry statsWarehouse-centric architecture is already a quarter of the field
Fortune 500 on Databricks>60% penetrationDatabricks (Dec 2025)A vast warehouse install base CustomerLake can land on
Analyst view
Per coverage in CMS Wire, Gartner predicts that by 2030, 80% of net-new enterprise CDP deployments will be embedded in or composable with data platforms — and advises CMOs to treat CustomerLake as an infrastructure decision before signing long-term CDP contracts. The underlying Gartner document sits behind a paywall, so we cite the figure as widely reported rather than independently verified.

07 — The Threat ModelWhat it means for incumbents and buyers.

CustomerLake does not threaten every CDP equally. The asymmetric economics hit hardest where a buyer already runs Databricks and the CDP’s main value was unifying and activating data the warehouse already holds. Where the incumbent’s value is real channel orchestration, deep activation integrations, or a 700-plus connector library and 25,000-company network like Twilio Segment’s, the calculus is more nuanced. The matrix below maps who should watch closely and who can keep building.

Already on Databricks
Enterprises with a lakehouse install base

With Fortune 500 penetration above 60%, a vast base already runs the warehouse CustomerLake lands on. If your CDP mainly unifies and activates Databricks data, this is the strongest reason yet to revisit the contract — but Private Preview means evaluate, do not migrate.

Evaluate as infrastructure
Standalone CDP vendors
Packaged-CDP incumbents

The asymmetric price dynamic is the real threat: a consumption model that is additive to a data-platform business is hard to match on cost alone. Differentiation now has to live in orchestration, integrations, and proven outcomes — not in storing another copy of the data.

Defend on activation depth
Multi-warehouse shops
Teams not on Databricks

Lakehouse Federation reaches Snowflake and BigQuery, but the native governance benefit is strongest on Databricks itself. If your data lives elsewhere, composable activation from your own warehouse may still be the cleaner path than adopting a second platform.

Weigh composable first
Risk-aware buyers
Anyone signing a long contract

Private Preview, no disclosed pricing, no GA date, and entirely vendor-stated capability benchmarks. The prudent move is to treat CustomerLake as a roadmap input to your CDP renewal — not a product you can deploy today.

Plan, don't commit

08 — Your StackWhat to actually do about it.

For most marketing teams, the right response to a Private Preview announcement is not to act but to recalibrate. If you are mid-cycle on a CDP renewal and you run Databricks, the smart move is to fold CustomerLake into the renewal conversation as a credible alternative — even if you cannot deploy it yet — because the option alone changes your negotiating position. If you are not on Databricks, the launch is a strong signal that warehouse-native architecture is the direction of travel, which makes the composable path worth a serious look before you sign another packaged-CDP contract.

The cost angle deserves its own line in the model. A consumption model that rides on compute you already pay for is a structurally different line item than a standalone CDP license, and it changes how CDP spend competes with the rest of your channel budget. If you are re-running that math, our marketing budget allocation guide frames where data-platform spend sits against paid, owned, and earned channels. And when the question becomes how to wire agents, identity, and activation into a working customer-data workflow rather than a slide, our CRM & marketing automation engagements start with exactly this kind of architecture decision — mapping your data, governance, and channel reality before any platform commitment.

09 — ConclusionAn infrastructure decision wearing a martech badge.

The shape of the CDP market, June 2026

The warehouse just became the CDP — but Private Preview means watch, not migrate.

CustomerLake is the clearest sign yet that the CDP category is being absorbed into the data platform rather than competing alongside it. By running the CDP inside the lakehouse and governing data, models, and agents under one plane, Databricks turns a decade of composable-CDP advocacy into its own on-ramp — and brings a structural cost advantage that standalone vendors cannot easily answer.

The honest caveats matter as much as the thesis. CustomerLake is in Private Preview, not generally available. Its scale and identity claims are vendor-stated with no independent verification, no pricing rates are public, and the named customers are early adopters rather than proven case studies. The 80%-by-2030 forecast is an analyst projection behind a paywall. None of that makes the move less significant — it makes it a roadmap input, not a deployable product.

The sharpest unanswered question is the one the market should sit with: 80% of new CDPs may be composable by 2030, yet only 22% of marketers report high utilization of the CDPs they already own. Moving the platform inside the warehouse fixes the architecture. It does not automatically fix adoption — and whether agents close that gap or simply relocate it is the test CustomerLake still has to pass. For now, the right posture is to treat it as the infrastructure decision Gartner calls it: evaluate it against your warehouse and your renewals, and commit only when the preview becomes a product.

Get your customer-data architecture right

Make the customer-data platform call as an infrastructure decision.

Our team helps marketing and data teams cut through warehouse-native vs packaged vs composable noise — mapping your data, governance, and channel reality to the right customer-data architecture before any platform commitment.

Free consultationSenior strategistsVendor-neutral
What we work on

Customer-data engagements

  • Build-buy-or-skip CDP decisions, scored on your data maturity
  • Warehouse-native vs composable vs packaged architecture reviews
  • Identity, activation, and governance mapping before you sign
  • CDP renewal strategy — including the CustomerLake option
  • Agentic marketing workflows wired to your live customer data
FAQ · Databricks CustomerLake

The questions teams are asking this week.

CustomerLake is Databricks' agentic customer data platform (CDP), announced June 16, 2026 at the Data + AI Summit in San Francisco. It is built natively on the Databricks lakehouse and governed by Unity Catalog, which means there is no separate CDP data store — customer data, AI models, and the agents that act on them co-reside in one governed platform. Profile Agents handle Customer 360 unification and Campaign Agents handle audience building and cross-channel activation. It is Databricks' formal entry into the marketing software market and its second software application after Lakewatch, the security lakehouse it shipped in March 2026.
Related dispatches

Keep exploring CRM & automation.