VOOZH about

URL: https://dev.to/thasha/multi-level-groupby-in-dataweave-2-traps-that-break-your-nested-aggregations-2944

⇱ Multi-Level GroupBy in DataWeave: 2 Traps That Break Your Nested Aggregations - DEV Community


I spent 2 days building a 3-level sales report in DataWeave. Region to country to product, with revenue totals at every level. The code was 10 lines. The debugging was 2 days. Two traps that nobody warns you about.

TL;DR

  • Nested groupBy creates hierarchical structures from flat data — 3 levels in 10 lines
  • Every groupBy returns an Object — you MUST use mapObject at every level, never map
  • sum() on an empty group returns null not 0 — your dashboard shows "null" for revenue
  • Handles 50,000 records in under 2 seconds

The Problem: Flat Data to Tree Structure

We had 50,000 order records:

[{"region":"North America","country":"USA","product":"Laptop","revenue":45000},{"region":"North America","country":"USA","product":"Monitor","revenue":12000},{"region":"Europe","country":"UK","product":"Laptop","revenue":22000},{"region":"Europe","country":"Germany","product":"Monitor","revenue":9500}]

The dashboard needed a tree: Region → Country → Product → totals.

The 10-Line Solution

%dw 2.0
output application/json
---
payload groupBy $.region mapObject (regionItems, region) -> ({
 (region): regionItems groupBy $.country mapObject (countryItems, country) -> ({
 (country): countryItems groupBy $.product mapObject (productItems, product) -> ({
 (product): {
 totalRevenue: sum(productItems.revenue),
 count: sizeOf(productItems)
 })
 })
 })
})

100 production-ready DataWeave patterns with tests: mulesoft-cookbook on GitHub


Trap 1: Cannot Coerce Object to Array

My first version used map at level 2:

regionItems groupBy $.country map (countryItems) -> ...

Error: "Cannot coerce :object to :array"

groupBy ALWAYS returns an Object. map expects an Array. This error appears at EVERY level where you use map instead of mapObject. I hit it 3 times — once at each level.

The rule: After groupBy, always mapObject. Never map. Three levels = three mapObject calls.

Trap 2: sum() Returns Null on Empty Groups

A region had no records for one product category. sum([]) returns null, not 0. The dashboard showed "null" in the revenue column.

The finance team reported it as a data quality issue. The fix:

totalRevenue: sum(productItems.revenue) default 0

Add default 0 to every aggregation function. sizeOf correctly returns 0 for empty arrays, but sum, avg, min, max all return null.

Performance

I tested with 50,000 records across 8 regions, 23 countries, and 400 products. Processing time: under 2 seconds on a 0.1 vCore CloudHub worker.

Each groupBy is O(n) and the data set shrinks at each level. The total is still O(n), not O(n^3). The 10-line DataWeave replaced a 200-line Java implementation that took 3 developers a week.

When to Use Multi-Level GroupBy

Use it when Don't use it when
Building hierarchical reports from flat data Data is already nested
Dashboard APIs needing tree structures Output needs to stay flat
Pivot-table style aggregation Only one grouping dimension
3+ dimensions of analysis Simple count/sum without hierarchy

100 patterns with MUnit tests: github.com/shakarbisetty/mulesoft-cookbook

60-second video walkthroughs: youtube.com/@SanThaParv