Running total for grouped data

Peter Bishop 256 Reputation points

Given input data similar to:

Month	Region	Store	Sales
Jan		R1		S1		10.00
Jan		R1		S1		15.00
Jan		R1		S1		30.00
Jan		R1		S2		 5.00
Jan		R1		S2		 1.00
Jan		R2		S3		17.00
Jan		R2		S3		 5.00
Jan		R2		S4		60.00
Jan		R2		S4		45.00
Jan		R2		S4		11.00
Feb		R1		S1		20.00
Feb		R1		S1		40.00
Feb		R1		S1		60.00
Feb		R2		S3		10.00
Feb		R2		S3		10.00
Feb		R2		S3		10.00
Feb		R2		S3		10.00

I'd like output of the form:

Month	Region	Store	 Sales
Jan		R1		S1		 55.00	 -- 10+15+30
Jan		R1		S2		 6.00	 -- 5+1
Jan		R2		S3		 22.00	 -- 17+5
Jan		R2		S4		116.00	 -- 60+45+11
Feb		R1		S1		175.00	 -- 20+40+60 + 55 (Jan's total)
Feb		R1		S2		 6.00	 -- (Jan's Total)
Feb		R2		S3		 62.00	 -- 10+10+10+10 + 22 (Jan's Total)
Feb		R2		S4		116.00	 -- (Jan's Total)
Mar		R1		S1		???.00	 -- (Mar's sales) + (Feb's total)
Mar		R1		S2		???.00	 -- (Mar's sales) + (Feb's total)
Mar		R2		S3		???.00	 -- (Mar's sales) + (Feb's total)
Mar		R2		S4		???.00	 -- (Mar's sales) + (Feb's total)
etc...

I can get a SUM() / GROUP BY to break down the month, region, store values for each combination but it won't do the ROLLUP.

I've also tried:

SUM([Sales]) OVER(PARTITION BY [Region],[Store] ORDER BY [Month] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

but this gave me a running total for every line rather than grouped by month, region, store. For example:

Month	Region	Store	Sales
Jan		R1		S1		10.00	 -- 10
Jan		R1		S1		25.00	 -- 10+15
Jan		R1		S1		50.00	 -- 10+15+30
Jan		R1		S2		 5.00	 -- 5
Jan		R1		S2		 6.00	 -- 5+1
etc...

Is there a way to achieve this without multiple queries (e.g. SUM() GROUP BY --> "ROWS BETWEEN".

Thanks.

0 comments No comments

Sign in to comment

3 answers

  1. Bruce (SqlWork.com) 84,086 Reputation points

    you can just use a simple group by and computed order by for month:

     select Month, Region, Store, sum(Sales) as Sales
     from Sales
     group by Month, Region, Store
     order by charIndex (Month, 'Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec'), Region, Store
    
    1. Erland Sommarskog 134.7K Reputation points MVP Volunteer Moderator

      Bruce, did you verify that query against Peter's desired result? He wants the running total per store - even if that store does not have any sales a particular month.

    2. Bruce (SqlWork.com) 84,086 Reputation points

      Sorry did not notice the running total.


    Sign in to comment
  2. Erland Sommarskog 134.7K Reputation points MVP Volunteer Moderator

    As Viorel pointed out, it gets less messy with numeric values for months. And presumably you need the year somewhere as well. Furthermore, I assume that you have tables that define the Regions and Stores. Those tables should replace the CTE AllStores.

    An issue with Viorel's query is that performance will be proportional to the square of number of months. This solution avoids this issue:

    DROP TABLE IF EXISTS #Peter
    CREATE TABLE #Peter (Month char(3) NOT NULL,
     Region char(2) NOT NULL,
     Store char(2) NOT NULL,
     Sales decimal(10,2) NOT NULL
    )
    INSERT #Peter(Month, Region, Store, Sales)
    VALUES
    ('Jan', 'R1',	'S1', 10.00),
    ('Jan', 'R1',	'S1', 15.00),
    ('Jan', 'R1',	'S1', 30.00),
    ('Jan', 'R1',	'S2', 5.00),
    ('Jan', 'R1',	'S2', 1.00),
    ('Jan', 'R2',	'S3', 17.00),
    ('Jan', 'R2',	'S3', 5.00),
    ('Jan', 'R2',	'S4', 60.00),
    ('Jan', 'R2',	'S4', 45.00),
    ('Jan', 'R2',	'S4', 11.00),
    ('Feb', 'R1',	'S1', 20.00),
    ('Feb', 'R1',	'S1', 40.00),
    ('Feb', 'R1',	'S1', 60.00),
    ('Feb', 'R2',	'S3', 10.00),
    ('Feb', 'R2',	'S3', 10.00),
    ('Feb', 'R2',	'S3', 10.00),
    ('Feb', 'R2',	'S3', 10.00)
    go
    ; WITH AllStores AS (
    	SELECT DISTINCT Region, Store FROM #Peter
    ), Months AS (
     SELECT value AS monthno, 
    	 choose(value, 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 
    	 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec') AS Month
     FROM generate_series(1, 23)
    )
    SELECT m.Month, a.Region, a.Store, 
     SUM(SUM(isnull(p.Sales, 0))) OVER(PARTITION BY a.Region, a.Store 
    		 ORDER BY m.monthno
    									 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
    FROM Months m
    CROSS JOIN AllStores a
    LEFT JOIN #Peter p ON p.Month = m.Month
     AND p.Region = a.Region
    						 AND p.Store = a.Store
    GROUP BY m.Month, m.monthno, a.Region, a.Store
    ORDER BY m.monthno, a.Region, a.Store
    go
    ;WITH MonthlyStoreSales AS
    (
     SELECT
     [Month],
     [Region],
     [Store],
     SUM([Sales]) AS MonthlySales
     FROM #Peter
     GROUP BY [Month], [Region], [Store]
    )
    SELECT
     [Month],
     [Region],
     [Store],
     SUM(MonthlySales) OVER
     (
     PARTITION BY [Region], [Store]
     ORDER BY [Month]
     ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
     ) AS Sales
    FROM MonthlyStoreSales
    ORDER BY [Month], [Region], [Store];
    
    
    

    I deleted the AI answer, because it was not producing the desired result.

    0 comments No comments

    Sign in to comment
  3. Viorel 127K Reputation points

    You could also try a longer query:

    ;
    with Q0 as
    (
    	select *, charindex([Month], 'JanFebMarAprMayJunJulAugSepOctNovDec') as MonthIndex
    	from MyTable
    ),
    Q1 as
    (
    	select distinct Region, Store
    	from Q0
    ),
    Q2 as
    (
    	select distinct [Month], MonthIndex, Q1.Region, Q1.Store
    	from Q0, Q1
    )
    select [Month], Region, Store, (select sum(Sales) from Q0 where Region=Q2.Region and Store=Q2.Store and MonthIndex<=Q2.MonthIndex) as Sales
    from Q2
    order by MonthIndex
    

    It would be simpler if the month were numeric.

    1. Lakshmi Narayana Garikapati 1,330 Reputation points Microsoft External Staff Moderator

      Hi @Peter Bishop ,
      Following up to confirm if the earlier responses addressed your query. If not, please share your questions and we'll assist further

      Thanks,

      Lakshmi


    Sign in to comment
Sign in to answer

Your answer