Applies to: 👁 check marked yes
Databricks SQL 👁 check marked yes
Databricks Runtime
Functions that operate on a group of rows, referred to as a window, and calculate a return value for each row based on the group of rows. Window functions are useful for processing tasks such as calculating a moving average, computing a cumulative statistic, or accessing the value of rows given the relative position of the current row.
Syntax
function OVER { window_name | ( window_name ) | window_spec }
function
{ ranking_function | analytic_function | aggregate_function }
over_clause
OVER { window_name | ( window_name ) | window_spec }
window_spec
( [ PARTITION BY partition [ , ... ] ] [ order_by ] [ window_frame ] )
Parameters
-
function
The function operating on the window. Different classes of functions support different configurations of window specifications.
-
ranking_function
Any of the Ranking window functions.
If specified the window_spec must include an ORDER BY clause, but not a window_frame clause.
-
analytic_function
Any of the Analytic window functions.
-
aggregate_function
Any of the Aggregate functions.
If specified the function must not include a FILTER clause.
-
-
window_name
Identifies a named window specification defined by the query.
-
window_spec
This clause defines how the rows will be grouped, sorted within the group, and which rows within a partition a function operates on.
-
partition
One or more expression used to specify a group of rows defining the scope on which the function operates. If no PARTITION clause is specified the partition is comprised of all rows.
-
order_by
The ORDER BY clause specifies the order of rows within a partition.
-
window_frame
The window frame clause specifies a sliding subset of rows within the partition on which the aggregate or analytics function operates.
-
You can specify SORT BY as an alias for ORDER BY.
You can also specify DISTRIBUTE BY as an alias for PARTITION BY. You can use CLUSTER BY as an alias for PARTITION BY in the absence of ORDER BY.
Common error conditions
- DISTINCT_WINDOW_FUNCTION_UNSUPPORTED
- MISSING_WINDOW_SPECIFICATION
- UNSUPPORTED_EXPR_FOR_WINDOW
- WINDOW_FUNCTION_AND_FRAME_MISMATCH
- WINDOW_FUNCTION_NOT_ALLOWED_IN_CLAUSE
- WINDOW_FUNCTION_WITHOUT_OVER_CLAUSE
Examples
>CREATETABLE employees
(name STRING, dept STRING, salary INT, age INT);
>INSERTINTO employees
VALUES('Lisa','Sales',10000,35),
('Evan','Sales',32000,38),
('Fred','Engineering',21000,28),
('Alex','Sales',30000,33),
('Tom','Engineering',23000,33),
('Jane','Marketing',29000,28),
('Jeff','Marketing',35000,38),
('Paul','Engineering',29000,23),
('Chloe','Engineering',23000,25);
>SELECT name, dept, salary, age FROM employees;
Chloe Engineering 2300025
Fred Engineering 2100028
Paul Engineering 2900023
Helen Marketing 2900040
Tom Engineering 2300033
Jane Marketing 2900028
Jeff Marketing 3500038
Evan Sales 3200038
Lisa Sales 1000035
Alex Sales 3000033
>SELECT name,
dept,
RANK()OVER(PARTITIONBY dept ORDERBY salary)AS rank
FROM employees;
Lisa Sales 100001
Alex Sales 300002
Evan Sales 320003
Fred Engineering 210001
Tom Engineering 230002
Chloe Engineering 230002
Paul Engineering 290004
Helen Marketing 290001
Jane Marketing 290001
Jeff Marketing 350003
>SELECT name,
dept,
DENSE_RANK()OVER(PARTITIONBY dept ORDERBY salary
ROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW)AS dense_rank
FROM employees;
Lisa Sales 100001
Alex Sales 300002
Evan Sales 320003
Fred Engineering 210001
Tom Engineering 230002
Chloe Engineering 230002
Paul Engineering 290003
Helen Marketing 290001
Jane Marketing 290001
Jeff Marketing 350002
>SELECT name,
dept,
age,
CUME_DIST()OVER(PARTITIONBY dept ORDERBY age
RANGE BETWEENUNBOUNDEDPRECEDINGANDCURRENTROW)AS cume_dist
FROM employees;
Alex Sales 330.3333333333333333
Lisa Sales 350.6666666666666666
Evan Sales 381.0
Paul Engineering 230.25
Chloe Engineering 250.50
Fred Engineering 280.75
Tom Engineering 331.0
Jane Marketing 280.3333333333333333
Jeff Marketing 380.6666666666666666
Helen Marketing 401.0
>SELECT name,
dept,
salary,
MIN(salary)OVER(PARTITIONBY dept ORDERBY salary)AS min
FROM employees;
Lisa Sales 1000010000
Alex Sales 3000010000
Evan Sales 3200010000
Helen Marketing 2900029000
Jane Marketing 2900029000
Jeff Marketing 3500029000
Fred Engineering 2100021000
Tom Engineering 2300021000
Chloe Engineering 2300021000
Paul Engineering 2900021000
>SELECT name,
salary,
LAG(salary)OVER(PARTITIONBY dept ORDERBY salary)AS lag,
LEAD(salary,1,0)OVER(PARTITIONBY dept ORDERBY salary)AS lead
FROM employees;
Lisa 10000NULL30000
Alex 300001000032000
Evan 32000300000
Fred 21000NULL23000
Chloe 230002100023000
Tom 230002300029000
Paul 29000230000
Helen 29000NULL29000
Jane 290002900035000
Jeff 35000290000
