Window functions execute calculations over a defined range of rows related to the current row, enabling advanced data analysis without requiring subqueries. Unlike aggregate functions, which typically return a single value for a group of rows, Window functions return a calculated value for each row based on the specified window of related rows.
Table of Contents
1. RANK( )
The RANK() function assigns a unique rank to each row within a specified partition of a result set. The rank starts at 1 and increments for each row. If two or more rows have the same rank, the next rank will be skipped.
Syntax:
RANK() OVER (PARTITION BY column1 ORDER BY column2)
Example: Consider the Sales table:
SalesID | SalesPerson | Amount |
1 | Alice | 500 |
2 | Bob | 300 |
3 | Alice | 700 |
4 | Bob | 400 |
Query:
SELECT SalesPerson, Amount,
RANK() OVER (PARTITION BY SalesPerson ORDER BY Amount DESC) AS Rank
FROM Sales;
Output:
SalesPerson | Amount | Rank |
Alice | 700 | 1 |
Alice | 500 | 2 |
Bob | 400 | 1 |
Bob | 300 | 2 |
2. DENSE_RANK( )
The DENSE_RANK() function is similar to RANK(), but it does not skip rank numbers when they are the same rank in more than one row.
Syntax:
DENSE_RANK() OVER (PARTITION BY column1 ORDER BY column2)
Example: Using the same Sales table:
SalesID | SalesPerson | Amount |
1 | Alice | 500 |
2 | Bob | 300 |
3 | Alice | 700 |
4 | Bob | 400 |
Query:
SELECT SalesPerson, Amount,
DENSE_RANK() OVER (PARTITION BY SalesPerson ORDER BY Amount DESC) AS DenseRank
FROM Sales;
Output:
SalesPerson | Amount | Rank |
Alice | 700 | 1 |
Alice | 500 | 2 |
Bob | 400 | 1 |
Bob | 300 | 2 |
3. ROW_NUMBER()
The ROW_NUMBER() function assigns a unique sequential integer to rows within a partition.
Syntax:
SalesID | SalesPerson | Amount |
1 | Alice | 500 |
2 | Bob | 300 |
3 | Alice | 700 |
4 | Bob | 400 |
Query:
SELECT SalesPerson, Amount,
ROW_NUMBER() OVER (PARTITION BY SalesPerson ORDER BY Amount DESC) AS RN
FROM Sales;
Output:
SalesPerson | Amount | RN |
Alice | 700 | 1 |
Alice | 500 | 2 |
Bob | 400 | 1 |
Bob | 300 | 2 |
4. NTILE( )
The NTILE() function evenly distributes the rows in an ordered partition into a specified number of groups, assigning a bucket number (1 through n) to each row.
Syntax:
NTILE(n) OVER (PARTITION BY column1 ORDER BY column2)
Example: Using the same Sales table to create 2 buckets:
Query:
SELECT SalesPerson, Amount,
NTILE(2) OVER (PARTITION BY SalesPerson ORDER BY Amount DESC) AS Bucket
FROM Sales;
Output:
SalesPerson | Amount | Bucket |
Alice | 700 | 1 |
Alice | 500 | 2 |
Bob | 400 | 1 |
Bob | 300 | 2 |
5. LEAD( ) and LAG( )
- LEAD(): Returns the value of a specified column from the next row within the result set.
- LAG(): Returns the value of a specified column from the previous row within the result set.
Syntax:
LEAD(column_name) OVER (PARTITION BY column1 ORDER BY column2)
LAG(column_name) OVER (PARTITION BY column1 ORDER BY column2)
Example: Using the same Sales table:
Query:
SELECT SalesPerson, Amount,
LEAD(Amount) OVER (ORDER BY Amount) AS NextAmount,
LAG(Amount) OVER (ORDER BY Amount) AS PrevAmount
FROM Sales;
Output:
SalesPerson | Amount | NextAmount | PrevAmount |
Bob | 300 | 400 | NULL |
Bob | 400 | 500 | 300 |
Alice | 500 | 700 | 400 |
Alice | 700 | NULL | 500 |
6. OVER( ) Clause
The OVER() clause defines the window (or set of rows) to which the window function is applied. It can include partitioning and ordering.
Syntax :
function_name() OVER (PARTITION BY column1 ORDER BY column2)
Example: To see how the OVER() clause works with RANK():
Query:
SELECT SalesPerson, Amount,
RANK() OVER (ORDER BY Amount DESC) AS OverallRank
FROM Sales;
Output:
SalesPerson | Amount | OverallRank |
Alice | 700 | 1 |
Alice | 500 | 2 |
Bob | 400 | 3 |
Bob | 300 | 4 |
Conclusion
Window functions provide powerful tools for analyzing data in SQL, enabling complex calculations while preserving individual row details. Understanding how to use functions like RANK(), DENSE_RANK(), ROW_NUMBER(), NTILE(), LEAD(), and LAG() can significantly enhance your data analysis capabilities.
Most Commented