SQL Window Functions: A Comprehensive Guide

Window functions execute calculations over a defined range of rows related to the current row, enabling advanced data analysis without requiring subqueries. Unlike aggregate functions, which typically return a single value for a group of rows, Window functions return a calculated value for each row based on the specified window of related rows.

1. RANK( )

The RANK() function assigns a unique rank to each row within a specified partition of a result set. The rank starts at 1 and increments for each row. If two or more rows have the same rank, the next rank will be skipped.

Syntax:

RANK() OVER (PARTITION BY column1 ORDER BY column2)

Example: Consider the Sales table:

SalesIDSalesPersonAmount
1Alice500
2Bob300
3Alice700
4Bob400

Query:

SELECT SalesPerson, Amount, 
       RANK() OVER (PARTITION BY SalesPerson ORDER BY Amount DESC) AS Rank
FROM Sales;

Output:

SalesPersonAmountRank
Alice7001
Alice5002
Bob4001
Bob3002

2. DENSE_RANK( )

The DENSE_RANK() function is similar to RANK(), but it does not skip rank numbers when they are the same rank in more than one row.

Syntax:

DENSE_RANK() OVER (PARTITION BY column1 ORDER BY column2)

Example: Using the same Sales table:

SalesIDSalesPersonAmount
1Alice500
2Bob300
3Alice700
4Bob400

Query:

SELECT SalesPerson, Amount, 
       DENSE_RANK() OVER (PARTITION BY SalesPerson ORDER BY Amount DESC) AS DenseRank
FROM Sales;

Output:

SalesPersonAmountRank
Alice7001
Alice5002
Bob4001
Bob3002

3. ROW_NUMBER()

The ROW_NUMBER() function assigns a unique sequential integer to rows within a partition.

Syntax:

SalesIDSalesPersonAmount
1Alice500
2Bob300
3Alice700
4Bob400

Query:

SELECT SalesPerson, Amount, 
       ROW_NUMBER() OVER (PARTITION BY SalesPerson ORDER BY Amount DESC) AS RN
FROM Sales;

Output:

SalesPersonAmountRN
Alice7001
Alice5002
Bob4001
Bob3002

4. NTILE( )

The NTILE() function evenly distributes the rows in an ordered partition into a specified number of groups, assigning a bucket number (1 through n) to each row.

Syntax:

NTILE(n) OVER (PARTITION BY column1 ORDER BY column2)

Example: Using the same Sales table to create 2 buckets:

Query:

SELECT SalesPerson, Amount, 
       NTILE(2) OVER (PARTITION BY SalesPerson ORDER BY Amount DESC) AS Bucket
FROM Sales;

Output:

SalesPersonAmountBucket
Alice7001
Alice5002
Bob4001
Bob3002

5. LEAD( ) and LAG( )

  • LEAD(): Returns the value of a specified column from the next row within the result set.
  • LAG(): Returns the value of a specified column from the previous row within the result set.

Syntax:

LEAD(column_name) OVER (PARTITION BY column1 ORDER BY column2)
LAG(column_name) OVER (PARTITION BY column1 ORDER BY column2)

Example: Using the same Sales table:

Query:

SELECT SalesPerson, Amount, 
       LEAD(Amount) OVER (ORDER BY Amount) AS NextAmount,
       LAG(Amount) OVER (ORDER BY Amount) AS PrevAmount
FROM Sales;

Output:

SalesPersonAmountNextAmountPrevAmount
Bob300400NULL
Bob400500300
Alice500700400
Alice700NULL500

6. OVER( ) Clause

The OVER() clause defines the window (or set of rows) to which the window function is applied. It can include partitioning and ordering.

Syntax :

function_name() OVER (PARTITION BY column1 ORDER BY column2)

Example: To see how the OVER() clause works with RANK():

Query:

SELECT SalesPerson, Amount, 
       RANK() OVER (ORDER BY Amount DESC) AS OverallRank
FROM Sales;

Output:

SalesPersonAmountOverallRank
Alice7001
Alice5002
Bob4003
Bob3004

Conclusion

Window functions provide powerful tools for analyzing data in SQL, enabling complex calculations while preserving individual row details. Understanding how to use functions like RANK(), DENSE_RANK(), ROW_NUMBER(), NTILE(), LEAD(), and LAG() can significantly enhance your data analysis capabilities.

Knowledge Check

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *