SQL Window Functions

Window Functions in SQL scaled Explore and Read Our Blogs Written By Our Insutry Experts Learn From KSR Data Vizon

Window functions execute calculations over a defined range of rows related to the current row, enabling advanced data analysis without requiring subqueries. Unlike aggregate functions, which typically return a single value for a group of rows, Window functions return a calculated value for each row based on the specified window of related rows.

Sample Sales Data

SalesIDSalesPersonAmount
1Alice700
2Alice700
3Alice500
4Alice500
5Bob600
6Bob400
7Bob400
8Bob300

SQL Query for Sample Data:

-- Base table used for all examples
DROP TABLE IF EXISTS Sales;
CREATE TABLE Sales (
  SalesID INT PRIMARY KEY,
  SalesPerson VARCHAR(50),
  Amount INT
);

INSERT INTO Sales (SalesID, SalesPerson, Amount) VALUES
(1, 'Alice', 700),
(2, 'Alice', 700),
(3, 'Alice', 500),
(4, 'Alice', 500),
(5, 'Bob',   600),
(6, 'Bob',   400),
(7, 'Bob',   400),
(8, 'Bob',   300);

1. RANK( )

The RANK() function assigns a unique rank to each row within a specified partition of a result set. The rank starts at 1 and increments for each row. If two or more rows have the same rank, the next rank will be skipped.

Syntax:

RANK() OVER (PARTITION BY column_name ORDER BY column_name [ASC|DESC])

Query:

SELECT
  SalesID, SalesPerson, Amount,
  RANK() OVER (
    PARTITION BY SalesPerson
    ORDER BY Amount DESC
  ) AS rnk
FROM Sales
ORDER BY SalesPerson, rnk, SalesID;

Output:

SalesIDSalesPersonAmountrnk
1Alice7001
2Alice7001
3Alice5003
4Alice5003
5Bob6001
6Bob4002
7Bob4002
8Bob3004

2. DENSE_RANK( )

The DENSE_RANK() function is similar to RANK(), but it does not skip rank numbers when they are the same rank in more than one row.

Syntax:

DENSE_RANK() OVER (PARTITION BY column_name ORDER BY column_name [ASC|DESC])

Query:

SELECT
  SalesID, SalesPerson, Amount,
  DENSE_RANK() OVER (
    PARTITION BY SalesPerson
    ORDER BY Amount DESC
  ) AS dense_rnk
FROM Sales
ORDER BY SalesPerson, dense_rnk, SalesID;

Output:

SalesIDSalesPersonAmountdense_rnk
1Alice7001
2Alice7001
3Alice5002
4Alice5002
5Bob6001
6Bob4002
7Bob4002
8Bob3003

3. ROW_NUMBER()

The ROW_NUMBER() function assigns a unique sequential integer to rows within a partition.

Syntax:

ROW_NUMBER() OVER ( PARTITION BY column_name
    ORDER BY column_name [ASC|DESC] )

Query:

SELECT
  SalesID, SalesPerson, Amount,
  ROW_NUMBER() OVER (
    PARTITION BY SalesPerson
    ORDER BY Amount DESC, SalesID ASC  -- tie-breaker for determinism
  ) AS rn
FROM Sales
ORDER BY SalesPerson, rn;

Output:

SalesIDSalesPersonAmountrn
1Alice7001
2Alice7002
3Alice5003
4Alice5004
5Bob6001
6Bob4002
7Bob4003
8Bob3004

4. NTILE( )

The NTILE() function evenly distributes the rows in an ordered partition into a specified number of groups, assigning a bucket number (1 through n) to each row.

Syntax:

NTILE(n) OVER (PARTITION BY column_name ORDER BY column_name [ASC|DESC])

Query:

SELECT
  SalesID, SalesPerson, Amount,
  NTILE(4) OVER (
    PARTITION BY SalesPerson
    ORDER BY Amount DESC, SalesID ASC
  ) AS bucket
FROM Sales
ORDER BY SalesPerson, bucket, SalesID;

Output:

SalesPersonAmountBucket
Alice7001
Alice5002
Bob4001
Bob3002

5. LEAD( ) and LAG( )

  • LEAD(): Returns the value of a specified column from the next row within the result set.
  • LAG(): Returns the value of a specified column from the previous row within the result set.

Syntax:

LEAD(column_name) OVER (PARTITION BY column_name ORDER BY column_name [ASC|DESC])
LAG(column_name) OVER (PARTITION BY column_name ORDER BY column_name [ASC|DESC])

Query:

SELECT
  SalesID, SalesPerson, Amount,
  LAG(Amount)  OVER (ORDER BY Amount ASC, SalesID ASC) AS prev_amount,
  LEAD(Amount) OVER (ORDER BY Amount ASC, SalesID ASC) AS next_amount
FROM Sales
ORDER BY Amount ASC, SalesID ASC;

Output (first has NULL prev; last has NULL next):

SalesIDSalesPersonAmountprev_amountnext_amount
8Bob300NULL400
6Bob400300400
7Bob400400500
3Alice500400500
4Alice500500600
5Bob600500700
1Alice700600700
2Alice700700NULL

6. OVER( ) Clause

The OVER() clause in SQL is what transforms normal aggregate functions (like SUM, AVG, COUNT, RANK, etc.) into window functions meaning they operate across a set of rows (a window) without collapsing them into one result like GROUP BY does.

Syntax :

function() OVER (
    PARTITION BY column1, column2, ...
    ORDER BY column3 [ASC|DESC]
    ROWS or RANGE frame specification
)

Query: OVER() without PARTITION or ORDER

SELECT
  SalesPerson,
  Amount,
  SUM(Amount) OVER() AS TotalSales
FROM Sales;

Output:

SalesPersonAmountTotalSales
Alice5003600
Bob4003600
Alice7003600
Bob3003600
Carol8003600
Dan9003600

➡️ Here, OVER() means the entire table is the window —
every row “sees” the total of all rows.

Query: OVER(PARTITION BY ...) — Grouped Windows

SELECT
  SalesPerson,
  Amount,
  SUM(Amount) OVER(PARTITION BY SalesPerson) AS TotalPerPerson
FROM Sales;

Output:

SalesPersonAmountTotalPerPerson
Alice5001200
Alice7001200
Bob400700
Bob300700
Carol800800
Dan900900

➡️ The partition restricts the window to each salesperson’s rows.
Equivalent of GROUP BY, but still returns each row individually.

OVER(ORDER BY ...) — Ordered Windows

SELECT
  SalesPerson,
  Amount,
  SUM(Amount) OVER(ORDER BY Amount) AS RunningTotal
FROM Sales;

Output:

SalesPersonAmountRunningTotal
Bob300300
Bob400700
Alice5001200
Alice7001900
Carol8002700
Dan9003600

➡️ The ORDER BY defines sequence inside the window — often used for running totals, cumulative sums, or ranks.

OVER(PARTITION BY … ORDER BY …) — Combined

SELECT
  SalesPerson,
  Amount,
  SUM(Amount) OVER(
    PARTITION BY SalesPerson
    ORDER BY Amount
  ) AS RunningTotalPerPerson
FROM Sales;

Output:

SalesPersonAmountRunningTotalPerPerson
Alice500500
Alice7001200
Bob300300
Bob400700

➡️ Each partition (salesperson) has its own ordered running total.

Query: Using with Ranking Functions

OVER() is also used with analytic functions like RANK(), DENSE_RANK(), and ROW_NUMBER().

SELECT
  SalesPerson,
  Amount,
  RANK() OVER(PARTITION BY SalesPerson ORDER BY Amount DESC) AS RankInTeam
FROM Sales;

➡️ OVER() defines “how to rank” — by grouping and ordering logic.

Query: ROWS BETWEEN — Frame Specification

For fine control within the ordered window:

SELECT
  SalesPerson,
  Amount,
  AVG(Amount) OVER(
    ORDER BY Amount
    ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
  ) AS MovingAvg
FROM Sales;

➡️ Calculates the average using the previous, current, and next row — a sliding window.

Summary of Over Clause

Clause PartPurposeExample
PARTITION BYDivides data into groups“Per SalesPerson”
ORDER BYDefines order of rows in each group“Sort by Amount”
ROWS / RANGEControls the frame (how many rows before/after are visible)“Running or moving totals”

Conclusion

Window functions provide powerful tools for analyzing data in SQL, enabling complex calculations while preserving individual row details. Understanding how to use functions like RANK(), DENSE_RANK(), ROW_NUMBER(), NTILE(), LEAD(), and LAG() can significantly enhance your data analysis capabilities.

Knowledge Check

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *