SQL TL;DR: Window Functions (DQL)
Back to SQL TL;DR Hub

Window Functions

/tldr: Performing calculations on a set of related rows without collapsing them.

Analytical Queries Advanced DQL Ranking & Running Totals

THE CORE IDEA: Aggregate without Grouping

Window functions allow you to run aggregate calculations (like SUM or AVG) or ranking operations on a related subset of rows—the "window"—but they do not collapse the result set like GROUP BY does. Every original row remains in the output, and the calculation result is added as a new column for each row.

GROUP BY: Many Rows One Summary Row

WINDOW: Many Rows Many Rows + Calculation

The Engine: The OVER Clause

Every window function must be followed by OVER (...), which defines the boundaries of the window the function will operate on.


FUNCTION(...) OVER (
    [PARTITION BY column_list]
    [ORDER BY column_list]
    [ROWS/RANGE frame_definition]
)

PARTITION BY

Divides the data into non-overlapping groups (the 'windows'). The function restarts its calculation for each new partition. (Similar to `GROUP BY`).

ORDER BY

Defines the sequence within the current partition. Crucial for ranking, running totals, and LAG/LEAD functions.

ROWS/RANGE

Defines the "frame"—the exact set of rows within the partition that the function should act upon (e.g., the last 3 rows, or all rows up to the current one).

Type 1: Ranking Functions

Used to assign a rank or sequential number to rows within each partition.

  • ROW_NUMBER()

    Assigns a unique, sequential integer to each row starting at 1. Doesn't account for ties.

  • RANK()

    Assigns the same rank to ties, but skips the next rank number (e.g., 1, 2, 2, 4).

  • DENSE_RANK()

    Assigns the same rank to ties, but does NOT skip the next rank number (e.g., 1, 2, 2, 3).

-- Goal: Find the highest paid employee in each department.
SELECT
    employee_name,
    salary,
    department,
    DENSE_RANK() OVER (
        PARTITION BY department  -- Group employees by department
        ORDER BY salary DESC     -- Rank them by salary, descending
    ) AS rank_in_dept
FROM
    employees;
-- Filter the result: WHERE rank_in_dept = 1
                    

Type 2: Analytic Functions (LAG/LEAD)

These help compare values across time or sequence, common in time-series analysis.

LAG(column, offset, default)

Retrieves the value of a column from a previous row in the partition (offset defaults to 1).

LEAD(column, offset, default)

Retrieves the value of a column from a subsequent row in the partition (offset defaults to 1).

-- Goal: Calculate the month-over-month sales difference
SELECT
    sale_date,
    revenue,
    LAG(revenue, 1, 0) OVER (ORDER BY sale_date) AS previous_month_revenue,
    revenue - LAG(revenue, 1, 0) OVER (ORDER BY sale_date) AS revenue_change
FROM
    monthly_sales;
            

Type 3: Aggregate Functions with Windows

Running Totals (The Frame)

By adding the `ROWS` clause, you can define a *moving frame* for aggregates. The common use case is calculating a running total or a moving average.

SELECT
    transaction_date,
    amount,
    -- Calculates the sum of all 'amount' rows up to the current row, ordered by date
    SUM(amount) OVER (
        PARTITION BY customer_id
        ORDER BY transaction_date
        ROWS UNBOUNDED PRECEDING  -- Start from the first row in the partition
        -- AND CURRENT ROW        -- End at the current row (This is the default, often omitted)
    ) AS running_total
FROM
    transactions;
                

Start slicing and comparing your data.

Window functions are often solved by subqueries, but the window function syntax is cleaner and much faster.