What are window functions in DB2 and how are they different from GROUP BY?

Window functions in DB2 compute a value for each row based on a related set of rows (the 'window') without collapsing those rows into a single output row. GROUP BY collapses many rows into one aggregate row per group, discarding individual row detail. Window functions preserve every input row and attach the aggregated or ranked value alongside it. This makes them ideal for running totals, rankings, and row-by-row comparisons where you still need access to the individual row's other columns.

Does DB2 for z/OS support all the same window functions as DB2 LUW?

DB2 for z/OS (version 9 and later) and DB2 LUW both support the core OLAP window functions: ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, LEAD, FIRST_VALUE, LAST_VALUE, and aggregate functions with OVER. There are minor syntax differences and some frame-specification behaviours differ between versions. DB2 for z/OS introduced OLAP specifications (window functions) in version 8, with full frame clause support from version 9. Always check your specific version's SQL Reference for exact capabilities.

What is PARTITION BY in DB2 window functions?

PARTITION BY divides the rows into independent groups (partitions) before the window function is applied. The function is calculated separately within each partition and restarts for each new partition. For example, ROW_NUMBER() OVER (PARTITION BY WORKDEPT ORDER BY SALARY DESC) assigns sequence numbers 1, 2, 3, ... independently within each department. Without PARTITION BY, the function treats the entire result set as a single partition.

What is the difference between RANK and DENSE_RANK in DB2?

Both RANK and DENSE_RANK assign a rank number based on ORDER BY values within a window partition. The difference is in how they handle ties. RANK skips rank numbers after a tie: if two rows share rank 2, the next row gets rank 4 (not 3). DENSE_RANK never skips: after two rows at rank 2, the next row gets rank 3. Use DENSE_RANK when you need a continuous sequence with no gaps, for example when selecting the Nth highest value from a group.

DB2 Window Functions: ROW_NUMBER, RANK, LAG, LEAD Complete Guide

DB2 window functions (also called OLAP functions or analytic functions) are one of the most powerful features of modern SQL. Available in both DB2 for z/OS and DB2 LUW, they allow you to compute rankings, running totals, moving averages, and row-to-row comparisons without resorting to self-joins, correlated subqueries, or procedural cursor loops. Once you understand the OVER clause, you will rewrite dozens of complex queries in far fewer lines.

This guide covers every major DB2 window function with working SQL examples, explains how PARTITION BY, ORDER BY, and frame specifications interact, and shows practical use cases including sales rankings, year-over-year analysis, and gap detection.

What Are Window Functions?

A window function performs a calculation over a set of table rows that are related to the current row. The set of related rows is called the window, and it is defined by the OVER clause that follows every window function call.

The critical distinction from GROUP BY aggregation is that window functions do not collapse rows. Every input row remains in the output, and the window function result appears as an additional column alongside all the original columns.

sql

-- GROUP BY: collapses to one row per department
SELECT WORKDEPT, AVG(SALARY) AS AVG_SAL
FROM   DSN8C10.EMP
GROUP BY WORKDEPT;

-- Window function: keeps every row, adds the department average alongside
SELECT EMPNO,
       LASTNAME,
       WORKDEPT,
       SALARY,
       AVG(SALARY) OVER (PARTITION BY WORKDEPT) AS DEPT_AVG_SAL
FROM   DSN8C10.EMP;

The second query returns one row per employee, with DEPT_AVG_SAL showing the average for that employee's department — something impossible to express cleanly with GROUP BY alone.

The OVER Clause: PARTITION BY and ORDER BY

The OVER clause defines the window (the set of rows) and the ordering within that window. It has three optional components:

text

function_name(...) OVER (
    [PARTITION BY partition_expression, ...]
    [ORDER BY sort_expression [ASC | DESC], ...]
    [frame_clause]
)

PARTITION BY

PARTITION BY divides the result set into independent partitions. The window function resets and is computed independently for each partition. It is analogous to GROUP BY, but without collapsing rows.

If PARTITION BY is omitted, the entire result set is treated as one partition.

ORDER BY Inside OVER

ORDER BY inside the OVER clause determines the logical ordering within each partition for functions that are order-sensitive (ranking functions, LAG/LEAD, running totals). This ORDER BY is completely separate from the ORDER BY at the end of the full SELECT statement.

sql

-- Rank employees by salary within each department
SELECT EMPNO,
       LASTNAME,
       WORKDEPT,
       SALARY,
       RANK() OVER (PARTITION BY WORKDEPT
                    ORDER BY SALARY DESC) AS SAL_RANK
FROM   DSN8C10.EMP
ORDER BY WORKDEPT, SAL_RANK;

Frame Specification: ROWS BETWEEN

For aggregate window functions (SUM, AVG, MAX, MIN, COUNT) used with ORDER BY in the OVER clause, DB2 applies a frame that determines which rows within the partition are included in each calculation. The frame clause uses the syntax:

text

ROWS BETWEEN <start> AND <end>

Common frame boundaries:

Boundary	Meaning
`UNBOUNDED PRECEDING`	First row of the partition
`N PRECEDING`	N rows before the current row
`CURRENT ROW`	The current row
`N FOLLOWING`	N rows after the current row
`UNBOUNDED FOLLOWING`	Last row of the partition

sql

-- Running total: accumulate salary from the first row through the current row
SELECT EMPNO,
       LASTNAME,
       WORKDEPT,
       SALARY,
       SUM(SALARY) OVER (PARTITION BY WORKDEPT
                         ORDER BY HIREDATE
                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
           AS RUNNING_TOTAL
FROM   DSN8C10.EMP
ORDER BY WORKDEPT, HIREDATE;

-- 3-row moving average (current row + 1 before + 1 after)
SELECT EMPNO,
       SALARY,
       AVG(SALARY) OVER (PARTITION BY WORKDEPT
                         ORDER BY HIREDATE
                         ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
           AS MOVING_AVG_3
FROM   DSN8C10.EMP
ORDER BY WORKDEPT, HIREDATE;

When ORDER BY is present in the OVER clause but no frame clause is specified, DB2 defaults to ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW for most aggregate functions — which computes a running total or running aggregate. Always specify the frame clause explicitly in production code to avoid ambiguity.

ROW_NUMBER()

ROW_NUMBER() assigns a unique sequential integer to each row within a partition, starting from 1. No ties are possible — even if two rows have identical ORDER BY values, they receive different row numbers (the ordering between them is arbitrary in that case).

sql

-- Assign a sequence number to each employee within their department,
-- ordered by descending salary
SELECT EMPNO,
       LASTNAME,
       WORKDEPT,
       SALARY,
       ROW_NUMBER() OVER (PARTITION BY WORKDEPT
                          ORDER BY SALARY DESC) AS SEQ_NUM
FROM   DSN8C10.EMP
ORDER BY WORKDEPT, SEQ_NUM;

Selecting the Top N Rows per Group

ROW_NUMBER() is the standard way to select the top N rows per group in DB2 (because DB2 does not support LIMIT in z/OS SQL or the FETCH FIRST clause inside subqueries in older versions).

sql

-- Select the top 2 earners per department
SELECT EMPNO, LASTNAME, WORKDEPT, SALARY
FROM  (
          SELECT EMPNO,
                 LASTNAME,
                 WORKDEPT,
                 SALARY,
                 ROW_NUMBER() OVER (PARTITION BY WORKDEPT
                                    ORDER BY SALARY DESC) AS RN
          FROM   DSN8C10.EMP
      ) AS RANKED
WHERE RN <= 2
ORDER BY WORKDEPT, SALARY DESC;

RANK() and DENSE_RANK()

RANK()

RANK() assigns a rank to each row, with ties receiving the same rank. After a tie of N rows at rank K, the next rank assigned is K+N (ranks are skipped).

DENSE_RANK()

DENSE_RANK() also assigns the same rank to ties, but does not skip rank numbers — the next rank after a tie is always the previous rank plus 1.

sql

-- Compare RANK vs DENSE_RANK for salary ranking
SELECT EMPNO,
       LASTNAME,
       WORKDEPT,
       SALARY,
       RANK()       OVER (PARTITION BY WORKDEPT ORDER BY SALARY DESC) AS RNKK,
       DENSE_RANK() OVER (PARTITION BY WORKDEPT ORDER BY SALARY DESC) AS DENSE_RNK
FROM   DSN8C10.EMP
ORDER BY WORKDEPT, SALARY DESC;

Example output for one department:

LASTNAME	SALARY	RNKK	DENSE_RNK
HAAS	52750	1	1
THOMPSON	46500	2	2
JONES	46500	2	2
STERN	32250	4	3

With RANK, position 3 is skipped because two people share rank 2. With DENSE_RANK, the next rank after the tie is 3.

NTILE()

NTILE(n) divides the rows in each partition into n approximately equal buckets and assigns a bucket number (1 through n) to each row. It is useful for percentile calculations and quartile analysis.

sql

-- Divide employees within each department into salary quartiles
SELECT EMPNO,
       LASTNAME,
       WORKDEPT,
       SALARY,
       NTILE(4) OVER (PARTITION BY WORKDEPT
                      ORDER BY SALARY ASC) AS SALARY_QUARTILE
FROM   DSN8C10.EMP
ORDER BY WORKDEPT, SALARY_QUARTILE, SALARY;

Rows are distributed as evenly as possible; if the partition size is not evenly divisible by n, the first partitions receive one extra row.

LAG() and LEAD()

LAG() and LEAD() are among the most practically useful window functions. They allow you to access data from a previous row (LAG) or a following row (LEAD) within the same partition, without requiring a self-join.

Syntax

sql

LAG(expression  [, offset [, default]]) OVER (partition_and_order)
LEAD(expression [, offset [, default]]) OVER (partition_and_order)

offset: how many rows back (LAG) or forward (LEAD) to look. Defaults to 1.
default: value to return when there is no such row (e.g., the first row has no previous row). Defaults to NULL.

Year-over-Year Sales Comparison

sql

-- Compare this year's monthly sales to the same month last year
SELECT SALE_YEAR,
       SALE_MONTH,
       MONTHLY_AMOUNT,
       LAG(MONTHLY_AMOUNT, 12, 0)
           OVER (ORDER BY SALE_YEAR, SALE_MONTH) AS SAME_MONTH_LAST_YEAR,
       MONTHLY_AMOUNT -
           LAG(MONTHLY_AMOUNT, 12, 0)
               OVER (ORDER BY SALE_YEAR, SALE_MONTH) AS YOY_CHANGE
FROM   MONTHLY_SALES
ORDER BY SALE_YEAR, SALE_MONTH;

Row-to-Row Difference: Salary Changes

sql

-- Show salary change between consecutive hires in the same department
SELECT EMPNO,
       LASTNAME,
       WORKDEPT,
       HIREDATE,
       SALARY,
       LAG(SALARY) OVER (PARTITION BY WORKDEPT
                         ORDER BY HIREDATE)       AS PREV_HIRE_SALARY,
       SALARY - LAG(SALARY, 1, SALARY)
                    OVER (PARTITION BY WORKDEPT
                          ORDER BY HIREDATE)      AS SAL_DIFF
FROM   DSN8C10.EMP
ORDER BY WORKDEPT, HIREDATE;

Gap Analysis with LEAD

sql

-- Find gaps in order numbers (missing order IDs)
SELECT ORDER_ID,
       LEAD(ORDER_ID) OVER (ORDER BY ORDER_ID) AS NEXT_ORDER_ID,
       LEAD(ORDER_ID) OVER (ORDER BY ORDER_ID) - ORDER_ID - 1 AS GAP_SIZE
FROM   ORDERS
WHERE  LEAD(ORDER_ID) OVER (ORDER BY ORDER_ID) - ORDER_ID > 1;

Note: In DB2, you cannot use a window function directly in a WHERE clause. Wrap it in a derived table or CTE:

sql

-- Correct approach: wrap in a derived table
SELECT ORDER_ID, NEXT_ORDER_ID, GAP_SIZE
FROM  (
          SELECT ORDER_ID,
                 LEAD(ORDER_ID) OVER (ORDER BY ORDER_ID) AS NEXT_ORDER_ID,
                 LEAD(ORDER_ID) OVER (ORDER BY ORDER_ID) - ORDER_ID - 1
                     AS GAP_SIZE
          FROM   ORDERS
      ) AS GAPS
WHERE GAP_SIZE > 0
ORDER BY ORDER_ID;

FIRST_VALUE() and LAST_VALUE()

FIRST_VALUE() returns the value of an expression evaluated at the first row of the window frame. LAST_VALUE() returns the value from the last row of the frame.

sql

-- For each employee, show the highest and lowest salary in their department
SELECT EMPNO,
       LASTNAME,
       WORKDEPT,
       SALARY,
       FIRST_VALUE(SALARY) OVER (PARTITION BY WORKDEPT
                                  ORDER BY SALARY DESC
                                  ROWS BETWEEN UNBOUNDED PRECEDING
                                           AND UNBOUNDED FOLLOWING) AS DEPT_MAX_SAL,
       LAST_VALUE(SALARY)  OVER (PARTITION BY WORKDEPT
                                  ORDER BY SALARY DESC
                                  ROWS BETWEEN UNBOUNDED PRECEDING
                                           AND UNBOUNDED FOLLOWING) AS DEPT_MIN_SAL
FROM   DSN8C10.EMP
ORDER BY WORKDEPT, SALARY DESC;

The ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING frame is important for LAST_VALUE. Without it, the default frame would only include rows up to the current row, making LAST_VALUE return the current row's own value — rarely what you want.

Running Totals and Cumulative Aggregates

Running totals are a classic use case for window functions. Using SUM() OVER with the appropriate frame clause is far more efficient than a correlated subquery for every row.

sql

-- Cumulative salary spend by hire date within each department
SELECT EMPNO,
       LASTNAME,
       WORKDEPT,
       HIREDATE,
       SALARY,
       SUM(SALARY) OVER (PARTITION BY WORKDEPT
                         ORDER BY HIREDATE
                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
           AS CUMULATIVE_SALARY,
       COUNT(*) OVER (PARTITION BY WORKDEPT
                      ORDER BY HIREDATE
                      ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
           AS CUMULATIVE_HEADCOUNT
FROM   DSN8C10.EMP
ORDER BY WORKDEPT, HIREDATE;

Moving Average

sql

-- 5-period moving average of monthly sales
SELECT SALE_MONTH,
       MONTHLY_AMOUNT,
       AVG(MONTHLY_AMOUNT)
           OVER (ORDER BY SALE_MONTH
                 ROWS BETWEEN 4 PRECEDING AND CURRENT ROW)
           AS MOVING_AVG_5
FROM   MONTHLY_SALES
ORDER BY SALE_MONTH;

Window Functions vs GROUP BY

Scenario	Use GROUP BY	Use Window Function
Total per group, no row detail needed	Yes	No
Total per group AND individual row detail	No	Yes
Ranking within a group	No	Yes
Access previous/next row value	No	Yes
Running or cumulative aggregate	No	Yes
Percentage of group total	No	Yes
Multiple different aggregations in one pass	Possible	Yes, cleaner

A common pattern is combining both in the same query — GROUP BY to produce a summary table (or CTE), then window functions on the expanded detail.

Practical Use Case: Sales Rankings and Performance Tiers

sql

WITH SALES_RANKED AS (
    SELECT SALESPERSON_ID,
           REGION,
           SALE_YEAR,
           TOTAL_SALES,
           RANK()       OVER (PARTITION BY REGION, SALE_YEAR
                              ORDER BY TOTAL_SALES DESC)  AS REGION_RANK,
           DENSE_RANK() OVER (ORDER BY TOTAL_SALES DESC)  AS OVERALL_RANK,
           NTILE(4)     OVER (PARTITION BY SALE_YEAR
                              ORDER BY TOTAL_SALES DESC)  AS PERFORMANCE_TIER,
           SUM(TOTAL_SALES) OVER (PARTITION BY REGION, SALE_YEAR)
                                                           AS REGION_TOTAL,
           DECIMAL(TOTAL_SALES /
               SUM(TOTAL_SALES) OVER (PARTITION BY REGION, SALE_YEAR)
               * 100, 5, 2)                                AS PCT_OF_REGION
    FROM   SALES_SUMMARY
    WHERE  SALE_YEAR IN (2025, 2026)
)
SELECT SALESPERSON_ID,
       REGION,
       SALE_YEAR,
       TOTAL_SALES,
       REGION_RANK,
       OVERALL_RANK,
       CASE PERFORMANCE_TIER
           WHEN 1 THEN 'TOP 25%'
           WHEN 2 THEN 'UPPER MIDDLE'
           WHEN 3 THEN 'LOWER MIDDLE'
           WHEN 4 THEN 'BOTTOM 25%'
       END AS TIER_LABEL,
       PCT_OF_REGION
FROM   SALES_RANKED
WHERE  REGION_RANK <= 5
ORDER BY REGION, SALE_YEAR, REGION_RANK;

Performance Considerations

Window Functions Are a Single Pass

A key performance advantage of window functions over equivalent correlated subqueries or self-joins is that DB2 typically processes the data in a single pass. Correlated subqueries for running totals re-scan the table for every row; window functions scan once, sort if necessary, and compute the window aggregate incrementally.

Sorting Cost

Window functions that use ORDER BY in the OVER clause may require sorting. If the underlying data is already sorted (due to an index or a prior sort in the query plan), DB2 can avoid an additional sort step. Include suitable indexes on PARTITION BY and ORDER BY columns to reduce sort overhead.

Multiple Window Functions with Different Partitions

If a query contains several window functions with different PARTITION BY or ORDER BY expressions, DB2 may need to sort the data multiple times. In this case, it can be worth separating the computations into CTEs to give the optimizer more flexibility.

sql

-- Potentially expensive: two different PARTITION BY in one SELECT
SELECT EMPNO,
       SALARY,
       AVG(SALARY) OVER (PARTITION BY WORKDEPT) AS DEPT_AVG,
       AVG(SALARY) OVER (PARTITION BY JOB)      AS JOB_AVG
FROM   DSN8C10.EMP;

Check the EXPLAIN output to see if DB2 performs two sort operations. If performance is critical, separate into CTEs or intermediate result sets.

Use EXPLAIN

Always explain queries with window functions on large tables:

sql

EXPLAIN PLAN SET QUERYNO = 200 FOR
    SELECT EMPNO, SALARY,
           ROW_NUMBER() OVER (PARTITION BY WORKDEPT ORDER BY SALARY DESC) AS RN
    FROM   DSN8C10.EMP;

SELECT PLANNO, TNAME, ACCESSTYPE, SORTC_PREF
FROM   PLAN_TABLE
WHERE  QUERYNO = 200
ORDER BY PLANNO;

Summary

DB2 window functions extend SQL with powerful row-by-row analytical capabilities:

The OVER clause defines the window — use PARTITION BY to create groups and ORDER BY to set the processing order within each group.
Frame clauses (ROWS BETWEEN) control which rows are included in aggregate calculations.
ROW_NUMBER() assigns unique row numbers — ideal for top-N-per-group queries.
RANK() and DENSE_RANK() handle ties differently; DENSE_RANK has no gaps in numbering.
NTILE(n) divides rows into n equal buckets for percentile analysis.
LAG() and LEAD() access previous and next rows without self-joins — perfect for YOY comparisons and gap analysis.
FIRST_VALUE() and LAST_VALUE() return boundary values within the frame.
SUM() OVER and AVG() OVER with frame clauses produce running totals and moving averages in a single table pass.
Window functions cannot be used in WHERE or HAVING clauses — wrap them in a derived table or CTE.

Frequently Asked Questions

Q: What are window functions in DB2 and how do they differ from GROUP BY aggregates? Window functions compute an aggregate or ranking across a sliding "window" of rows defined by the OVER() clause — without collapsing the result set. GROUP BY reduces multiple rows to one row per group; a window function adds a computed column to every row in the result while still returning all rows. This makes window functions ideal for running totals, rankings, and moving averages where you need both the individual row and the aggregate.

Q: What window functions does DB2 support? DB2 supports ranking functions (ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE()), aggregate window functions (SUM(), AVG(), COUNT(), MIN(), MAX() with OVER()), and the LAG()/LEAD() navigation functions for accessing prior and next row values. The OVER() clause accepts PARTITION BY (groups rows), ORDER BY (defines row sequence), and a ROWS/RANGE frame clause (defines the sliding window boundary).

Q: How do you compute a running total in DB2 using window functions? Use SUM(amount) OVER (PARTITION BY account_id ORDER BY transaction_date ROWS UNBOUNDED PRECEDING). The ROWS UNBOUNDED PRECEDING frame includes all rows from the start of the partition up to the current row, producing a cumulative sum. Without the frame clause, ORDER BY inside OVER implies a range frame that can produce unexpected results for duplicate dates, so always specify ROWS explicitly for running totals.