What does EXPLAIN do in DB2?

EXPLAIN is a DB2 command that causes the optimizer to analyse a SQL statement and populate the PLAN_TABLE (or EXPLAIN tables) with information about how DB2 plans to execute it — which indexes it will use, the join order and method, estimated cardinalities, and whether sorting is required. EXPLAIN does not execute the query; it shows the planned execution path. Use it to diagnose why a query is slow and to verify that optimization changes take effect.

What is the PLAN_TABLE in DB2?

PLAN_TABLE is the output table for EXPLAIN on DB2 for z/OS. Each row represents one step in the query execution plan. Key columns include ACCESSTYPE (how the table is accessed), MATCHCOLS (how many index columns are matched), ACCESSCREATOR and ACCESSNAME (which index is used), PREFETCH (whether sequential or list prefetch is used), and SORTN_JOIN/SORTU_JOIN (whether sorts are needed). On DB2 LUW, the equivalent tables are the EXPLAIN_ tables in the current schema.

What does ACCESSTYPE 'I' vs 'R' mean in DB2 EXPLAIN?

ACCESSTYPE in the PLAN_TABLE indicates how DB2 accesses the table: I = Index scan (DB2 uses an index to find qualifying rows — generally fast); R = Tablespace scan (DB2 reads every page in the tablespace — slow for large tables); I1 = One-fetch index scan (single row lookup via unique index — very fast); N = Index scan returning non-qualifying rows (followed by a data page access). Index access is almost always preferred over tablespace scans for selective queries.

How do I force DB2 to use a specific index?

DB2 does not have a direct hint syntax like Oracle. The recommended approach is to ensure statistics are current (RUNSTATS) and the index has appropriate statistics. You can influence the optimizer using OPTIMIZE FOR n ROWS (tells DB2 to optimise for fetching the first n rows), FETCH FIRST n ROWS ONLY (which implies OPTIMIZE FOR n ROWS), or by restructuring the query to make the desired index more attractive. On DB2 LUW you can use SELECTIVITY hints in some versions.

DB2 EXPLAIN and Query Optimization

Query performance is one of the most critical skills for any DB2 developer or DBA. A slow query in a CICS transaction can bring an entire system to its knees; a missing index on a high-volume batch program can turn a 2-minute job into a 2-hour one. DB2's EXPLAIN facility exposes exactly how the optimizer plans to execute your SQL — and understanding it is the foundation of all performance tuning.

How the DB2 Optimizer Works

The DB2 optimizer is a cost-based optimizer. When you submit a SQL statement, the optimizer:

Parses and validates the SQL syntax and semantics
Enumerates possible access plans (which indexes to use, join orders, join methods)
Estimates the cost of each plan using catalog statistics (row counts, column cardinalities, index clustering)
Selects the plan with the lowest estimated cost
Stores the plan in a package or plan (at BIND time) or compiles it dynamically

The quality of the optimizer's decisions depends entirely on the quality of the statistics. Outdated statistics lead to poor estimates, which lead to poor access plans and slow queries.

Setting Up EXPLAIN on DB2 LUW

On DB2 LUW, create the EXPLAIN tables in your schema:

sql

-- Create all explain tables in your current schema
CALL SYSPROC.SYSINSTALLOBJECTS('EXPLAIN', 'C', NULL, CURRENT SCHEMA);

Now run EXPLAIN:

sql

EXPLAIN PLAN FOR
SELECT E.EMPNO, E.LASTNAME, D.DEPTNAME
FROM   EMPLOYEE E
JOIN   DEPARTMENT D ON E.WORKDEPT = D.DEPTNO
WHERE  E.SALARY > 70000
ORDER BY E.SALARY DESC;

-- View the plan
SELECT OPERATOR_TYPE, OBJECT_SCHEMA, OBJECT_NAME, ESTIMATED_CARDINALITY, IO_COST
FROM   EXPLAIN_OPERATOR
ORDER BY OPERATOR_ID;

Setting Up EXPLAIN on DB2 for z/OS

On z/OS, PLAN_TABLE must exist in your schema before running EXPLAIN:

sql

CREATE TABLE PLAN_TABLE (
    QUERYNO         INTEGER NOT NULL,
    QBLOCKNO        SMALLINT NOT NULL,
    APPLNAME        CHAR(8),
    PROGNAME        CHAR(8),
    PLANNO          SMALLINT NOT NULL,
    METHOD          SMALLINT NOT NULL,
    CREATOR         CHAR(8),
    TNAME           CHAR(18),
    TABNO           SMALLINT NOT NULL,
    ACCESSTYPE      CHAR(2),
    MATCHCOLS       SMALLINT NOT NULL,
    ACCESSCREATOR   CHAR(8),
    ACCESSNAME      CHAR(18),
    PREFETCH        CHAR(1),
    COLUMN_FN_EVAL  CHAR(1),
    MIXOPSEQ        SMALLINT NOT NULL,
    -- ... additional columns
    PRIMARY KEY (QUERYNO, QBLOCKNO, PLANNO, MIXOPSEQ)
);

Then run:

sql

EXPLAIN PLAN SET QUERYNO = 1 FOR
SELECT EMPNO, LASTNAME, SALARY
FROM   EMPLOYEE
WHERE  WORKDEPT = 'A00'
ORDER BY SALARY DESC;

SELECT TNAME, ACCESSTYPE, MATCHCOLS, ACCESSNAME, PREFETCH
FROM   PLAN_TABLE
WHERE  QUERYNO = 1
ORDER BY PLANNO;

Reading the PLAN_TABLE

ACCESSTYPE — How DB2 Reaches the Data

ACCESSTYPE	Meaning	Performance
I	Index scan	Good — selective queries
I1	One-fetch index scan (unique key)	Excellent
N	Index scan, non-matching rows filtered	OK
R	Tablespace scan (full scan)	Poor for large tables
M	Multiple index access (RID list)	Situational

MATCHCOLS — Index Match Depth

MATCHCOLS tells you how many leading columns of the index are being used to filter data:

MATCHCOLS = 0: The index is used but only for ordering, not filtering
MATCHCOLS = 1: The first index column matches a predicate
MATCHCOLS = n: The first n columns match predicates

Higher MATCHCOLS generally means fewer rows examined.

PREFETCH — I/O Strategy

PREFETCH	Meaning
S	Sequential prefetch (reads ahead in large blocks)
L	List prefetch (collects RIDs first, then fetches in order)
D	Dynamic prefetch
(blank)	No prefetch

Common Performance Problems and Fixes

Problem 1: Tablespace Scan on a Large Table

sql

-- Slow: full tablespace scan
SELECT * FROM EMPLOYEE WHERE SALARY > 70000;

EXPLAIN shows ACCESSTYPE = 'R'. Fix: create an index on SALARY:

sql

CREATE INDEX EMP_SALARY_IDX ON EMPLOYEE (SALARY);

After creating the index and running RUNSTATS, re-EXPLAIN to verify ACCESSTYPE changes to 'I'.

Problem 2: Index Exists but Not Used

If EXPLAIN still shows a tablespace scan after adding an index, possible reasons:

Query is not selective enough: If 40% of rows qualify, a tablespace scan is faster than an index + data page fetch
Statistics are outdated: Run RUNSTATS to refresh
Implicit type conversion: WHERE EMPNO = 10 when EMPNO is CHAR(6) — fix with WHERE EMPNO = '000010'
Function on indexed column: WHERE YEAR(HIREDATE) = 2020 prevents index use — rewrite as WHERE HIREDATE BETWEEN '2020-01-01' AND '2020-12-31'

Problem 3: Sort Required for ORDER BY

When SORTN_JOIN or SORTU_UNIQ is 'Y' in PLAN_TABLE, DB2 is performing an explicit sort. This is expensive for large result sets. Fix: create an index that matches both the filter and the ORDER BY:

sql

-- Query with both filter and sort
SELECT EMPNO, LASTNAME FROM EMPLOYEE
WHERE WORKDEPT = 'A00'
ORDER BY SALARY DESC;

-- Index that avoids the sort (leading column = filter, trailing = sort order)
CREATE INDEX EMP_DEPT_SAL_IDX ON EMPLOYEE (WORKDEPT, SALARY DESC);

Problem 4: Cartesian Product (Missing JOIN Condition)

sql

-- Missing JOIN condition produces a Cartesian product
SELECT E.LASTNAME, D.DEPTNAME
FROM   EMPLOYEE E, DEPARTMENT D;  -- missing WHERE E.WORKDEPT = D.DEPTNO

EXPLAIN shows METHOD = 4 (nested loop join with no matching columns). This returns every combination of rows — exponentially more data than intended.

OPTIMIZE FOR n ROWS

Tell DB2 to optimise for fetching the first n rows quickly rather than minimising overall cost:

sql

-- Optimise for interactive display of top 25 results
SELECT EMPNO, LASTNAME, SALARY
FROM   EMPLOYEE
ORDER BY SALARY DESC
OPTIMIZE FOR 25 ROWS;

This is especially useful in CICS programs where you display the first page of results and users rarely scroll further.

Visual Explain (DB2 LUW)

IBM Data Studio and Db2 Data Management Console provide a Visual Explain feature that renders the access plan as a graphical tree. Each node shows the operation type, estimated cardinality, and cost. This is often faster to interpret than reading the flat PLAN_TABLE output.

To use Visual Explain in IBM Data Studio:

Right-click on a SQL statement in the SQL Editor
Select Run Explain
Switch to the Explain Plan tab to see the graphical view

Explain for Dynamic SQL (RUNSTATS-Aware)

After running RUNSTATS, dynamic SQL queries will use updated statistics on the next execution. For static SQL (embedded in COBOL packages), you must REBIND the package:

jcl

//REBIND   EXEC PGM=IKJEFT01
//SYSTSIN  DD *
  DSN SYSTEM(DSN1)
  REBIND PACKAGE(EMPPKG.EMPRPT(*)) ACTION(REPLACE)
  END
/*

DB2 Accounting and Statistics Traces

For production performance problems, EXPLAIN only shows the planned path. The actual runtime behaviour is captured in DB2 traces:

Accounting Trace (Class 1): Per-transaction summary — DB2 CPU, elapsed time, lock waits, I/O counts.

Accounting Trace (Class 3): Includes wait time broken down by lock waits, I/O waits, and DB2 internal waits.

Statistics Trace: System-level — buffer pool hit rates, total lock waits, log I/O.

On z/OS, these traces write to SMF records processed by IBM Omegamon, OMPE (DB2 Performance Expert), or Query Monitor. On DB2 LUW, use the db2pd command or the activity monitor.

Quick Performance Checklist

Before escalating a slow query, work through this checklist:

Run EXPLAIN and check for tablespace scans on large tables
Run RUNSTATS if statistics are more than a week old for volatile tables
Check that predicates on indexed columns use the correct data type (no implicit conversion)
Verify that functions on indexed columns are not preventing index use
Check MATCHCOLS — is the leading index column being used?
For ORDER BY, check if a composite index can eliminate the sort
For joins, check that JOIN conditions match indexed columns
For z/OS packages, run REBIND after RUNSTATS to apply new statistics

Next Steps

Query optimisation depends heavily on indexes. Read the DB2 Indexes Guide for a complete treatment of index types, creation strategies, and when indexes hurt rather than help. For the full learning path, visit the DB2 Mastery course hub.

Frequently Asked Questions

Q: How do you capture and read a DB2 EXPLAIN output? First create the EXPLAIN tables by running EXPLAIN.DDL (LUW) or the z/OS equivalent DSNTESC. Then prefix your query with EXPLAIN ALL FOR or use SET CURRENT EXPLAIN MODE = YES to auto-explain all queries. DB2 populates EXPLAIN_STATEMENT, EXPLAIN_OPERATOR, and related tables. Use db2exfmt on LUW or the Visual Explain tool in IBM Data Studio to format the output into a readable access plan tree.

Q: What access methods appear in a DB2 EXPLAIN and what do they mean? Common access methods include: TBSCAN (table space scan — reads all rows, usually undesirable for large tables), IXSCAN (index scan — uses an index to locate rows), IXAND/IXOR (multiple index combinations), NLJOIN/MSJOIN/HSJOIN (nested-loop, merge-scan, and hash join algorithms), and SORT (explicit sort required for ORDER BY or merge join). A TBSCAN on a large table is the most common sign that an index is missing or not being chosen.

Q: What does a high TOTALCOST value in EXPLAIN mean and how do you reduce it? TOTALCOST is the optimiser's estimated cost in timerons (an internal unit). A high value indicates the optimiser predicts expensive I/O or CPU. Reduce it by: ensuring RUNSTATS is current (stale stats mislead the optimiser), adding or extending indexes to cover the query's predicates, rewriting correlated subqueries as joins, and avoiding functions on indexed columns in WHERE predicates (which prevent index usage).

DB2 EXPLAIN and Query Optimization: Complete Performance Guide