DB2 JOINs Explained: INNER, LEFT, RIGHT, FULL OUTER with Examples

Introduction: Understanding DB2 JOINs
The DB2 JOIN is the fundamental operation that makes relational databases relational. Instead of storing all related data in one massive flat table, DB2 distributes data across normalised tables — EMPLOYEE, DEPARTMENT, PROJECT — and uses JOINs to bring it back together at query time. Understanding how each JOIN type behaves is not just an academic exercise: choosing the wrong JOIN type produces incorrect results without any error message, which makes JOIN semantics one of the most important concepts in DB2 SQL.
This guide covers every major JOIN type available in DB2: INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, CROSS JOIN, and self JOIN. It also covers joining three or more tables, JOIN with aggregation, JOIN versus subquery trade-offs, and the most common mistakes developers make. All examples use the IBM sample database tables — EMPLOYEE, DEPARTMENT, and PROJECT — which map to real-world data structures you encounter on z/OS and LUW systems.
The Two Table Syntaxes: ANSI JOIN vs Comma Syntax
Before diving into join types, it is worth clarifying the two syntax styles you will encounter in DB2 codebases.
The older comma syntax places all tables in the FROM clause separated by commas, and the join condition goes in the WHERE clause:
-- Old comma syntax (avoid in new code)
SELECT E.LASTNAME, D.DEPTNAME
FROM EMP.EMPLOYEE E, EMP.DEPARTMENT D
WHERE E.WORKDEPT = D.DEPTNO;The modern ANSI JOIN syntax (ISO SQL standard, supported by all DB2 versions) uses explicit JOIN keywords and an ON clause:
-- Modern ANSI syntax (preferred)
SELECT E.LASTNAME, D.DEPTNAME
FROM EMP.EMPLOYEE E
JOIN EMP.DEPARTMENT D
ON E.WORKDEPT = D.DEPTNO;The comma syntax is equivalent to an INNER JOIN in terms of results. However, the ANSI syntax is superior because it clearly separates join conditions from filter conditions, it is required for outer joins (LEFT/RIGHT/FULL), and it is significantly easier to debug when queries grow complex. All examples in this guide use ANSI syntax.
INNER JOIN: The Default Join
An INNER JOIN returns only the rows where the join condition evaluates to TRUE in both tables. Rows from either table that have no matching row in the other table are completely excluded from the result.
-- List every employee alongside their department name
-- (employees with no matching department are excluded)
SELECT E.EMPNO,
E.FIRSTNME || ' ' || E.LASTNAME AS FULL_NAME,
E.WORKDEPT,
D.DEPTNAME,
D.MGRNO
FROM EMP.EMPLOYEE E
INNER JOIN EMP.DEPARTMENT D
ON E.WORKDEPT = D.DEPTNO
ORDER BY D.DEPTNAME, E.LASTNAME;The keyword INNER is optional — a bare JOIN in DB2 is an INNER JOIN by default. Including INNER explicitly makes the intent clear to future readers.
INNER JOIN is the right choice when you can guarantee that every row in the left table has a corresponding row in the right table — for example, when the left table has a foreign key column that is NOT NULL and references the right table's primary key.
LEFT OUTER JOIN: Keep All Left Rows
A LEFT OUTER JOIN (also written as LEFT JOIN) returns all rows from the left table, and the matching rows from the right table. When a left-table row has no match in the right table, the right table's columns appear as NULL in the result.
-- List ALL employees, including those not assigned to any department
SELECT E.EMPNO,
E.LASTNAME,
E.WORKDEPT,
D.DEPTNAME
FROM EMP.EMPLOYEE E
LEFT JOIN EMP.DEPARTMENT D
ON E.WORKDEPT = D.DEPTNO
ORDER BY E.LASTNAME;Employees whose WORKDEPT is NULL or does not match any DEPTNO in DEPARTMENT will still appear in this result, with NULL in the DEPTNAME column. This is the most common outer join in practice because many business queries need to report on all records in a primary entity (employees, orders, customers) while optionally enriching them with data from secondary tables.
You can use the NULL values produced by the outer join to find records that have no match — the "anti-join" pattern:
-- Find employees who are NOT assigned to any department
SELECT E.EMPNO,
E.LASTNAME,
E.WORKDEPT
FROM EMP.EMPLOYEE E
LEFT JOIN EMP.DEPARTMENT D
ON E.WORKDEPT = D.DEPTNO
WHERE D.DEPTNO IS NULL;This is generally more efficient than using NOT IN (SELECT DEPTNO FROM DEPARTMENT) because it avoids the subquery execution for each row and handles NULLs more predictably.
RIGHT OUTER JOIN: Keep All Right Rows
A RIGHT OUTER JOIN is the mirror image of a LEFT OUTER JOIN. It returns all rows from the right table and matching rows from the left table. Right-table rows with no left match appear with NULLs for the left table's columns.
-- List ALL departments, including those with no employees assigned
SELECT D.DEPTNO,
D.DEPTNAME,
E.EMPNO,
E.LASTNAME
FROM EMP.EMPLOYEE E
RIGHT JOIN EMP.DEPARTMENT D
ON E.WORKDEPT = D.DEPTNO
ORDER BY D.DEPTNAME, E.LASTNAME;In practice, RIGHT OUTER JOIN is far less common than LEFT OUTER JOIN, because you can always rewrite a RIGHT JOIN as a LEFT JOIN by swapping the table order. Most developers find queries easier to reason about when the "primary" or "driving" table is on the left. The two queries below produce identical results:
-- These two are equivalent:
SELECT D.DEPTNAME, E.LASTNAME
FROM EMP.EMPLOYEE E RIGHT JOIN EMP.DEPARTMENT D ON E.WORKDEPT = D.DEPTNO;
SELECT D.DEPTNAME, E.LASTNAME
FROM EMP.DEPARTMENT D LEFT JOIN EMP.EMPLOYEE E ON D.DEPTNO = E.WORKDEPT;FULL OUTER JOIN: Keep All Rows from Both Tables
A FULL OUTER JOIN (also written as FULL JOIN) returns all rows from both tables. Where a match exists on the join condition, the columns from both tables are populated. Where a row has no match, the columns from the other table appear as NULL.
-- Show all employees and all departments,
-- even if employees have no dept or depts have no employees
SELECT E.EMPNO,
E.LASTNAME,
E.WORKDEPT,
D.DEPTNO,
D.DEPTNAME
FROM EMP.EMPLOYEE E
FULL OUTER JOIN EMP.DEPARTMENT D
ON E.WORKDEPT = D.DEPTNO
ORDER BY D.DEPTNO NULLS LAST,
E.LASTNAME NULLS LAST;FULL OUTER JOIN is useful for reconciliation queries — finding records in Table A that are missing from Table B, and records in Table B that are missing from Table A, all in one pass:
-- Find unmatched rows on both sides using FULL OUTER JOIN
SELECT E.EMPNO AS ORPHAN_EMPLOYEE,
D.DEPTNO AS EMPTY_DEPARTMENT
FROM EMP.EMPLOYEE E
FULL OUTER JOIN EMP.DEPARTMENT D
ON E.WORKDEPT = D.DEPTNO
WHERE E.EMPNO IS NULL -- Department with no employees
OR D.DEPTNO IS NULL; -- Employee with no departmentCROSS JOIN: The Cartesian Product
A CROSS JOIN produces the cartesian product of two tables — every row from the left table is paired with every row from the right table. With 10 rows on the left and 5 rows on the right, the result contains 50 rows. CROSS JOIN has no ON clause.
-- Generate all possible employee-project combinations
-- (useful for scheduling or capacity planning matrices)
SELECT E.EMPNO,
E.LASTNAME,
P.PROJNO,
P.PROJNAME
FROM EMP.EMPLOYEE E
CROSS JOIN EMP.PROJECT P
ORDER BY E.LASTNAME, P.PROJNO;CROSS JOIN is legitimate for generating combination matrices, test data, or Cartesian lookups against small reference tables (like date or calendar dimension tables in reporting). However, accidentally performing a CROSS JOIN on two large tables — by omitting the ON clause or using the comma syntax without a WHERE condition — is one of the most destructive mistakes in SQL. A 100,000-row table CROSS JOINed to another 100,000-row table produces 10 billion rows.
Self JOIN: Querying a Table Against Itself
A self JOIN joins a table to itself by using two different aliases. The classic use case is querying hierarchical or self-referencing data — for example, finding each employee's manager, where both the employee and the manager are rows in the same EMPLOYEE table.
-- List each employee alongside their manager's name
SELECT E.EMPNO AS EMP_ID,
E.LASTNAME AS EMPLOYEE,
M.EMPNO AS MGR_ID,
M.LASTNAME AS MANAGER
FROM EMP.EMPLOYEE E
LEFT JOIN EMP.EMPLOYEE M
ON E.WORKDEPT = M.WORKDEPT
AND M.JOB = 'MANAGER'
ORDER BY E.LASTNAME;A LEFT JOIN is used here rather than INNER JOIN to preserve employees who have no manager record (e.g., the top-level executive whose own row has no manager above them in the same table).
Another self-join pattern compares rows within the same table — for example, finding pairs of employees in the same department who share the same job title:
-- Find pairs of employees with the same job in the same department
SELECT A.LASTNAME AS EMPLOYEE_A,
B.LASTNAME AS EMPLOYEE_B,
A.JOB,
A.WORKDEPT
FROM EMP.EMPLOYEE A
JOIN EMP.EMPLOYEE B
ON A.WORKDEPT = B.WORKDEPT
AND A.JOB = B.JOB
AND A.EMPNO < B.EMPNO -- Avoid duplicate pairs (A,B) and (B,A)
ORDER BY A.WORKDEPT, A.JOB;The A.EMPNO < B.EMPNO condition is a standard technique to avoid seeing the same pair twice in reversed order.
Joining Three or More Tables
Real-world DB2 queries frequently need to join three, four, or more tables. Each additional JOIN adds a clause that references the already-joined result set. DB2's optimizer is free to reorder joins internally for efficiency, though the logical result is the same.
-- List each employee, their department, and the project they work on
SELECT E.EMPNO,
E.LASTNAME,
D.DEPTNAME,
P.PROJNO,
P.PROJNAME,
PA.ACTNO,
PA.ACSTAFF
FROM EMP.EMPLOYEE E
JOIN EMP.DEPARTMENT D
ON E.WORKDEPT = D.DEPTNO
JOIN EMP.PROJACT PA
ON E.EMPNO = PA.EMPNO
JOIN EMP.PROJECT P
ON PA.PROJNO = P.PROJNO
WHERE P.DEPTNO = 'D01'
AND PA.ACSTAFF > 0.5
ORDER BY E.LASTNAME, P.PROJNO;When joining three or more tables, give every table a short alias immediately in the FROM clause and always prefix every column reference with its alias. This eliminates ambiguous column name errors and makes it immediately clear which table each column comes from.
Verify that every JOIN has an explicit ON clause. If you accidentally omit an ON clause for one of the joins, DB2 will either raise a syntax error or — in some forms — silently produce a cartesian product.
JOIN with Aggregation (GROUP BY)
JOINs are frequently combined with GROUP BY and aggregate functions to produce summary reports. The JOIN happens first (building the combined row set), and then GROUP BY collapses those rows into groups.
-- Count employees and total payroll per department
SELECT D.DEPTNO,
D.DEPTNAME,
COUNT(E.EMPNO) AS HEADCOUNT,
SUM(E.SALARY) AS TOTAL_SALARY,
AVG(E.SALARY) AS AVG_SALARY,
MAX(E.SALARY) AS MAX_SALARY
FROM EMP.DEPARTMENT D
LEFT JOIN EMP.EMPLOYEE E
ON D.DEPTNO = E.WORKDEPT
GROUP BY D.DEPTNO,
D.DEPTNAME
ORDER BY TOTAL_SALARY DESC;Using LEFT JOIN here ensures that departments with zero employees still appear in the output (with headcount 0 and NULL salary figures). With INNER JOIN, empty departments would be silently excluded — which might produce misleading reports.
The HAVING clause filters the grouped results, similar to how WHERE filters individual rows:
-- Only show departments with more than 3 employees and average salary over 60,000
SELECT D.DEPTNO,
D.DEPTNAME,
COUNT(E.EMPNO) AS HEADCOUNT,
AVG(E.SALARY) AS AVG_SALARY
FROM EMP.DEPARTMENT D
JOIN EMP.EMPLOYEE E
ON D.DEPTNO = E.WORKDEPT
GROUP BY D.DEPTNO,
D.DEPTNAME
HAVING COUNT(E.EMPNO) > 3
AND AVG(E.SALARY) > 60000.00
ORDER BY AVG_SALARY DESC;JOIN vs Subquery: When to Use Each
Both JOINs and subqueries can answer the same questions in many cases, and DB2's optimizer often transforms one into the other internally. As a practical guideline: use a JOIN when you need columns from multiple tables in the result set. Use a subquery when you only need to test a condition against another table without selecting its columns.
-- JOIN approach: shows manager name in output
SELECT E.EMPNO, E.LASTNAME, M.LASTNAME AS MGR_LASTNAME
FROM EMP.EMPLOYEE E
JOIN EMP.EMPLOYEE M ON E.WORKDEPT = M.WORKDEPT AND M.JOB = 'MANAGER';
-- Subquery approach: just checks if a manager exists, no output of manager name
SELECT E.EMPNO, E.LASTNAME
FROM EMP.EMPLOYEE E
WHERE EXISTS (
SELECT 1 FROM EMP.EMPLOYEE M
WHERE M.WORKDEPT = E.WORKDEPT
AND M.JOB = 'MANAGER'
);For EXISTS-style filters, DB2 often uses the same access path whether you write it as a JOIN or an EXISTS subquery. However, for NOT EXISTS or NOT IN patterns (anti-joins), the LEFT JOIN / IS NULL idiom tends to perform better because NOT IN has problematic NULL semantics and can produce unexpected results when the subquery returns any NULL values.
Common DB2 JOIN Mistakes
Mistake 1: Cartesian join from missing ON clause. Forgetting the ON clause in a JOIN — or using the comma syntax without a WHERE condition — produces a cartesian product. With large tables, this fills the buffer pool and can crash a CICS region. Always double-check that every JOIN has a corresponding ON clause.
Mistake 2: Using INNER JOIN when LEFT OUTER JOIN is needed. If you need all records from the primary table (like all departments in a headcount report), INNER JOIN will silently drop departments with no employees. Think carefully about which rows you cannot afford to lose.
Mistake 3: Ambiguous column names without aliases. When two joined tables have columns with the same name (e.g., both EMPLOYEE and DEPARTMENT have a DEPTNO column), DB2 raises an ambiguous column error. Always use table aliases and prefix every column reference.
Mistake 4: Filtering on an outer-joined table's column in WHERE instead of ON. Placing a condition on the right table's column in the WHERE clause of a LEFT JOIN converts it to an effective INNER JOIN, because WHERE filters out the NULLs that the LEFT JOIN produced. Move the filter into the ON clause to preserve the outer join semantics:
-- WRONG: This becomes an INNER JOIN because WHERE filters out the NULLs
SELECT E.EMPNO, D.DEPTNAME
FROM EMP.EMPLOYEE E
LEFT JOIN EMP.DEPARTMENT D ON E.WORKDEPT = D.DEPTNO
WHERE D.DEPTNAME LIKE 'A%'; -- Excludes employees with no department
-- CORRECT: Filter in the ON clause preserves left outer join semantics
SELECT E.EMPNO, D.DEPTNAME
FROM EMP.EMPLOYEE E
LEFT JOIN EMP.DEPARTMENT D
ON E.WORKDEPT = D.DEPTNO
AND D.DEPTNAME LIKE 'A%'; -- Employees with no dept still appear, DEPTNAME = NULLMistake 5: Not aliasing self-joins.
A self-join without distinct aliases (FROM EMPLOYEE E JOIN EMPLOYEE M) will fail with a syntax error in DB2. Always assign different aliases to each instance of the table in a self-join.
Comprehensive Example: Employee Project Report
This final example brings together INNER JOIN, LEFT JOIN, three-table joining, aggregation, and a HAVING clause — the kind of query a mainframe reporting tool or a DB2 stored procedure would execute:
-- Full project staffing report with department info and hours summary
SELECT D.DEPTNO,
D.DEPTNAME,
E.EMPNO,
E.FIRSTNME || ' ' || E.LASTNAME AS EMPLOYEE_NAME,
E.JOB,
COUNT(PA.PROJNO) AS PROJECTS_ASSIGNED,
SUM(PA.ACSTAFF) AS TOTAL_STAFF_ALLOC,
COALESCE(MAX(P.PRENDATE), 'N/A') AS LATEST_PROJECT_END
FROM EMP.DEPARTMENT D
JOIN EMP.EMPLOYEE E
ON D.DEPTNO = E.WORKDEPT
LEFT JOIN EMP.PROJACT PA
ON E.EMPNO = PA.EMPNO
LEFT JOIN EMP.PROJECT P
ON PA.PROJNO = P.PROJNO
GROUP BY D.DEPTNO,
D.DEPTNAME,
E.EMPNO,
E.FIRSTNME,
E.LASTNAME,
E.JOB
HAVING COUNT(PA.PROJNO) > 0
ORDER BY D.DEPTNO,
TOTAL_STAFF_ALLOC DESC;This query demonstrates how chaining JOINs with LEFT OUTER options, combining aggregate functions, and using HAVING to filter grouped results produces a rich, accurate report without any application-side processing.
Key Takeaways
DB2 JOINs are the backbone of relational query writing. INNER JOIN restricts to matched rows only; LEFT OUTER JOIN preserves all rows from the left table; RIGHT OUTER JOIN preserves all rows from the right table; FULL OUTER JOIN preserves all rows from both tables; CROSS JOIN creates every possible combination; and self JOIN compares a table to itself. Mastering which type to use in which situation — and avoiding the common pitfall of converting an outer join to an inner join with a misplaced WHERE filter — is what separates a beginner from a proficient DB2 SQL developer.
For a concise syntax reference covering SELECT, DML, and JOIN in one place, visit the DB2 Cheat Sheet. To go deeper into DB2 explain plans, index design, and join optimisation strategies, enrol in the DB2 Mastery Course.
