What does the What column capture in the Zachman Framework?

The What column captures the inventory of things the enterprise cares about — from a high-level business glossary (Planner row) through conceptual business entities (Owner row), logical data models (Designer row), physical database schemas (Builder row), to actual table DDL (Subcontractor row). It traces how business concepts become implemented data.

What is the difference between the What column at the Designer row and Builder row?

The Designer row What column contains a technology-independent logical data model with entities, attributes, and relationships using business names. The Builder row What column translates this into a physical schema for a specific database platform, adding data types, indexes, constraints, and partition strategies.

Why is the What column often the first to be populated?

Data is tangible and familiar — architects, analysts, and developers all understand entity-relationship concepts. The What column also has the most mature tooling (ERD tools, data catalogues) which lowers the effort to produce and maintain it. Populating it first gives the team shared vocabulary that other columns can reference.

Zachman 'What' Column: The Same Data Question, 6 Different Answers

← Back to Zachman Hub

Zachman What Column: Data and Inventory Architecture Explained

The "What" interrogative forms the first column of the Zachman Framework matrix. It answers the fundamental question: "What does the enterprise consist of?" In other words, what data entities, objects, and inventory items are central to the enterprise's operation?

The What column represents data architecture across all six perspectives - from the executive planner's simple entity list down to the actual working database containing millions of records. Understanding this column is essential for enterprise data governance, data quality, and ensuring data consistency across the enterprise.

What Does the What Column Address?

The What column specifically focuses on:

Entities and objects: All the "nouns" the enterprise cares about (customers, orders, products, accounts, employees, assets)
Entity relationships: How entities relate to each other (a customer has multiple accounts, an account has multiple transactions)
Data attributes: Properties of each entity (a customer has a name, email, phone)
Inventory management: Classification and counting of business objects
Information classification: Categorising data by type and sensitivity
Master data: Reference data that is reused across the enterprise (product codes, customer segments)

The What column is intentionally NOT about processes, locations, timing, or business rules. That is the domain of the other interrogatives. The What column focuses purely on "what things exist and how they relate."

The What Column Across Six Perspectives

Row 1 (Planner): Entity List / Data Scope

Question: What types of entities does the business deal with?

Artefacts:

Simple list of entities (nouns): Customer, Order, Product, Account, Invoice, Employee, etc.
Brief description of each entity (one or two sentences)
Scope boundaries (e.g., "We track customers globally but products only in North America")

Characteristics:

Executive summary level - one or two pages
Business terminology only, no technical jargon
Stable and rarely changing

Example:

text

Entities:
- Customer: A person or organisation who purchases products
- Order: A request to purchase one or more products
- Product: An item available for sale
- Account: A billing account associated with a customer
- Invoice: A bill sent to a customer for an order

Why it matters: Establishes shared vocabulary. Everyone agrees on what "customer," "order," and "product" mean.

Row 2 (Owner): Entity-Relationship Model / Business Data Model

Question: How do entities relate to each other from a business perspective?

Artefacts:

Entity-Relationship diagram (ER diagram) with business entities and relationships
Entity definitions including attributes (non-technical)
Cardinalities and constraints (one-to-many, many-to-many)
Business rules governing relationships (e.g., "An order must have at least one line item")

Characteristics:

5-30 pages depending on complexity
Business owner's view - understandable without technical training
Reflects current business model - changes when business processes change

Example ER relationships:

text

Customer ---> (has many) ---> Account
Account ---> (has many) ---> Transaction
Customer ---> (places) ---> Order
Order ---> (contains many) ---> LineItem
LineItem ---> (references) ---> Product

Business rules:

A customer can have 0 to unlimited accounts
An order must contain at least 1 line item and at most 100 line items
An order must reference at least one customer

Why it matters: Ensures business stakeholders agree on data relationships. Prevents misalignment between business model and technical implementation.

Row 3 (Designer): Logical Data Model

Question: What is the system's logical data structure, independent of technology choice?

Artefacts:

Normalised Entity-Relationship diagram (third-normal form or higher)
Logical entity definitions with attributes and data types (logical types: text, number, date, boolean)
Logical keys and constraints
Relationship specifications without database-specific syntax

Characteristics:

10-50 pages depending on complexity
System designer's view - technical but not technology-specific
Detailed and comprehensive - includes all entities that the system must track
Changes when system requirements change, not when technology changes

Example logical entity:

text

Customer entity:
  - CustomerID (unique identifier)
  - FirstName (text, required)
  - LastName (text, required)
  - Email (text, required, unique)
  - Phone (text)
  - Address (text)
  - DateOfBirth (date)
  - CustomerSegment (text) [references Segment]
  - CreatedDate (date)
  - ModifiedDate (date)

Logical relationships:

text

Customer --< 1:M >-- Account
  A customer has 0 to many accounts
  An account belongs to exactly 1 customer

Account --< 1:M >-- Transaction
  An account has 1 to many transactions
  A transaction belongs to exactly 1 account

Why it matters: Ensures system design is sound before committing to specific technologies. Can be implemented using any database technology.

Row 4 (Builder): Physical Data Model / Database Schema

Question: What is the actual database design using specific technology?

Artefacts:

Physical database schema (Oracle schema, PostgreSQL schema, MongoDB schema, etc.)
Table/collection definitions with column types specific to the chosen database
Indexes and partitioning strategy
Performance optimisation decisions (e.g., denormalisation for speed)
Storage parameters (tablespace, file locations)

Characteristics:

20-100 pages depending to database complexity
Technical and technology-specific
Designed for performance, not pure normalisation
Changes when technology or performance requirements change

Example for Oracle:

sql

CREATE TABLE CUSTOMER (
  CUSTOMER_ID NUMBER PRIMARY KEY,
  FIRST_NAME VARCHAR2(100) NOT NULL,
  LAST_NAME VARCHAR2(100) NOT NULL,
  EMAIL VARCHAR2(255) NOT NULL UNIQUE,
  PHONE VARCHAR2(20),
  ADDRESS VARCHAR2(500),
  DATE_OF_BIRTH DATE,
  CUSTOMER_SEGMENT VARCHAR2(50),
  CREATED_DATE DATE DEFAULT SYSDATE,
  MODIFIED_DATE DATE DEFAULT SYSDATE
);

CREATE INDEX IDX_CUSTOMER_EMAIL ON CUSTOMER(EMAIL);
CREATE INDEX IDX_CUSTOMER_SEGMENT ON CUSTOMER(CUSTOMER_SEGMENT);

Why it matters: Ensures database is designed for performance and configured correctly for the chosen technology.

Row 5 (Sub-Contractor): DDL Scripts and Data Loading

Question: What are the specific database creation and data loading scripts?

Artefacts:

Complete DDL (Data Definition Language) scripts (CREATE TABLE, CREATE INDEX, etc.)
Data loading scripts (INSERT statements or ETL scripts)
Migration scripts (ALTER TABLE for schema changes)
Backup and recovery procedures
Data validation scripts

Characteristics:

Hundreds to thousands of lines of code
Fully executable and version-controlled
Highly volatile (changes constantly as requirements evolve)
"Out of context" without explanation - difficult to understand without deep technical knowledge

Example:

sql

-- Create main tables
CREATE TABLE CUSTOMER (...);
CREATE TABLE ACCOUNT (...);
CREATE TABLE TRANSACTION (...);

-- Load initial data
INSERT INTO CUSTOMER VALUES (1, 'John', 'Doe', 'john@example.com', ...);
INSERT INTO CUSTOMER VALUES (2, 'Jane', 'Smith', 'jane@example.com', ...);

-- Create indexes
CREATE INDEX IDX_CUSTOMER_EMAIL ON CUSTOMER(EMAIL);

-- Create views
CREATE VIEW ACTIVE_CUSTOMERS AS SELECT * FROM CUSTOMER WHERE status = 'ACTIVE';

-- Grant permissions
GRANT SELECT ON CUSTOMER TO app_user;
GRANT INSERT, UPDATE ON CUSTOMER TO app_user;

Why it matters: Makes the database operational. Scripts are version-controlled and repeatable.

Row 6 (Enterprise): Live Database

Question: What data actually exists in the running system?

Artefacts:

Actual database contents (live records)
Data volumes and growth metrics
Data quality metrics
Sample queries and data extraction
Backup and recovery status
Data age and refresh rates

Characteristics:

Constantly changing (new data arrives every second)
Measured and observable
The source of truth for "what actually exists"
Monitored for quality and compliance

Example metrics:

text

CUSTOMER table: 145.2 million records
  - Active customers: 142.1 M
  - Inactive: 3.1 M
  - Average growth: 2.1% per year
  - Data freshness: Real-time (updated via API)

ACCOUNT table: 287.4 million records
  - Average per customer: 1.97 accounts
  - Growth rate: 2.3% per year

Data quality:
  - Duplicate email addresses: 0 (checked daily)
  - Null phone numbers: 12.3% (acceptable)
  - Data freshness: Last update 34 seconds ago

Why it matters: Validates that the system is operating correctly and data quality is maintained.

Data Governance: Using the What Column

The What column is the foundation for enterprise data governance:

Row 1 alignment: Business and IT agree on what entities matter.
Row 2 specification: Business defines entity definitions and relationships.
Row 3 design: Architects design system-independent logical model.
Row 4 implementation: DBAs implement in specific technology.
Row 5 deployment: Operations deploy the database.
Row 6 monitoring: Data stewards monitor for quality and compliance.

This traceability ensures data governance policies flow from business definition through technical implementation to operational monitoring.

Common Mistakes in the What Column

Skipping Row 2 (Business ER Model): Going directly from scope (Row 1) to logical design (Row 3). Result: Requirements misaligned with business.
Over-normalisation: Normalising to 5NF or higher in Row 3, then de-normalising for performance in Row 4. Result: Confusion about the "true" model.
Denormalisation gone wrong: De-normalising in Row 4 to the point where the model is unrecognisable. Result: Data anomalies and maintainability problems.
Ignoring Row 6 validation: Never comparing Rows 1-5 (designs) with Row 6 (actual data). Result: Designs that don't match reality.
Mixing interrogatives: Including process information (How column) in the What column. Result: Confusion about what the What column is supposed to do.

What Column vs Other Columns

What vs How: What is "nouns" (entities), How is "verbs" (processes). A customer (What) undergoes a billing process (How).
What vs Where: What is "what exists," Where is "where it's located." Customer data (What) is stored in us-east-1 and replicated to eu-west-1 (Where).
What vs Who: What is "objects," Who is "actors." Customer records (What) are owned by the sales department (Who).
What vs When: What is "things," When is "timing." Customer records (What) are updated nightly (When).
What vs Why: What is "inventory," Why is "motivation." We track customers (What) because we need billing information (Why).

Industry Examples: The What Column

Banking

What entities: Customer, Account, Transaction, Card, Loan, Collateral, Payment
Key relationships: Customer has many accounts; Account has many transactions; Loan has collateral

Healthcare

What entities: Patient, Provider, Appointment, Diagnosis, Medication, Lab Result, Insurance
Key relationships: Patient has many appointments; Patient has many diagnoses; Prescription contains many medications

Retail

What entities: Customer, Order, Product, Inventory, Supplier, Category, Promotion
Key relationships: Order contains many products; Product belongs to many categories; Inventory tracks products across locations

Data Architecture Best Practices

Maintain traceability: Ensure each Row 4-5 table maps back to a Row 3 logical entity.
Version your models: Use version control for Rows 3-5. Track changes over time.
Separate logical from physical: Keep Row 3 (logical) independent of Row 4 (physical). This enables technology changes without breaking the model.
Validate against reality: Periodically compare Rows 1-5 with Row 6. Do they align?
Document relationships: Clearly document why entities relate the way they do (the "Why" of the What column).

Key Takeaways

The What column classifies enterprise data and inventory: It answers "what entities and data must the enterprise track?"
Six perspectives provide a complete data view: From business scope to live database.
Traceability from Row 1 to Row 6 is critical: Ensure business definitions flow through design and implementation to actual data.
The What column is foundational for data governance: It defines the universe of data the enterprise manages.
Do not skip any rows: Each row adds value and catches different types of data architecture errors.

Test Your Understanding

In the Zachman Framework, which row in the What column would contain an Entity-Relationship Diagram with logical entities and named attributes?

No quiz options available