COBOLMainframe

COBOL String Handling: STRING, UNSTRING, INSPECT, and Functions

TT
TopicTrick
COBOL String Handling: STRING, UNSTRING, INSPECT, and Functions

String manipulation in COBOL uses dedicated verbs — STRING, UNSTRING, INSPECT — plus a catalogue of intrinsic functions. The design reflects COBOL's fixed-length record heritage: rather than dynamic string objects, COBOL works with fixed-length fields and explicit length control. Understanding these tools lets you parse CSV data, format output lines, validate input, and build dynamic messages efficiently.

MOVE and String Basics

The foundation of COBOL string handling is the MOVE statement. When moving between alphanumeric fields of different lengths:

  • Moving to a longer field: the value is left-justified and padded with spaces on the right
  • Moving to a shorter field: the value is left-justified and truncated on the right
cobol
01 WS-SHORT     PIC X(5)  VALUE 'HELLO'.
01 WS-LONG      PIC X(10) VALUE SPACES.
01 WS-LONGER    PIC X(15) VALUE SPACES.

MOVE WS-SHORT TO WS-LONG.     *> 'HELLO     ' (padded 5 spaces)
MOVE WS-LONG  TO WS-LONGER.   *> 'HELLO          ' (padded 10 spaces)
MOVE WS-LONGER TO WS-SHORT.   *> 'HELLO' (truncated — no data loss here)

For reference within a field, COBOL supports reference modification using (start:length) notation:

cobol
01 WS-DATE-STRING   PIC X(8) VALUE '20260120'.

MOVE WS-DATE-STRING(1:4) TO WS-YEAR.    *> '2026'
MOVE WS-DATE-STRING(5:2) TO WS-MONTH.   *> '01'
MOVE WS-DATE-STRING(7:2) TO WS-DAY.     *> '20'

Reference modification is zero-overhead — it is a compiler directive, not a runtime operation.

STRING Statement

STRING concatenates multiple source fields or literals into a single destination field.

Basic STRING

cobol
01 WS-FIRST-NAME    PIC X(15) VALUE 'JOHN           '.
01 WS-LAST-NAME     PIC X(20) VALUE 'SMITH              '.
01 WS-FULL-NAME     PIC X(36) VALUE SPACES.
01 WS-STR-POINTER   PIC 9(4) COMP VALUE 1.

STRING
    WS-FIRST-NAME DELIMITED BY SPACE
    ' '           DELIMITED BY SIZE
    WS-LAST-NAME  DELIMITED BY SPACE
    INTO WS-FULL-NAME
    WITH POINTER WS-STR-POINTER
ON OVERFLOW
    MOVE 'NAME TOO LONG' TO WS-ERROR-MSG
END-STRING.

After this statement, WS-FULL-NAME contains 'JOHN SMITH' and WS-STR-POINTER holds 11 (the position after the last character written).

DELIMITED BY Options

  • DELIMITED BY SPACE — includes characters up to (not including) the first space. Trims trailing spaces from variable-length values.
  • DELIMITED BY SIZE — includes the entire field, regardless of content. Use for fixed-length fields where you want every character.
  • DELIMITED BY 'char' — includes characters up to (not including) the specified character.

STRING with POINTER

The WITH POINTER clause specifies where in the receiving field to start writing. Initialize it to 1 before the STRING call. After STRING, it points to the position after the last character written — useful for chaining multiple STRING operations:

cobol
*> Build a formatted date string: DD/MM/YYYY
MOVE 1 TO WS-PTR
STRING
    WSD-DAY   DELIMITED BY SIZE
    '/'       DELIMITED BY SIZE
    WSD-MONTH DELIMITED BY SIZE
    '/'       DELIMITED BY SIZE
    WSD-YEAR  DELIMITED BY SIZE
    INTO WS-FORMATTED-DATE
    WITH POINTER WS-PTR
END-STRING.

ON OVERFLOW

If the destination field fills up before STRING completes, the ON OVERFLOW clause fires and writing stops. The field contains whatever was written before overflow. Without ON OVERFLOW, overflow silently truncates — no error is raised.

UNSTRING Statement

UNSTRING is the inverse of STRING — it splits a source field into multiple receiving fields based on delimiters.

Basic UNSTRING

cobol
01 WS-CSV-LINE    PIC X(100) VALUE 'JOHN,SMITH,ACC001,1250.00'.
01 WS-FIELD1      PIC X(20).
01 WS-FIELD2      PIC X(20).
01 WS-FIELD3      PIC X(10).
01 WS-FIELD4      PIC X(10).

UNSTRING WS-CSV-LINE
    DELIMITED BY ','
    INTO WS-FIELD1
         WS-FIELD2
         WS-FIELD3
         WS-FIELD4
END-UNSTRING.

*> WS-FIELD1 = 'JOHN                '
*> WS-FIELD2 = 'SMITH               '
*> WS-FIELD3 = 'ACC001    '
*> WS-FIELD4 = '1250.00   '

COUNT IN and TALLYING IN

COUNT IN captures how many characters were placed in each receiving field — essential for knowing the actual length of variable-length parsed values:

cobol
01 WS-COUNT1    PIC 9(4) COMP.
01 WS-COUNT2    PIC 9(4) COMP.
01 WS-FIELDS-FILLED PIC 9(4) COMP VALUE ZERO.

UNSTRING WS-CSV-LINE
    DELIMITED BY ','
    INTO WS-FIELD1 COUNT IN WS-COUNT1
         WS-FIELD2 COUNT IN WS-COUNT2
    TALLYING IN WS-FIELDS-FILLED
END-UNSTRING.

*> WS-COUNT1 = 4 (length of 'JOHN')
*> WS-COUNT2 = 5 (length of 'SMITH')
*> WS-FIELDS-FILLED = 2

Multiple Delimiters

UNSTRING can split on multiple delimiter characters simultaneously:

cobol
UNSTRING WS-ADDRESS-LINE
    DELIMITED BY ',' OR ' ' OR ';'
    INTO WS-PART1 WS-PART2 WS-PART3 WS-PART4
END-UNSTRING.

UNSTRING with POINTER

Like STRING, UNSTRING supports WITH POINTER to start parsing from a specific position within the source field:

cobol
MOVE 6 TO WS-PARSE-PTR
UNSTRING WS-RECORD
    DELIMITED BY ':'
    INTO WS-VALUE
    WITH POINTER WS-PARSE-PTR
END-UNSTRING.

INSPECT Statement

INSPECT examines a field character by character to count or replace occurrences.

INSPECT TALLYING

Counts occurrences of characters or strings:

cobol
01 WS-TEXT       PIC X(50) VALUE 'HELLO WORLD HELLO'.
01 WS-COUNT      PIC 9(5) COMP VALUE ZERO.

*> Count all spaces:
INSPECT WS-TEXT
    TALLYING WS-COUNT FOR ALL SPACES.
*> WS-COUNT = 2

*> Count leading spaces (from left until non-space):
MOVE ZERO TO WS-COUNT
INSPECT WS-TEXT
    TALLYING WS-COUNT FOR LEADING SPACES.
*> WS-COUNT = 0 (starts with 'H')

*> Count a specific string:
MOVE ZERO TO WS-COUNT
INSPECT WS-TEXT
    TALLYING WS-COUNT FOR ALL 'HELLO'.
*> WS-COUNT = 2

Multiple TALLYING clauses in one INSPECT scan the field once:

cobol
INSPECT WS-ACCOUNT-NUMBER
    TALLYING WS-DIGIT-COUNT FOR ALL '0' '1' '2' '3' '4' '5' '6' '7' '8' '9'
             WS-ALPHA-COUNT FOR ALL 'A' THRU 'Z'.

INSPECT REPLACING

Replaces occurrences in place — modifies the field directly:

cobol
*> Replace all commas with spaces:
INSPECT WS-CSV-DATA
    REPLACING ALL ',' BY ' '.

*> Replace leading zeros with spaces:
INSPECT WS-NUMERIC-STRING
    REPLACING LEADING '0' BY ' '.

*> Replace specific string:
INSPECT WS-MESSAGE
    REPLACING ALL 'ERROR' BY 'WARN '.    *> same length required

The replacement string must be the same length as the search string — INSPECT operates in-place on a fixed-length field.

INSPECT CONVERTING

Converts characters using a translation table — the mainframe equivalent of tr:

cobol
*> Convert lowercase to uppercase (for EBCDIC):
INSPECT WS-INPUT-DATA
    CONVERTING 'abcdefghijklmnopqrstuvwxyz'
            TO 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.

For modern code, FUNCTION UPPER-CASE is cleaner — but INSPECT CONVERTING remains useful for custom character translations.

Intrinsic Functions for Strings

IBM Enterprise COBOL provides intrinsic functions invoked with the FUNCTION keyword. No CALL statement or special setup is needed.

FUNCTION UPPER-CASE and LOWER-CASE

cobol
MOVE FUNCTION UPPER-CASE(WS-INPUT-NAME) TO WS-UPPER-NAME.
MOVE FUNCTION LOWER-CASE(WS-CODE-VALUE)  TO WS-LOWER-CODE.

*> Use directly in a conditional:
IF FUNCTION UPPER-CASE(WS-USER-INPUT) = 'YES'
    PERFORM CONFIRM-ACTION
END-IF.

FUNCTION TRIM

Removes leading or trailing spaces (or both):

cobol
MOVE FUNCTION TRIM(WS-PADDED-VALUE LEADING)  TO WS-CLEAN.
MOVE FUNCTION TRIM(WS-PADDED-VALUE TRAILING) TO WS-CLEAN.
MOVE FUNCTION TRIM(WS-PADDED-VALUE)          TO WS-CLEAN.  *> both ends

FUNCTION LENGTH

Returns the defined (allocated) length of a field — not the content length:

cobol
01 WS-BUFFER     PIC X(100).
01 WS-BUF-LEN    PIC 9(4) COMP.

MOVE FUNCTION LENGTH(WS-BUFFER) TO WS-BUF-LEN.  *> = 100

For the actual character count (up to the last non-space), combine TRIM and LENGTH:

cobol
MOVE FUNCTION LENGTH(FUNCTION TRIM(WS-BUFFER TRAILING)) TO WS-CONTENT-LEN.

FUNCTION REVERSE

Reverses the characters in a string:

cobol
MOVE FUNCTION REVERSE(WS-FORWARD-STRING) TO WS-REVERSED.

FUNCTION SUBSTITUTE

Replaces all occurrences of a substring with another (available in IBM Enterprise COBOL v6.3+):

cobol
MOVE FUNCTION SUBSTITUTE(WS-TEXT 'ERROR' 'WARN') TO WS-OUTPUT-TEXT.

Unlike INSPECT REPLACING, SUBSTITUTE can handle replacement strings of different lengths when the result is stored in a field large enough to accommodate the change.

FUNCTION CONCATENATE

Joins multiple strings:

cobol
MOVE FUNCTION CONCATENATE(WS-PATH '/' WS-FILENAME) TO WS-FULL-PATH.

Practical Pattern: Parsing a Pipe-Delimited Record

cobol
01 WS-PIPE-RECORD  PIC X(200).
01 WS-ACCOUNT-ID   PIC X(10).
01 WS-CUST-NAME    PIC X(40).
01 WS-BALANCE      PIC X(15).
01 WS-STATUS       PIC X(2).
01 WS-FIELD-COUNT  PIC 9(4) COMP VALUE ZERO.
01 WS-COUNTS.
   05 WS-CNT1      PIC 9(4) COMP.
   05 WS-CNT2      PIC 9(4) COMP.
   05 WS-CNT3      PIC 9(4) COMP.
   05 WS-CNT4      PIC 9(4) COMP.

PARSE-PIPE-RECORD.
    MOVE ZERO TO WS-FIELD-COUNT
    UNSTRING WS-PIPE-RECORD
        DELIMITED BY '|'
        INTO WS-ACCOUNT-ID   COUNT IN WS-CNT1
             WS-CUST-NAME    COUNT IN WS-CNT2
             WS-BALANCE      COUNT IN WS-CNT3
             WS-STATUS       COUNT IN WS-CNT4
        TALLYING IN WS-FIELD-COUNT
    END-UNSTRING
    IF WS-FIELD-COUNT < 4
        MOVE 'INCOMPLETE RECORD' TO WS-ERROR-MSG
        SET WS-ERROR-FOUND TO TRUE
    END-IF.

Practical Pattern: Building a Dynamic SQL Predicate

cobol
01 WS-SQL-PRED    PIC X(200) VALUE SPACES.
01 WS-SQL-PTR     PIC 9(4) COMP VALUE 1.

BUILD-SQL-PREDICATE.
    MOVE 1 TO WS-SQL-PTR
    STRING
        'WHERE ACCOUNT_ID = ''' DELIMITED BY SIZE
        WS-ACCOUNT-ID            DELIMITED BY SPACE
        ''''                     DELIMITED BY SIZE
        ' AND STATUS = '''       DELIMITED BY SIZE
        WS-STATUS                DELIMITED BY SPACE
        ''''                     DELIMITED BY SIZE
        INTO WS-SQL-PRED
        WITH POINTER WS-SQL-PTR
    END-STRING.
*> Result: WHERE ACCOUNT_ID = 'ACC001' AND STATUS = 'AC'

Next Steps

With string handling covered, the next major topic is file I/O — reading and writing VSAM, QSAM, and sequential datasets that make up the backbone of mainframe batch processing. See COBOL File Handling, or return to the COBOL Mastery course.