What is the main difference between x86 assembly and HLASM?

x86-64 uses a CISC (Complex Instruction Set Computer) design on Intel/AMD processors with 16 general-purpose 64-bit registers and little-endian byte ordering. HLASM runs on IBM z/Architecture, which also has 16 general-purpose registers but uses big-endian byte ordering, base-displacement addressing for all memory references, and an entirely different instruction encoding. Both are assembly languages, but the syntax, mnemonics, and addressing models are completely different.

What assembler should I use for x86-64?

NASM (Netwide Assembler) is the most popular cross-platform x86 assembler and the recommended starting point. It uses Intel syntax, runs on Linux, Windows, and macOS, and has excellent documentation. GNU AS (GAS) uses AT&T syntax and is part of the Linux GNU toolchain. MASM is Microsoft's assembler, tightly integrated with Visual Studio on Windows.

What calling convention does x86-64 Linux use?

Linux and macOS x86-64 use the System V AMD64 ABI. Function arguments are passed in registers RDI, RSI, RDX, RCX, R8, R9 (in that order). The return value is in RAX. Registers RBX, RBP, R12–R15 must be preserved by the called function (callee-saved). The stack must be 16-byte aligned before a CALL instruction.

How do system calls work in x86-64 Linux?

Place the system call number in RAX, arguments in RDI, RSI, RDX, R10, R8, R9, then execute the SYSCALL instruction. The return value appears in RAX. For example, to call write(): RAX=1, RDI=file descriptor, RSI=buffer pointer, RDX=byte count, then SYSCALL. This is conceptually similar to z/OS SVC calls, but uses SYSCALL instead of SVC and uses different registers.

Why should an HLASM developer learn x86 assembly?

Understanding x86-64 assembly broadens your perspective on assembly programming as a discipline. It helps you understand GCC and Clang compiler output, write performance-critical C extensions, reverse-engineer software, and contribute to open-source systems projects. The mental model of registers, stack frames, and instruction-level programming transfers directly from HLASM — only the syntax and architecture specifics differ.

x86 Assembly Programming: An HLASM Developer's Guide

If you already know HLASM, learning x86-64 assembly is significantly easier than starting from scratch. The fundamental concepts are identical: registers hold working data, instructions manipulate those registers and memory, a stack manages function calls, and every conditional branch tests a flags register. What changes is the architecture, the syntax, the toolchain, and the OS interface.

This guide bridges the gap. It introduces x86-64 assembly with explicit comparisons to z/Architecture and HLASM throughout, so you can build on what you already know.

x86-64 Architecture Overview

x86-64 (also called AMD64 or Intel 64) is the 64-bit extension of the original Intel 8086 architecture from 1978. Like z/Architecture, it has accumulated decades of backward-compatible extensions. Unlike z/Architecture, it is little-endian — the least significant byte of a multi-byte integer is stored at the lowest memory address.

Key architectural differences from z/Architecture

Feature	z/Architecture (HLASM)	x86-64
Byte order	Big-endian	Little-endian
General registers	16 × 64-bit (GR0–GR15)	16 × 64-bit (RAX–R15)
Addressing	Base + displacement (12-bit disp)	Many modes, up to 32-bit displacement
Condition codes	2-bit CC in PSW	RFLAGS register (multiple flag bits)
System call	SVC instruction	SYSCALL instruction
Memory model	Virtual, flat per address space	Virtual, flat
Character encoding	EBCDIC (z/OS)	ASCII / UTF-8 (Linux/Windows)

x86-64 Registers

x86-64 has 16 general-purpose 64-bit registers. Their names carry historical baggage from 16-bit and 32-bit predecessors:

64-bit	32-bit	16-bit	8-bit (low)	Common Use
RAX	EAX	AX	AL	Accumulator, return value
RBX	EBX	BX	BL	Callee-saved, base pointer (legacy)
RCX	ECX	CX	CL	Counter, 4th argument
RDX	EDX	DX	DL	Data, 3rd argument
RSI	ESI	SI	SIL	Source index, 2nd argument
RDI	EDI	DI	DIL	Dest index, 1st argument
RSP	ESP	SP	SPL	Stack pointer
RBP	EBP	BP	BPL	Base pointer (stack frame)
R8–R15	R8D–R15D	R8W–R15W	R8B–R15B	Additional arguments / scratch

Compared to HLASM: z/Architecture's GR0–GR15 are symmetrical — any register can be used for any purpose (with the convention exceptions covered in the architecture fundamentals article). x86-64 registers are more asymmetric; RSP is always the stack pointer and cannot be repurposed freely.

Writing to a 32-bit sub-register (e.g., MOV EAX, 1) zero-extends the result into the full 64-bit register on x86-64. This is different from writing to a 16-bit sub-register, which leaves the upper 48 bits unchanged.

The RFLAGS Register

x86-64's equivalent of HLASM's condition code is the RFLAGS register, a 64-bit register with individual flag bits:

Flag	Meaning
ZF (Zero Flag)	Set if the result of an operation is zero
SF (Sign Flag)	Set if the result is negative
CF (Carry Flag)	Set on unsigned overflow / borrow
OF (Overflow Flag)	Set on signed overflow
PF (Parity Flag)	Set if the result has even parity

Conditional jump instructions test specific combinations of these flags — analogous to BC in HLASM testing specific condition code values.

The NASM Assembler and Toolchain

NASM (Netwide Assembler) is the standard tool for x86-64 assembly on Linux. The equivalent role to HLASM's ASMA90 on z/OS.

Installing and using NASM on Linux

bash

sudo apt install nasm          # Debian/Ubuntu
sudo dnf install nasm          # RHEL/Fedora

nasm -f elf64 program.asm -o program.o   # assemble to ELF object
ld -o program program.o                   # link to executable
./program                                 # run

Compared to HLASM: On z/OS you submit JCL to invoke ASMA90 and BINDER as batch jobs. On Linux, NASM and ld are command-line tools you run interactively from a terminal.

A first x86-64 program

nasm

; hello.asm — write "Hello\n" to stdout using Linux system calls
; Assembled with: nasm -f elf64 hello.asm -o hello.o && ld -o hello hello.o

section .data
    msg     db  "Hello, World!", 10    ; string + newline (LF = 0x0A)
    msglen  equ $ - msg                ; calculate length at assemble time

section .text
    global _start

_start:
    mov     rax, 1          ; syscall number: write (sys_write)
    mov     rdi, 1          ; file descriptor: 1 = stdout
    mov     rsi, msg        ; pointer to the message
    mov     rdx, msglen     ; number of bytes to write
    syscall                 ; invoke the Linux kernel

    mov     rax, 60         ; syscall number: exit (sys_exit)
    xor     rdi, rdi        ; exit code: 0
    syscall

Compared to HLASM: The HLASM equivalent uses WTO for console output and RETURN for exit — both are macro calls that generate SVC instructions. The x86-64 program issues SYSCALL directly, passing arguments in registers. The concept is identical; the registers and system call numbers differ.

x86-64 Memory Addressing

x86-64 uses a more flexible addressing model than z/Architecture's strict base-displacement scheme.

Addressing mode syntax (NASM Intel syntax)

nasm

mov rax, [rbx]              ; load from address in RBX
mov rax, [rbx + 8]          ; base + displacement
mov rax, [rbx + rcx]        ; base + index
mov rax, [rbx + rcx*8]      ; base + index * scale (scale: 1,2,4,8)
mov rax, [rbx + rcx*8 + 16] ; base + index * scale + displacement
mov rax, [rel symbol]        ; RIP-relative: address relative to instruction pointer

The last form — RIP-relative addressing — is crucial in x86-64. Because executables can be loaded at arbitrary addresses (Position Independent Code / PIE), data and code references use addresses relative to the current instruction pointer rather than absolute addresses. HLASM uses USING and base registers to achieve the same goal.

Compared to HLASM: HLASM's 12-bit displacement means a single base register covers 4096 bytes. x86-64 allows a 32-bit signed displacement, so a single reference can reach ±2 GB from the base register — much more flexible, but also more implicit.

Arithmetic and Logical Instructions

x86-64 arithmetic instructions are similar in concept to HLASM but use two-operand syntax where the first operand is both source and destination:

nasm

; NASM (Intel syntax)                  ; HLASM equivalent
add     rax, rbx        ; RAX += RBX   -- AR  R1,R2
sub     rax, 10         ; RAX -= 10    -- S   R1,=F'10'
imul    rax, rbx        ; RAX *= RBX   -- MR  (signed multiply, result in one reg)
inc     rcx             ; RCX++        -- A   R1,=F'1'
dec     rsi             ; RSI--        -- S   R1,=F'1'
neg     rax             ; RAX = -RAX   -- LCR R1,R1
and     rax, 0xFF       ; mask bits    -- N   R1,=X'000000FF'
or      rax, rbx        ; set bits     -- OR  R1,R2
xor     rax, rax        ; zero RAX     -- XR  R1,R1 (fastest zero idiom)
shl     rax, 3          ; shift left   -- SLL R1,3
shr     rax, 1          ; logical shr  -- SRL R1,1
sar     rax, 1          ; arith shr    -- SRA R1,1

The xor rax, rax idiom to zero a register is universal in x86-64 — it encodes as a shorter opcode than mov rax, 0 and is recognised by the CPU as a register-zeroing operation.

Control Flow and Branching

Control flow in x86-64 uses CMP to set flags and conditional jump instructions to branch:

nasm

    cmp     rax, 10         ; sets flags based on RAX - 10 (result discarded)
    je      equal           ; jump if equal (ZF = 1)
    jne     not_equal       ; jump if not equal (ZF = 0)
    jl      less_than       ; jump if less (signed, SF ≠ OF)
    jg      greater_than    ; jump if greater (signed, ZF=0 and SF=OF)
    jb      below           ; jump if below (unsigned, CF = 1)
    ja      above           ; jump if above (unsigned, CF=0 and ZF=0)

Compared to HLASM: CMP in x86-64 is equivalent to CR or CLC in HLASM. The conditional jump mnemonics (JE, JNE, JL, JG) correspond to HLASM's extended branch mnemonics (BE, BNE, BL, BH). The underlying mechanism — comparing, setting a status register, and branching on the result — is identical in both architectures.

Implementing a loop

nasm

    mov     rcx, 10         ; loop counter
.loop:
    ; ... loop body ...
    dec     rcx             ; decrement
    jnz     .loop           ; jump back if not zero

Equivalent HLASM:

hlasm

         LA    R5,10         LOAD COUNTER
LOOP     DS    0H
*        ... LOOP BODY ...
         BCTR  R5,0          DECREMENT R5 (BCTR = Branch on Count Register)
         BNZ   LOOP          BRANCH IF NOT ZERO

The Stack and Calling Conventions

The x86-64 stack grows downward (toward lower addresses), exactly like the z/Architecture stack.

System V AMD64 ABI (Linux/macOS)

This is the calling convention for Linux and macOS x86-64. It defines:

Arguments: passed in RDI, RSI, RDX, RCX, R8, R9 (left to right). Additional arguments go on the stack.
Return value: in RAX (and RDX for 128-bit values).
Callee-saved registers: RBX, RBP, R12–R15. The called function must preserve these.
Caller-saved registers: RAX, RCX, RDX, RSI, RDI, R8–R11. May be clobbered by a called function.
Stack alignment: RSP must be 16-byte aligned when the CALL instruction executes.

nasm

; a function that receives two arguments and returns their sum
; long add(long a, long b)  -- a in RDI, b in RSI, result in RAX
add_func:
    push    rbp             ; save caller's base pointer
    mov     rbp, rsp        ; establish stack frame
    ; function body
    mov     rax, rdi        ; return value = first argument
    add     rax, rsi        ; + second argument
    pop     rbp             ; restore base pointer
    ret                     ; return (pop return address, jump to it)

Compared to HLASM: HLASM's standard linkage saves all 15 registers in a 72-byte save area at 12(R13). System V ABI only requires saving RBX, RBP, R12–R15. The mechanisms differ, but both conventions ensure that after a function returns, the caller's register state is intact.

System Calls on x86-64 Linux

Linux x86-64 system calls use the SYSCALL instruction. Arguments go in RDI, RSI, RDX, R10, R8, R9 (note: R10 not RCX for syscalls):

nasm

; Common Linux x86-64 system call numbers
; sys_read   = 0
; sys_write  = 1
; sys_open   = 2
; sys_close  = 3
; sys_exit   = 60
; sys_mmap   = 9

; Read from stdin (fd=0) into buffer
    mov     rax, 0          ; sys_read
    mov     rdi, 0          ; fd = stdin
    mov     rsi, buffer     ; buffer address
    mov     rdx, 256        ; max bytes to read
    syscall
    ; RAX now holds number of bytes actually read (or negative error code)

Compared to HLASM: z/OS uses SVC instructions with macro-generated parameter lists. The concept — trap to the kernel with a function number and arguments — is the same. The details of which registers carry what, and the system call table, are OS-specific.

Debugging x86-64 Assembly with GDB

GDB (GNU Debugger) is the standard Linux debugger for x86-64 assembly — the equivalent of IBM's Debug Tool / IPCS on z/OS.

bash

nasm -f elf64 -g -F dwarf program.asm -o program.o   # assemble with debug info
ld -o program program.o
gdb ./program

Essential GDB commands for assembly:

text

(gdb) layout asm          # show disassembly pane
(gdb) layout regs         # show registers alongside disassembly
(gdb) si                  # step one instruction (step into calls)
(gdb) ni                  # step one instruction (step over calls)
(gdb) info registers      # display all register values
(gdb) x/8xg $rsp          # examine 8 quadwords at stack pointer
(gdb) break _start        # set breakpoint at label
(gdb) run                 # start execution

Compared to HLASM debugging: HLASM debugging under IPCS involves reading a formatted storage dump — a text representation of memory and register contents at the time of an ABEND. GDB is interactive: you step through instructions while the program is live, watching register and memory state change in real time.

Conclusion

x86-64 assembly is a natural extension for any HLASM developer. The core concepts — registers, memory addressing, condition flags, the call stack, and system calls — are universal. The differences lie in syntax, calling conventions, byte order, and the specific instruction mnemonics.

The most valuable skill is learning to read compiler output. When you compile a C function with gcc -O2 -S, the generated assembly is x86-64. Reading that output fluently is what separates senior systems engineers from everyone else. Your HLASM background gives you a substantial head start.

Continue the track by exploring ARM Assembly Programming: A Comparative Guide for Assembler Developers.