x86 Assembly Programming: HLASM Developer's Guide to Intel Architecture

x86 Assembly Programming: An HLASM Developer's Guide
If you already know HLASM, learning x86-64 assembly is significantly easier than starting from scratch. The fundamental concepts are identical: registers hold working data, instructions manipulate those registers and memory, a stack manages function calls, and every conditional branch tests a flags register. What changes is the architecture, the syntax, the toolchain, and the OS interface.
This guide bridges the gap. It introduces x86-64 assembly with explicit comparisons to z/Architecture and HLASM throughout, so you can build on what you already know.
x86-64 Architecture Overview
x86-64 (also called AMD64 or Intel 64) is the 64-bit extension of the original Intel 8086 architecture from 1978. Like z/Architecture, it has accumulated decades of backward-compatible extensions. Unlike z/Architecture, it is little-endian — the least significant byte of a multi-byte integer is stored at the lowest memory address.
Key architectural differences from z/Architecture
| Feature | z/Architecture (HLASM) | x86-64 |
|---|---|---|
| Byte order | Big-endian | Little-endian |
| General registers | 16 × 64-bit (GR0–GR15) | 16 × 64-bit (RAX–R15) |
| Addressing | Base + displacement (12-bit disp) | Many modes, up to 32-bit displacement |
| Condition codes | 2-bit CC in PSW | RFLAGS register (multiple flag bits) |
| System call | SVC instruction | SYSCALL instruction |
| Memory model | Virtual, flat per address space | Virtual, flat |
| Character encoding | EBCDIC (z/OS) | ASCII / UTF-8 (Linux/Windows) |
x86-64 Registers
x86-64 has 16 general-purpose 64-bit registers. Their names carry historical baggage from 16-bit and 32-bit predecessors:
| 64-bit | 32-bit | 16-bit | 8-bit (low) | Common Use |
|---|---|---|---|---|
| RAX | EAX | AX | AL | Accumulator, return value |
| RBX | EBX | BX | BL | Callee-saved, base pointer (legacy) |
| RCX | ECX | CX | CL | Counter, 4th argument |
| RDX | EDX | DX | DL | Data, 3rd argument |
| RSI | ESI | SI | SIL | Source index, 2nd argument |
| RDI | EDI | DI | DIL | Dest index, 1st argument |
| RSP | ESP | SP | SPL | Stack pointer |
| RBP | EBP | BP | BPL | Base pointer (stack frame) |
| R8–R15 | R8D–R15D | R8W–R15W | R8B–R15B | Additional arguments / scratch |
Compared to HLASM: z/Architecture's GR0–GR15 are symmetrical — any register can be used for any purpose (with the convention exceptions covered in the architecture fundamentals article). x86-64 registers are more asymmetric; RSP is always the stack pointer and cannot be repurposed freely.
Writing to a 32-bit sub-register (e.g., MOV EAX, 1) zero-extends the result into the full 64-bit register on x86-64. This is different from writing to a 16-bit sub-register, which leaves the upper 48 bits unchanged.
The RFLAGS Register
x86-64's equivalent of HLASM's condition code is the RFLAGS register, a 64-bit register with individual flag bits:
| Flag | Meaning |
|---|---|
| ZF (Zero Flag) | Set if the result of an operation is zero |
| SF (Sign Flag) | Set if the result is negative |
| CF (Carry Flag) | Set on unsigned overflow / borrow |
| OF (Overflow Flag) | Set on signed overflow |
| PF (Parity Flag) | Set if the result has even parity |
Conditional jump instructions test specific combinations of these flags — analogous to BC in HLASM testing specific condition code values.
The NASM Assembler and Toolchain
NASM (Netwide Assembler) is the standard tool for x86-64 assembly on Linux. The equivalent role to HLASM's ASMA90 on z/OS.
Installing and using NASM on Linux
sudo apt install nasm # Debian/Ubuntu
sudo dnf install nasm # RHEL/Fedora
nasm -f elf64 program.asm -o program.o # assemble to ELF object
ld -o program program.o # link to executable
./program # runCompared to HLASM: On z/OS you submit JCL to invoke ASMA90 and BINDER as batch jobs. On Linux, NASM and ld are command-line tools you run interactively from a terminal.
A first x86-64 program
; hello.asm — write "Hello\n" to stdout using Linux system calls
; Assembled with: nasm -f elf64 hello.asm -o hello.o && ld -o hello hello.o
section .data
msg db "Hello, World!", 10 ; string + newline (LF = 0x0A)
msglen equ $ - msg ; calculate length at assemble time
section .text
global _start
_start:
mov rax, 1 ; syscall number: write (sys_write)
mov rdi, 1 ; file descriptor: 1 = stdout
mov rsi, msg ; pointer to the message
mov rdx, msglen ; number of bytes to write
syscall ; invoke the Linux kernel
mov rax, 60 ; syscall number: exit (sys_exit)
xor rdi, rdi ; exit code: 0
syscallCompared to HLASM: The HLASM equivalent uses WTO for console output and RETURN for exit — both are macro calls that generate SVC instructions. The x86-64 program issues SYSCALL directly, passing arguments in registers. The concept is identical; the registers and system call numbers differ.
x86-64 Memory Addressing
x86-64 uses a more flexible addressing model than z/Architecture's strict base-displacement scheme.
Addressing mode syntax (NASM Intel syntax)
mov rax, [rbx] ; load from address in RBX
mov rax, [rbx + 8] ; base + displacement
mov rax, [rbx + rcx] ; base + index
mov rax, [rbx + rcx*8] ; base + index * scale (scale: 1,2,4,8)
mov rax, [rbx + rcx*8 + 16] ; base + index * scale + displacement
mov rax, [rel symbol] ; RIP-relative: address relative to instruction pointerThe last form — RIP-relative addressing — is crucial in x86-64. Because executables can be loaded at arbitrary addresses (Position Independent Code / PIE), data and code references use addresses relative to the current instruction pointer rather than absolute addresses. HLASM uses USING and base registers to achieve the same goal.
Compared to HLASM: HLASM's 12-bit displacement means a single base register covers 4096 bytes. x86-64 allows a 32-bit signed displacement, so a single reference can reach ±2 GB from the base register — much more flexible, but also more implicit.
Arithmetic and Logical Instructions
x86-64 arithmetic instructions are similar in concept to HLASM but use two-operand syntax where the first operand is both source and destination:
; NASM (Intel syntax) ; HLASM equivalent
add rax, rbx ; RAX += RBX -- AR R1,R2
sub rax, 10 ; RAX -= 10 -- S R1,=F'10'
imul rax, rbx ; RAX *= RBX -- MR (signed multiply, result in one reg)
inc rcx ; RCX++ -- A R1,=F'1'
dec rsi ; RSI-- -- S R1,=F'1'
neg rax ; RAX = -RAX -- LCR R1,R1
and rax, 0xFF ; mask bits -- N R1,=X'000000FF'
or rax, rbx ; set bits -- OR R1,R2
xor rax, rax ; zero RAX -- XR R1,R1 (fastest zero idiom)
shl rax, 3 ; shift left -- SLL R1,3
shr rax, 1 ; logical shr -- SRL R1,1
sar rax, 1 ; arith shr -- SRA R1,1The xor rax, rax idiom to zero a register is universal in x86-64 — it encodes as a shorter opcode than mov rax, 0 and is recognised by the CPU as a register-zeroing operation.
Control Flow and Branching
Control flow in x86-64 uses CMP to set flags and conditional jump instructions to branch:
cmp rax, 10 ; sets flags based on RAX - 10 (result discarded)
je equal ; jump if equal (ZF = 1)
jne not_equal ; jump if not equal (ZF = 0)
jl less_than ; jump if less (signed, SF ≠ OF)
jg greater_than ; jump if greater (signed, ZF=0 and SF=OF)
jb below ; jump if below (unsigned, CF = 1)
ja above ; jump if above (unsigned, CF=0 and ZF=0)Compared to HLASM: CMP in x86-64 is equivalent to CR or CLC in HLASM. The conditional jump mnemonics (JE, JNE, JL, JG) correspond to HLASM's extended branch mnemonics (BE, BNE, BL, BH). The underlying mechanism — comparing, setting a status register, and branching on the result — is identical in both architectures.
Implementing a loop
mov rcx, 10 ; loop counter
.loop:
; ... loop body ...
dec rcx ; decrement
jnz .loop ; jump back if not zeroEquivalent HLASM:
LA R5,10 LOAD COUNTER
LOOP DS 0H
* ... LOOP BODY ...
BCTR R5,0 DECREMENT R5 (BCTR = Branch on Count Register)
BNZ LOOP BRANCH IF NOT ZEROThe Stack and Calling Conventions
The x86-64 stack grows downward (toward lower addresses), exactly like the z/Architecture stack.
System V AMD64 ABI (Linux/macOS)
This is the calling convention for Linux and macOS x86-64. It defines:
- Arguments: passed in RDI, RSI, RDX, RCX, R8, R9 (left to right). Additional arguments go on the stack.
- Return value: in RAX (and RDX for 128-bit values).
- Callee-saved registers: RBX, RBP, R12–R15. The called function must preserve these.
- Caller-saved registers: RAX, RCX, RDX, RSI, RDI, R8–R11. May be clobbered by a called function.
- Stack alignment: RSP must be 16-byte aligned when the CALL instruction executes.
; a function that receives two arguments and returns their sum
; long add(long a, long b) -- a in RDI, b in RSI, result in RAX
add_func:
push rbp ; save caller's base pointer
mov rbp, rsp ; establish stack frame
; function body
mov rax, rdi ; return value = first argument
add rax, rsi ; + second argument
pop rbp ; restore base pointer
ret ; return (pop return address, jump to it)Compared to HLASM: HLASM's standard linkage saves all 15 registers in a 72-byte save area at 12(R13). System V ABI only requires saving RBX, RBP, R12–R15. The mechanisms differ, but both conventions ensure that after a function returns, the caller's register state is intact.
System Calls on x86-64 Linux
Linux x86-64 system calls use the SYSCALL instruction. Arguments go in RDI, RSI, RDX, R10, R8, R9 (note: R10 not RCX for syscalls):
; Common Linux x86-64 system call numbers
; sys_read = 0
; sys_write = 1
; sys_open = 2
; sys_close = 3
; sys_exit = 60
; sys_mmap = 9
; Read from stdin (fd=0) into buffer
mov rax, 0 ; sys_read
mov rdi, 0 ; fd = stdin
mov rsi, buffer ; buffer address
mov rdx, 256 ; max bytes to read
syscall
; RAX now holds number of bytes actually read (or negative error code)Compared to HLASM: z/OS uses SVC instructions with macro-generated parameter lists. The concept — trap to the kernel with a function number and arguments — is the same. The details of which registers carry what, and the system call table, are OS-specific.
Debugging x86-64 Assembly with GDB
GDB (GNU Debugger) is the standard Linux debugger for x86-64 assembly — the equivalent of IBM's Debug Tool / IPCS on z/OS.
nasm -f elf64 -g -F dwarf program.asm -o program.o # assemble with debug info
ld -o program program.o
gdb ./programEssential GDB commands for assembly:
(gdb) layout asm # show disassembly pane
(gdb) layout regs # show registers alongside disassembly
(gdb) si # step one instruction (step into calls)
(gdb) ni # step one instruction (step over calls)
(gdb) info registers # display all register values
(gdb) x/8xg $rsp # examine 8 quadwords at stack pointer
(gdb) break _start # set breakpoint at label
(gdb) run # start executionCompared to HLASM debugging: HLASM debugging under IPCS involves reading a formatted storage dump — a text representation of memory and register contents at the time of an ABEND. GDB is interactive: you step through instructions while the program is live, watching register and memory state change in real time.
Conclusion
x86-64 assembly is a natural extension for any HLASM developer. The core concepts — registers, memory addressing, condition flags, the call stack, and system calls — are universal. The differences lie in syntax, calling conventions, byte order, and the specific instruction mnemonics.
The most valuable skill is learning to read compiler output. When you compile a C function with gcc -O2 -S, the generated assembly is x86-64. Reading that output fluently is what separates senior systems engineers from everyone else. Your HLASM background gives you a substantial head start.
Continue the track by exploring ARM Assembly Programming: A Comparative Guide for Assembler Developers.
