CMemory

C Structs, Unions & Data Alignment: Memory Layout Mastery

TT
TopicTrick Team
C Structs, Unions & Data Alignment: Memory Layout Mastery

C Structs, Unions & Data Alignment: Memory Layout Mastery


Table of Contents


Structs: Custom Data Types

Arrays store many items of the same type contiguously. Structs store items of different types grouped under one name. This is the primary building block for complex data modeling in C:

c

C23 Designated Initializers

C99 introduced designated initializers; C23 strengthens their semantics. They let you initialize specific fields by name, in any order, leaving unlisted fields zero-initialized:

c

Memory Layout: Alignment and Padding Explained

CPUs read memory in fixed-size chunks aligned to their bus width. Reading a 4-byte int is fastest when its address is divisible by 4. If it's at address 0x003, the CPU must perform two reads — once for bytes at 0x000-0x003 and once for 0x004-0x007 — and then combine them. This is called a misaligned access and is 2× slower (some architectures trap on it entirely).

To prevent misaligned access, C compilers add padding bytes between struct fields to ensure each field starts at an aligned address:

c
mermaid

Optimizing Struct Field Order

The general rule for minimizing padding: declare fields from largest to smallest alignment requirement:

TypeAlignment
double, int64_t, pointers8 bytes
int32_t, float4 bytes
int16_t2 bytes
char, int8_t1 byte
c

This optimization matters significantly in situations where millions of structs are stored (database records, network packet metadata, game entity systems).


The offsetof Macro

The offsetof(type, member) macro (from <stddef.h>) returns the byte offset of a member within a struct. This is essential for generic serialization code, binary protocol implementation, and container_of patterns:

c

The container_of pattern (fundamental in the Linux kernel) uses offsetof to recover a parent struct's pointer from a member pointer:

c

Unions: Multiple Interpretations of the Same Memory

A union stores all its members at the same starting address. The size of the union is the size of its largest member. Only one member contains valid data at a time:

c

Unions are essential for:

  • Type punning: Reinterpreting the bits of a float as an integer (or vice versa) — used in fast inverse square root algorithms, IEEE 754 manipulation.
  • Protocol parsing: A network packet header that can be viewed as raw bytes or as a structured layout.
  • Register overlays: Mapping hardware registers that share bits between fields.

Tagged Unions: Type-Safe Discriminated Unions

Raw unions are unsafe — you can accidentally read the wrong member. A tagged union (discriminated union) wraps the union in a struct with a "tag" field indicating which member is currently valid:

c

This pattern is used in scripting language interpreters, JSON parsers, and any system that handles heterogeneous data.


Bit-Fields: Sub-Byte Precision

C allows you to specify how many bits a field should occupy using : in a struct definition. This is invaluable in embedded systems where RAM is measured in kilobytes:

c

[!WARNING] Bit-field layout is implementation-defined — the standard does not specify byte order or bit ordering within a field. Never serialize a struct with bit-fields directly to a network socket or file if portability is required.


Flexible Array Members (FAMs)

C99 introduced flexible array members — a zero-length array at the end of a struct that allows variable-size struct instances with a single allocation:

c

FAMs are used extensively in file system structures, network packet buffers, and any case where a header needs to be followed by variable-length data in a single contiguous allocation.


Packed Structs: Overriding Alignment

You can force the compiler to remove all padding using __attribute__((packed)) (GCC/Clang) or #pragma pack:

c

[!CAUTION] Packed structs cause misaligned memory access — which is slow on x86-64 and causes a hardware fault on strict-alignment architectures like ARM Cortex-M0. Only use __attribute__((packed)) when you need exact binary layout (network protocols, file format headers) and are aware of the performance cost.


Frequently Asked Questions

Why does the compiler add padding and not the hardware? The hardware requires aligned access, but has no concept of C structs. The compiler, during compilation, has full knowledge of each field's type and alignment requirement. It inserts padding bytes precisely to ensure every field will be aligned correctly at runtime. You can inspect padding with pahole or -Wpadded.

How can I see the actual memory layout of my struct? Use pahole ./program (part of the dwarves package on Linux) to see a detailed breakdown of every struct's padding and total size, derived from DWARF debug information. Alternatively, print offsetof for each field by hand.

Should I always reorder fields to minimize padding? Not always. Sometimes related fields should be grouped together for code readability, even if they waste a byte or two. Prioritize optimization when: storing millions of structs in memory, the struct is in a performance-critical hot path, or you're targeting memory-constrained embedded systems.

What is alignas in C23? C23 introduces alignas(N) (previously in C11 as _Alignas) to explicitly set the alignment of a variable or struct member: alignas(64) int cache_line_data[16]; — this puts the array on a 64-byte cache line boundary, preventing false sharing in multi-threaded code.


Key Takeaway

Structs and unions are C's primary tools for modeling structured reality. By understanding data alignment and optimizing field order, you ensure your struct layouts are compact, cache-friendly, and hardware-efficient. By using unions and tagged variants, you gain C's closest equivalent to polymorphism — flexible, type-aware data containers without runtime overhead.

Read next: Arrays & Buffer Management: Contiguous Power →


Part of the C Mastery Course — 30 modules from systems fundamentals to production-grade engineering.