C Structs, Unions & Data Alignment: Memory Layout Mastery

C Structs, Unions & Data Alignment: Memory Layout Mastery
Table of Contents
- Structs: Custom Data Types
- C23 Designated Initializers
- Memory Layout: Alignment and Padding Explained
- Optimizing Struct Field Order
- The offsetof Macro
- Unions: Multiple Interpretations of the Same Memory
- Tagged Unions: Type-Safe Discriminated Unions
- Bit-Fields: Sub-Byte Precision
- Flexible Array Members (FAMs)
- Packed Structs: Overriding Alignment
- Frequently Asked Questions
- Key Takeaway
Structs: Custom Data Types
Arrays store many items of the same type contiguously. Structs store items of different types grouped under one name. This is the primary building block for complex data modeling in C:
#include <stdio.h>
#include <stdint.h>
#include <string.h>
// Define a new type: NetworkPacket
struct NetworkPacket {
uint8_t version; // 1 byte
uint16_t source_port; // 2 bytes
uint16_t dest_port; // 2 bytes
uint32_t sequence_num; // 4 bytes
uint8_t payload[1024]; // 1024 bytes
};
// Use typedef to avoid writing 'struct' every time
typedef struct NetworkPacket NetworkPacket;
int main(void) {
NetworkPacket pkt;
memset(&pkt, 0, sizeof(pkt)); // Zero-initialize all fields
pkt.version = 4;
pkt.source_port = 8080;
pkt.dest_port = 443;
pkt.sequence_num = 1;
printf("Packet size: %zu bytes\n", sizeof(pkt));
printf("Version: %u, Destination: %u\n", pkt.version, pkt.dest_port);
return 0;
}C23 Designated Initializers
C99 introduced designated initializers; C23 strengthens their semantics. They let you initialize specific fields by name, in any order, leaving unlisted fields zero-initialized:
typedef struct {
char name[64];
int32_t age;
double salary;
bool is_active;
} Employee;
int main(void) {
// C23 designated initializer: explicit, readable, safe
Employee alice = {
.name = "Alice Johnson",
.age = 32,
.salary = 85000.0,
.is_active = true,
};
// Partial initialization — unlisted fields are zero
Employee temp = { .name = "Temp User" };
// temp.age = 0, temp.salary = 0.0, temp.is_active = false
printf("%s earns $%.2f\n", alice.name, alice.salary);
return 0;
}Memory Layout: Alignment and Padding Explained
CPUs read memory in fixed-size chunks aligned to their bus width. Reading a 4-byte int is fastest when its address is divisible by 4. If it's at address 0x003, the CPU must perform two reads — once for bytes at 0x000-0x003 and once for 0x004-0x007 — and then combine them. This is called a misaligned access and is 2× slower (some architectures trap on it entirely).
To prevent misaligned access, C compilers add padding bytes between struct fields to ensure each field starts at an aligned address:
#include <stdio.h>
// Poorly ordered struct — wastes 7 bytes of padding
struct BadLayout {
char a; // 1 byte at offset 0
// 7 bytes padding (to align 'b' to 8-byte boundary)
double b; // 8 bytes at offset 8
char c; // 1 byte at offset 16
// 7 bytes padding (to bring total size to multiple of 8)
}; // Total: 24 bytes (wasted 14 bytes!)
// Well-ordered struct — no wasted space
struct GoodLayout {
double b; // 8 bytes at offset 0
char a; // 1 byte at offset 8
char c; // 1 byte at offset 9
// 6 bytes padding (to bring size to multiple of 8 for array use)
}; // Total: 16 bytes
int main(void) {
printf("Bad layout: %zu bytes\n", sizeof(struct BadLayout)); // 24
printf("Good layout: %zu bytes\n", sizeof(struct GoodLayout)); // 16
return 0;
}Optimizing Struct Field Order
The general rule for minimizing padding: declare fields from largest to smallest alignment requirement:
| Type | Alignment |
|---|---|
double, int64_t, pointers | 8 bytes |
int32_t, float | 4 bytes |
int16_t | 2 bytes |
char, int8_t | 1 byte |
// OPTIMAL: Sorted by alignment (largest first)
struct OptimalEmployee {
double salary; // 8 bytes — align 8, offset 0
char *name_ptr; // 8 bytes (pointer) — align 8, offset 8
int64_t employee_id; // 8 bytes — align 8, offset 16
int32_t age; // 4 bytes — align 4, offset 24
int16_t department_code; // 2 bytes — align 2, offset 28
int8_t level; // 1 byte — align 1, offset 30
bool is_active; // 1 byte — align 1, offset 31
}; // Total: 32 bytes — ZERO padding!This optimization matters significantly in situations where millions of structs are stored (database records, network packet metadata, game entity systems).
The offsetof Macro
The offsetof(type, member) macro (from <stddef.h>) returns the byte offset of a member within a struct. This is essential for generic serialization code, binary protocol implementation, and container_of patterns:
#include <stdio.h>
#include <stddef.h>
#include <stdint.h>
struct Sensor {
uint32_t id;
float temperature;
float humidity;
uint64_t timestamp;
};
int main(void) {
printf("Sensor struct size: %zu\n", sizeof(struct Sensor));
printf("offset of id: %zu\n", offsetof(struct Sensor, id));
printf("offset of temperature:%zu\n", offsetof(struct Sensor, temperature));
printf("offset of humidity: %zu\n", offsetof(struct Sensor, humidity));
printf("offset of timestamp: %zu\n", offsetof(struct Sensor, timestamp));
return 0;
}The container_of pattern (fundamental in the Linux kernel) uses offsetof to recover a parent struct's pointer from a member pointer:
#define container_of(ptr, type, member) \
((type*)((char*)(ptr) - offsetof(type, member)))Unions: Multiple Interpretations of the Same Memory
A union stores all its members at the same starting address. The size of the union is the size of its largest member. Only one member contains valid data at a time:
#include <stdio.h>
#include <stdint.h>
union DataView {
uint32_t as_uint;
float as_float;
uint8_t as_bytes[4];
};
int main(void) {
union DataView view;
view.as_uint = 0x3F800000; // IEEE 754 encoding of 1.0f
printf("As uint: 0x%08X\n", view.as_uint); // 0x3F800000
printf("As float: %f\n", view.as_float); // 1.000000
printf("Bytes: %02X %02X %02X %02X\n",
view.as_bytes[0], view.as_bytes[1],
view.as_bytes[2], view.as_bytes[3]); // 00 00 80 3F (little-endian)
printf("Union size: %zu (largest member)\n", sizeof(union DataView)); // 4
return 0;
}Unions are essential for:
- Type punning: Reinterpreting the bits of a float as an integer (or vice versa) — used in fast inverse square root algorithms, IEEE 754 manipulation.
- Protocol parsing: A network packet header that can be viewed as raw bytes or as a structured layout.
- Register overlays: Mapping hardware registers that share bits between fields.
Tagged Unions: Type-Safe Discriminated Unions
Raw unions are unsafe — you can accidentally read the wrong member. A tagged union (discriminated union) wraps the union in a struct with a "tag" field indicating which member is currently valid:
#include <stdio.h>
#include <stdint.h>
#include <string.h>
typedef enum { VALUE_INT, VALUE_FLOAT, VALUE_STRING } ValueType;
typedef struct {
ValueType type; // The tag — tells us what's in the union
union {
int64_t integer;
double floating;
char string[64];
};
} Value;
void print_value(const Value *v) {
switch (v->type) {
case VALUE_INT: printf("int: %lld\n", v->integer); break;
case VALUE_FLOAT: printf("float: %f\n", v->floating); break;
case VALUE_STRING: printf("string: %s\n", v->string); break;
}
}
int main(void) {
Value values[3] = {
{ .type = VALUE_INT, .integer = 42 },
{ .type = VALUE_FLOAT, .floating = 3.14159 },
{ .type = VALUE_STRING, .string = "Hello, C" },
};
for (int i = 0; i < 3; i++) print_value(&values[i]);
return 0;
}This pattern is used in scripting language interpreters, JSON parsers, and any system that handles heterogeneous data.
Bit-Fields: Sub-Byte Precision
C allows you to specify how many bits a field should occupy using : in a struct definition. This is invaluable in embedded systems where RAM is measured in kilobytes:
#include <stdio.h>
#include <stdint.h>
// Status register for an embedded device — fits in 1 byte
struct StatusRegister {
uint8_t power_on : 1; // Bit 0
uint8_t data_ready : 1; // Bit 1
uint8_t error_flag : 1; // Bit 2
uint8_t mode : 2; // Bits 3-4 (0-3 range)
uint8_t reserved : 3; // Bits 5-7 (unused)
};
int main(void) {
struct StatusRegister sr = { .power_on = 1, .data_ready = 1, .mode = 2 };
printf("Size: %zu bytes\n", sizeof(sr)); // 1 byte!
printf("Mode: %u\n", sr.mode); // 2
printf("Error: %u\n", sr.error_flag); // 0
return 0;
}[!WARNING] Bit-field layout is implementation-defined — the standard does not specify byte order or bit ordering within a field. Never serialize a struct with bit-fields directly to a network socket or file if portability is required.
Flexible Array Members (FAMs)
C99 introduced flexible array members — a zero-length array at the end of a struct that allows variable-size struct instances with a single allocation:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
size_t length;
char data[]; // Flexible array member — must be last
} Buffer;
Buffer* create_buffer(const char *src) {
size_t len = strlen(src);
Buffer *buf = malloc(sizeof(Buffer) + len + 1); // +1 for null terminator
if (!buf) return NULL;
buf->length = len;
memcpy(buf->data, src, len + 1);
return buf;
}
int main(void) {
Buffer *b = create_buffer("Hello, FAM!");
printf("Length: %zu, Data: %s\n", b->length, b->data);
free(b);
return 0;
}FAMs are used extensively in file system structures, network packet buffers, and any case where a header needs to be followed by variable-length data in a single contiguous allocation.
Packed Structs: Overriding Alignment
You can force the compiler to remove all padding using __attribute__((packed)) (GCC/Clang) or #pragma pack:
#include <stdio.h>
#include <stdint.h>
// Standard — has padding
struct Standard {
uint8_t a;
uint32_t b;
uint8_t c;
}; // sizeof = 12 (with padding)
// Packed — no padding (but beware misaligned access on some CPUs)
struct __attribute__((packed)) Packed {
uint8_t a;
uint32_t b;
uint8_t c;
}; // sizeof = 6
int main(void) {
printf("Standard: %zu bytes\n", sizeof(struct Standard)); // 12
printf("Packed: %zu bytes\n", sizeof(struct Packed)); // 6
return 0;
}[!CAUTION] Packed structs cause misaligned memory access — which is slow on x86-64 and causes a hardware fault on strict-alignment architectures like ARM Cortex-M0. Only use
__attribute__((packed))when you need exact binary layout (network protocols, file format headers) and are aware of the performance cost.
Frequently Asked Questions
Why does the compiler add padding and not the hardware?
The hardware requires aligned access, but has no concept of C structs. The compiler, during compilation, has full knowledge of each field's type and alignment requirement. It inserts padding bytes precisely to ensure every field will be aligned correctly at runtime. You can inspect padding with pahole or -Wpadded.
How can I see the actual memory layout of my struct?
Use pahole ./program (part of the dwarves package on Linux) to see a detailed breakdown of every struct's padding and total size, derived from DWARF debug information. Alternatively, print offsetof for each field by hand.
Should I always reorder fields to minimize padding? Not always. Sometimes related fields should be grouped together for code readability, even if they waste a byte or two. Prioritize optimization when: storing millions of structs in memory, the struct is in a performance-critical hot path, or you're targeting memory-constrained embedded systems.
What is alignas in C23?
C23 introduces alignas(N) (previously in C11 as _Alignas) to explicitly set the alignment of a variable or struct member: alignas(64) int cache_line_data[16]; — this puts the array on a 64-byte cache line boundary, preventing false sharing in multi-threaded code.
Key Takeaway
Structs and unions are C's primary tools for modeling structured reality. By understanding data alignment and optimizing field order, you ensure your struct layouts are compact, cache-friendly, and hardware-efficient. By using unions and tagged variants, you gain C's closest equivalent to polymorphism — flexible, type-aware data containers without runtime overhead.
Read next: Arrays & Buffer Management: Contiguous Power →
Part of the C Mastery Course — 30 modules from systems fundamentals to production-grade engineering.
