CFoundations

C Arrays & Buffer Management: Contiguous Memory, Strings & Buffer Safety

TT
TopicTrick Team
C Arrays & Buffer Management: Contiguous Memory, Strings & Buffer Safety

C Arrays & Buffer Management: Contiguous Memory, Strings & Buffer Safety


Table of Contents


What Is a C Array at the Memory Level?

Unlike Python lists or JavaScript arrays — which are heap-allocated objects with dynamic sizes and type flexibility — a C array is the simplest possible data structure: a contiguous block of identically-sized elements in memory.

mermaid

Accessing data[i] is an O(1) operation computed as: base_address + i * sizeof(element_type). This is why arrays have perfect "cache locality" — when the CPU reads data[0], it also loads data[1] through data[7] into its L1 cache automatically (64-byte cache line). Iterating over an array sequentially is one of the fastest operations a CPU can perform.


Stack Arrays vs Heap Arrays

c

Key differences:

Stack ArrayHeap Array
SizeCompile-time constantRuntime-determined
SpeedExtremely fast (RSP adjustment)Slightly slower (malloc call)
Max size~1-8 MB (stack limit)Limited by RAM/virtual memory
LifetimeCurrent scopeUntil free() is called
SafetyAutomatic cleanupManual — must free

Array Initialization Patterns

c

The element count idiom (sizeof(arr) / sizeof(arr[0])) is the canonical way to compute an array's element count without hardcoding a magic number. Define it as a macro for reuse:

c

Pointer Decay: Arrays Are Not Pointers

This is one of C's most commonly misunderstood rules: an array "decays" to a pointer to its first element in most expression contexts. The array and the pointer are not the same thing, but they behave similarly:

c

The three contexts where arrays do NOT decay:

  1. When used with sizeofsizeof(arr) gives the full array size.
  2. When used with &&arr gives the address of the entire array with type int(*)[5].
  3. When used as a string literal initializer — char name[] = "Alice".

Multidimensional Arrays

Multidimensional arrays in C are stored in row-major order — all elements of row 0 come first, then all of row 1, etc. This is a physical memory layout, not an abstraction:

c

Cache-friendliness matters: Iterating matrix[row][col] (row-major) is fast because consecutive elements are adjacent in memory. Iterating matrix[col][row] (column-major) on a row-major array causes cache misses on every iteration — potentially 5-10× slower for large matrices.


C Strings: The Null-Terminator Contract

C does not have a built-in string type. A C "string" is simply a char array with a sentinel value — '\0' (null byte, ASCII 0) — marking the end:

c

Every C string function assumes the null terminator exists. strlen, printf("%s"), strcpy, strcat — all terminate when they find '\0'. Forgetting the null terminator causes reads far past the intended buffer boundary (undefined behavior).


Safe String Handling

The Problem with strcpy and strcat

strcpy(dest, src) copies until it finds a null terminator in src — with no regard for the size of dest. If src is larger than dest, adjacent memory is overwritten:

c

Safer Alternatives

c

The snprintf pattern is the modern, safe approach for string building in C. It always null-terminates, and the return value tells you if truncation occurred.

On BSD/macOS systems, strlcpy(dest, src, sizeof(dest)) is available and always null-terminates without the manual dest[n-1] = '\0' step. On Linux, include <bsd/string.h> or define your own.


Variable-Length Arrays (VLAs)

C99 introduced Variable-Length Arrays — stack-allocated arrays whose size is determined at runtime:

c

[!WARNING] VLAs are optional in C11/C23 (compiler may not support them). For sizes above a few KB, VLAs risk stack overflow without any warning. For production systems code, prefer malloc for dynamic sizing. VLAs are most useful in embedded system functions where the size is small and bounded.


Buffer Overflow: The Most Dangerous Bug in C

A buffer overflow occurs when you write more bytes into a buffer than it can hold, corrupting adjacent memory. It is the most common source of CVEs in C codebases and the root cause of countless security exploits:

c

A malicious user can craft an input that overwrites the return address to point to their own shellcode — this is a classic stack-smashing exploit. Modern mitigations include:

  • Stack Canaries (-fstack-protector-strong): Place a random value before the return address; check it before return.
  • ASLR: Randomize memory addresses to make exploitation unreliable.
  • NX bit: Mark stack as non-executable (prevents shellcode execution).
  • FORTIFY_SOURCE: Compile-time and runtime replacement of unsafe functions.

Bounds-Checked Access Patterns

c

Frequently Asked Questions

Why doesn't C check array bounds automatically? By design. C's philosophy is "trust the programmer and don't pay for what you don't need." Runtime bounds checking on every array access adds overhead. C gives you the tools (ASan, Valgrind, safe access wrappers) to check when you need to. Languages that check bounds by default (Java, Python, Rust) pay a constant performance cost that C avoids.

What is the difference between char[] and char* for strings? char name[] = "Alice" creates a stack-allocated mutable copy of the string literal. const char *name = "Alice" creates a pointer to a string literal stored in the program's read-only data segment. Attempting to write to name[0] with the pointer form is undefined behavior (usually a segfault). Always use const char* for string literals you don't intend to modify.

Is gets() really banned? Yes — gets() was removed from the C11 standard entirely. It reads into a buffer with absolutely no size limit, making buffer overflow guaranteed for any input longer than the buffer. It is the most dangerous function in the C standard library's history. Use fgets(buf, sizeof(buf), stdin) instead.

Can I use sizeof to get the length of an array passed to a function? No. When an array is passed to a function, it decays to a pointer. sizeof a pointer is always 8 bytes on 64-bit systems, regardless of the original array size. You must always pass the element count as a separate argument.


Key Takeaway

C arrays represent the Physical Reality of Memory — raw, contiguous bytes with zero overhead. Their performance is unmatched precisely because there is no wrapper, no metadata, no reference counting. The trade-off is that safety is entirely your responsibility.

By using bounds checking wrappers, snprintf for string building, ARRAY_SIZE macros for element counting, and ASan during development, you get C's raw speed without sacrificing correctness. This discipline is what separates professional systems hackers from beginners.

Read next: Linked Lists: Building Dynamic Collections →


Part of the C Mastery Course — 30 modules of expert C systems programming.