C Binary Trees & Recursive Search: BST, Tree Traversals & Balancing (Complete Guide)

C Binary Trees & Recursive Search: BST, Tree Traversals & Balancing (Complete Guide)
Table of Contents
- What Is a Binary Tree?
- The BST Property
- Node Structure and Creation
- Recursive Insertion
- Recursive Search: O(log n)
- Tree Traversal Orders
- Finding Min, Max, and Successor
- BST Deletion: The Hardest Operation
- Tree Height and Balance
- Memory Management: Freeing the Entire Tree
- Real-World Applications
- Frequently Asked Questions
- Key Takeaway
What Is a Binary Tree?
A binary tree is a linked data structure where each node has:
- A data value.
- A
leftchild pointer (orNULL). - A
rightchild pointer (orNULL).
Unlike arrays (contiguous memory) or linked lists (linear chain), trees represent hierarchical relationships. The height of a balanced binary tree with n nodes is O(log n) — for 1 million nodes, the height is only ~20. This logarithmic depth is what makes tree operations so efficient.
Terminology:
- Root: The top node with no parent.
- Leaf: A node with no children (
left = NULLandright = NULL). - Height: The maximum number of edges from root to a leaf.
- Depth: The number of edges from the root to a specific node.
- Subtree: Any node viewed as the root of the sub-structure below it.
The BST Property
A Binary Search Tree maintains a critical invariant:
For every node N: all values in N's left subtree are less than N's value, and all values in N's right subtree are greater than N's value.
This property holds recursively for every node in the tree. It means:
- Searching: at each node, eliminate half the remaining tree.
- In-order traversal: produces values in sorted ascending order.
- Insert/search/delete: O(log n) for balanced trees, O(n) worst case when degenerate.
Node Structure and Creation
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
typedef struct BST_Node {
int32_t value;
struct BST_Node *left;
struct BST_Node *right;
} BST_Node;
typedef struct {
BST_Node *root;
size_t count;
} BST;
// Initialize an empty BST
void bst_init(BST *tree) {
tree->root = NULL;
tree->count = 0;
}
// Create a new node (returns NULL on allocation failure)
static BST_Node *node_create(int32_t value) {
BST_Node *n = malloc(sizeof(BST_Node));
if (!n) return NULL;
n->value = value;
n->left = NULL;
n->right = NULL;
return n;
}Recursive Insertion
Recursion is the natural idiom for tree operations. The recursive structure directly mirrors the tree's recursive structure:
// Recursive helper: inserts value into subtree rooted at 'node'
// Returns the new root of the subtree (may be the new node if tree was empty)
static BST_Node *node_insert(BST_Node *node, int32_t value, bool *inserted) {
if (node == NULL) {
// Base case: found the insertion point
*inserted = true;
return node_create(value);
}
if (value < node->value) {
node->left = node_insert(node->left, value, inserted);
} else if (value > node->value) {
node->right = node_insert(node->right, value, inserted);
}
// If value == node->value: duplicate, do not insert (return existing node)
return node; // Return unchanged node pointer
}
int bst_insert(BST *tree, int32_t value) {
bool inserted = false;
tree->root = node_insert(tree->root, value, &inserted);
if (inserted) tree->count++;
return inserted ? 0 : 1; // 0=success, 1=duplicate
}Each recursive call either inserts into the left subtree or the right subtree, consuming one level of the tree per call. For a balanced tree with n nodes, this recurses at most logâ‚‚(n) times.
Recursive Search: O(log n)
// Search: returns the node containing 'value', or NULL if not found
BST_Node *bst_search(const BST_Node *node, int32_t value) {
if (node == NULL) return NULL; // Not found
if (value == node->value) return (BST_Node*)node; // Found
if (value < node->value) {
return bst_search(node->left, value); // Search left subtree
} else {
return bst_search(node->right, value); // Search right subtree
}
}
// Iterative version (avoids recursion overhead, preferred for very deep trees)
BST_Node *bst_search_iter(const BST *tree, int32_t value) {
BST_Node *current = tree->root;
while (current) {
if (value == current->value) return current;
current = (value < current->value) ? current->left : current->right;
}
return NULL; // Not found
}
void test_search(void) {
BST tree; bst_init(&tree);
bst_insert(&tree, 50); bst_insert(&tree, 30);
bst_insert(&tree, 70); bst_insert(&tree, 20);
BST_Node *found = bst_search_iter(&tree, 30);
printf("Found 30: %s\n", found ? "yes" : "no"); // yes
printf("Found 99: %s\n", bst_search_iter(&tree, 99) ? "yes" : "no"); // no
}Tree Traversal Orders
Three fundamental traversal orders, each with different applications:
// IN-ORDER (Left → Root → Right): produces sorted ascending output
void inorder(const BST_Node *node) {
if (!node) return;
inorder(node->left);
printf("%d ", node->value);
inorder(node->right);
}
// PRE-ORDER (Root → Left → Right): useful for serializing/cloning a tree
void preorder(const BST_Node *node) {
if (!node) return;
printf("%d ", node->value);
preorder(node->left);
preorder(node->right);
}
// POST-ORDER (Left → Right → Root): correct order for freeing tree memory
void postorder(const BST_Node *node) {
if (!node) return;
postorder(node->left);
postorder(node->right);
printf("%d ", node->value); // Process node AFTER children
}
// Level-order (BFS): visits nodes level by level using a queue
// Requires a queue implementation — see Module 10 (Linked Lists)Mnemonic: Position of "Root" in the traversal name tells you when the root is processed:
- Pre-order: Root is first (before children).
- In-order: Root is in the middle (between left and right).
- Post-order: Root is last (after children).
For a BST with values :
- In-order:
10 20 30 40 50 60 70(sorted!) - Pre-order:
50 30 20 10 40 70 60(root first) - Post-order:
10 20 40 30 60 70 50(leaves first)
Finding Min, Max, and Successor
// Find minimum: always in the leftmost position
BST_Node *bst_minimum(BST_Node *node) {
if (!node) return NULL;
while (node->left) node = node->left;
return node;
}
// Find maximum: always in the rightmost position
BST_Node *bst_maximum(BST_Node *node) {
if (!node) return NULL;
while (node->right) node = node->right;
return node;
}These O(log n) operations (O(h) where h = tree height) are fundamental to:
- Database range queries:
WHERE price >= 100 AND price <= 500traverses from the minimum ≥ 100 to maximum ≤ 500 in-order. - Deletion of nodes with two children: Replace with the in-order successor (minimum of right subtree).
BST Deletion: The Hardest Operation
Deletion has three cases:
- Node is a leaf (no children): Simply remove it.
- Node has one child: Replace the node with its child.
- Node has two children: Replace with in-order successor, then delete the successor.
static BST_Node *node_delete(BST_Node *node, int32_t value, bool *deleted) {
if (!node) return NULL; // Value not found
if (value < node->value) {
node->left = node_delete(node->left, value, deleted);
} else if (value > node->value) {
node->right = node_delete(node->right, value, deleted);
} else {
// Found the node to delete
*deleted = true;
// Case 1: Leaf node
if (!node->left && !node->right) {
free(node);
return NULL;
}
// Case 2: One child
if (!node->left) {
BST_Node *temp = node->right;
free(node);
return temp;
}
if (!node->right) {
BST_Node *temp = node->left;
free(node);
return temp;
}
// Case 3: Two children — replace with in-order successor
BST_Node *successor = bst_minimum(node->right);
node->value = successor->value; // Copy successor's value
// Delete the successor from the right subtree
bool ignored = false;
node->right = node_delete(node->right, successor->value, &ignored);
}
return node;
}
int bst_delete(BST *tree, int32_t value) {
bool deleted = false;
tree->root = node_delete(tree->root, value, &deleted);
if (deleted) tree->count--;
return deleted ? 0 : -1;
}Tree Height and Balance
// Compute tree height: O(n) — must visit every node
int bst_height(const BST_Node *node) {
if (!node) return -1; // Empty tree: height = -1 by convention
int left_h = bst_height(node->left);
int right_h = bst_height(node->right);
return 1 + (left_h > right_h ? left_h : right_h);
}
// Detect if tree is balanced: height difference between subtrees ≤ 1
int bst_is_balanced(const BST_Node *node) {
if (!node) return 1;
int left_h = bst_height(node->left);
int right_h = bst_height(node->right);
int diff = left_h - right_h;
if (diff < -1 || diff > 1) return 0; // Unbalanced
return bst_is_balanced(node->left) && bst_is_balanced(node->right);
}The degenerate BST problem: If you insert already-sorted values (1, 2, 3, 4, 5...) into a BST, every node goes to the right child, producing a linked list with O(n) search. Self-balancing trees solve this:
- AVL Tree: Maintains height balance factor within ±1 using rotations after each insert/delete.
- Red-Black Tree: Used in C++ STL
std::map, Java TreeMap, Linux kernel's CFS scheduler. Relaxed balance (up to 2× height difference) for faster insertions. - B-Tree: Used in databases (SQLite, PostgreSQL). Multi-way branching with large node capacities matching disk block sizes.
Memory Management: Freeing the Entire Tree
Always use post-order traversal to free a tree — you must free children before the parent:
void bst_destroy_node(BST_Node *node) {
if (!node) return;
bst_destroy_node(node->left); // Free left subtree first
bst_destroy_node(node->right); // Free right subtree
free(node); // Then free this node
}
void bst_destroy(BST *tree) {
bst_destroy_node(tree->root);
tree->root = NULL;
tree->count = 0;
}Complete usage example:
int main(void) {
BST tree;
bst_init(&tree);
int values[] = {50, 30, 70, 20, 40, 60, 80};
for (size_t i = 0; i < 7; i++) bst_insert(&tree, values[i]);
printf("Count: %zu\n", tree.count); // 7
printf("Height: %d\n", bst_height(tree.root)); // 2
printf("Balanced: %d\n", bst_is_balanced(tree.root)); // 1
printf("In-order: "); inorder(tree.root); printf("\n"); // 20 30 40 50 60 70 80
printf("Pre-order: "); preorder(tree.root); printf("\n"); // 50 30 20 40 70 60 80
printf("Post-order: "); postorder(tree.root); printf("\n"); // 20 40 30 60 80 70 50
bst_delete(&tree, 30);
printf("After deleting 30: "); inorder(tree.root); printf("\n"); // 20 40 50 60 70 80
bst_destroy(&tree);
return 0;
}Real-World Applications
| Application | Tree Type | Why Trees |
|---|---|---|
| Linux CFS scheduler | Red-Black Tree | O(log n) task selection by virtual runtime |
| Linux kernel memory mgmt | Red-Black Tree | VMA (Virtual Memory Area) range queries |
| PostgreSQL/MySQL indexes | B-Tree | Disk block-aligned, supports range scans |
| SQLite | B-Tree | Single-file database with page-aligned nodes |
C++ STL std::map | Red-Black Tree | Ordered key-value with guaranteed O(log n) |
Java TreeMap | Red-Black Tree | Sorted map with navigable floor/ceiling |
| Compiler symbol tables | Hash + BST hybrid | Scope-local lookup by name |
Frequently Asked Questions
What is an "unbalanced" BST and why is it a problem? If you insert items in sorted order, each new item is always larger than the previous root and goes to the right child, creating a degenerate tree that looks like a linked list. Search degrades from O(log n) to O(n). Self-balancing trees (AVL, Red-Black) use rotations after each insert to maintain height balance.
How is a Red-Black tree different from an AVL tree? AVL trees maintain strict height balance (height difference ≤ 1), making lookups slightly faster but insertions slower (more rotations needed). Red-Black trees allow a more relaxed balance (longest path ≤ 2× shortest) but have fewer rotations on average, making insertions and deletions faster. Red-Black trees are preferred in most production systems.
Why do databases use B-Trees instead of BSTs? BSTs have 1 key per node — with millions of records, the tree can be 20+ levels deep, each level potentially requiring a disk read (millisecond latency). B-Trees have many keys per node (hundreds to thousands), reducing tree height to 3-5 levels. Each node is sized to fit exactly one disk block (4KB-16KB), maximizing I/O efficiency.
When should I use a recursive tree traversal vs an iterative one? Recursive traversals are cleaner code. The stack depth equals the tree height: for a balanced tree with 1M nodes, that's ~20 levels — perfectly safe. Iterative traversals (using an explicit stack) are preferred when: the tree might be degenerate (O(n) depth), you need to pause/resume traversal (iterator pattern), or you're working in a stack-constrained environment (embedded systems).
Key Takeaway
Binary Search Trees provide Logarithmic Performance for search, insertion, and deletion with the bonus of ordered data. This combination — which neither arrays nor hash tables provide — is what makes trees the backbone of database indexing, OS scheduling, and compiler construction.
By mastering tree node structure, recursive operations, and the three traversal orders, you gain a mental model that extends directly to B-Trees, Red-Black Trees, tries, and segment trees — the data structures powering the world's most critical software infrastructure.
Read next: Bitwise Operations & Low-Level I/O: Bit Manipulation in C →
Part of the C Mastery Course — 30 modules from C basics to expert systems engineering.
