Why use std::unordered_map for an in-memory key-value store in C++?

std::unordered_map provides O(1) average-case lookup, insertion, and deletion - the same complexity as Redis for single-key operations. A sorted std::map has O(log n) per operation, which is measurably slower at large scales. For a key-value store, hash-based lookup is the correct default. If you need range queries or ordered iteration, consider a std::map or a skip list, accepting the log n overhead.

How do I implement TTL (time-to-live) expiry in a C++ key-value store?

Store an expiry timestamp (std::chrono::steady_clock::time_point) alongside each value. Check expiry lazily on access - if the key is expired, delete it and return not-found. For proactive cleanup, run a background thread or use a priority queue (min-heap by expiry time) to schedule deletions. Lazy expiry is simpler and scales better; eager expiry keeps memory usage bounded but adds complexity.

How do I make a C++ in-memory store thread-safe?

For a simple store, use a single std::shared_mutex - readers acquire a shared lock, writers acquire an exclusive lock. This allows concurrent reads while serialising writes. For higher concurrency, use striped locking - divide the key space into N buckets, each with its own mutex, reducing contention proportionally. For maximum throughput, explore lock-free hash maps (tbb::concurrent_hash_map from Intel TBB or libcuckoo).

How does a C++ in-memory store compare to Redis performance?

A well-implemented C++ in-memory store can match or exceed Redis throughput for local, single-process access - there is no network stack overhead. Redis processes commands serially in a single thread (with I/O multiplexing), so a multi-threaded C++ store with fine-grained locking can achieve higher parallelism for read-heavy workloads. The trade-off is that Redis provides persistence, replication, clustering, and a rich command set that an in-house store would need to replicate.

Project: Building a High-Performance In-Memory Key-Value Store (TopicKV) - Phase 2 Capstone

← Back to C++ Mastery

Project: Building a High-Performance In-Memory Key-Value Store (TopicKV) - Phase 2 Capstone

Architecture: Sharded Store Design
Core Data Model: Entry with TTL
Sharded Storage for Scalability
Thread-Safe GET with std::optional
Thread-Safe SET with RAII
TTL Expiry and Lazy Deletion
LRU Eviction Policy
Command-Line REPL with std::format
Statistics and Metrics
Performance Benchmarks
Extension Challenges
Phase 2 Reflection

Architecture: Sharded Store Design

To avoid a single global lock bottlenecking all operations, TopicKV uses sharding - 16 independent sub-maps each with their own mutex:

Core Data Model: Entry with TTL

cpp

// include/topickv.hpp
#pragma once
#include <string>
#include <string_view>
#include <optional>
#include <memory>
#include <chrono>
#include <unordered_map>
#include <shared_mutex>
#include <array>
#include <atomic>

using Clock     = std::chrono::steady_clock;
using TimePoint = std::chrono::time_point<Clock>;
using Duration  = std::chrono::milliseconds;

// Internal entry - owns the value and tracks TTL:
struct Entry {
    std::shared_ptr<std::string> value;     // Shared ownership for snapshot reads
    std::optional<TimePoint>     expires_at; // nullopt = no expiry
    TimePoint                    last_access; // For LRU tracking
    
    explicit Entry(std::string val, std::optional<Duration> ttl = std::nullopt)
        : value(std::make_shared<std::string>(std::move(val)))
        , last_access(Clock::now())
    {
        if (ttl) expires_at = Clock::now() + *ttl;
    }
    
    bool is_expired() const {
        if (!expires_at) return false;
        return Clock::now() >= *expires_at;
    }
    
    void touch() { last_access = Clock::now(); }
};

Sharded Storage for Scalability

cpp

class TopicKV {
    static constexpr size_t SHARD_COUNT = 16;
    static constexpr size_t MAX_ENTRIES = 10'000; // Per shard
    
    struct Shard {
        mutable std::shared_mutex          mtx;
        std::unordered_map<std::string, Entry> data;
    };
    
    std::array<Shard, SHARD_COUNT> shards_;
    std::atomic<size_t> total_gets_{0};
    std::atomic<size_t> total_sets_{0};
    std::atomic<size_t> cache_hits_{0};
    std::atomic<size_t> cache_misses_{0};
    
    // Determine which shard owns a key:
    Shard& shard_for(std::string_view key) {
        size_t h = std::hash<std::string_view>{}(key);
        return shards_[h % SHARD_COUNT];
    }
    const Shard& shard_for(std::string_view key) const {
        size_t h = std::hash<std::string_view>{}(key);
        return shards_[h % SHARD_COUNT];
    }

Thread-Safe GET with std::optional

cpp

public:
    // GET: returns shared_ptr for snapshot safety (value stays valid even if key deleted)
    std::optional<std::shared_ptr<std::string>> get(std::string_view key) {
        auto& shard = shard_for(key);
        std::shared_lock lock(shard.mtx); // Multiple concurrent readers OK
        
        total_gets_.fetch_add(1, std::memory_order_relaxed);
        
        auto it = shard.data.find(std::string(key));
        if (it == shard.data.end()) {
            cache_misses_.fetch_add(1, std::memory_order_relaxed);
            return std::nullopt; // Key not found
        }
        
        // Lazy TTL check:
        if (it->second.is_expired()) {
            // Can't erase under shared_lock - caller should call del() or GC handles it
            cache_misses_.fetch_add(1, std::memory_order_relaxed);
            return std::nullopt; // Pretend it's gone
        }
        
        it->second.touch(); // Mark LRU access time
        cache_hits_.fetch_add(1, std::memory_order_relaxed);
        return it->second.value; // Return shared_ptr - safe even after shard unlock
    }
    
    // GET with fallback value:
    std::string get_or(std::string_view key, std::string_view fallback) {
        auto result = get(key);
        return result ? **result : std::string(fallback);
    }

Thread-Safe SET with RAII

cpp

    // SET: insert or update, with optional TTL
    void set(std::string_view key, std::string value,
             std::optional<Duration> ttl = std::nullopt) {
        auto& shard = shard_for(key);
        std::unique_lock lock(shard.mtx); // Exclusive write access
        
        total_sets_.fetch_add(1, std::memory_order_relaxed);
        
        // Evict if at capacity:
        if (shard.data.size() >= MAX_ENTRIES) {
            evict_lru(shard);
        }
        
        // Insert or update:
        auto [it, inserted] = shard.data.try_emplace(
            std::string(key),
            std::move(value),
            ttl
        );
        if (!inserted) {
            // Key exists - update in-place:
            it->second = Entry(std::move(value), ttl);
        }
    }
    
    // DEL: remove a key:
    bool del(std::string_view key) {
        auto& shard = shard_for(key);
        std::unique_lock lock(shard.mtx);
        return shard.data.erase(std::string(key)) > 0;
    }
    
    // MSET: set multiple at once (atomically within each shard):
    void mset(std::initializer_list<std::pair<std::string_view, std::string>> entries) {
        for (auto& [key, value] : entries) set(key, value);
    }

TTL Expiry and Lazy Deletion

cpp

    // Garbage collect all expired entries (call periodically):
    size_t gc() {
        size_t removed = 0;
        for (auto& shard : shards_) {
            std::unique_lock lock(shard.mtx);
            auto it = shard.data.begin();
            while (it != shard.data.end()) {
                if (it->second.is_expired()) {
                    it = shard.data.erase(it);
                    removed++;
                } else {
                    ++it;
                }
            }
        }
        return removed;
    }

LRU Eviction Policy

cpp

private:
    // Evict the least-recently-used entry from a shard (must hold unique_lock):
    static void evict_lru(Shard& shard) {
        if (shard.data.empty()) return;
        
        // Find the entry with the oldest last_access:
        auto oldest = std::ranges::min_element(shard.data, {},
            [](const auto& entry) { return entry.second.last_access; });
        
        if (oldest != shard.data.end()) {
            shard.data.erase(oldest);
        }
    }

Command-Line REPL with std::format

cpp

// src/repl.cpp
#include <print>
#include <string>
#include <sstream>
#include "topickv.hpp"

void run_repl(TopicKV& store) {
    std::println("TopicKV v1.0 - type HELP for commands");
    std::string line;
    
    while (std::print("> "), std::getline(std::cin, line)) {
        std::istringstream iss(line);
        std::string cmd;
        iss >> cmd;
        
        if (cmd == "SET") {
            std::string key, val;
            iss >> key >> val;
            store.set(key, val);
            std::println("OK");
            
        } else if (cmd == "SETEX") {  // SET with TTL
            std::string key, val;
            int ms;
            iss >> key >> ms >> val;
            store.set(key, val, std::chrono::milliseconds(ms));
            std::println("OK (expires in {}ms)", ms);
            
        } else if (cmd == "GET") {
            std::string key;
            iss >> key;
            auto result = store.get(key);
            result ? std::println("{}", **result) : std::println("(nil)");
            
        } else if (cmd == "DEL") {
            std::string key;
            iss >> key;
            std::println("{}", store.del(key) ? "1" : "0");
            
        } else if (cmd == "GC") {
            std::println("Removed {} expired entries", store.gc());
            
        } else if (cmd == "STATS") {
            auto [gets, sets, hits, misses] = store.stats();
            std::println("GETs: {} | SETs: {} | Hits: {} | Misses: {} | Rate: {:.1f}%",
                         gets, sets, hits, misses,
                         gets > 0 ? 100.0 * hits / gets : 0.0);
        } else if (cmd == "QUIT" || cmd == "EXIT") {
            break;
        } else {
            std::println("Unknown command. Commands: SET SETEX GET DEL GC STATS QUIT");
        }
    }
}

Extension Challenges

Persistence: Serialize the entire store to a binary file on SIGTERM (use std::ofstream + std::bit_cast for POD types)
Pub/Sub: Add a SUBSCRIBE key / PUBLISH key msg command using std::condition_variable and a std::deque<std::string> channel
Transactions: Implement MULTI / EXEC by buffering commands in a std::vector<Command> and applying atomically
Network interface: Wrap the store in a TCP server using std::jthread (from Module 14) - one thread per connection using the REPL dispatch logic

Phase 2 Reflection

Phase 2 Concept	Used In TopicKV
`std::unordered_map` (Module 10)	Core key-value storage per shard
`std::shared_ptr` (Module 3)	Entry value - snapshot safety
`std::shared_mutex` (Module 14)	Concurrent reader/writer safety
`std::optional` (Module 6)	GET return type - explicit "not found"
`std::chrono`	TTL timestamps, expiry comparison
RAII locks (Module 8)	`shared_lock` / `unique_lock` auto-release
`std::atomic` (Module 14)	Lock-free statistics counters
`std::ranges::min_element` (Module 15)	LRU eviction algorithm

Proceed to Phase 3: Inheritance & Composition ->

Frequently Asked Questions

Q: What C++ data structures are best suited for building an in-memory key-value store? std::unordered_map<Key, Value> provides O(1) average-case lookup and is the natural choice for a hash-based key-value store. For ordered range queries, std::map (red-black tree, O(log n)) is better. For a concurrent store, wrap with std::shared_mutex for read-heavy workloads or use a sharded approach (multiple maps each with their own lock) to reduce contention. For extremely high throughput, consider lock-free hash maps from libraries like libcuckoo or Junction.

Q: How do you implement expiry (TTL) for keys in a C++ in-memory store? Store the expiry timestamp alongside each value - std::chrono::steady_clock::time_point. On every read, check whether the current time exceeds the stored expiry and return "not found" if so (lazy expiration). For active cleanup, maintain a priority queue (std::priority_queue) sorted by expiry time and run a background thread that periodically pops and deletes expired keys. This two-pronged approach (lazy + active) balances memory usage and read latency.

Q: How does an in-memory datastore project demonstrate important C++ concepts? It exercises RAII for resource management (automatic cleanup of allocated entries), templates and type erasure for storing heterogeneous value types, concurrency primitives (std::mutex, std::shared_mutex, std::condition_variable) for thread safety, move semantics for zero-copy value insertion, and std::chrono for TTL management. It also naturally motivates benchmarking with std::chrono::high_resolution_clock and profiling to understand cache behaviour - making it an excellent capstone project for systems-level C++ skills.

Part of the C++ Mastery Course - 30 modules from modern C++ basics to expert systems engineering.

Project: Building a High-Performance In-Memory Key-Value Store (TopicKV) - Phase 2 Capstone

Project: Building a High-Performance In-Memory Key-Value Store (TopicKV) - Phase 2 Capstone

Table of Contents

Architecture: Sharded Store Design

Core Data Model: Entry with TTL

Sharded Storage for Scalability

Thread-Safe GET with std::optional

Thread-Safe SET with RAII

TTL Expiry and Lazy Deletion

LRU Eviction Policy

Command-Line REPL with std::format

Extension Challenges

Phase 2 Reflection

Frequently Asked Questions