Software ArchitectureSystem Design

Database Sharding: Horizontal Partitioning Strategies for Petabyte-Scale Systems

Complete guide to database sharding patterns. Understand the difference between vertical and horizontal scaling, choose the right sharding key to avoid hot shards, compare Range vs Hash vs Directory sharding strategies, implement consistent hashing to minimise resharding cost, use Vitess and Citus for managed sharding, handle cross-shard queries and distributed transactions, and learn when sharding is the wrong answer.

TT
Emily Ross
8 min read
Database Sharding: Horizontal Partitioning Strategies for Petabyte-Scale Systems

Database Sharding: Horizontal Partitioning Strategies for Petabyte-Scale Systems


Table of Contents


Vertical vs Horizontal Scaling: When Each Fails

Before sharding, exhaust these options:

  1. Query optimisation (indexes, query plans, avoid N+1)
  2. Read replicas (offload 80%+ of traffic)
  3. Vertical scaling (larger instance type)
  4. Caching layer (Redis - serve hot data without DB)
  5. Database partitioning (single DB, multiple files - simpler than sharding)

Sharding should be the last resort - it fundamentally changes how your application queries data.


What Sharding Actually Does

Sharding horizontally partitions rows across multiple database instances. Each shard is an independent database that owns a subset of the data:

text
Unsharded (1 server):     Sharded (3 servers):
+------------------+      +-------------+  +-------------+  +-------------+
| orders table     |      | Shard 1     |  | Shard 2     |  | Shard 3     |
| id=1  user=alice |  ->   | id=1 alice  |  | id=6 carol  |  | id=4 eve    |
| id=2  user=bob   |      | id=2 bob    |  | id=7 dave   |  | id=5 frank  |
| id=3  user=carol |      | id=3 carol? |  +--------------+  +--------------+
| ...10M rows...   |      +--------------+
+-------------------+      Each shard handles 3.3M rows
                          Each accepts writes independently

Your application layer (or a sharding proxy) must determine which shard to route each query to based on a sharding key.


The Sharding Key Decision: Most Critical Choice

The sharding key is the column used to determine which shard owns a row. A poor sharding key causes hot shards - one shard receiving disproportionate traffic while others sit idle.

Sharding KeyDistributionQuery PatternRisk
user_idEven (high cardinality)All user data on one shard✅ Good default
tenant_idUneven (large tenants)Complete tenant isolationHot shard for large tenants
created_atAll new data to one shardTime-range queries land on one shard❌ Always hot
random UUIDPerfect distributionNo locality - every query may hit any shardCross-shard joins nightmare
regionDepends on geographyGeographic localityHot if users cluster by region

The hot shard problem in detail:

text
created_at-based sharding (2026 data):
Shard 1: 2024 data  -> 0% of current writes
Shard 2: 2025 data  -> 5% of current writes
Shard 3: 2026 data  -> 95% of all current writes (BURNING HOT)

user_id-based sharding (hash to 3 shards):
Shard 1: users where hash(user_id) % 3 == 0 -> 33% of writes
Shard 2: users where hash(user_id) % 3 == 1 -> 33% of writes
Shard 3: users where hash(user_id) % 3 == 2 -> 33% of writes (BALANCED)

Strategy 2: Hash Sharding

Hash sharding applies a hash function to the sharding key to determine the shard:

python
def get_shard(user_id: int, num_shards: int) -> int:
    return hash(user_id) % num_shards
    # hash("user-12345") % 3 -> Shard 1
    # hash("user-99999") % 3 -> Shard 0

# Routing in application code:
def get_order(user_id: str, order_id: str):
    shard_index = get_shard(hash(user_id), NUM_SHARDS)
    db = shard_connections[shard_index]
    return db.query("SELECT * FROM orders WHERE id = %s", [order_id])

Advantage: Perfectly even distribution - no hot shards.

Disadvantage: Adding a shard requires re-hashing all data. If you go from 3 shards to 4 shards, hash(user_id) % 4 produces different shard assignments - most data must move.


Strategy 3: Consistent Hashing - Minimising Resharding

Consistent hashing places both shards and keys on a virtual "ring". When a shard is added or removed, only keys near that shard's position on the ring need to move:

text
Consistent Hash Ring (simplified):

         Key A -> Shard 1
    +------------------------+
    |   Shard 3    Shard 1   |  Adding Shard 4:
    |         ←‘              |  Only 25% of keys move
    |   Key D   Key B        |  (compared to 75% with simple modulo)
    |         ←"              |
    |        Shard 2         |
    +-------------------------+

Simple modulo (3->4 shards): 75% of keys move to different shard
Consistent hashing (3->4 shards): ~25% of keys move

Used by: DynamoDB, Cassandra, Redis Cluster, Riak

Implementation: Use a sorted array of virtual nodes. Libraries: ketama (Python), HashRing (Java), or use a database with built-in consistent hashing (Redis Cluster, Cassandra).


Strategy 4: Directory Sharding (Lookup Table)

A central lookup table maps each entity ID to its shard:

sql
-- Shard directory table (in a dedicated metadata DB):
CREATE TABLE shard_directory (
    user_id    BIGINT PRIMARY KEY,
    shard_id   SMALLINT NOT NULL,
    created_at TIMESTAMP NOT NULL
);

-- Routing in application:
SELECT shard_id FROM shard_directory WHERE user_id = $1;
-- Then query the appropriate shard

-- Rebalancing is simple: UPDATE shard_directory SET shard_id = 3 WHERE user_id = 12345;
-- Then migrate that user's data to shard 3
-- No formula change needed!

Advantage: Maximum flexibility - entries can be moved to different shards at will. Easy to handle "whale" tenants (move large tenants to dedicated shards).

Disadvantage: The directory table is a critical single point of failure and a performance bottleneck for every query. Must be heavily cached with Redis and designed for HA.


Cross-Shard Queries: The Pain of Sharding

After sharding, queries that previously used JOIN become complex:

python
# ❌ BEFORE sharding - simple join:
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.region = 'EU'
GROUP BY u.id;

# ❌ AFTER sharding - impossible as a single query:
# users are on Shard 1, 2, 3... orders are on Shard 1, 2, 3...
# You can't JOIN across shards in a single SQL statement

# ✅ Application-level join (scatter-gather):
eu_users = []
for shard in all_shards:
    eu_users.extend(shard.query("SELECT id, name FROM users WHERE region = 'EU'"))

user_ids = [u.id for u in eu_users]

# For each shard, get order counts for users that live there:
order_counts = {}
for shard in all_shards:
    shard_counts = shard.query(
        f"SELECT user_id, COUNT(*) AS cnt FROM orders WHERE user_id IN ({','.join(user_ids)}) GROUP BY user_id"
    )
    order_counts.update({row.user_id: row.cnt for row in shard_counts})

# Merge in application memory - expensive and slow!
result = [(u.name, order_counts.get(u.id, 0)) for u in eu_users]

Mitigation strategies:

  • Design your sharding key so that related entities (users + their orders) live on the same shard
  • Use a data warehouse (BigQuery, Redshift) for cross-shard analytics queries
  • Denormalise data - store copies on both shards for frequently-joined tables

Managed Sharding: Vitess and Citus

Building sharding middleware yourself is an enormous engineering investment. Use managed solutions:

Vitess (YouTube/MySQL): A proxy layer + orchestration system for MySQL sharding. Your app connects to Vitess as if it's a single MySQL server; Vitess routes queries to the right shard, handles resharding, manages replication, and provides a web UI for operations.

Citus (PostgreSQL): A PostgreSQL extension that transforms a single Postgres instance into a sharded cluster. Native PostgreSQL protocol - apps connect as if to a single database. Supports hash and range distribution, parallel query execution across shards, and transparent resharding.

sql
-- Citus: Mark a table for sharding
SELECT create_distributed_table('orders', 'user_id');
-- Citus handles all routing transparently from now on
-- All queries via psql/JDBC work as normal - Citus routes internally

Frequently Asked Questions

How do I handle transactions across shards? You generally can't use database-level transactions across shards - each shard has an independent transaction manager. Use the Saga Pattern for cross-shard operations: break the transaction into compensatable local transactions on each shard. For financial systems requiring strict atomicity, consider keeping all financial data for one entity on the same shard.

Can I shard without changing application code? With Vitess or Citus, you can shard transparently to the application for most query types. However, cross-shard JOINs and some aggregate queries require application changes. The rule: if you design your sharding key so queries always specify the sharding key, most queries route to a single shard with zero application changes.


Key Takeaway

Database sharding is a last-resort scaling strategy that comes with a high permanent complexity cost - loss of cross-shard joins, loss of multi-row transactions, need for scatter-gather queries for analytics. The sharding key decision is irreversible in practice - choosing user_id vs tenant_id vs region determines your system's scaling characteristics for years. Use read replicas, caching, and vertical scaling first. When you genuinely need sharding, prefer managed solutions (Vitess, Citus) over hand-rolled middleware, and invest heavily in choosing a sharding key that colocates related data on the same shard.

Read next: Leader-Follower Replication: High-Availability Database Patterns ->


Part of the Software Architecture Hub - comprehensive guides from architectural foundations to advanced distributed systems patterns.