Database Sharding: Horizontal Partitioning Strategies for Petabyte-Scale Systems
Complete guide to database sharding patterns. Understand the difference between vertical and horizontal scaling, choose the right sharding key to avoid hot shards, compare Range vs Hash vs Directory sharding strategies, implement consistent hashing to minimise resharding cost, use Vitess and Citus for managed sharding, handle cross-shard queries and distributed transactions, and learn when sharding is the wrong answer.

Database Sharding: Horizontal Partitioning Strategies for Petabyte-Scale Systems
Table of Contents
- Vertical vs Horizontal Scaling: When Each Fails
- What Sharding Actually Does
- The Sharding Key Decision: Most Critical Choice
- Strategy 1: Range Sharding
- Strategy 2: Hash Sharding
- Strategy 3: Consistent Hashing - Minimising Resharding
- Strategy 4: Directory Sharding (Lookup Table)
- Cross-Shard Queries: The Pain of Sharding
- Managed Sharding: Vitess and Citus
- When NOT to Shard
- Resharding: The Hardest Operation
- Frequently Asked Questions
- Key Takeaway
Vertical vs Horizontal Scaling: When Each Fails
Before sharding, exhaust these options:
- Query optimisation (indexes, query plans, avoid N+1)
- Read replicas (offload 80%+ of traffic)
- Vertical scaling (larger instance type)
- Caching layer (Redis - serve hot data without DB)
- Database partitioning (single DB, multiple files - simpler than sharding)
Sharding should be the last resort - it fundamentally changes how your application queries data.
What Sharding Actually Does
Sharding horizontally partitions rows across multiple database instances. Each shard is an independent database that owns a subset of the data:
Unsharded (1 server): Sharded (3 servers):
+------------------+ +-------------+ +-------------+ +-------------+
| orders table | | Shard 1 | | Shard 2 | | Shard 3 |
| id=1 user=alice | -> | id=1 alice | | id=6 carol | | id=4 eve |
| id=2 user=bob | | id=2 bob | | id=7 dave | | id=5 frank |
| id=3 user=carol | | id=3 carol? | +--------------+ +--------------+
| ...10M rows... | +--------------+
+-------------------+ Each shard handles 3.3M rows
Each accepts writes independentlyYour application layer (or a sharding proxy) must determine which shard to route each query to based on a sharding key.
The Sharding Key Decision: Most Critical Choice
The sharding key is the column used to determine which shard owns a row. A poor sharding key causes hot shards - one shard receiving disproportionate traffic while others sit idle.
| Sharding Key | Distribution | Query Pattern | Risk |
|---|---|---|---|
user_id | Even (high cardinality) | All user data on one shard | ✅ Good default |
tenant_id | Uneven (large tenants) | Complete tenant isolation | Hot shard for large tenants |
created_at | All new data to one shard | Time-range queries land on one shard | ❌ Always hot |
random UUID | Perfect distribution | No locality - every query may hit any shard | Cross-shard joins nightmare |
region | Depends on geography | Geographic locality | Hot if users cluster by region |
The hot shard problem in detail:
created_at-based sharding (2026 data):
Shard 1: 2024 data -> 0% of current writes
Shard 2: 2025 data -> 5% of current writes
Shard 3: 2026 data -> 95% of all current writes (BURNING HOT)
user_id-based sharding (hash to 3 shards):
Shard 1: users where hash(user_id) % 3 == 0 -> 33% of writes
Shard 2: users where hash(user_id) % 3 == 1 -> 33% of writes
Shard 3: users where hash(user_id) % 3 == 2 -> 33% of writes (BALANCED)Strategy 2: Hash Sharding
Hash sharding applies a hash function to the sharding key to determine the shard:
def get_shard(user_id: int, num_shards: int) -> int:
return hash(user_id) % num_shards
# hash("user-12345") % 3 -> Shard 1
# hash("user-99999") % 3 -> Shard 0
# Routing in application code:
def get_order(user_id: str, order_id: str):
shard_index = get_shard(hash(user_id), NUM_SHARDS)
db = shard_connections[shard_index]
return db.query("SELECT * FROM orders WHERE id = %s", [order_id])Advantage: Perfectly even distribution - no hot shards.
Disadvantage: Adding a shard requires re-hashing all data. If you go from 3 shards to 4 shards, hash(user_id) % 4 produces different shard assignments - most data must move.
Strategy 3: Consistent Hashing - Minimising Resharding
Consistent hashing places both shards and keys on a virtual "ring". When a shard is added or removed, only keys near that shard's position on the ring need to move:
Consistent Hash Ring (simplified):
Key A -> Shard 1
+------------------------+
| Shard 3 Shard 1 | Adding Shard 4:
| ←‘ | Only 25% of keys move
| Key D Key B | (compared to 75% with simple modulo)
| ←" |
| Shard 2 |
+-------------------------+
Simple modulo (3->4 shards): 75% of keys move to different shard
Consistent hashing (3->4 shards): ~25% of keys move
Used by: DynamoDB, Cassandra, Redis Cluster, RiakImplementation: Use a sorted array of virtual nodes. Libraries: ketama (Python), HashRing (Java), or use a database with built-in consistent hashing (Redis Cluster, Cassandra).
Strategy 4: Directory Sharding (Lookup Table)
A central lookup table maps each entity ID to its shard:
-- Shard directory table (in a dedicated metadata DB):
CREATE TABLE shard_directory (
user_id BIGINT PRIMARY KEY,
shard_id SMALLINT NOT NULL,
created_at TIMESTAMP NOT NULL
);
-- Routing in application:
SELECT shard_id FROM shard_directory WHERE user_id = $1;
-- Then query the appropriate shard
-- Rebalancing is simple: UPDATE shard_directory SET shard_id = 3 WHERE user_id = 12345;
-- Then migrate that user's data to shard 3
-- No formula change needed!Advantage: Maximum flexibility - entries can be moved to different shards at will. Easy to handle "whale" tenants (move large tenants to dedicated shards).
Disadvantage: The directory table is a critical single point of failure and a performance bottleneck for every query. Must be heavily cached with Redis and designed for HA.
Cross-Shard Queries: The Pain of Sharding
After sharding, queries that previously used JOIN become complex:
# ❌ BEFORE sharding - simple join:
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.region = 'EU'
GROUP BY u.id;
# ❌ AFTER sharding - impossible as a single query:
# users are on Shard 1, 2, 3... orders are on Shard 1, 2, 3...
# You can't JOIN across shards in a single SQL statement
# ✅ Application-level join (scatter-gather):
eu_users = []
for shard in all_shards:
eu_users.extend(shard.query("SELECT id, name FROM users WHERE region = 'EU'"))
user_ids = [u.id for u in eu_users]
# For each shard, get order counts for users that live there:
order_counts = {}
for shard in all_shards:
shard_counts = shard.query(
f"SELECT user_id, COUNT(*) AS cnt FROM orders WHERE user_id IN ({','.join(user_ids)}) GROUP BY user_id"
)
order_counts.update({row.user_id: row.cnt for row in shard_counts})
# Merge in application memory - expensive and slow!
result = [(u.name, order_counts.get(u.id, 0)) for u in eu_users]Mitigation strategies:
- Design your sharding key so that related entities (users + their orders) live on the same shard
- Use a data warehouse (BigQuery, Redshift) for cross-shard analytics queries
- Denormalise data - store copies on both shards for frequently-joined tables
Managed Sharding: Vitess and Citus
Building sharding middleware yourself is an enormous engineering investment. Use managed solutions:
Vitess (YouTube/MySQL): A proxy layer + orchestration system for MySQL sharding. Your app connects to Vitess as if it's a single MySQL server; Vitess routes queries to the right shard, handles resharding, manages replication, and provides a web UI for operations.
Citus (PostgreSQL): A PostgreSQL extension that transforms a single Postgres instance into a sharded cluster. Native PostgreSQL protocol - apps connect as if to a single database. Supports hash and range distribution, parallel query execution across shards, and transparent resharding.
-- Citus: Mark a table for sharding
SELECT create_distributed_table('orders', 'user_id');
-- Citus handles all routing transparently from now on
-- All queries via psql/JDBC work as normal - Citus routes internallyFrequently Asked Questions
How do I handle transactions across shards? You generally can't use database-level transactions across shards - each shard has an independent transaction manager. Use the Saga Pattern for cross-shard operations: break the transaction into compensatable local transactions on each shard. For financial systems requiring strict atomicity, consider keeping all financial data for one entity on the same shard.
Can I shard without changing application code? With Vitess or Citus, you can shard transparently to the application for most query types. However, cross-shard JOINs and some aggregate queries require application changes. The rule: if you design your sharding key so queries always specify the sharding key, most queries route to a single shard with zero application changes.
Key Takeaway
Database sharding is a last-resort scaling strategy that comes with a high permanent complexity cost - loss of cross-shard joins, loss of multi-row transactions, need for scatter-gather queries for analytics. The sharding key decision is irreversible in practice - choosing user_id vs tenant_id vs region determines your system's scaling characteristics for years. Use read replicas, caching, and vertical scaling first. When you genuinely need sharding, prefer managed solutions (Vitess, Citus) over hand-rolled middleware, and invest heavily in choosing a sharding key that colocates related data on the same shard.
Read next: Leader-Follower Replication: High-Availability Database Patterns ->
Part of the Software Architecture Hub - comprehensive guides from architectural foundations to advanced distributed systems patterns.
