SQL Transactions: Concurrency and Isolation

SQL Transactions: Concurrency and Isolation
Imagine you are transferring 100 from your savings account to your checking account. The database has to do two things:
- Subtract 100 from Savings.
- Add 100 to Checking.
The Nightmare: What if the power goes out after Step 1? The 100 vanishes into thin air. You have lost the customer's money, and your database is in a "Corrupt State." To prevent this, we use Transactions. This 1,500+ word guide covers the "Fortress Tier" of database engineering.
1. ACID: The Four Pillars of Reliability
A transaction is a single "Atomic" unit of work. It either succeeds 100% or fails 100%, with zero in-between. To achieve this in physical silicon, the database follows the ACID protocol.
1. Atomicity: The "All or Nothing" Fuse
If one query in a 10-statement transaction fails, the entire transaction is Rolled Back.
- The Mirror: The engine keeps a "Transaction Log" of every change. If a failure occurs, it reads the log in reverse and un-does every write operation until the data is exactly as it was before the transaction started.
2. Consistency: The "Rule of Law"
The database moves from one valid state to another. Constraints (like NOT NULL, UNIQUE, or FOREIGN KEY) are enforced.
- The Physics: Consistency is not just about data types; it's about business integrity. If a transaction would result in a negative bank balance (and you have a constraint preventing that), the hardware triggers an immediate abort.
3. Isolation: The "Solitude" Mirror
Even if 1,000 people are using the database at the same time, their transactions shouldn't "Leak" into each other.
- The Challenge: True isolation is computationally expensive. As we will see, databases use "Isolation Levels" to balance safety with performance.
4. Durability: The "Permanent Record"
Once the database says "Success," the data is physically saved. If the server loses power 1 millisecond later, the data must still be there when it reboots.
- The Physics: As we will explore in the WAL section, this is achieved through sequential disk appends rather than random file writes.
2. Hardware-Mirror: MVCC vs. 2PL
How does a database handle multiple people editing the same row at the same time? There are two primary schools of thought: Locking vs. Versioning.
The 2PL (Two-Phase Locking) Physics
Older databases (and some modern ones like SQL Server in certain modes) use Two-Phase Locking.
- Phase 1: Expansion: The transaction acquires all the locks it needs (Shared locks for reading, Exclusive locks for writing).
- Phase 2: Shrinking: The transaction releases all locks at once upon COMMIT or ROLLBACK.
- The Mirror Limit: In a 2PL system, "Readers block Writers" and "Writers block Readers." If you are running a long report (Read), nobody can update those rows (Write). This causes massive performance bottlenecks in high-traffic mirrors.
The MVCC (Multi-Version Concurrency Control) Mirror
Modern engines (Postgres, Oracle, MySQL/InnoDB) use MVCC.
- The Concept: Instead of locking a row, the database creates a New Version of the row.
- The Snapshot: When you start a transaction, the engine gives you a "Snapshot" of the database at that exact moment. You see the world as it was when you started.
- The Win: In MVCC, "Readers NEVER block Writers" and "Writers NEVER block Readers." You can run a 24-hour report while millions of updates happen in the background without any performance impact.
3. The WAL Internals: Write-Ahead Logging
How does a database guarantee Durability? If it had to write to the massive data files on the SSD every time you hit "Save," it would be too slow.
The Physics of the WAL
- The Log First: When you change a row, the database writes the change to a lightweight, sequential file called the WAL (Write-Ahead Log). This is very fast because the disk head doesn't have to move (Sequential I/O).
- The Success: Once the change is safely in the WAL (flushed to the silicon), the database tells you "Success!"
- The Checkpoint: Later, a background process (the Checkpointer) carries the changes from the WAL to the actual data tables.
- Crash Recovery: If the power fails, the database re-reads the WAL upon reboot and "Replays" any changes that haven't made it to the main tables yet. This is the physical secret of ACID.
4. The 4 Isolation Levels: The Safety/Speed Trade-off
This is the most complex part of SQL. You must choose how much "Isolation" you want. The ANSI standard defines four levels to prevent three specific "Phenomena."
The Phenomena
- Dirty Read: Seeing data that was written by another transaction but not yet committed.
- Non-Repeatable Read: Reading a row, then reading it again and seeing it has changed.
- Phantom Read: Reading a set of rows, then reading again and seeing new rows have appeared.
The Levels
| Level | Dirty Reads | Non-Repeatable | Phantoms | Hardware Cost |
|---|---|---|---|---|
| Read Uncommitted | Possible | Possible | Possible | Lowest (Memory Chaos) |
| Read Committed | Not Possible | Possible | Possible | Low (Postgres Default) |
| Repeatable Read | Not Possible | Not Possible | Possible | Medium (Snapshot Sync) |
| Serializable | Not Possible | Not Possible | Not Possible | Highest (Total Locking) |
Architect's Standard:
- In 2026, Read Committed is the industry standard for 95% of tasks.
- For financial transfers or inventory logic (the "Double-Spend" problem), you must use Serializable or explicit locking.
5. Case Study: The "Double-Spend" Disaster
A crypto exchange was using Read Committed for their withdrawals.
- The Scenario: A user has 100. They send two withdrawal requests for 100 at the exact same millisecond.
- The Failure:
- Transaction A checks balance: "It's 100. OK."
- Transaction B checks balance: "It's 100. OK."
- Transaction A subtracts 100 and sends money.
- Transaction B subtracts 100 and sends money.
- The Result: The user sent 200 even though they only had 100. The exchange lost millions in a "Race Condition."
The Architect's Fix
- Pessimistic Locking: Use
SELECT ... FOR UPDATE. This tells the database: "Lock this row. Nobody else can even READ it until I'm finished." This forces Transaction B to wait until A is done. - Serializable Isolation: Let the database detect the conflict. If Transaction B tries to commit, the database will say "Serialization Failure" and force B to retry.
6. Deadlock Physics: The Infinite Wait
A Deadlock happens when Transaction A is waiting for a lock held by Transaction B, and Transaction B is waiting for a lock held by Transaction A. They are stuck in a logical loop.
The Graph Detection Mirror
Modern databases maintain a Wait-for Graph.
- Nodes: Current Transactions.
- Edges: "A is waiting for a lock held by B."
- The Detection: The engine runs a cycle-detection algorithm on this graph. If it finds a loop (A -> B -> A), it has found a deadlock.
- The Sacrifice: The engine picks one transaction (usually the one that did the least work) and "Kills" it (ABORT), effectively breaking the cycle and allowing the other to survive.
How to Prevent Deadlocks
- Consistent Order: Always access your tables in the Same Order (e.g., always update
usersthenorders). - Short Transactions: Never put "Human Input" or "External API Calls" inside a transaction. Keep it strictly database-to-database to minimize lock duration.
7. The Savepoint Mirror: Partial Rollbacks
In complex batch processing, you don't always want to roll back the entire transaction if one small step fails.
The Physics: Savepoints create "Snapshots" of the transaction's internal state. They allow for complex error-handling logic without the heavy performance cost of restarting a massive transaction from scratch.
8. Summary: The Transaction Checklist
- Keep it Short: Locks are expensive. The longer a transaction lives, the more it slows down the entire system.
- Prefer MVCC: Understand that Postgres/Oracle readers don't block writers. If you are on SQL Server, ensure
READ_COMMITTED_SNAPSHOTis enabled to match this performance. - Isolation Awareness: Don't use
REPEATABLE READunless you specifically need to avoid non-repeatable reads; it increases the risk of serialization failures. - Audit the WAL: Monitor your WAL write latency. High WAL latency usually indicates a slow disk or a bottleneck in the disk I/O mirror.
- Clean Up: Always ensure your application code has a
finallyblock that callsROLLBACKif the transaction wasn't committed. "Zombies" (Idle in Transaction) are the silent killers of enterprise databases.
Transactions are the "Conscience" of your application. By mastering the WAL, the nuances of MVCC, and the physical reality of deadlocks, you gain the power to build systems that are fundamentally Unbreakable. You graduate from "Gathering data" to "Architecting Integrity."
Phase 17: Transaction Mastery Checklist
- Audit Lock Wait Times: Monitor system views (like
pg_stat_activity) for transactions waiting on locks. - Implement Sovereign Savepoints: Use savepoints in long batch processes to handle partial failures.
- Configure WAL Geometry: Ensure your WAL segments are sized correctly for your write throughput to minimize checkpoint spikes.
- Verify Isolation Levels: Ensure financial logic uses
SERIALIZABLEor explicit row-locking (FOR UPDATE).
Read next: SQL ACID Properties: The Mechanics of Truth →
Part of the SQL Mastery Course — engineering the truth.
