RustSystems Programming

Rust Concurrency: Threads, Channels, and Shared State

TT
TopicTrick Team
Rust Concurrency: Threads, Channels, and Shared State

Rust Concurrency: Threads, Channels, and Shared State

For over a decade, software engineering has marched relentlessly towards distributed, multi-core architecture. If your application cannot utilize all 12 cores on a modern CPU simultaneously, you are leaving 90% of your available compute power on the table.

Concurrently running tasks is highly performant but brutally difficult to write securely. When multiple threads touch the exact same piece of memory in RAM simultaneously, unpredictable Data Races occur, leading to phantom state-corruptions that are virtually impossible to replicate reliably during debugging.

In C++ and Java, you fight data races using disciplined code reviews and exhaustive manual locks. In Rust, the compiler guarantees you will not write data races. This paradigm is affectionately known by the community as Fearless Concurrency.


1. Creating Threads with thread::spawn

In Rust, the standard library provides native access to 1:1 Operating System threads via the std::thread module. When you spawn a thread, the OS literally schedules a separate execution process running concurrently to your main thread.

rust
use std::thread;
use std::time::Duration;

fn main() {
    let handle = thread::spawn(|| { // Spawns a Closure into a new Thread!
        for i in 1..5 {
            println!("Hi number {} from the spawned thread!", i);
            thread::sleep(Duration::from_millis(1));
        }
    });

    for i in 1..3 {
        println!("Hi number {} from the main thread!", i);
        thread::sleep(Duration::from_millis(1));
    }

    // Force the main thread to wait for the spawned thread to finish!
    handle.join().unwrap();
}

If you do not call .join() on the returned handle, the main thread will terminate the moment its loop finishes, immediately terminating the spawned thread before it has a chance to execute all 5 iterations!

The move Keyword

What happens if the spawned thread wants to use a piece of data that belongs to the main thread?

rust
use std::thread;

fn main() {
    let vector = vec![1, 2, 3];

    // ERROR! The closure borrows `vector`, but the spawned thread might 
    // theoretically outlive the main thread, resulting in a dangling pointer!
    // let handle = thread::spawn(|| {
    //     println!("Here's a vector: {:?}", vector);
    // });
    
    // CORRECTION: We use `move` to forcibly MOVE ownership into the thread!
    let handle = thread::spawn(move || {
        println!("Here's a vector: {:?}", vector);
    });

    handle.join().unwrap();
}

By placing the move keyword before the closure brackets, the compiler violently transfers ownership of vector entirely into the spawned thread. The main thread can never use vector again, mathematically proving no Data Race can occur!


2. Message Passing: Channels (mpsc)

There is a famous proverb from the Go language community that Rust heavily adopted: "Do not communicate by sharing memory; instead, share memory by communicating."

Instead of building a massive, complex block of global memory that every thread constantly competes for, you can use Channels. Think of a Channel as a literal pipe connecting two threads. One thread pushes data into the pipe (Transmitter); the other pulls it out (Receiver).

Rust implements a Multiple Producer, Single Consumer (mpsc) channel by default.

rust
use std::sync::mpsc;
use std::thread;

fn main() {
    // Create the channel! tx = Transmitter, rx = Receiver.
    let (tx, rx) = mpsc::channel();

    thread::spawn(move || {
        let val = String::from("Hello from the backend!");
        
        // Push the payload into the transmitter. 
        // Note: `send` MOVES ownership of `val` into the channel!
        tx.send(val).unwrap(); 
        
        // println!("{}", val); // This would cause a compile error! `val` is gone!
    });

    // The Receiver pulls the payload out safely on the main thread.
    // `.recv()` blocks the main thread completely until a message arrives.
    let received = rx.recv().unwrap();
    println!("Got: {}", received);
}

Channels are mathematically secure because the send method takes Ownership of the data. Once thread A sends the string to thread B, thread A fundamentally cannot access it anymore. There can be no race conditions because the data never has two owners.


3. Shared State Concurrency

Sometimes, passing messages is too rigid. For instance, you might have a global HitCounter on a highly concurrent Web API. Every single incoming HTTP connection thread needs to increment the exact same specific counter structurally in RAM.

You must share memory. You must use Mutexes.

As we briefly explored in the Smart Pointers module, a Mutex (Mutual Exclusion) provides safe access to data by physically locking it. To access it, a thread must request the Lock. When the thread finishes its block, it drops the lock.

rust
use std::sync::{Arc, Mutex};
use std::thread;

fn main() {
    // We wrap a Mutex inside an Arc (Atomic Reference Counting)
    // because multiple threads need simultaneous shared Ownership of the Mutex itself!
    let hit_counter = Arc::new(Mutex::new(0));
    let mut handles = vec![];

    for _ in 0..10 {
        let counter_clone = Arc::clone(&hit_counter);
        
        let handle = thread::spawn(move || {
            // .lock() physically blocks this specific thread until no one else holds the lock.
            let mut num = counter_clone.lock().unwrap();
            
            // We mutate the inner data!
            *num += 1;
        }); // The lock is implicitly dropped right here!
        
        handles.push(handle);
    }

    // Wait for all 10 threads to finish fighting for the lock.
    for handle in handles {
        handle.join().unwrap();
    }

    println!("Total hits: {}", *hit_counter.lock().unwrap()); // Prints 10
}

Notice that we had to wrap the Mutex in an Arc. If we used Rc (the non-atomic reference counter), the compiler would have recognized that Rc is unsafe for multithreading and failed the build immediately!


4. The Magic Behind the Curtain: Send and Sync

How does the compiler explicitly "know" that Rc is unsafe across threads, but Arc is perfectly fine? Does it physically simulate multi-threading during compilation? No.

Rust's entire Fearless Concurrency model is powered by exactly two marker traits: Send and Sync.

The Send Trait

If a type implements Send, it means that Ownership of the type can be safely transferred to another thread.

Almost all primitive types natively implement Send. String, Vec, and Box implement Send. Because Rc<T> uses non-atomic integer increments, it does not mathematically implement Send. When thread::spawn requires the arguments passed to it to be Send, the compiler immediately flags Rc as an invalid architectural violation.

The Sync Trait

If a type implements Sync, it means it is safe to be Referenced (&T) from multiple threads at the identical exact time.

Essentially, a type is Sync if &T (an immutable reference to it) is Send. Standard primitive types are natively Sync.

However, RefCell<T> (which provides interior mutability) does not implement Sync, because multiple threads attempting to mutate it internally simultaneously would cause chaotic panics. Therefore, wrapping a RefCell in an Arc will not suddenly make it thread-safe. You explicitly require Mutex<T>, which is Sync because it explicitly enforces the mutual exclusion blocks manually.

Channels (mpsc)Mutex + Arc
Data Safety MechanismOwnership Transfer (Move Semantics).Physical Thread Locking (Wait/Block).
Data ShapeOne-Way Stream of discrete packets.Centralized Global State.
Primary BenefitNo deadlocks, incredible performance scaling.Complete synchronous control of global environment.

Summary and Next Steps

The Rust compiler does not simply provide "guidelines" for concurrency—it implements a strict mathematical proofs environment.

If your code compiles, it means the compiler has exhaustively proven via Send, Sync, and the Borrow Checker that Data Races are structurally impossible. This empowers developers to casually sprinkle thread::spawn optimizations throughout their codebase without incurring the terrifying risk profile standard to C++ architectures.

To this point, whenever we encountered an error in our threads or files, we used .unwrap(), which intentionally crashes the application immediately (a Panic). This is terrible for production software. In the next module, we investigate how to structure professional, non-panicking Error APIs natively.

Read next: Rust Error Handling: Custom Errors, thiserror, and anyhow



Quick Knowledge Check

Why does the thread::spawn(|| { ... }) closure often explicitly require the move keyword before its declaration brackets?

  1. To tell the Operating System to move the thread's execution away from CPU Core 0.
  2. To violently transfer Ownership of referenced outer-scope variables entirely into the closure, ensuring the spawned thread won't accidentally access a Dangling Pointer if the main thread ends execution prematurely. ✓
  3. To bypass the 'Send' trait for non-atomic Smart Pointers like Rc<T\>.
  4. It is syntactic sugar denoting that the execution space is logically nested.

Explanation: Because a spawned thread might outlive the function that created it, any variables it captures from the environment must explicitly belong to the thread. The 'move' keyword enforces an Ownership transfer, so the data physically migrates into the localized scope of the executing thread.