Go Goroutines Explained: Concurrency for Beginners

Go Goroutines Explained: Concurrency for Beginners
What is a Goroutine?
A goroutine is a lightweight, concurrent function managed by the Go runtime rather than the operating system. Starting a goroutine requires only the go keyword before a function call and uses approximately 2KB of initial stack memory — compared to ~1MB for an OS thread. Go applications routinely run tens of thousands of goroutines simultaneously, making Go ideal for highly concurrent servers, pipelines, and microservices.
Concurrency 1: Goroutines & The Scheduler
Concurrency is Go's "killer feature." While other languages struggle with complex thread pools, heavy memory overhead, or a "Single Threaded" event loop, Go makes high-performance concurrent programming as easy as typing two letters: go.
In this module, we will explore the Goroutine—a lightweight execution thread managed entirely by the Go runtime rather than the operating system.
Concurrency vs. Parallelism
What is a Goroutine?
A Goroutine is a function that is executing concurrently with other Goroutines in the same address space. They are incredibly "cheap"—a typical Goroutine starts with only 2KB of stack space and is dynamically resized. You can easily run hundreds of thousands of them on a single laptop without crashing the system.
Launching Your First Goroutine
To start a Goroutine, simply prefix a function call with the keyword go.
The Magic: The Go Scheduler
The Go runtime includes its own scheduler that employs an M:N scheduling technique. It multiplexes $M$ Goroutines onto $N$ OS threads.
The scheduler uses a strategy called "Work Stealing" to ensure that no CPU core remains idle while others are overloaded. This is why Go is the language of choice for cloud-native infrastructure like Docker, Kubernetes, and Terraform.
The Fork-Join Model
When you use the go keyword, you are "forking" a new branch of execution. However, the go keyword doesn't provide a way to "join" or wait for that execution to finish. To do that safely, we need synchronization tools like Channels or WaitGroups (which we will cover in the next modules).
The Power of Goroutines
~2KB stackUnlike OS threads which require ~1MB of memory, Goroutines allow for massive density on minimal hardware.
User-space schedulingBecause the Go runtime manages the threads, switching between Goroutines is orders of magnitude faster than OS thread context switching.
Self-governing memoryGoroutine stacks grow and shrink as needed, preventing 'Stack Overflow' errors while remaining memory efficient.
| Task / Feature | OS Threads | Go Goroutines |
|---|---|---|
| Memory Cost | High (~1MB) | Low (~2KB) |
| Creation Time | Slow (System call) | Fast (Runtime allocation) |
| Management | Kernel/OS Scheduler | Go Runtime Scheduler |
| Quantity | Hundreds or Thousands | Millions |
Goroutines in Practice: HTTP Server Example
Every incoming HTTP request in a Go web server is handled in its own goroutine automatically. This means a Go server handling 10,000 simultaneous connections creates 10,000 goroutines — something that would be impossible with OS threads.
Here is a simple demonstration of launching goroutines to process tasks concurrently:
Without goroutines, processing 100 orders sequentially at 100ms each would take 10 seconds. With goroutines, all 100 complete in approximately 100ms.
Detecting Race Conditions
Go ships with a built-in race detector. Run your code or tests with the -race flag to catch data races during development:
The race detector instruments your code to detect when two goroutines access the same memory location concurrently without proper synchronisation. It adds runtime overhead, so use it during testing and development, not in production binaries.
A common race condition looks like this:
The fix is to use either a sync.Mutex to protect the shared variable or a channel to coordinate access.
Anonymous Goroutines
A very common Go pattern is launching an anonymous function as a goroutine. This is useful for fire-and-forget background tasks:
Be careful with closures and goroutines. Variables captured by a goroutine closure must not be mutated by the main goroutine without synchronisation. When launching goroutines in a loop, pass the loop variable as a function argument to avoid the classic closure-over-loop-variable bug:
Why Go's Concurrency Model Matters for Production
The reason companies like Cloudflare, Uber, Dropbox, and Docker chose Go as their primary language is directly attributable to goroutines. A Node.js server handling I/O-bound work through an event loop is single-threaded — one slow callback blocks everyone. A Python server using threads pays 1MB per thread and switches between them via the OS scheduler with significant overhead.
Go's goroutine model hits the sweet spot: true parallelism across multiple CPU cores (unlike Node.js) with the lightweight scheduling overhead of a green thread system (unlike OS threads). The result is servers that can handle extraordinary concurrency with predictable, low latency.
Further Reading
This is the first module in the Go concurrency series. Continue with Go channels and communication and then select statements and WaitGroups. For the broader context of what makes Go powerful, see what is Go programming language.
Next Steps
Goroutines are powerful, but they are dangerous if they can't communicate. If two Goroutines try to access the same variable at the same time, you'll get a Race Condition, which can lead to unpredictable crashes. In the next tutorial, we will explore Channels—the safe, elegant way for Goroutines to talk to each other.
Common Goroutine Mistakes
1. Launching goroutines without a way to wait for them
go doWork() starts a goroutine and returns immediately. If main() exits, all goroutines are killed. Use a sync.WaitGroup to wait for goroutines to finish before the program exits.
2. Goroutine leaks
A goroutine blocked on a channel read or waiting for a mutex that is never released leaks forever. Always ensure goroutines have a way to exit — pass a context.Context and select on ctx.Done(). The Go blog on concurrency patterns covers leak-free patterns.
3. Race conditions on shared memory
Two goroutines reading and writing the same variable without synchronisation is a data race. Run go test -race or go run -race to detect races. Use sync.Mutex or channels to protect shared state.
4. Closing a channel from the receiver side Only the sender should close a channel. Closing from the receiver, or closing an already-closed channel, panics. Establish a clear ownership rule: the goroutine that creates and sends on a channel is responsible for closing it.
5. Using goroutines for CPU-bound work without limiting parallelism
Spawning thousands of goroutines for CPU-bound tasks creates more OS threads than CPU cores and causes thrashing. Use a worker pool pattern — a fixed number of goroutines reading from a shared job channel — to limit parallelism to runtime.NumCPU().
Frequently Asked Questions
How many goroutines can a Go program run simultaneously? Go can comfortably run millions of goroutines. Each starts with a small stack (~2KB) that grows dynamically as needed. The Go runtime multiplexes goroutines onto OS threads using its M:N scheduler. See the Go runtime scheduler documentation for the mechanics.
What is the difference between a goroutine and a thread? OS threads are managed by the kernel and typically have a fixed 1–8MB stack. Goroutines are managed by the Go runtime, start with ~2KB, and are much cheaper to create and context-switch. A typical Go server runs thousands of goroutines on a handful of OS threads.
When should I use a goroutine vs a channel vs a mutex? Use goroutines to express concurrency. Use channels to communicate between goroutines and to transfer ownership of data. Use mutexes to protect shared state that multiple goroutines access. The Go proverb applies: "Do not communicate by sharing memory; instead, share memory by communicating."
