C++20 Coroutines: co_await, co_yield, co_return, Generators & Async I/O Deep Dive

C++20 Coroutines: co_await, co_yield, co_return, Generators & Async I/O Deep Dive
Table of Contents
- What Makes a Function a Coroutine
- The Coroutine Frame and promise_type
- Awaitables: suspend_always, suspend_never, co_await
- Building
Generator<T>with co_yield - std::generator
<T>(C++23): Out of the Box - Building
Task<T>for Async I/O - Coroutine Lifetime and Heap Allocation
- Coroutines in Production: asio and libcoro
- Coroutines vs Threads: When to Use Which
- Frequently Asked Questions
- Key Takeaway
What Makes a Function a Coroutine
A function becomes a coroutine the moment it contains any of these three keywords in its body:
| Keyword | Meaning | Use Case |
|---|---|---|
co_await expr | Suspend until expr completes | Waiting for I/O, timers, other tasks |
co_yield value | Suspend and produce a value | Generators, lazy sequences |
co_return value | Complete the coroutine with a result | Return final value from async task |
The coroutine's return type must be a class that satisfies the coroutine protocol by providing an inner promise_type:
The Coroutine Frame and promise_type
Every coroutine has a coroutine frame — heap-allocated storage for:
- All local variables that must survive suspension
- The
promise_typeinstance - The resume and destroy function pointers
- The suspend point (where it last suspended)
#include <coroutine>
#include <optional>
#include <stdexcept>
// Minimal coroutine return type — Task that produces a single int
struct SimpleTask {
struct promise_type {
int result;
std::exception_ptr exception;
// Called to create the return object (SimpleTask instance):
SimpleTask get_return_object() {
return SimpleTask{
std::coroutine_handle<promise_type>::from_promise(*this)
};
}
// Called BEFORE the coroutine body runs:
std::suspend_always initial_suspend() { return {}; } // Lazy start
// std::suspend_never initial_suspend() { return {}; } // Eager start
// Called AFTER co_return (coroutine done):
std::suspend_always final_suspend() noexcept { return {}; } // Keep frame alive
// co_return value; → sets promise.result and suspends at final_suspend
void return_value(int val) { result = val; }
// Handle unhandled exceptions:
void unhandled_exception() { exception = std::current_exception(); }
};
std::coroutine_handle<promise_type> handle;
~SimpleTask() { if (handle) handle.destroy(); } // Free the coroutine frame
SimpleTask(SimpleTask&&) = default;
SimpleTask(const SimpleTask&) = delete;
// Resume and get result:
int get() {
handle.resume(); // Run to completion (or to next suspension)
if (handle.promise().exception)
std::rethrow_exception(handle.promise().exception);
return handle.promise().result;
}
};
// Usage:
SimpleTask my_computation() {
int a = 1 + 1;
co_return a * 21; // 42
}
SimpleTask t = my_computation(); // Coroutine suspended at initial_suspend
int result = t.get(); // Resume → runs body → returns 42Building Generator<T> with co_yield
A generator is a coroutine that lazily produces a sequence of values using co_yield:
#include <coroutine>
#include <iterator>
template<typename T>
class Generator {
public:
struct promise_type {
T value;
Generator get_return_object() {
return Generator{std::coroutine_handle<promise_type>::from_promise(*this)};
}
std::suspend_always initial_suspend() { return {}; }
std::suspend_always final_suspend() noexcept { return {}; }
void return_void() {}
void unhandled_exception() { std::terminate(); }
// co_yield value; → stores value, suspends
std::suspend_always yield_value(T val) {
value = std::move(val);
return {};
}
};
// Iterator support (makes Generator range-compatible):
struct iterator {
std::coroutine_handle<promise_type> handle;
bool done;
iterator& operator++() {
handle.resume();
done = handle.done();
return *this;
}
T& operator*() { return handle.promise().value; }
bool operator==(std::default_sentinel_t) const { return done; }
};
iterator begin() {
handle_.resume(); // Start the coroutine
return {handle_, handle_.done()};
}
std::default_sentinel_t end() { return {}; }
~Generator() { if (handle_) handle_.destroy(); }
private:
explicit Generator(std::coroutine_handle<promise_type> h) : handle_(h) {}
std::coroutine_handle<promise_type> handle_;
};
// Using the generator:
Generator<int> fibonacci() {
int a = 0, b = 1;
while (true) {
co_yield a;
auto next = a + b;
a = b;
b = next;
}
}
Generator<std::string> read_lines(const std::string& filename) {
std::ifstream file(filename);
std::string line;
while (std::getline(file, line)) {
co_yield line; // Yield one line at a time — file stays open between yields
}
}
// Range-compatible — works with range-based for:
for (int fib : fibonacci() | std::views::take(10)) {
std::cout << fib << ' '; // 0 1 1 2 3 5 8 13 21 34
}
for (const auto& line : read_lines("huge_file.txt") | std::views::take(100)) {
process(line); // Only reads 100 lines — file not loaded into memory!
}std::generator<T> (C++23): Out of the Box
C++23 standardizes generator coroutines — no more boilerplate:
#include <generator> // C++23
std::generator<int> iota(int start = 0) {
while (true) co_yield start++;
}
std::generator<int> fibonacci() {
auto [a, b] = std::pair{0, 1};
while (true) {
co_yield a;
std::tie(a, b) = std::pair{b, a + b};
}
}
// Recursive generator with co_yield*:
std::generator<int> flatten(std::generator<std::generator<int>> nested) {
for (auto& gen : nested) {
co_yield std::ranges::elements_of(gen); // C++23: yield from sub-generator
}
}
// Works directly with ranges:
auto first_10_fibs = fibonacci() | std::views::take(10);
for (int n : first_10_fibs) std::print("{} ", n);Building Task<T> for Async I/O
A Task<T> coroutine represents a future value that becomes available when async I/O completes:
#include <coroutine>
#include <functional>
// Conceptual Task<T> — in production use asio or libcoro instead of this:
template<typename T>
class Task {
public:
struct promise_type {
T result;
std::coroutine_handle<> continuation; // Who to resume when done
Task get_return_object() {
return Task{std::coroutine_handle<promise_type>::from_promise(*this)};
}
std::suspend_always initial_suspend() { return {}; }
// When task finishes — resume the awaiting coroutine
struct FinalAwaiter {
bool await_ready() noexcept { return false; }
void await_suspend(std::coroutine_handle<promise_type> h) noexcept {
if (h.promise().continuation)
h.promise().continuation.resume(); // Chain resumption
}
void await_resume() noexcept {}
};
FinalAwaiter final_suspend() noexcept { return {}; }
void return_value(T val) { result = std::move(val); }
void unhandled_exception() { std::terminate(); }
};
// Awaitable interface — allows co_await task:
bool await_ready() { return false; }
void await_suspend(std::coroutine_handle<> awaiter) {
handle_.promise().continuation = awaiter;
handle_.resume(); // Start the task
}
T await_resume() { return std::move(handle_.promise().result); }
private:
std::coroutine_handle<promise_type> handle_;
explicit Task(std::coroutine_handle<promise_type> h) : handle_(h) {}
};
// Usage with async operations (using Asio-style):
Task<std::vector<char>> read_file_async(const std::string& path) {
auto fd = co_await async_open(path);
auto data = co_await async_read(fd, 1024 * 1024);
co_await async_close(fd);
co_return data;
}
Task<void> handle_request(Socket socket) {
auto request = co_await socket.read_async(); // Non-blocking — thread free!
auto response = process(request);
co_await socket.write_async(response);
co_await socket.close_async();
// No thread blocked during any of the I/O operations above
}Coroutines vs Threads: When to Use Which
| Aspect | Threads | Coroutines |
|---|---|---|
| Memory per unit | 1-8 MB (stack) | 50-500 bytes (frame) |
| Creation cost | ~10µs (OS syscall) | ~30ns (heap alloc) |
| Max concurrent | ~1000-10000 (practical) | Millions |
| Context switch | ~1-5µs (OS scheduler) | ~10ns (function call) |
| CPU parallelism | Yes (real parallel) | No (cooperative, single thread) |
| Best for | CPU-bound parallel work | I/O-bound concurrent work |
// Use threads for: parallel CPU computation (parallel matrix multiply, image processing)
std::vector<std::jthread> workers;
for (int i = 0; i < std::thread::hardware_concurrency(); i++)
workers.emplace_back(process_chunk, i);
// Use coroutines for: I/O-bound concurrency (10k simultaneous connections)
// One thread runs millions of coroutines cooperatively:
for (int i = 0; i < 100'000; i++)
schedule(handle_request(accept_connection()));
// All 100k "connections" handled by coroutines — only one thread!Frequently Asked Questions
Why is coroutine promise_type so complex? Can't the library hide this?
The C++20 coroutine mechanism is deliberately "low level" — it provides the machinery (frame, handle, suspend/resume) but not the policy (when to resume, how to schedule). This lets library authors build any async model: generators, tasks, actors, fibers. Production use should always use a library (asio::awaitable, libcoro::task, cppcoro::task) rather than raw promise_type.
Do coroutines always allocate on the heap? By default, the coroutine frame is heap-allocated. However, the standard allows the compiler to perform Heap Allocation Elision Optimization (HALO) — if the coroutine's lifetime is fully contained in the caller, the frame can be stack-allocated. This is common in short-lived generator patterns.
Can coroutines be used in embedded systems?
Yes — with custom allocators. The coroutine frame is allocated via operator new by default, but you can override this by providing operator new/delete in the promise_type to use a custom pool allocator. This makes coroutines viable on platforms with limited heap.
Key Takeaway
C++20 coroutines fundamentally change how you write async code. Instead of callback chains, futures/promises, or thread-per-connection models, you write linear code that reads synchronously but executes asynchronously. The key insight: co_await doesn't block a thread — it suspends the coroutine frame and returns the thread to the scheduler, which can run other coroutines. For I/O-bound systems, this enables orders-of-magnitude better scalability than thread-per-request models.
Read next: Variadic Templates & C++26 Pack Indexing →
*Part of the C++ Mastery Course — 30 modules from modern C++ basics to expert systems engine
