C++Concurrency

C++20 Coroutines: co_await, co_yield, co_return, Generators & Async I/O Deep Dive

TT
TopicTrick Team
C++20 Coroutines: co_await, co_yield, co_return, Generators & Async I/O Deep Dive

C++20 Coroutines: co_await, co_yield, co_return, Generators & Async I/O Deep Dive


Table of Contents


What Makes a Function a Coroutine

A function becomes a coroutine the moment it contains any of these three keywords in its body:

KeywordMeaningUse Case
co_await exprSuspend until expr completesWaiting for I/O, timers, other tasks
co_yield valueSuspend and produce a valueGenerators, lazy sequences
co_return valueComplete the coroutine with a resultReturn final value from async task

The coroutine's return type must be a class that satisfies the coroutine protocol by providing an inner promise_type:

mermaid

The Coroutine Frame and promise_type

Every coroutine has a coroutine frame — heap-allocated storage for:

  • All local variables that must survive suspension
  • The promise_type instance
  • The resume and destroy function pointers
  • The suspend point (where it last suspended)
cpp

Building Generator<T> with co_yield

A generator is a coroutine that lazily produces a sequence of values using co_yield:

cpp

std::generator<T> (C++23): Out of the Box

C++23 standardizes generator coroutines — no more boilerplate:

cpp

Building Task<T> for Async I/O

A Task<T> coroutine represents a future value that becomes available when async I/O completes:

cpp

Coroutines vs Threads: When to Use Which

AspectThreadsCoroutines
Memory per unit1-8 MB (stack)50-500 bytes (frame)
Creation cost~10µs (OS syscall)~30ns (heap alloc)
Max concurrent~1000-10000 (practical)Millions
Context switch~1-5µs (OS scheduler)~10ns (function call)
CPU parallelismYes (real parallel)No (cooperative, single thread)
Best forCPU-bound parallel workI/O-bound concurrent work
cpp

Frequently Asked Questions

Why is coroutine promise_type so complex? Can't the library hide this? The C++20 coroutine mechanism is deliberately "low level" — it provides the machinery (frame, handle, suspend/resume) but not the policy (when to resume, how to schedule). This lets library authors build any async model: generators, tasks, actors, fibers. Production use should always use a library (asio::awaitable, libcoro::task, cppcoro::task) rather than raw promise_type.

Do coroutines always allocate on the heap? By default, the coroutine frame is heap-allocated. However, the standard allows the compiler to perform Heap Allocation Elision Optimization (HALO) — if the coroutine's lifetime is fully contained in the caller, the frame can be stack-allocated. This is common in short-lived generator patterns.

Can coroutines be used in embedded systems? Yes — with custom allocators. The coroutine frame is allocated via operator new by default, but you can override this by providing operator new/delete in the promise_type to use a custom pool allocator. This makes coroutines viable on platforms with limited heap.


Key Takeaway

C++20 coroutines fundamentally change how you write async code. Instead of callback chains, futures/promises, or thread-per-connection models, you write linear code that reads synchronously but executes asynchronously. The key insight: co_await doesn't block a thread — it suspends the coroutine frame and returns the thread to the scheduler, which can run other coroutines. For I/O-bound systems, this enables orders-of-magnitude better scalability than thread-per-request models.

Read next: Variadic Templates & C++26 Pack Indexing →


Part of the C++ Mastery Course — 30 modules from modern C++ basics to expert systems engineering.