Skip to content

Structured Concurrency with asyncio.TaskGroup

The code that brings most teams here looks innocent: a handful of asyncio.create_task() calls fired off, then a bare await asyncio.gather(*tasks). It works until one task raises. Then gather() propagates that first exception but leaves the other tasks running — orphaned, holding connections, sometimes logging Task exception was never retrieved minutes later. Cancellation does not flow to siblings, cleanup does not happen, and a failure in one branch silently leaks the rest. asyncio.TaskGroup (Python 3.11+) replaces this ad-hoc pattern with nursery-style structured concurrency: tasks created in the group are owned by it, the first failure cancels every sibling, and all errors surface together as an ExceptionGroup when the async with block exits. This guide walks from a basic fan-out to nested groups with timeouts, with a verification step after each.

Prerequisites

  • Python 3.11+. asyncio.TaskGroup, asyncio.timeout(), and the except* syntax for exception groups are all 3.11 features. None of this works on 3.10 or earlier.
  • Comfort with task ownership. This is a detail page under Coroutine Design Patterns; the catalogue there explains where TaskGroup fits relative to gather() and as_completed(). For the scheduler model underneath, see Asyncio Fundamentals & Event Loop Architecture.
  • A running loop. Every snippet runs under asyncio.run().
TaskGroup as a nursery scope A TaskGroup parent scope owns three child tasks; while the block is open it awaits all of them, and the first failure cancels the surviving siblings before the combined ExceptionGroup leaves the scope. TaskGroup nursery scope async with TaskGroup() as tg scope owns every child child A runs to done child B raises first child C cancelled B's failure cancels siblings ExceptionGroup leaves the scope handle with except*

1. Basic TaskGroup fan-out

Create tasks with tg.create_task() inside the async with block. The block does not exit until every child has finished — there is no separate await of a task list, and no chance of forgetting one.

import asyncio


async def fetch(name: str, delay: float) -> None:
    await asyncio.sleep(delay)
    print(f"fetched {name}")


async def main() -> None:
    async with asyncio.TaskGroup() as tg:
        tg.create_task(fetch("users", 0.2))
        tg.create_task(fetch("orders", 0.1))
        tg.create_task(fetch("inventory", 0.3))
    # Control reaches here only after ALL three complete.
    print("all done")


asyncio.run(main())

Verify: the three fetched ... lines print (in completion order), then all done prints exactly once, after the slowest task. The group blocks at the end of the with until the last child finishes.

2. Collecting results

create_task() returns the Task, but you must read .result() after the block exits — inside the block the task may not be done. Keep references and harvest them afterward.

import asyncio


async def price(symbol: str) -> float:
    await asyncio.sleep(0.1)
    return len(symbol) * 10.0


async def main() -> dict[str, float]:
    async with asyncio.TaskGroup() as tg:
        tasks = {sym: tg.create_task(price(sym)) for sym in ("AAPL", "MSFT", "GOOG")}
    # Safe now: the group guarantees every task is done.
    return {sym: t.result() for sym, t in tasks.items()}


print(asyncio.run(main()))  # {'AAPL': 40.0, 'MSFT': 40.0, 'GOOG': 40.0}

Verify: the dict prints with one entry per symbol. Calling t.result() inside the block instead would raise InvalidStateError, which is the reminder to harvest after exit.

3. A failure cancels siblings

This is the behavior gather() lacks. When one child raises, the group cancels the others, waits for their cancellation to finish, and re-raises. The cancelled siblings see asyncio.CancelledError and run their cleanup.

import asyncio


async def worker(name: str, delay: float, fail: bool) -> None:
    try:
        await asyncio.sleep(delay)
        if fail:
            raise ValueError(f"{name} failed")
        print(f"{name} finished")
    except asyncio.CancelledError:
        print(f"{name} cancelled, cleaning up")
        raise  # always re-raise CancelledError


async def main() -> None:
    async with asyncio.TaskGroup() as tg:
        tg.create_task(worker("fast-fail", 0.1, fail=True))
        tg.create_task(worker("slow", 1.0, fail=False))  # will be cancelled


asyncio.run(main())

Verify: fast-fail raises at 0.1 s, slow prints slow cancelled, cleaning up (not slow finished), and the program ends by raising an ExceptionGroup. The slow task never runs to completion — proof that the sibling was cancelled.

4. Handle the ExceptionGroup with except*

The group always raises a BaseExceptionGroup, even for a single failure. Use except* to match by member type; it runs the handler for every matching exception in the group and re-raises any unmatched ones. The mechanics of grouping and except* are covered in depth under Exception Groups & TaskGroups, with a focused walkthrough in handling ExceptionGroup from a TaskGroup.

import asyncio


async def flaky(name: str, exc: Exception) -> None:
    await asyncio.sleep(0.05)
    raise exc


async def main() -> None:
    try:
        async with asyncio.TaskGroup() as tg:
            tg.create_task(flaky("a", ValueError("bad value")))
            tg.create_task(flaky("b", KeyError("missing")))
    except* ValueError as eg:
        print("value errors:", [str(e) for e in eg.exceptions])
    except* KeyError as eg:
        print("key errors:", [str(e) for e in eg.exceptions])


asyncio.run(main())

Verify: both handlers fire — value errors: ['bad value'] and key errors: ["'missing'"] — because two siblings failed and except* dispatches by type across the group. A plain except ValueError would not catch it; the group is a BaseExceptionGroup, not a ValueError.

5. Nesting and combining with asyncio.timeout

TaskGroup composes: nest groups for sub-phases, and wrap a group in asyncio.timeout() to give the whole phase a deadline. On timeout, the timeout() context cancels the group, which cancels every child — one deadline, full structured teardown.

import asyncio


async def stage(name: str, delay: float) -> None:
    await asyncio.sleep(delay)
    print(f"{name} done")


async def main() -> None:
    try:
        async with asyncio.timeout(0.25):           # deadline for the whole block
            async with asyncio.TaskGroup() as outer:
                outer.create_task(stage("ingest", 0.1))
                async with asyncio.TaskGroup() as inner:  # nested sub-phase
                    inner.create_task(stage("parse", 0.5))   # too slow
                    inner.create_task(stage("validate", 0.05))
    except TimeoutError:
        print("phase exceeded deadline; all tasks cancelled")


asyncio.run(main())

Verify: validate done prints, the deadline fires before parse finishes, and you see phase exceeded deadline; all tasks cancelled. Both the inner and outer groups tear down — nested structured scopes propagate cancellation inward without leaking any task.

Verification

A correct migration shows, end to end: a successful run reaches the code after the async with only when every task succeeded; a single failure cancels all siblings (their cleanup runs) and raises a BaseExceptionGroup; except* clauses dispatch by member type; and len(asyncio.all_tasks()) returns to baseline after the block — no orphans survive the scope. Under PYTHONASYNCIODEBUG=1 you should see no Task exception was never retrieved messages, the hallmark of the old gather() leak this pattern eliminates.

Diagnostic Hook. In production, alert on any Task exception was never retrieved log line — with TaskGroup it should never appear, so its presence means a task escaped the group (created with bare create_task() instead of tg.create_task()). Track len(asyncio.all_tasks()) as a gauge; a sawtooth that returns to baseline per request confirms structured teardown, while a rising floor signals tasks escaping their group.

Pitfalls & edge cases

  • except does not catch a group. A bare except ValueError will not catch a TaskGroup's failure because it is wrapped in a BaseExceptionGroup. Use except*, or unwrap with eg.exceptions.
  • Reading .result() too early. Calling task.result() inside the async with raises InvalidStateError; harvest results only after the block exits.
  • Adding tasks after the block. You cannot create_task() once the group has started exiting; tasks must be created while the body is running.
  • Swallowing CancelledError. A sibling cancelled by the group must re-raise CancelledError after cleanup. Catching and suppressing it breaks the group's teardown contract and can hang the exit.
  • Expecting best-effort semantics. TaskGroup is all-or-nothing: the first failure cancels everyone. For independent batches where partial success is fine, use asyncio.gather(*coros, return_exceptions=True) instead.

Frequently Asked Questions

How is asyncio.TaskGroup different from asyncio.gather()?

TaskGroup owns its child tasks: the first failure cancels all siblings, waits for their teardown, and surfaces every error as a BaseExceptionGroup. gather() propagates the first exception but leaves the other tasks running unless you manually cancel them, so it is best-effort rather than structured. Use TaskGroup for owned, all-or-nothing work and gather(return_exceptions=True) for independent batches.

Why doesn't my except ValueError catch a TaskGroup failure?

TaskGroup wraps all child failures in a BaseExceptionGroup, which is not itself a ValueError, so a plain except clause does not match. Use the except* syntax (Python 3.11+) to dispatch by member type, or unwrap the group via its .exceptions attribute.

Can I read a task's result inside the TaskGroup block?

No. Inside the async with block a task may not be done yet, so task.result() raises InvalidStateError. Keep the Task references and read their results only after the block exits, where the group guarantees completion.

How do I put a deadline on a whole TaskGroup?

Wrap the async with asyncio.TaskGroup() block in async with asyncio.timeout(seconds). When the deadline expires the timeout context cancels the group, which cancels every child, and a TimeoutError surfaces after structured teardown completes.