Cancellation Patterns¶
This reference covers asyncio's cooperative cancellation model and the discipline required to use it without corrupting shutdown. Cancellation in asyncio is not a kill signal — it is a CancelledError injected into a coroutine at its next suspension point, which the coroutine is expected to observe, clean up after, and let propagate. The scope here is narrow on purpose: not error handling in general, but the concrete mechanics of task.cancel(), the rules for catching versus re-raising CancelledError, asyncio.shield() for protecting finalizers, the 3.11 uncancel()/cancellation-count machinery, and the cancel-and-drain pattern for retiring a set of tasks. Get these wrong and you produce the canonical cancellation bugs: zombie tasks that survive loop.close(), cleanup that blocks teardown forever, and the dreaded Task was destroyed but it is pending warning at process exit.
Everything here is downstream of one fact carried over from how the loop schedules work: a Task is a self-rescheduling callback, and cancel() does not interrupt it — it arranges for CancelledError to be thrown into the coroutine the next time the loop resumes it. Cancellation is therefore as cooperative as the rest of asyncio. This guide sits under Resilience, Cancellation & Error Handling, and the lifecycle it builds on is detailed in Task Scheduling & Lifecycle.
Architectural principles¶
CancelledErroris control flow, not an error. Since Python 3.8 it derives fromBaseException, notException, precisely so a broadexcept Exceptioncannot swallow it. Treat it as a signal to unwind, never as a failure to log-and-continue.- Catch only to clean up, then re-raise. A coroutine may intercept
CancelledErrorto release resources, but it must re-raise (or let it propagate) so the task actually reaches theCANCELLEDstate. Swallowing it leaves a zombie that the loop still considers alive. - Always observe a cancelled task. After
task.cancel(), you mustawaitthe task (directly or viagather(..., return_exceptions=True)) to let the cancellation complete and to retrieve the result. A cancel that is never awaited is a cancel that may never finish. - Cancelling is a request, not a guarantee. A task can suppress one cancellation, finish normally despite a cancel, or re-cancel itself during cleanup.
task.cancelled()is the only authority on whether it truly ended cancelled — not the fact that you calledcancel(). - Cleanup runs in a hostile window. Code in
finallycan itself be cancelled, especially under a deadline-driven shutdown. Must-run finalizers belong insideasyncio.shield()with a bounded timeout, never as an unprotectedawaitin afinally.
How cancellation integrates with the event loop¶
task.cancel(msg=None) does not raise anything synchronously. It sets a _must_cancel flag on the task and, if the task is currently suspended awaiting a future, requests cancellation of that future. On the loop's next iteration, when the task's __step runs, asyncio throws CancelledError(msg) into the coroutine at the exact line where it last yielded — the open await. The coroutine's normal exception machinery then takes over: except/finally blocks run, and if the error propagates out of the coroutine, the task transitions to CANCELLED. This is the same step-and-reschedule cycle that drives every task, which is why cancellation only ever takes effect at an await and never inside a synchronous run of code. For the full picture of how __step, the ready queue, and futures compose, start from Asyncio Fundamentals & Event Loop Architecture and the scheduling detail in Task Scheduling & Lifecycle.
Two consequences follow. First, a task running a long synchronous (CPU-bound) section cannot be cancelled until it yields — cancel() is buffered until the next suspension. Second, because the injection point is "the next await," cleanup code that itself awaits is reachable by a second cancellation if the deadline that triggered the first is still firing. Python 3.11 made this observable with Task.cancelling() and reversible with Task.uncancel(), which decrements the cancellation count so a coroutine that legitimately handled one cancel (for example, asyncio.timeout() converting it to TimeoutError) can continue rather than being treated as cancelled. The same lifecycle governs cleanup inside async context managers and iterators, whose __aexit__/aclose() run in this window.
Pattern catalogue¶
Each pattern is a different answer to one question: who owns the CancelledError, and how does it leave the system cleanly? Choose by who issues the cancel and what must survive it.
Graceful cancel with suppress¶
Use when you own a background task and want to stop it from the outside, expecting it to end cancelled. Cancel it, then await it through contextlib.suppress(asyncio.CancelledError) so the propagated cancellation is observed without re-raising into your shutdown path.
The await hb is non-negotiable: it gives the loop the chance to run the task's cancellation to completion. suppress is correct here because you are the canceller and the cancellation was expected — never use it inside the cancelled coroutine itself.
Cleanup in finally, then re-raise¶
Use inside a coroutine that holds a resource and must release it on cancellation. Put the release in finally so it runs on every exit path, and do not swallow the error — finally lets CancelledError continue propagating automatically.
If you genuinely need an except asyncio.CancelledError (to log or roll back), the raise at the end is mandatory. Omitting it converts a clean cancellation into a zombie task. A try/finally with no except is the safer default because it cannot forget to re-raise.
Shield a non-cancellable finalizer¶
Use when a finalizer (a transaction commit, a "goodbye" frame, an audit write) must complete even though the surrounding task is being cancelled. asyncio.shield() protects the inner awaitable from the outer cancellation; wrap it in a bounded asyncio.timeout() so it cannot block teardown forever.
shield only deflects cancellation aimed at the awaiting task; if the shielded coroutine is itself cancelled directly, it still stops. The bounded timeout is what prevents a stuck finalizer from turning a graceful shutdown into a hang.
Cancel-and-drain a set of tasks¶
Use to retire a pool of workers or in-flight requests deterministically. Cancel every task, then gather them with return_exceptions=True so every cancellation (and any error raised during cleanup) is observed in one place.
return_exceptions=True is essential: without it, the first CancelledError would re-raise and you would never await the rest, leaving them pending. This is the same drain step that retires the losers in a wait(FIRST_COMPLETED) race; for the scheduling side see Task Scheduling & Lifecycle.
Timeout-driven cancellation handoff¶
Use when a deadline, not you, is the canceller. asyncio.timeout() cancels the body when the deadline passes and converts the internal CancelledError into a TimeoutError at the block boundary — a handoff from cancellation to a normal exception you can catch.
The body's CancelledError is consumed by the timeout context (via uncancel() internally), so it surfaces as TimeoutError and does not mark the calling task cancelled. The deadline mechanics and the timeout()-vs-wait_for() choice are covered in Timeouts and deadlines.
Resource boundaries¶
Cancellation has no built-in time budget — a task can spend arbitrarily long in cleanup. Bounding that window is the operational discipline of safe shutdown.
Bound total cleanup time. Wrap the cancel-and-drain in a single outer deadline so a misbehaving finalizer cannot stall the process. Tasks still being cancelled when the budget expires get a second, harder cancel and are abandoned.
Separate the shielded budget from the drain budget. Must-run finalizers get their own small shield(...) + timeout(...) allowance inside each task; the outer drain budget caps the aggregate. The shielded budget should be a fraction of the drain budget so the outer deadline is never the thing that interrupts a critical commit.
Cap the depth of nested shields. Every shield you stack widens the window in which a second cancellation can be absorbed. Keep shielded finalizers shallow (one level, one bounded await) so that the cancellation count tracked by Task.cancelling() stays interpretable and uncancel() accounting does not drift.
Integrated production example¶
The following is a queue worker that runs until it receives a shutdown signal, drains its in-flight item, commits progress under a shield, and retires cleanly — exposing a diagnostic snapshot of pending tasks throughout.
Diagnostic Hook: The diagnostics coroutine samples len([t for t in asyncio.all_tasks() if not t.done()]) every 200 ms. During normal operation this is flat; during shutdown it must fall to zero (or to just the main task) within the cleanup budget. A count that plateaus above zero after the drain deadline is the signature of a swallowed CancelledError — a task that ignored its cancel. Pair this with the final drained/total ratio: anything less than the full set means at least one task did not reach CANCELLED.
Diagnostic Hook — detecting zombie and pending tasks
Watch three signals around any cancellation path. Pending-on-shutdown audit: immediately before loop.close(), log asyncio.all_tasks(); a non-empty set (minus the runner task) means tasks survived their cancel — the classic zombie. "Task was destroyed but it is pending": this warning at GC/finalization is asyncio telling you a task was dropped while still alive, almost always because its CancelledError was swallowed or it was never awaited; run with PYTHONASYNCIODEBUG=1 to get the creation traceback. Cleanup duration: time each finalizer (loop.time() before/after the shielded await) and alert when it approaches the shielded budget, because a finalizer creeping toward its timeout is the leading indicator of a future shutdown hang.
Failure modes¶
| Failure mode | Root cause | Detection | Fix |
|---|---|---|---|
| Zombie task survives shutdown | CancelledError caught and not re-raised (often a broad except BaseException) |
asyncio.all_tasks() non-empty after drain; task.cancelled() is False |
Use try/finally, or catch CancelledError narrowly and raise after cleanup |
| Cleanup blocks teardown forever | An unbounded await in a finally is re-cancelled or stalls on a dead peer |
Shutdown hangs; cleanup duration grows without bound | Wrap finalizers in asyncio.shield() + asyncio.timeout() with a fixed budget |
Task was destroyed but it is pending |
A task was garbage-collected while still alive — never awaited or its cancel swallowed | Warning at finalization; run with PYTHONASYNCIODEBUG=1 for the traceback |
Await every cancelled task via gather(*tasks, return_exceptions=True) |
Double-cancel breaks timeout() logic |
A second cancel() arrives during cleanup, raising CancelledError where code expected TimeoutError |
Unexpected CancelledError escaping a timeout() block |
Account for it with Task.cancelling()/uncancel(); keep finalizers shallow |
| Shield leaks the protected task | shield() result discarded; the shielded coroutine keeps running after the parent returns |
Side effect completes after the caller has moved on; rising task count | Retain and await the shielded task, or give it a bounded lifetime and drain it |
| Cancel ignored on a CPU-bound task | The task never reaches an await, so CancelledError is never injected |
cancel() returns but the task keeps running synchronously |
Offload to an executor via asyncio.to_thread(); insert await asyncio.sleep(0) in hot loops |
Frequently Asked Questions¶
Why does asyncio.CancelledError inherit from BaseException instead of Exception?
Since Python 3.8, CancelledError derives from BaseException so that a broad except Exception block cannot accidentally swallow it. This makes cancellation behave as control flow rather than an error: it propagates past ordinary error handling unless code explicitly catches it to clean up and then re-raises.
Do I have to re-raise CancelledError after catching it?
Yes, if you catch it inside the cancelled coroutine. Catch CancelledError only to release resources or log, then raise it again so the task actually reaches the CANCELLED state. Swallowing it leaves a zombie task the loop still considers alive, which hangs shutdown and triggers the 'Task was destroyed but it is pending' warning. A try/finally with no except is safer because it cannot forget to re-raise.
How does asyncio.shield() protect a finalizer during cancellation?
asyncio.shield() deflects cancellation aimed at the awaiting (outer) task so the shielded coroutine can finish, which is how a must-run commit or audit write survives the surrounding task being cancelled. It does not protect against the shielded coroutine being cancelled directly. Always pair it with asyncio.timeout() so a stuck finalizer cannot block teardown forever.
What is the correct way to cancel and drain a set of tasks?
Call cancel() on every task, then await them with asyncio.gather(*tasks, return_exceptions=True). The return_exceptions flag is essential: without it the first CancelledError re-raises and the remaining tasks are never awaited, leaving them pending. Awaiting every task lets each cancellation run to completion and surfaces any error raised during cleanup.
What does Task.uncancel() do in Python 3.11?
Task.uncancel() decrements the task's cancellation count, which Python 3.11 tracks via Task.cancelling(). It lets a coroutine that legitimately handled a cancellation — for example, asyncio.timeout() converting CancelledError into TimeoutError — continue running rather than being treated as cancelled. It is the mechanism that makes nested timeouts and cancellation handoffs behave correctly.
Why is my task not responding to cancel()?
cancel() only schedules CancelledError to be injected at the task's next suspension point. A task running a long synchronous or CPU-bound section never reaches an await, so the cancellation is buffered indefinitely. Offload blocking work with asyncio.to_thread() or insert await asyncio.sleep(0) in hot loops so the task yields and the cancellation can take effect.
Related¶
- Resilience, Cancellation & Error Handling — up to the overview for how cancellation, timeouts, retries, and exception groups fit the failure-handling model.
- Timeouts and deadlines — the deadline machinery that drives most production cancellations, and how
timeout()hands off to aTimeoutError. - Exception groups and TaskGroups — how a failing child cancels its siblings and surfaces the combined fault as an
ExceptionGroup. - Task Scheduling & Lifecycle — the step-and-reschedule cycle that determines exactly when a cancel takes effect.
- Preventing CancelledError leaks in cleanup — a step-by-step fix for the swallowed-cancel zombie-task failure.