Skip to content

Event Loop Configuration

A production-grade asyncio deployment requires deliberate configuration of the underlying event loop. The defaults that ship with CPython optimise for first-run ergonomics, not for throughput, observability, or fault isolation. A service that runs the loop with default settings will silently swallow exceptions in detached tasks, leave debug instrumentation off when you need it and on when you cannot afford it, block the reactor on a synchronous call nobody noticed, and drop in-flight requests when Kubernetes sends SIGTERM. This reference is the narrow set of decisions that turn a working script into a hardened daemon: how to choose an entrypoint, how to select and install a faster loop backend, how to install a hard error boundary, how to tune debug and slow-callback detection, how to size the executor that absorbs blocking work, and how to wire signals into a deterministic shutdown.

The scope here is configuration of one loop instance for one process. The decision of which API actually drives that loop — asyncio.run() versus a manually managed loop.run_until_complete() — and the step-by-step hardening checklist both have their own dedicated guides, linked throughout. Everything below assumes a Python 3.11+ runtime so that asyncio.Runner, asyncio.TaskGroup, and asyncio.timeout() are available.

Architectural principles

  • Configure before the loop spins, validate after it runs. Loop backend, debug flag, and policy must be set before the first iteration; an exception handler and signal handlers must be attached from inside the running loop. Ordering errors fail silently — the runtime falls back to defaults rather than raising.
  • Every detached task needs an error boundary. A task with no awaiter and no exception handler logs its traceback only when garbage-collected, often long after the failure. A loop-level set_exception_handler is the single hook that guarantees uncaught task exceptions reach your logging pipeline.
  • Debug instrumentation is a runtime cost, not a constant. Debug mode and a low slow_callback_duration are diagnostic tools with measurable per-tick overhead. They belong behind an environment flag, not baked into the production image.
  • The loop runs on one thread; blocking work must leave it. Any synchronous call — requests, sqlite3, bcrypt, a vendor SDK — stalls every coroutine until it returns. A bounded executor is the pressure-relief valve, and its size is a tuning parameter, not a constant.
  • Shutdown is part of the contract. An orchestrator sends SIGTERM and waits a fixed grace period before SIGKILL. A correct service intercepts the signal, stops accepting work, cancels in-flight tasks with a deadline, drains async generators, and closes the loop — all inside that window.

How configuration integrates with the loop scheduler

Configuration is not a layer on top of the loop; each knob mutates a specific stage of the loop's core iteration. That iteration runs ready callbacks from loop._ready, polls the selector (or IOCP, or libuv backend) for I/O that became ready, and fires due timers — then repeats. Choosing a backend swaps the multiplexer that the poll stage drives: the pure-Python SelectorEventLoop wraps epoll/kqueue through the selectors module, uvloop replaces the entire C core with libuv. The exception handler is invoked from the step machinery whenever a callback or task raises without a consumer, so it sits directly in the path that drives every Task. slow_callback_duration is checked against loop.time() deltas around each callback execution, which is why it measures synchronous stalls rather than total task latency. The default executor is the bridge from this single thread into a thread pool: run_in_executor submits work and returns a future that the loop's done-callback machinery re-enqueues when the pool thread finishes.

For the full picture of how the selector, timers, and executors compose into one loop iteration, start from the overview at Asyncio Fundamentals & Event Loop Architecture. Because configuration determines how fast and how cleanly tasks move through the ready queue, it is tightly coupled to Task Scheduling & Lifecycle, which covers the state transitions a Task undergoes once the loop is running.

Event loop bootstrap and shutdown lifecycle Bootstrap selects a backend and disables debug, the running loop attaches an exception handler and signal handlers and drives the ready-queue iteration, and a signal triggers graceful cancellation, async-generator drain, and close. Loop bootstrap, run, and graceful shutdown 1. Bootstrap select backend (uvloop) set_debug, loop_factory 2. Attach hooks set_exception_handler add_signal_handler 3. Run loop ready → poll → timers slow_callback_duration SIGTERM / SIGINT trigger shutdown 4. Cancel + drain all_tasks().cancel(), gather 5. Close shutdown_asyncgens, close() Steps 1-2 run before the loop iterates; steps 4-5 must complete inside the orchestrator's grace period. A dropped error boundary at step 2 means failures in step 3 vanish until GC.

Pattern catalogue

The configuration surface decomposes into a handful of patterns. Each is independently useful; the integrated example at the end composes them into one bootstrap.

The asyncio.run() entrypoint

For any standalone process — a CLI, a worker, a microservice mainasyncio.run() is the correct entrypoint. It creates a fresh loop, runs the coroutine, cancels leftover tasks, drains async generators, and closes the loop, all in one call. On 3.11+ it is a thin wrapper over asyncio.Runner, which exposes loop_factory so you can inject a backend without touching the deprecated policy system.

import asyncio


async def main() -> None:
    loop = asyncio.get_running_loop()
    print(f"backend: {type(loop).__module__}.{type(loop).__name__}")
    await asyncio.sleep(0.1)


if __name__ == "__main__":
    # asyncio.run owns loop creation, task cleanup, and deterministic close.
    asyncio.run(main())

Use asyncio.Runner directly only when you need to run several top-level coroutines on one configured loop (the classic REPL/test-harness case). The full decision — and the legacy cases where manual loop control is still correct — is laid out in when to use asyncio.run vs loop.run_until_complete.

Installing uvloop as the backend

uvloop replaces the CPython loop with a libuv core and typically improves network throughput by 2–4x by cutting syscall and Python-level dispatch overhead. The forward-compatible installation on 3.11+ is loop_factory, which avoids the policy API that is deprecated since 3.12 and slated for removal in 3.16.

# pip install uvloop
import asyncio

try:
    import uvloop
    loop_factory = uvloop.new_event_loop
except ImportError:  # Windows, Alpine without build deps
    loop_factory = asyncio.new_event_loop


async def main() -> None:
    await asyncio.sleep(0)


if __name__ == "__main__":
    with asyncio.Runner(loop_factory=loop_factory) as runner:
        runner.run(main())

The try/except is mandatory: uvloop has no Windows wheels and may fail to build on minimal images, so the selector loop must remain a working fallback rather than a crash.

A loop-level exception handler

Detached tasks that raise without an awaiter only surface their traceback at garbage-collection time. A loop exception handler intercepts them immediately, serialises the context, and forwards it to your logging pipeline. Install it from inside the running loop.

import asyncio
import logging
import traceback
from typing import Any

logger = logging.getLogger("asyncio.errors")


def exception_handler(loop: asyncio.AbstractEventLoop, context: dict[str, Any]) -> None:
    exc = context.get("exception")
    if isinstance(exc, asyncio.CancelledError):
        return  # cancellation during shutdown is expected, not an error
    task = context.get("task")
    logger.error(
        "loop exception: %s | task=%s | %s",
        context.get("message", "unhandled"),
        task.get_name() if task else "n/a",
        "".join(traceback.format_exception(exc)) if exc else "",
    )
    loop.default_exception_handler(context)

The default_exception_handler fallback preserves CPython's built-in diagnostics; dropping it means losing context fields the handler did not explicitly copy. This boundary is the configuration counterpart to the patterns in Coroutine Design Patterns, where the goal is to never let a task fail unobserved.

Debug mode and slow-callback detection

loop.set_debug(True) enables coroutine creation-site tracking, resource-leak warnings for unclosed transports, and logging of any callback that exceeds slow_callback_duration. It adds roughly 10–30% per-tick overhead, so gate it behind an environment flag and only lower the slow-callback threshold when actively hunting stalls.

import asyncio
import os


async def main() -> None:
    loop = asyncio.get_running_loop()
    loop.set_debug(os.getenv("PYTHONASYNCIODEBUG") == "1")
    # Flag any synchronous callback that stalls the loop beyond 50 ms.
    loop.slow_callback_duration = 0.05
    print(f"debug={loop.get_debug()} slow_callback={loop.slow_callback_duration}s")
    await asyncio.sleep(0.1)


asyncio.run(main())

A logged Executing <Handle ...> took 0.120 seconds line is the loop telling you exactly which callback blocked it — the cheapest stall detector asyncio offers.

Signals and graceful shutdown

loop.add_signal_handler runs a callback in the loop thread when a signal arrives — unlike signal.signal, which fires on an arbitrary frame and is unsafe to mix with async state. The handler should set an Event (or cancel a sentinel) rather than do the teardown inline, so cancellation happens in coroutine context.

import asyncio
import signal


async def serve(stop: asyncio.Event) -> None:
    while not stop.is_set():
        await asyncio.sleep(1)  # real work goes here


async def main() -> None:
    loop = asyncio.get_running_loop()
    stop = asyncio.Event()
    for sig in (signal.SIGINT, signal.SIGTERM):
        loop.add_signal_handler(sig, stop.set)
    await serve(stop)

Signal handling and cancellation are two halves of one mechanism; the deeper treatment of cancel-safe teardown lives under Cancellation Patterns.

Resource boundaries

The single hard limit a configured loop must respect is the executor that absorbs blocking work. The loop runs on one thread, so every synchronous call is offloaded to run_in_executor, and the pool behind it has finite capacity. Over-provisioning past 2 × CPU_COUNT for CPU-adjacent work invites GIL thrashing and context-switch overhead; for I/O-bound blocking calls, min(32, (os.cpu_count() or 1) * 4) is a safe starting heuristic. The pool's work queue is unbounded, so backpressure must come from the caller — bound concurrency with a Semaphore or a TaskGroup so the queue cannot grow without limit.

import asyncio
import os
from concurrent.futures import ThreadPoolExecutor


async def main() -> None:
    loop = asyncio.get_running_loop()
    executor = ThreadPoolExecutor(
        max_workers=min(32, (os.cpu_count() or 1) * 4),
        thread_name_prefix="io-worker",
    )
    loop.set_default_executor(executor)
    sem = asyncio.Semaphore(executor._max_workers)  # cap in-flight submissions

    async def offload(fn, *args):
        async with sem:
            return await loop.run_in_executor(None, fn, *args)

    await offload(lambda: sum(range(1000)))
    executor.shutdown(wait=True)


asyncio.run(main())

The semaphore is the boundary: without it, a burst of callers can enqueue tens of thousands of jobs faster than the pool drains them, and memory grows until the OOM killer intervenes.

Integrated production bootstrap

The following composes every pattern above into one reusable bootstrap: backend selection, debug gating, error boundary, signal-driven shutdown, executor sizing, and deterministic close — driven by asyncio.Runner so the configured loop is created before the first iteration.

import asyncio
import logging
import os
import signal
import traceback
from concurrent.futures import ThreadPoolExecutor
from typing import Any

logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
logger = logging.getLogger("service")

SHUTDOWN_GRACE = 25.0  # seconds; keep under the orchestrator's terminationGracePeriod


def make_loop_factory():
    try:
        import uvloop
        return uvloop.new_event_loop
    except ImportError:
        logger.warning("uvloop unavailable; using selector loop")
        return asyncio.new_event_loop


def exception_handler(loop: asyncio.AbstractEventLoop, context: dict[str, Any]) -> None:
    exc = context.get("exception")
    if isinstance(exc, asyncio.CancelledError):
        return
    logger.error(
        "loop exception: %s | %s",
        context.get("message", "unhandled"),
        "".join(traceback.format_exception(exc)) if exc else "",
    )
    loop.default_exception_handler(context)


async def worker(worker_id: int, stop: asyncio.Event) -> None:
    while not stop.is_set():
        try:
            await asyncio.sleep(2)
            logger.info("worker %d tick", worker_id)
        except asyncio.CancelledError:
            logger.info("worker %d cancelled", worker_id)
            raise


async def graceful_shutdown(stop: asyncio.Event) -> None:
    stop.set()
    tasks = [t for t in asyncio.all_tasks() if t is not asyncio.current_task()]
    for t in tasks:
        t.cancel()
    try:
        async with asyncio.timeout(SHUTDOWN_GRACE):
            await asyncio.gather(*tasks, return_exceptions=True)
    except TimeoutError:
        logger.error("shutdown exceeded %.0fs; %d tasks may be orphaned", SHUTDOWN_GRACE, len(tasks))


async def main() -> None:
    loop = asyncio.get_running_loop()
    loop.set_debug(os.getenv("PYTHONASYNCIODEBUG") == "1")
    loop.slow_callback_duration = 0.1
    loop.set_exception_handler(exception_handler)
    loop.set_default_executor(
        ThreadPoolExecutor(max_workers=min(32, (os.cpu_count() or 1) * 4), thread_name_prefix="io")
    )

    stop = asyncio.Event()
    for sig in (signal.SIGINT, signal.SIGTERM):
        loop.add_signal_handler(sig, lambda: asyncio.ensure_future(graceful_shutdown(stop)))

    logger.info("service up | backend=%s | debug=%s", type(loop).__module__, loop.get_debug())
    async with asyncio.TaskGroup() as tg:
        for i in range(4):
            tg.create_task(worker(i, stop))


if __name__ == "__main__":
    with asyncio.Runner(loop_factory=make_loop_factory()) as runner:
        try:
            runner.run(main())
        except* asyncio.CancelledError:
            logger.info("clean shutdown complete")

Diagnostic Hook: On startup, log type(loop).__module__ (expect uvloop in production), loop.get_debug() (expect False), and loop.slow_callback_duration. In production, emit the executor's _work_queue.qsize() and len(_threads) as gauges; a queue depth that climbs while threads stay pinned at max_workers is the precursor to memory blow-up. During a deploy, time the gap between SIGTERM receipt and the final clean shutdown complete line — if it approaches SHUTDOWN_GRACE, your tasks are not yielding on cancel.

Diagnostic Hook (debug session): Set PYTHONASYNCIODEBUG=1 and drop slow_callback_duration to 0.02. The loop will log Executing <Handle ...> took N seconds for every blocking callback and emit coroutine ... was never awaited and unclosed-transport warnings. Profile the offending callback with py-spy dump --pid <pid> to see the synchronous frame stalling the reactor.

Failure modes

Failure mode Root cause Detection Fix
Detached task error vanishes No awaiter and no loop exception handler; traceback only logs at GC Errors appear minutes late or never; Task was destroyed but it is pending warnings Install loop.set_exception_handler; retain task handles or use TaskGroup
Latency spikes across all coroutines Synchronous call blocking the single loop thread slow_callback_duration logs Executing <Handle> took N s; py-spy shows a sync frame Offload via run_in_executor; bound with a Semaphore
Config silently ignored Backend/debug/policy set after the loop was created Logged type(loop) is _UnixSelectorEventLoop, not uvloop; debug stays default Use Runner(loop_factory=...); set set_debug before any await
Pod killed on deploy with dropped requests No SIGTERM handler; abrupt exit mid-request Connection resets at deploy time; no clean shutdown log line add_signal_handler → cancel + gather inside the grace period
OOM under burst load Unbounded executor work queue or unbounded task creation _work_queue.qsize() climbs while _threads is pinned; RSS grows monotonically Bound submissions with a Semaphore; cap concurrency with TaskGroup
RuntimeError: Event loop is closed at exit Tasks or executor submissions outlive loop.close() Traceback during teardown; ResourceWarning for unclosed transports await loop.shutdown_asyncgens() and executor.shutdown(wait=True) before close (handled by Runner)

Frequently Asked Questions

How do I configure a custom event loop in Python 3.12+ without the deprecated policy API?

Pass loop_factory to asyncio.run(main(), loop_factory=...) or use asyncio.Runner(loop_factory=...). For uvloop specifically, loop_factory=uvloop.new_event_loop. The policy system (asyncio.set_event_loop_policy()) is deprecated since 3.12 and slated for removal in 3.16, so reserve it for libraries that still support older runtimes.

What is the actual performance impact of loop.set_debug(True)?

Roughly 10–30% added latency per loop iteration from coroutine creation-site tracking, slow-callback timing, and resource-leak detection, plus higher memory from retained reference chains. Keep it behind an environment flag and enable it only when diagnosing a stall, leak, or race.

When should I swap to uvloop versus tuning the default selector loop?

Use uvloop for I/O-bound network services where epoll/kqueue dispatch dominates — it typically yields 2–4x throughput. Stay on the default loop on Windows (no native uvloop), on minimal images where it will not build, or when a dependency relies on selector-loop internals. Always keep the selector loop as a fallback in the factory.

Why does my exception handler never fire for failing tasks?

It is attached too late, or the task is awaited. The handler fires only for exceptions with no consumer; if you await or gather a task, the exception propagates to that awaiter instead. Attach the handler from inside the running loop before creating detached tasks, and confirm with a deliberately raising fire-and-forget task.

How do I prevent dropped requests when Kubernetes sends SIGTERM?

Register loop.add_signal_handler(SIGTERM, ...) to stop accepting work, cancel in-flight tasks, and await asyncio.gather(..., return_exceptions=True) inside an asyncio.timeout() shorter than the pod's terminationGracePeriodSeconds. Then let Runner/asyncio.run drain async generators and close the loop.