Avoiding Event Loop Blocking with asyncpg¶

You migrated to asyncpg precisely because it is non-blocking, yet p99 latency still spikes in bursts: most requests are fast, then a burst of them stalls for hundreds of milliseconds for no obvious reason. The query plans are fine and the database is not overloaded. The cause is almost always something else on the event loop — a synchronous psycopg2 call left in a legacy module, a CPU-heavy transform over a large result set, or a single connection shared across concurrent tasks — that freezes the loop while asyncpg waits its turn. This guide is a repeatable workflow to find that hidden blocker and remove it.

Prerequisites¶

Python 3.11+ (for asyncio.timeout(), asyncio.TaskGroup, and asyncio.to_thread()).
asyncpg: pip install asyncpg.
Familiarity with the driver and pool model from Async Database Drivers and the loop's I/O-multiplexing model in Network I/O & Protocol Handling.

Step 1 — Detect Loop Blocking¶

Turn on debug mode and lower the slow-callback threshold so the loop tells you when a single callback ran too long. Any warning is a synchronous call that froze the loop — its traceback names the offender.

import asyncio
import logging

logging.basicConfig(level=logging.WARNING)

async def main() -> None:
    loop = asyncio.get_running_loop()
    loop.set_debug(True)
    # Default is 0.1s; lower it so even modest stalls are flagged.
    loop.slow_callback_duration = 0.05  # 50 ms
    await run_workload()

asyncio.run(main())

Verify: Under load you should see Executing <...> took 0.123 seconds warnings. The frame in the warning is your blocking call. If you see none and latency is still spiky, the stall is between callbacks (e.g. GC or a blocked thread) — proceed to the next steps anyway, as a too-small pool also masquerades as latency.

Step 2 — Set Up the Pool Correctly¶

A per-request connection pays a TCP and auth round-trip every time, which looks exactly like latency. Create one pool at startup with explicit sizing and a command_timeout so a stuck query cannot hang a connection forever.

import asyncpg

async def make_pool(dsn: str) -> asyncpg.Pool:
    return await asyncpg.create_pool(
        dsn,
        min_size=5,            # warm connections at startup
        max_size=20,           # cap fan-out; see Step 5 and server limits
        command_timeout=10.0,  # per-statement deadline
        max_inactive_connection_lifetime=300.0,
    )

Verify: Log pool.get_size() at startup; it should equal min_size immediately, proving connections are warmed, not opened lazily on the hot path.

Sharing one connection across concurrent tasks interleaves protocol frames and either corrupts the stream or serializes the work behind one connection. Borrow per operation and release promptly.

async def fetch_user(pool: asyncpg.Pool, user_id: int) -> asyncpg.Record | None:
    async with pool.acquire() as conn:          # one conn per call
        return await conn.fetchrow(
            "SELECT id, email FROM users WHERE id = $1", user_id
        )
    # released here automatically, even on exception

# WRONG: one conn captured and reused by many tasks
# conn = await pool.acquire()
# await asyncio.gather(*(work(conn) for _ in range(50)))  # frames interleave

Verify: Remove the anti-pattern, then run your concurrency test. Intermittent asyncpg protocol/decode errors should disappear entirely.

Step 4 — Move CPU-Heavy Row Processing Off the Loop¶

asyncpg's network wait yields, but the Python code that runs after fetch() does not. Deserializing 50k rows or running a heavy transform inline blocks the loop just like a sync driver would. Push it to a thread with asyncio.to_thread (or a process pool for truly CPU-bound work, per CPU-bound task offloading).

import asyncio

def _transform(rows: list[asyncpg.Record]) -> list[dict]:
    # CPU-heavy: parsing, aggregation, regex over every row.
    return [{"id": r["id"], "score": expensive_score(r)} for r in rows]

async def report(pool: asyncpg.Pool) -> list[dict]:
    async with pool.acquire() as conn:
        rows = await conn.fetch("SELECT * FROM events LIMIT 50000")
    # Fetch on the loop (I/O), transform off the loop (CPU).
    return await asyncio.to_thread(_transform, rows)

Verify: Re-run Step 1's debug loop. The slow-callback warning that pointed at _transform should be gone; the CPU time now lives on a worker thread.

Step 5 — Bound Concurrency to the Pool Size¶

If you launch 1000 tasks against a 20-connection pool, 980 of them block on acquire() and your acquire-wait p99 explodes. Gate callers with a Semaphore sized to the pool so excess load waits at one observable choke point instead of piling onto the connector.

import asyncio

async def run_bounded(pool: asyncpg.Pool, ids: list[int]) -> list:
    gate = asyncio.Semaphore(pool.get_max_size())  # match the pool

    async def one(uid: int):
        async with gate:                            # cap in-flight queries
            return await fetch_user(pool, uid)

    async with asyncio.TaskGroup() as tg:
        tasks = [tg.create_task(one(uid)) for uid in ids]
    return [t.result() for t in tasks]

Verify: Time the acquire() (wrap it in asyncio.timeout() and record the wait). With the Semaphore matched to the pool, acquire wait should sit near zero even when you submit far more tasks than connections.

Verification¶

After applying all five steps, confirm the fix holds under load:

The slow-callback warnings from Step 1 no longer appear during a sustained load test.
p99 latency stabilizes — the bursty spikes flatten because no single coroutine monopolizes the loop.
Pool acquire wait (p99) stays low; idle connections are non-zero between bursts, proving the pool is no longer the bottleneck.
Concurrency tests run clean: no intermittent asyncpg protocol or decode errors.

Pitfalls & Edge Cases¶

A sync psycopg2 call sneaking in. A "quick" synchronous query in an admin endpoint or a logging hook blocks the loop for everyone. Grep for non-async drivers in the codebase; route any survivor through asyncio.to_thread.
Sharing a connection via a captured variable. Storing an acquired connection on self or a module global and reusing it across tasks is the same bug as Step 3, just hidden. Acquire inside each operation, never above the fan-out.
Giant result sets. A SELECT returning millions of rows materializes them all in memory and the transform stalls the loop. Use a server-side cursor (conn.cursor() inside a transaction) to stream in batches, and offload the per-batch work.
A transaction spanning a slow await. Doing unrelated slow I/O (an HTTP call, a sleep) inside async with conn.transaction() pins the connection and holds locks for the whole duration. Keep only the related writes inside the transaction.
Pool max_size greater than the server max_connections. The pool happily tries to open more backends than the server allows and you get FATAL: too many connections. Count every instance's pool against the server budget, as covered in Async Database Drivers.

Frequently Asked Questions¶

Why does asyncpg still cause latency spikes if it is non-blocking?

asyncpg yields only during network waits. Latency spikes come from something else on the same loop: a synchronous psycopg2 or sqlite3 call left in another module, CPU-heavy processing of a large result set running inline, or a connection shared across tasks. Enable loop debug with a low slow_callback_duration to find the offending callback.

How do I detect that something is blocking my asyncio event loop?

Call loop.set_debug(True) and set loop.slow_callback_duration to about 0.05 seconds. The loop then logs a warning whenever a single callback runs longer than that threshold, and the traceback in the warning points directly at the blocking code.

How do I keep heavy row processing from blocking the loop with asyncpg?

Do the network fetch on the loop, then hand the rows to asyncio.to_thread for CPU-heavy transforms, or to a ProcessPoolExecutor for truly CPU-bound work. This keeps the parsing and aggregation off the loop thread so other coroutines continue to run.

Async Database Drivers — up to the overview for pools, transactions, and sizing across native and blocking drivers.
Network I/O & Protocol Handling — up to the overview for the loop's I/O-multiplexing model that asyncpg relies on.
Running Blocking SDK Calls with asyncio.to_thread — the pattern for any synchronous driver that must coexist with the loop.