Asyncio Synchronization Primitives¶

asyncio ships four coordination primitives — asyncio.Lock, asyncio.Semaphore (and its stricter sibling BoundedSemaphore), asyncio.Event, and asyncio.Condition — and engineers routinely reach for the wrong one. A Lock gets used to cap concurrency (it can't), an Event gets used as a work queue (it loses items), a Condition gets used where a queue would be simpler and safer. This guide draws the boundaries: what each primitive actually guarantees, how it interacts with the event loop scheduler, and how each one fails in production.

The single most important fact: these primitives are loop-affine and not thread-safe. They coordinate coroutines running on one event loop. They do nothing to protect state shared with OS threads or processes — for that you need threading primitives or thread-safe queues, covered under hybrid concurrency models.

The reason these primitives feel deceptively simple is that cooperative scheduling does most of the work for free. A synchronous critical section in a threaded program needs a lock because the OS can preempt a thread at any bytecode boundary; an asyncio coroutine can only lose control at an explicit await. That single constraint reshapes when coordination is necessary — much of the "obvious" locking you would write in threaded code is dead weight here, while a different, smaller set of hazards (await-spanning state, unbounded fan-out, missed signals) takes its place. The rest of this guide is about matching each of those hazards to exactly one primitive, and recognizing the production symptoms when you get the match wrong.

Architectural Principles¶

A critical section only needs a Lock if it awaits. Code that runs straight through without yielding is already atomic under cooperative scheduling — no other coroutine can interleave. You only need mutual exclusion when the section suspends at an await and another coroutine could observe or mutate half-updated state in the gap.
A Semaphore caps concurrency; it is not mutual exclusion. A semaphore with a count of N admits N coroutines into the protected region simultaneously. Use it to bound fan-out (open sockets, in-flight requests), never to serialize a single shared resource — that's a Lock (a semaphore of 1).
None of these are thread-safe. Calling .acquire(), .set(), or .notify() from a thread other than the loop's thread is a race. Cross-thread coordination belongs to threading.Lock, queue.Queue, or loop.call_soon_threadsafe().
Prefer a Queue to a Condition when you can. A Condition is the most error-prone primitive here (missed wakeups, predicate re-checks, lock coupling). If your real problem is "hand work items between producers and consumers," an asyncio.Queue is correct, simpler, and backpressure-aware.
Wakeups are FIFO and fair. Waiters are queued in arrival order and released in that order, so no coroutine starves under contention — but fairness does not prevent deadlock if you acquire multiple locks in inconsistent orders.

Execution Model: How a Coroutine Suspends on a Primitive¶

All four primitives are built on the same mechanism: an internal collection of Future objects. When a coroutine calls await lock.acquire() and the lock is held, the lock creates a Future, appends it to its internal waiter deque, and awaits it. Awaiting an unresolved future yields control to the event loop, which is then free to run other ready tasks. The calling coroutine is now suspended — it consumes no CPU and is invisible to the scheduler until its future resolves.

When the holder calls lock.release(), the lock pops the oldest waiter future off the deque and calls set_result(True) on it. That schedules a callback (via loop.call_soon) to wake the corresponding task on the next loop iteration. Because the deque is FIFO, the longest-waiting coroutine is always served first. This is the same machinery that powers raw futures and callbacks — see the parent Asyncio Fundamentals & Event Loop Architecture for how the loop turns resolved futures into scheduled task steps.

The consequence for Semaphore is identical, just counted: acquire() decrements an internal value and only suspends when the value hits zero; release() increments it and wakes one waiter. Event and Condition use the same future-queue pattern but with broadcast (Event.set() resolves all waiters) and predicate re-check (Condition) semantics layered on top.

Two properties fall directly out of this design. First, acquisition is not atomic with the decision to wait — there is no busy-spinning, no polling, and no wasted loop iterations; a parked coroutine costs nothing but the memory of a suspended frame and a future. This is why you can have tens of thousands of coroutines blocked on a single primitive without the scheduler degrading. Second, fairness is structural, not heuristic. Because waiters live in an ordered deque and are popped from the front, the wakeup order is deterministic FIFO. A coroutine cannot be starved by a steady stream of newcomers, which is a stronger guarantee than most threading runtimes give you. The flip side is that fairness says nothing about liveness across multiple primitives: if coroutine A holds lock 1 and waits on lock 2 while coroutine B holds lock 2 and waits on lock 1, both are fairly queued and both are deadlocked forever. Ordering guarantees within one primitive never rescue you from inconsistent acquisition order across several.

Pattern Catalogue¶

Lock for an Await-Spanning Critical Section¶

Use a Lock when a shared resource is read-modified-written across an await. The classic case is a lazily-initialized singleton (a connection, a cached token) where two coroutines could both observe "not yet initialized," both start initializing, and double-create.

import asyncio


class TokenCache:
    def __init__(self) -> None:
        self._lock = asyncio.Lock()
        self._token: str | None = None
        self._expiry: float = 0.0

    async def get(self) -> str:
        # Fast path: no await, no lock needed to *read* the reference.
        now = asyncio.get_running_loop().time()
        if self._token and now < self._expiry:
            return self._token
        # Slow path: refresh must be serialized because it awaits.
        async with self._lock:
            # Re-check inside the lock — a peer may have refreshed while we waited.
            now = asyncio.get_running_loop().time()
            if self._token and now < self._expiry:
                return self._token
            self._token = await self._fetch_token()
            self._expiry = now + 3000
            return self._token

    async def _fetch_token(self) -> str:
        await asyncio.sleep(0.2)  # stand-in for an auth round-trip
        return "tok-" + str(id(self))

The double-checked pattern matters: without the re-check inside the lock, every coroutine that queued behind the first refresh would needlessly refresh again. The Lock only guards the await-spanning region; the fast-path read needs no protection because it never yields. A subtle point worth internalizing: the assignment self._token = await self._fetch_token() looks like one statement, but the await splits it — _fetch_token runs, suspends, and only after it resolves is the result bound to self._token. Any coroutine that checks self._token during that window sees the stale value, which is precisely why the entire refresh, not just the assignment, lives inside the lock.

asyncio.Lock is also non-reentrant: a coroutine that already holds the lock and calls async with self._lock again will deadlock against itself. Unlike threading.RLock, there is no recursive variant in the standard library, so structure your code so a lock is acquired at exactly one level and never re-entered by a nested helper.

Semaphore to Bound Concurrency¶

A Semaphore(N) is the standard tool for capping in-flight work — fan-out to an API, parallel file reads, concurrent DB queries. It does not serialize; it admits up to N coroutines at once and makes the rest wait.

import asyncio


async def fetch(client, url: str, sem: asyncio.Semaphore) -> int:
    async with sem:                      # blocks once N are in flight
        # stand-in for client.get(url); returns a status code
        await asyncio.sleep(0.1)
        return 200


async def fan_out(urls: list[str], limit: int = 20) -> list[int]:
    sem = asyncio.Semaphore(limit)
    async with asyncio.TaskGroup() as tg:
        tasks = [tg.create_task(fetch(None, u, sem)) for u in urls]
    return [t.result() for t in tasks]

Note all tasks are created immediately, but only limit of them progress past async with sem at any moment. This is the right way to throttle a fan-out: the semaphore, not the task count, is your concurrency knob. A common alternative — slicing the input into batches of N and awaiting each batch with gather — is strictly worse, because a single slow item in a batch stalls the whole batch behind it (head-of-line blocking). The semaphore keeps the pipeline full: the instant any in-flight call finishes, the next queued coroutine is admitted, so you sustain N concurrent calls continuously rather than in lurching waves.

The semaphore also composes cleanly with cancellation. If the surrounding TaskGroup is cancelled, every parked acquire() is interrupted with CancelledError, and any coroutine inside async with sem releases its permit on the way out as the context manager unwinds — so a cancelled fan-out does not leak permits. This is one more reason to always use the async with form rather than manual acquire/release; the cleanup is automatic and exception-safe.

BoundedSemaphore to Catch Over-Release¶

A plain Semaphore lets release() push the counter above its initial value — a silent bug if your acquire/release accounting drifts. BoundedSemaphore raises ValueError the moment a release would exceed the original count, turning a latent leak into a loud failure at the offending call site.

import asyncio


async def main() -> None:
    sem = asyncio.BoundedSemaphore(2)
    await sem.acquire()
    sem.release()
    try:
        sem.release()                    # one release too many
    except ValueError as exc:
        print(f"caught over-release: {exc}")   # fires immediately


asyncio.run(main())

Prefer BoundedSemaphore whenever you call acquire/release manually rather than via async with; it is the cheapest available guard against an accounting error that would otherwise inflate your concurrency cap unnoticed.

Event for One-Shot Signaling and Readiness¶

An Event is a sticky boolean broadcast: many coroutines await event.wait(), one coroutine calls event.set(), and all waiters wake. Once set, it stays set, so any coroutine that waits afterward returns immediately. Use it for readiness gates ("config loaded," "warm-up done") and shutdown fan-out — never as a queue, because it carries no payload and merges multiple signals into one.

The stickiness is what makes Event safe for startup gates and dangerous for repeated pulses. A latecomer to a readiness gate should sail through immediately — the system is ready, there is nothing to wait for. But if you try to reuse one Event as a recurring "tick," you must clear() it between pulses, and now you have a race: a waiter that has not yet looped back to wait() when you set()-then-clear() misses the pulse entirely. The rule of thumb is blunt: one Event, one edge. If the signal fires more than once in the object's lifetime, you have outgrown Event and want a queue (one sentinel per pulse) or a Condition over an explicit counter.

import asyncio


async def worker(name: str, ready: asyncio.Event) -> None:
    await ready.wait()                    # parks until startup completes
    print(f"{name} started processing")


async def main() -> None:
    ready = asyncio.Event()
    workers = [asyncio.create_task(worker(f"w{i}", ready)) for i in range(3)]
    await asyncio.sleep(0.5)              # simulate warm-up / config load
    ready.set()                           # release every waiter at once
    await asyncio.gather(*workers)


asyncio.run(main())

Condition for Wait-for-State¶

A Condition couples a lock with notification: a coroutine holds the lock, evaluates a predicate over shared state, and await condition.wait() if the predicate is false — releasing the lock while parked and re-acquiring it on wakeup. A producer mutates the state under the same lock and calls notify()/notify_all(). Always re-check the predicate in a while loop, because a wakeup does not guarantee the predicate now holds.

import asyncio
from collections import deque


class Buffer:
    def __init__(self) -> None:
        self._cond = asyncio.Condition()
        self._items: deque[int] = deque()

    async def put(self, item: int) -> None:
        async with self._cond:
            self._items.append(item)
            self._cond.notify()           # wake one waiter

    async def get(self) -> int:
        async with self._cond:
            while not self._items:        # re-check, never `if`
                await self._cond.wait()
            return self._items.popleft()

This is exactly what an asyncio.Queue does internally — which is why a Queue is the better default. Reach for Condition only when your wait predicate is richer than "is there an item" (e.g. "are there at least K items" or "is the system in state X"). Cancellation while parked in wait() interacts subtly with the lock; see cancellation patterns for safe teardown.

The while-not-if rule is not optional defensive style; it is correctness. Condition.wait() releases the lock, suspends, and re-acquires the lock on wakeup — and between the notify() and your coroutine actually resuming, another awakened waiter may have already consumed the state your predicate was checking for. With notify_all() this is the norm rather than the exception: every waiter wakes, races to re-acquire the lock one at a time, and all but the lucky ones must find the predicate false again and loop back into wait(). Use the narrowest notification you can — notify(1) when exactly one waiter can make progress — to avoid the thundering-herd re-acquire storm that notify_all() triggers under heavy contention.

Resource Boundaries: Sizing a Semaphore¶

A semaphore's count is a capacity decision, and the wrong number is a production incident. Size it to the narrowest downstream limit, not to how much parallelism your loop can nominally sustain. If your database connection pool holds 10 connections, a Semaphore(50) guarding queries just moves the queue from your semaphore to the pool's checkout wait — or worse, exhausts the pool and raises timeouts. Match the semaphore to the pool, not the pool to the semaphore. The same logic ties a semaphore to socket and keepalive limits described in connection pooling and keepalive.

Three sizing rules:

Bound to the scarcest resource. Connections, file descriptors, or a vendor's documented rate limit — whichever is smallest sets the count.
Leave headroom for retries. If failed work is retried on the same semaphore, a burst of retries competes for the same slots; size for steady-state plus retry overhead, or use a separate semaphore for retries.
Make it observable. A semaphore at its limit with a growing waiter queue is invisible unless you measure it (see the Diagnostic Hook below).

A frequent anti-pattern is treating the semaphore count as a performance dial to be turned up when throughput is low. It rarely is one. If a Semaphore(10) in front of a 10-connection pool shows constant waiters, raising it to 50 does not create 40 more connections — it just relocates the queue from the (observable) semaphore into the (opaque) pool checkout, where requests now wait on pool.acquire() and eventually time out. Throughput is set by the slowest real resource in the chain; the semaphore's only job is to make sure you queue politely in front of it rather than overwhelming it. When you genuinely need more throughput, raise the downstream limit first and the semaphore second, in that order. The same chain reasoning governs database driver pools, where over-subscription manifests as checkout timeouts rather than refused connections.

Integrated Example: Bounded Workers with Event-Based Shutdown¶

This combines a Semaphore-bounded worker set draining an asyncio.Queue, an Event for cooperative shutdown, and a contention probe. It demonstrates the production shape: caps on concurrency, a clean signal to stop, and instrumentation that surfaces back-pressure.

import asyncio
import time


class BoundedWorkerSystem:
    def __init__(self, concurrency: int = 5) -> None:
        self.queue: asyncio.Queue[int] = asyncio.Queue(maxsize=1000)
        self.sem = asyncio.Semaphore(concurrency)
        self.shutdown = asyncio.Event()
        self._hold_ns = 0          # cumulative time slots were held
        self._acquired = 0         # number of acquisitions

    async def _handle(self, item: int) -> None:
        start = time.perf_counter_ns()
        async with self.sem:                       # concurrency cap
            self._hold_ns += time.perf_counter_ns() - start
            self._acquired += 1
            await asyncio.sleep(0.05)              # stand-in for I/O

    async def worker(self, wid: int) -> None:
        while not self.shutdown.is_set():
            try:
                item = await asyncio.wait_for(self.queue.get(), timeout=0.2)
            except TimeoutError:
                continue                           # re-check shutdown flag
            try:
                await self._handle(item)
            finally:
                self.queue.task_done()

    async def probe(self) -> None:
        """Diagnostic Hook: report waiter pressure and mean acquire latency."""
        while not self.shutdown.is_set():
            await asyncio.sleep(1.0)
            waiters = len(self.sem._waiters) if self.sem._waiters else 0
            mean_wait_us = (self._hold_ns / self._acquired / 1000) if self._acquired else 0
            print(
                f"qsize={self.queue.qsize()} sem_value={self.sem._value} "
                f"sem_waiters={waiters} mean_acquire_us={mean_wait_us:.1f}"
            )

    async def run(self, items: int, workers: int = 8) -> None:
        tasks = [asyncio.create_task(self.worker(i)) for i in range(workers)]
        probe = asyncio.create_task(self.probe())
        for n in range(items):
            await self.queue.put(n)
        await self.queue.join()                    # wait for full drain
        self.shutdown.set()                        # broadcast stop to all
        await asyncio.gather(*tasks, probe)


asyncio.run(BoundedWorkerSystem(concurrency=5).run(items=200, workers=8))

The shutdown Event lets the producer signal every worker and the probe with one call; each loop re-checks is_set() and the bounded wait_for on queue.get() guarantees the flag is observed within 200 ms even when the queue is empty.

Diagnostic Hook. The probe coroutine reads sem._value (free slots) and len(sem._waiters) (coroutines parked on acquire) once per second. A sustained sem_value == 0 with a non-zero, growing sem_waiters is the unambiguous signature of a concurrency cap that is too tight — or a downstream that has slowed. Export both as gauges. Pair them with hold time: if mean_acquire_us climbs while waiters grow, the bottleneck is inside the critical section, not the cap itself.

Diagnostic Hook: Detecting Contention and Deadlock¶

The internal attributes are not public API but are stable enough for diagnostics. Read them; don't mutate them.

Contention: len(lock._waiters) / len(sem._waiters) is the live waiter count. Track its high-water mark per primitive. Track hold time by stamping acquire and release; a long mean hold is a critical section doing too much (often blocking I/O — push it to a worker, per hybrid concurrency models).
Deadlock signs: the loop is alive (asyncio.all_tasks() returns tasks) but no task makes progress, and waiter queues are non-empty and static. Two locks acquired in opposite orders by two coroutines is the textbook cause; enforce a global lock-ordering or collapse to a single lock.
Hangs: an Event whose wait()-ers never wake means set() was never reached — usually on an error path that skipped the signal. Always set() shutdown/error events in a finally.

Failure Modes¶

Failure mode	Root cause	Detection	Fix
Race on shared state	Used an `asyncio.Lock` to guard state touched from a worker thread; the lock only excludes coroutines, not threads	Intermittent corruption that correlates with thread-pool use; clean under single-thread load	Use `threading.Lock` or a thread-safe `queue.Queue`; coordinate via sharing state between tasks and threads
Semaphore leak	`acquire()` without a matching `release()` on an exception path (manual acquire, no `async with`)	`sem._value` trends down over time; effective concurrency shrinks toward zero	Always use `async with sem:`; for manual use, release in `finally`; switch to `BoundedSemaphore` to catch the inverse error
Hang on Event never set	`event.set()` lives only on the happy path; an early return or exception skips it	Waiters parked forever in `event.wait()`; loop alive, task stuck	Move `set()` into a `finally` so failures still release waiters
Missed wakeup (Condition)	`notify()` fired before the waiter reached `wait()`, or `if` used instead of `while` to test the predicate	A consumer sleeps despite state satisfying its predicate	Hold the lock around state change and `notify`; always re-test the predicate in a `while` loop

Frequently Asked Questions¶

Are asyncio synchronization primitives thread-safe?

No. asyncio.Lock, Semaphore, Event, and Condition are loop-affine: they coordinate coroutines running on a single event loop and are not safe to call from other OS threads. To share state with threads, use threading primitives or a thread-safe queue.Queue, or marshal calls back to the loop with loop.call_soon_threadsafe().

When do I actually need an asyncio.Lock?

Only when a critical section awaits. Code that runs straight through without yielding is already atomic under cooperative scheduling, so no lock is needed. A Lock matters when the shared region suspends at an await and another coroutine could observe or mutate half-updated state in that gap.

What is the difference between Semaphore and BoundedSemaphore?

A plain Semaphore allows release() to raise its internal counter above the initial value, silently inflating your concurrency cap if acquire/release accounting drifts. BoundedSemaphore raises ValueError immediately when a release would exceed the original count, turning a latent leak into a loud, locatable failure.

Should I use an asyncio.Condition or an asyncio.Queue?

Prefer a Queue. An asyncio.Queue is essentially a Condition with correct, backpressure-aware producer-consumer semantics already built in. Reach for Condition only when your wait predicate is richer than 'is there an item' — for example 'are there at least K items' or 'is the system in state X'.

How do I size an asyncio.Semaphore?

Bind the count to the scarcest downstream resource — the connection pool size, file-descriptor ceiling, or a vendor rate limit — not to how much parallelism the loop could sustain. Leave headroom for retries and export sem._value and len(sem._waiters) as gauges so a too-tight cap is visible.

Asyncio Fundamentals & Event Loop Architecture — up to the overview for how the loop schedules and wakes future-based waiters.
Choosing asyncio Lock vs Semaphore vs Event — a symptom-to-primitive decision guide when you are unsure which one to reach for.
Hybrid concurrency models — what to use instead of these primitives when state is shared with threads or processes.
Connection pooling and keepalive — the downstream limits that should set your semaphore size.
Cancellation patterns — safe teardown when a coroutine is cancelled while parked on a lock or condition.