Production Guide¶
This guide collects the operational concerns that come up when running
aiographql-client in a long-lived service: session lifecycle, connection
pool sizing, timeouts, retries, observability, and shutdown.
Sections build on Configuring Transport, Authentication, and Errors and Exceptions; each one is referenced where relevant rather than restated.
Session Lifecycle¶
For any process that handles more than a handful of requests, provide your own transport session and reuse it across the lifetime of the client. The client does not close sessions it did not create; ownership stays with the caller.
The recommended shape is an application-scoped session bound to the
application lifecycle (FastAPI startup/shutdown, asyncio.run entry
function, contextlib.AsyncExitStack, etc.):
import aiohttp
from aiographql.client import GraphQLClient
async def main():
async with aiohttp.ClientSession(
connector=aiohttp.TCPConnector(limit=200),
) as session:
client = GraphQLClient(
endpoint="https://api.example.com/graphql",
session=session,
)
await run_application(client)
# Session closes here; in-flight requests are aborted.
For httpx:
import httpx
from aiographql.client import GraphQLClient
async with httpx.AsyncClient(
limits=httpx.Limits(max_connections=200, max_keepalive_connections=50),
) as session:
client = GraphQLClient(
endpoint="https://api.example.com/graphql",
session=session,
)
await run_application(client)
Connection Pool Sizing¶
The default aiohttp connector limit is 100. Two patterns push past it:
High request concurrency. Size the pool above your steady-state request fan-out, leaving headroom for retries.
Subscriptions. Each subscription holds an open WebSocket for its lifetime and counts against the pool. Add the expected concurrent subscription count to the request fan-out estimate.
Set the limit on the connector for aiohttp or on Limits for
httpx. Going too low causes new requests to queue; going too high wastes
file descriptors and lets a stalled backend retain more in-flight work.
Timeouts¶
The library does not impose timeouts. Configure them on the session:
timeout = aiohttp.ClientTimeout(total=30, connect=5, sock_read=10)
async with aiohttp.ClientSession(timeout=timeout) as session:
...
For httpx:
timeout = httpx.Timeout(30.0, connect=5.0, read=10.0)
async with httpx.AsyncClient(timeout=timeout) as session:
...
A total timeout caps end-to-end latency for a request. connect and
read (or sock_read) catch unresponsive peers without waiting out the
full budget. Pick values that match the slowest acceptable user-facing
latency for the operation.
Retries¶
The client does not retry. Wrap calls with tenacity (or any retry
library) and limit retries to transient failures only: connection errors,
timeouts, and HTTP 5XX. Do not retry validation errors or 4XX responses.
from tenacity import (
retry,
retry_if_exception_type,
stop_after_attempt,
wait_exponential,
)
from aiographql.client import GraphQLRequestException
from aiographql.client.exceptions import GraphQLTransportException
@retry(
retry=retry_if_exception_type((GraphQLTransportException, asyncio.TimeoutError)),
wait=wait_exponential(multiplier=0.5, min=0.5, max=8),
stop=stop_after_attempt(4),
reraise=True,
)
async def query_with_retry(client, request):
return await client.query(request)
For HTTP 5XX-specific retries, catch
GraphQLRequestException and inspect
exc.response.json to decide whether to retry.
Schema Validation¶
Pre-flight validation catches programmer errors before they hit the network, but it requires introspection on the server.
Development: keep
validate=True. Bad queries fail fast with a clear message.Production with introspection disabled: set
validate=Falseon the client. Surface server-side validation asGraphQLRequestExceptionor viaresponse.errors.Production with introspection enabled: consider setting
schema_ttlon the client so the introspected schema refreshes periodically without restarting the service.
See Errors and Exceptions for handling each layer.
Observability¶
Logging and tracing belong on the transport session, not on the GraphQL
client. Use aiohttp’s TraceConfig or
httpx’s event hooks to capture request/response metadata for every call,
including ones that fail before the GraphQL layer sees them.
Minimal aiohttp example:
import logging
import aiohttp
log = logging.getLogger("graphql")
async def on_request_start(session, ctx, params):
ctx.start = asyncio.get_running_loop().time()
async def on_request_end(session, ctx, params):
elapsed = asyncio.get_running_loop().time() - ctx.start
log.info(
"graphql %s %s %.3fs",
params.method,
params.url,
elapsed,
)
trace_config = aiohttp.TraceConfig()
trace_config.on_request_start.append(on_request_start)
trace_config.on_request_end.append(on_request_end)
async with aiohttp.ClientSession(trace_configs=[trace_config]) as session:
client = GraphQLClient(endpoint="...", session=session)
For tracing, install an OpenTelemetry instrumentation for the underlying
transport (opentelemetry-instrumentation-aiohttp-client or
opentelemetry-instrumentation-httpx); spans cover every call the client
makes.
Graceful Shutdown¶
The client itself is cheap to construct and discard. The expensive part is the session. On shutdown:
Stop accepting new work that would call the client.
Cancel or drain any subscriptions (
GraphQLSubscription.unsubscribe).Await in-flight queries with a deadline.
Exit the
async withblock (or callawait session.close()) so the transport closes its connector and releases file descriptors.
Skipping step 4 in long-running services leaks connections; in
short-running scripts it leaks them only on process exit but produces
Unclosed client session warnings.
Health Checks¶
A liveness probe should not call the upstream GraphQL endpoint. A readiness
probe can, but should issue a minimal query (often { __typename }) and
treat any
GraphQLClientException subclass as
“not ready” without bringing the pod down.
async def graphql_ready(client) -> bool:
try:
await asyncio.wait_for(
client.query("{ __typename }"),
timeout=2,
)
except Exception:
return False
return True