.. _production: Production Guide ================ This guide collects the operational concerns that come up when running ``aiographql-client`` in a long-lived service: session lifecycle, connection pool sizing, timeouts, retries, observability, and shutdown. Sections build on :ref:`transport`, :ref:`authentication`, and :ref:`errors`; each one is referenced where relevant rather than restated. Session Lifecycle ----------------- For any process that handles more than a handful of requests, provide your own transport session and reuse it across the lifetime of the client. The client does not close sessions it did not create; ownership stays with the caller. The recommended shape is an application-scoped session bound to the application lifecycle (FastAPI startup/shutdown, ``asyncio.run`` entry function, ``contextlib.AsyncExitStack``, etc.): .. code-block:: python import aiohttp from aiographql.client import GraphQLClient async def main(): async with aiohttp.ClientSession( connector=aiohttp.TCPConnector(limit=200), ) as session: client = GraphQLClient( endpoint="https://api.example.com/graphql", session=session, ) await run_application(client) # Session closes here; in-flight requests are aborted. For ``httpx``: .. code-block:: python import httpx from aiographql.client import GraphQLClient async with httpx.AsyncClient( limits=httpx.Limits(max_connections=200, max_keepalive_connections=50), ) as session: client = GraphQLClient( endpoint="https://api.example.com/graphql", session=session, ) await run_application(client) Connection Pool Sizing ---------------------- The default ``aiohttp`` connector limit is 100. Two patterns push past it: * High request concurrency. Size the pool above your steady-state request fan-out, leaving headroom for retries. * Subscriptions. Each subscription holds an open WebSocket for its lifetime and counts against the pool. Add the expected concurrent subscription count to the request fan-out estimate. Set the limit on the connector for ``aiohttp`` or on ``Limits`` for ``httpx``. Going too low causes new requests to queue; going too high wastes file descriptors and lets a stalled backend retain more in-flight work. Timeouts -------- The library does not impose timeouts. Configure them on the session: .. code-block:: python timeout = aiohttp.ClientTimeout(total=30, connect=5, sock_read=10) async with aiohttp.ClientSession(timeout=timeout) as session: ... For ``httpx``: .. code-block:: python timeout = httpx.Timeout(30.0, connect=5.0, read=10.0) async with httpx.AsyncClient(timeout=timeout) as session: ... A ``total`` timeout caps end-to-end latency for a request. ``connect`` and ``read`` (or ``sock_read``) catch unresponsive peers without waiting out the full budget. Pick values that match the slowest acceptable user-facing latency for the operation. Retries ------- The client does not retry. Wrap calls with ``tenacity`` (or any retry library) and limit retries to transient failures only: connection errors, timeouts, and HTTP 5XX. Do not retry validation errors or 4XX responses. .. code-block:: python from tenacity import ( retry, retry_if_exception_type, stop_after_attempt, wait_exponential, ) from aiographql.client import GraphQLRequestException from aiographql.client.exceptions import GraphQLTransportException @retry( retry=retry_if_exception_type((GraphQLTransportException, asyncio.TimeoutError)), wait=wait_exponential(multiplier=0.5, min=0.5, max=8), stop=stop_after_attempt(4), reraise=True, ) async def query_with_retry(client, request): return await client.query(request) For HTTP 5XX-specific retries, catch :class:`~aiographql.client.GraphQLRequestException` and inspect ``exc.response.json`` to decide whether to retry. Schema Validation ----------------- Pre-flight validation catches programmer errors before they hit the network, but it requires introspection on the server. * **Development:** keep ``validate=True``. Bad queries fail fast with a clear message. * **Production with introspection disabled:** set ``validate=False`` on the client. Surface server-side validation as :class:`~aiographql.client.GraphQLRequestException` or via ``response.errors``. * **Production with introspection enabled:** consider setting ``schema_ttl`` on the client so the introspected schema refreshes periodically without restarting the service. See :ref:`errors` for handling each layer. Observability ------------- Logging and tracing belong on the transport session, not on the GraphQL client. Use ``aiohttp``'s :py:class:`~aiohttp.TraceConfig` or ``httpx``'s event hooks to capture request/response metadata for every call, including ones that fail before the GraphQL layer sees them. Minimal ``aiohttp`` example: .. code-block:: python import logging import aiohttp log = logging.getLogger("graphql") async def on_request_start(session, ctx, params): ctx.start = asyncio.get_running_loop().time() async def on_request_end(session, ctx, params): elapsed = asyncio.get_running_loop().time() - ctx.start log.info( "graphql %s %s %.3fs", params.method, params.url, elapsed, ) trace_config = aiohttp.TraceConfig() trace_config.on_request_start.append(on_request_start) trace_config.on_request_end.append(on_request_end) async with aiohttp.ClientSession(trace_configs=[trace_config]) as session: client = GraphQLClient(endpoint="...", session=session) For tracing, install an OpenTelemetry instrumentation for the underlying transport (``opentelemetry-instrumentation-aiohttp-client`` or ``opentelemetry-instrumentation-httpx``); spans cover every call the client makes. Graceful Shutdown ----------------- The client itself is cheap to construct and discard. The expensive part is the session. On shutdown: 1. Stop accepting new work that would call the client. 2. Cancel or drain any subscriptions (:meth:`GraphQLSubscription.unsubscribe `). 3. Await in-flight queries with a deadline. 4. Exit the ``async with`` block (or call ``await session.close()``) so the transport closes its connector and releases file descriptors. Skipping step 4 in long-running services leaks connections; in short-running scripts it leaks them only on process exit but produces ``Unclosed client session`` warnings. Health Checks ------------- A liveness probe should not call the upstream GraphQL endpoint. A readiness probe can, but should issue a minimal query (often ``{ __typename }``) and treat any :class:`~aiographql.client.GraphQLClientException` subclass as "not ready" without bringing the pod down. .. code-block:: python async def graphql_ready(client) -> bool: try: await asyncio.wait_for( client.query("{ __typename }"), timeout=2, ) except Exception: return False return True