Context
You may experience PostgreSQL connection pool contention that results in 503 errors, high latency (up to 50 seconds), and requests being queued even when the pool is configured with a high maximum size. This typically occurs when the connection pool appears stuck at a low number of connections despite being configured for much higher limits, and can affect both thread search operations and overall system performance.
Answer
This issue is commonly caused by mismatched worker pool and database pool configurations, along with insufficient database resources. Here's how to resolve it:
1. Configure Environment Variables
Add these environment variables to your values.yaml file:
apiServer:
deployment:
replicaCount: 3
extraEnv:
# Worker pool must match DB pool size
- name: ASYNC_WORKER_POOL_SIZE
value: "150"
- name: THREAD_POOL_SIZE
value: "150"
- name: N_JOBS_PER_WORKER
value: "10"
# Database pool scaling
- name: POSTGRES_POOL_MAX_SIZE
value: "150"
- name: POSTGRES_POOL_MIN_SIZE
value: "10"
# Query timeouts
- name: POSTGRES_STATEMENT_TIMEOUT
value: "30000"
- name: POSTGRES_IDLE_IN_TRANSACTION_TIMEOUT
value: "60000"
readinessProbe:
exec:
command:
- /bin/sh
- -c
- exec python /api/healthcheck.py
initialDelaySeconds: 30
timeoutSeconds: 10
failureThreshold: 5
livenessProbe:
exec:
command:
- /bin/sh
- -c
- exec python /api/healthcheck.py
initialDelaySeconds: 60
periodSeconds: 30
timeoutSeconds: 10
failureThreshold: 6Apply the changes using:
helm upgrade your-deployment-name <path-to-your-chart> \
-f values.yaml \
-n your-namespace \
--wait2. Scale Your Database Resources
Ensure your PostgreSQL instance has sufficient resources:
Use a larger database instance size (refer to the scaling documentation for recommended specifications)
Ensure your database can handle the maximum number of connections (e.g., Aurora RDS db.r6g.xlarge supports up to 2000 connections)
3. Remove Read Replicas
LangGraph and LangSmith do not support read-only database endpoints. If you're using Aurora RDS with read replicas:
Remove read replicas as they are not utilized
Use only the write endpoint
This will also reduce unnecessary resource overhead
4. Scale API Servers
Instead of using multiple uvicorn workers on a single API server:
Deploy multiple API server instances
Use the default single uvicorn worker per pod configuration
This provides better resource distribution and fault tolerance
5. Optimize Thread Search Performance
For high-volume thread search operations:
Implement pagination with reasonable limits
Use more granular filtering in your search queries
Ensure proper indexing on metadata fields you're filtering by
Key Understanding: The connection pool issue occurs when the job limit (N_JOBS_PER_WORKER) is reached before the database pool can scale up. The worker pool settings must be aligned with database pool settings to allow proper scaling.
These configurations apply to both API servers and queue workers when using the combined API/Queue pattern, as they share the same connection pool.