Context

When scaling your platform to a higher number of instances (e.g., 32), you may encounter PostgreSQL "too many connections" or "too many clients already" errors. This occurs because each backend pod establishes multiple database connections, and the total number of connections can exceed PostgreSQL's connection limit.

Answer

This issue happens when your application pods try to establish more database connections than PostgreSQL allows. Each backend pod uses connection pooling with a default pool size, **Note:** For LangGraph API deployments, the default `LANGGRAPH_POSTGRES_POOL_MAX_SIZE` is 150 connections per replica, which can quickly exhaust database connections when scaling, and when you have many pods (backend, platform-backend, and queue pods), the total connections can exceed the database limit.

Short-term solutions:

Increase PostgreSQL connection limit:

ALTER SYSTEM SET max_connections = 200;
SELECT pg_reload_conf();

kubectl get pods -l app.kubernetes.io/name=langsmith
kubectl exec -it <any-backend-pod> -- env | grep ASYNCPG
kubectl exec -it <any-platform-backend-pod> -- env | grep ASYNCPG

queue:
  deployment:
    extraEnv:
      - name: "ASYNCPG_POOL_MAX_SIZE"
        value: "2"

2. Reduce connection pool size per pod:

For LangGraph API deployments, set LANGGRAPH_POSTGRES_POOL_MAX_SIZE:

# Example: For 100 max_connections with buffer
LANGGRAPH_POSTGRES_POOL_MAX_SIZE=40

For other deployments, set ASYNCPG_POOL_MAX_SIZE:

kubectl get pods -l app.kubernetes.io/name=langsmith
kubectl exec -it <any-backend-pod> -- env | grep ASYNCPG
kubectl exec -it <any-platform-backend-pod> -- env | grep ASYNCPG

Then modify your helm values.yaml:

queue:
  deployment:
    extraEnv:
      - name: "ASYNCPG_POOL_MAX_SIZE"
        value: "2"

Add connection management:
Set ASYNCPG_POOL_MIN_SIZE = 1 to reduce idle connections and review PostgreSQL connection timeouts.

Long-term recommendations:

Use external PostgreSQL instead of in-cluster PostgreSQL, as managed services handle connection limits better and are more production-ready
Implement connection pooling with PgBouncer
Consider adding PostgreSQL read replicas to distribute read load
Implement connection retry logic with exponential backoff on the application side

The connection calculation varies by deployment type:

LangGraph API: Total connections = (number of replicas) × LANGGRAPH_POSTGRES_POOL_MAX_SIZE (default 150)
Other deployments: Total connections = (number of backend pods + platform-backend pods + queue pods) × ASYNCPG_POOL_MAX_SIZE

For example, 10 LangGraph API replicas with default settings can establish up to 1,500 connections. Ensure your total stays below PostgreSQL's max_connections setting, leaving buffer for superuser connections and overhead.

Context

Answer

Short-term solutions:

Increase PostgreSQL connection limit:

ALTER SYSTEM SET max_connections = 200;
SELECT pg_reload_conf();

kubectl get pods -l app.kubernetes.io/name=langsmith
kubectl exec -it <any-backend-pod> -- env | grep ASYNCPG
kubectl exec -it <any-platform-backend-pod> -- env | grep ASYNCPG

queue:
  deployment:
    extraEnv:
      - name: "ASYNCPG_POOL_MAX_SIZE"
        value: "2"

2. Reduce connection pool size per pod:

For LangGraph API deployments, set LANGGRAPH_POSTGRES_POOL_MAX_SIZE:

# Example: For 100 max_connections with buffer
LANGGRAPH_POSTGRES_POOL_MAX_SIZE=40

For other deployments, set ASYNCPG_POOL_MAX_SIZE:

kubectl get pods -l app.kubernetes.io/name=langsmith
kubectl exec -it <any-backend-pod> -- env | grep ASYNCPG
kubectl exec -it <any-platform-backend-pod> -- env | grep ASYNCPG

Then modify your helm values.yaml:

queue:
  deployment:
    extraEnv:
      - name: "ASYNCPG_POOL_MAX_SIZE"
        value: "2"

Add connection management:

Set ASYNCPG_POOL_MIN_SIZE = 1 to reduce idle connections and review PostgreSQL connection timeouts.

Long-term recommendations:

Use external PostgreSQL instead of in-cluster PostgreSQL, as managed services handle connection limits better and are more production-ready

Implement connection pooling with PgBouncer

Consider adding PostgreSQL read replicas to distribute read load

Implement connection retry logic with exponential backoff on the application side

The connection calculation varies by deployment type:

LangGraph API: Total connections = (number of replicas) × LANGGRAPH_POSTGRES_POOL_MAX_SIZE (default 150)

Other deployments: Total connections = (number of backend pods + platform-backend pods + queue pods) × ASYNCPG_POOL_MAX_SIZE

How do I resolve "too many connections" PostgreSQL errors when scaling platform instances?

Context

Answer

LangChain Support

Sign in to Chat

How do I resolve "too many connections" PostgreSQL errors when scaling platform instances?

Context

Answer

LangChain Support

Sign in to Chat