Context
Users running LangGraph deployments may encounter Redis memory-related issues, particularly in environments with high load or limited resources. These issues can manifest as unexplained latencies, failed runs, or Out of Memory (OOM) errors when Redis reaches its maximum memory limit.
Answer
There are several steps you can take to resolve Redis memory issues in your LangGraph deployment:
Check Deployment Type: Ensure you're using the appropriate deployment type:
Dev deployments have hard compute and memory constraints
Production deployments auto-scale database size based on needs and provide higher max resources
Redis Configuration:
For non-clustered Redis: Use the environment variable
REDIS_CLUSTER=falseFor enterprise/production use: Enable clustering with
REDIS_CLUSTER=true
Memory Management:
Monitor Redis memory usage through your monitoring dashboard
Implement TTL (Time To Live) settings in your langgraph.json to manage data retention
Consider increasing Redis memory limits if you consistently hit capacity
Important: If you're running production workloads, it's strongly recommended to use a production-type deployment instead of a dev deployment. Dev deployments cannot be upgraded to production - you'll need to create a new production deployment.
Immediate Resolution Steps for Memory Issues:
If possible, restart your Redis instance as a temporary solution
Monitor memory usage patterns to identify potential memory leaks or usage spikes
Consider implementing async calls for API operations to reduce memory pressure
Advanced Resource Management and Compatibility
Understanding LangGraph Container Resource Constraints
2 CPU cores and 2GB RAM per container
Autoscaling up to 10 containers based on 75% CPU/memory utilization targets
Monitor these metrics at Deployments → Monitoring to review CPU, memory, and pending runs
Preventing Memory Issues Through Workload Management
Configure Concurrent Operations:
Set the
N_JOBS_PER_WORKERenvironment variable to limit concurrent runs per worker (default is 10)Reduce this value if experiencing resource pressure from parallel operations
Add delays between task submissions when running many parallel tasks
Critical Compatibility Requirements
Use Redis instead of Valkey – AWS ElastiCache with Valkey can cause hanging requests and connection issues
If experiencing unexplained hangs with Valkey, switch to Redis for reliable operation
This compatibility issue can manifest as silent failures that are difficult to diagnose
Proactive Monitoring and Prevention
Resource Pattern Analysis:
Track correlation between pending runs and memory spikes
Monitor autoscaling triggers to understand when you're approaching limits
Identify optimal
N_JOBS_PER_WORKERvalues based on your workload patterns