Summary
When deploying updates to a LangGraph application on LangSmith Deployment, existing threads with saved checkpoints may resume on the new version. This article explains what happens when state schemas change between versions and provides safe patterns for schema evolution.
How Checkpoint Restoration Works
When a thread resumes, LangGraph deserializes the saved checkpoint and restores the state. If your graph's state schema has changed since the checkpoint was created:
New fields in the schema won't exist in the checkpoint data
Removed fields in the checkpoint won't have a corresponding schema definition
Renamed fields appear as a removal + addition (old data orphaned)
Type changes may cause deserialization errors if incompatible
Schema Change Compatibility Matrix
Change | TypeSafe | Behavior |
Add new field with default | Safe | Existing checkpoints get the default value |
Remove unused field | Safe | Old checkpoint data is ignored |
Rename field | Unsafe | Old field data is lost; new field gets default |
Change field type (incompatible) | Unsafe | Deserialization fails or type errors |
Add required field (no default) | Unsafe | Existing checkpoints will error |
Safe Patterns
Adding new fields - always provide a default and use defensive access:
class AgentState(TypedDict):
messages: Annotated[list, add_messages]
context: str # New field in v2
def my_node(state: AgentState):
# Use .get() with default for backwards compatibility
context = state.get("context", "")Removing fields - safe, old checkpoint data is ignored.
Handling type changes with Pydantic - use validators for existing checkpoints:
class AgentState(BaseModel):
due_date: date
@field_validator('due_date', mode='before')
@classmethod
def parse_date(cls, v):
if isinstance(v, list):
return date(*v) # Handle old format
return vInterrupted Threads and Node Changes
Threads that are interrupted (paused mid-execution) have additional constraints. An interrupted thread may be waiting to enter a specific node. If that node is renamed or removed, the thread cannot resume.
Completed threads can survive topology changes (node renames, additions, removals). Interrupted threads cannot.
Deployment Best Practices
Always use defaults for new fields and access with
state.get("field", default)Configure checkpoint TTL to clean up old threads:
{ "checkpointer": { "ttl": { "strategy": "delete", "default_ttl": 43200 } } }For breaking changes, either drain existing threads before deploying or accept that in-progress threads will need to be restarted