Context
After upgrading to LangSmith SDK >= 0.3.78, you may notice a significant increase in token usage and evaluation costs. This happens because the upgrade includes an opt-in change that adds intermediate messages (steps) to traced outputs, which can make the input data much larger when evaluators process traces that include multiple roundtrips or tool calls.
Answer
The increased token usage is caused by the new "steps" feature in LangSmith 0.3.78, which includes intermediate messages in traced outputs. Here's how to reduce costs by optimizing your evaluators:
For evaluators that only need model output:
Edit your evaluator configuration
Change the input mapping from using the full input to using only
output.contentThis works well for evaluators like Style and Intent classifiers that don't need the conversation history or tool calls
For evaluators that need tool information:
Consider adjusting your sampling rate to evaluate fewer traces, since they now include much larger inputs
Focus on whether tools were called rather than their full outputs if that's sufficient for your use case
Alternative solutions:
Remove the steps parameter: If you don't need intermediate steps in certain spots, you can remove this parameter from your code
Use array/index notation: You can pick out individual messages using array notation if you need specific parts of the conversation
The steps feature is opt-in and designed for cases where AI SDK takes multiple roundtrips to respond to a query. If your evaluators don't require this level of detail, switching to output-only evaluation can significantly reduce token usage and costs.