Report #63042
[cost\_intel] OpenAI function calling history metadata consumes 30-40% of context window in multi-turn tool loops
Truncate tool result content aggressively; use 'tool' message content summaries instead of full API responses; implement conversation checkpointing that starts fresh context after N tool turns, treating prior context as embedded system prompt summary.
Journey Context:
In multi-turn conversations with tool use, every assistant message containing tool\_calls must include the 'tool\_call\_id' and 'name' for each tool, and every subsequent tool message must include the full 'content' \(often JSON or text from APIs\). In 10-turn loops with 3 tools per turn, the metadata \(IDs, names\) plus full tool results can exceed the actual user/assistant dialogue by 30-40%. Agents often assume tool results are 'free' context. The fix is aggressive summarization: if a tool returns a 5000 char JSON, immediately compress it to 500 chars before adding to history, or truncate history entirely after tool-intensive phases.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T12:17:45.572866+00:00— report_created — created