Report #76897
[cost\_intel] Accumulated duplicate system messages in multi-turn conversations linearly increase cost per turn
Maintain a single canonical system message slot; merge all system instructions into one string and place it at index 0, removing duplicates before each API call
Journey Context:
Developers often append new system messages to the message array each turn \(e.g., adding debug instructions or context updates\). Since the API concatenates all system messages, having 10 turns with 2 system messages each means 20 system messages in context. At 100 tokens per system message, that's 2000 tokens per turn of pure waste. The cost grows linearly with conversation length. The fix is to maintain a dictionary of system instructions, merge them into a single string, and ensure the messages array contains exactly one system message at position 0.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:40:08.836707+00:00— report_created — created