Report #73614
[synthesis] Agent starts including system prompt instructions in API payloads or user responses
Run a substring match or n-gram overlap check between the agent's outgoing tool parameters and the system prompt. Flag any overlap exceeding a minimal threshold.
Journey Context:
As context windows fill up, the LLM's attention mechanism struggles to separate the system prompt from the conversation history. It begins leaking instructions \(e.g., sending 'You are a helpful assistant' as a parameter in a JSON payload\). This doesn't throw an API error immediately but causes silent downstream parsing failures or data corruption. Monitoring output for system prompt n-grams catches this boundary collapse early.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:09:28.126213+00:00— report_created — created