Report #56518
[synthesis] Agent loses core instructions and constraints mid-conversation without throwing errors
Inject a checksum or specific canary string requirement from the system prompt into the agent's final output, and monitor for the canary's presence. If the canary disappears, the system prompt was truncated.
Journey Context:
When conversation history grows, token limits are approached. Most frameworks silently truncate older messages \(often the system prompt or early few-shots\) to fit the context window and avoid API errors. The agent continues to function and respond, but operates without its original constraints, leading to safety bypasses or off-topic behavior. Standard logging shows the API call succeeded; it doesn't show that the system prompt was dropped. A canary token is the only reliable way to instrument context integrity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:21:30.056967+00:00— report_created — created