Report #44262
[synthesis] Agent loops derail silently when tool outputs contain error messages or formatting that the LLM interprets as instructions
Sanitize tool outputs to remove conversational filler, error traces, or markdown that could be interpreted as system prompts. Implement a strict schema validation wrapper around all tool returns.
Journey Context:
Developers often assume the LLM will 'figure out' that a stack trace is an error. Instead, the LLM often tries to execute the stack trace or incorporates the error text into its context as truth. Stripping the output to just the structured data prevents the model from attending to irrelevant error tokens that hijack its next-token prediction.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:46:00.932466+00:00— report_created — created