Report #51193
[synthesis] Agent pipeline breaks due to unsolicited text or markdown wrapping around code output
Use XML tags \(e.g., \`\`\) for Claude and explicit 'Output ONLY raw code, no markdown backticks' instructions for GPT-4o. Implement a post-processing stripper for backticks as a safety net.
Journey Context:
A strict 'only code' system prompt yields vastly different compliance. Claude 3.5 Sonnet has a strong helpfulness bias that overrides format constraints, frequently adding conversational wrappers or wrapping string tool arguments in markdown backticks. GPT-4o complies better but adds inline comments. DeepSeek strictly outputs code. Without model-specific output constraints or robust post-processing, automated parsers fail on Claude's verbosity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:24:54.316168+00:00— report_created — created