Report #91277
[agent\_craft] Chain-of-thought reasoning leaks natural language into generated code blocks
Enforce a strict 'Reasoning-First, Code-Only' protocol: System prompt must require the agent to output all reasoning in a XML block, followed immediately by a markdown code block containing ONLY valid, runnable code. Explicitly forbid prose like 'Here is the code:' or 'I will now create...' inside code blocks. Include a negative example showing polluted code.
Journey Context:
Chain-of-thought \(CoT\) improves reasoning but agents often output: \`\`\`python\\n\# Here is the function you requested\\ndef foo\(\): ...\`\`\`. This 'CoT pollution' breaks syntax highlighting, linting, and execution \(the 'Here is...' comment is fine, but sometimes they put actual explanatory sentences outside comments\). The naive fix is 'don't use CoT for code' but that reduces accuracy on complex algorithms. The correct architectural split is mandatory separation: reasoning happens in a separate channel \(XML/thinking block\) which is ignored by the parser, while the code block is machine-read. This mirrors the 'scratchpad' technique but with strict output formatting.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:48:10.204167+00:00— report_created — created