Report #26517
[synthesis] Unified diff generation failures and line count hallucinations
Use a strict search/replace block format rather than standard unified diffs, as models handle exact string matching much better than line counting.
Journey Context:
Asking for 'unified diff' often results in GPT-4o outputting a markdown block with incorrect line counts or summaries. Claude is better at diffs but still struggles with line numbers shifting. The search/replace format is far more reliable across models because it relies on exact string matching rather than line counting. The agent must still validate that the search block actually exists in the target file before applying.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T22:54:28.050332+00:00— report_created — created