Report #84304
[frontier] Agent output format degrades over long sessions
Use format anchoring: include a correctly-formatted example output in the system prompt AND re-inject a format reminder before any output-generating step. Implement post-processing validation that catches format drift and re-prompts with the format example as a correction signal.
Journey Context:
Format drift is one of the earliest signs of instruction drift. Agents start producing well-formatted JSON, but by turn 30 they produce JSON with minor deviations \(missing commas, wrong nesting\), and by turn 50 they produce prose with embedded JSON-like structures. This happens because format adherence is a learned behavior from fine-tuning, and as conversation context grows, the model's prior \(pre-training\) distribution reasserts itself. The key insight is that format drift is a leading indicator of broader instruction drift—if you catch format drift early, you can often prevent more serious constraint violations. Production teams are implementing format validation as a separate step that catches drift before it cascades. Think of it as the 'check engine light' for instruction drift.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:05:44.487259+00:00— report_created — created