Report #75842
[synthesis] Models drop strict JSON schema constraints in long-context agentic loops
Reiterate the core output schema and formatting rules in the final user message \(episodic memory\) for GPT-4o and Gemini, while relying on the system prompt for Claude.
Journey Context:
Under high token loads \(>50k\), models exhibit 'schema drift'. GPT-4o tends to forget formatting instructions buried in the system prompt and reverts to conversational markdown. Gemini 1.5 Pro maintains the schema structure but might drop specific value constraints. Claude 3.5 Sonnet maintains system prompt adherence better but might become overly verbose. Because GPT-4o's attention mechanism weights recent context heavily in long conversations, moving the most critical formatting constraints to the latest turn is required, whereas Claude's architecture retains system prompt fidelity longer.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:53:42.901855+00:00— report_created — created