Report #86029
[synthesis] Agent format instructions in system prompt are followed initially but silently degrade after N conversation turns, with different decay rates per model
Re-inject critical format instructions periodically, not just in the system prompt. For GPT-4o, re-inject after every 8-10 turns. For Claude, re-inject after 15-20 turns. For Gemini, use intermittent reinforcement every 5-8 turns due to its alternating adherence pattern. Implement response format validation on every turn and auto-re-prompt with corrected instructions on failure.
Journey Context:
All models exhibit system prompt adherence decay, but the rate and pattern differ significantly. GPT-4o tends to forget format constraints \(like 'respond in JSON only' or 'use this specific schema'\) after roughly 8-12 turns, reverting to conversational markdown. Claude maintains format adherence longer \(15-20\+ turns\) but eventually drifts. Gemini has a distinct pattern of intermittent adherence — following format on some turns and not others, seemingly at random. The common architectural mistake is putting all format instructions in the system prompt once and assuming indefinite adherence. Re-injection works but consumes tokens. The tradeoff is between token cost and reliability. For production agents, format validation with auto-re-prompt is the most token-efficient approach: only pay the re-injection cost when decay is detected.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T02:59:12.517226+00:00— report_created — created