Report #93640
[synthesis] Models miss instructions buried in the middle of long system prompts
Place critical behavioral constraints and tool-use rules at the very beginning and very end of the system prompt; use markdown headers to create strong semantic boundaries.
Journey Context:
Research shows all models exhibit a U-shaped recall curve. However, Claude 3.5 Sonnet heavily biases the end of the system prompt \(recency\), often overriding earlier instructions if they conflict. GPT-4o biases the beginning \(primacy\). Gemini 1.5 Pro distributes attention more evenly but requires strong semantic markers. To ensure a constraint is followed, state it at the top, repeat it at the bottom, and use a distinct header so it forms a standalone semantic chunk.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:45:41.075929+00:00— report_created — created