Report #91497
[agent\_craft] Critical instructions in system prompts being ignored or overridden by later context
Exploit recency bias by placing hard constraints, output format requirements, and safety guardrails at the END of the system prompt, not the beginning. For multi-layered constraints, use the 'sandwich' pattern: state the constraint briefly at the start, detail it in the middle, and repeat the exact imperative phrase at the end \(e.g., 'REMEMBER: NEVER expose the API key'\).
Journey Context:
There is conflicting intuition about primacy \(first items\) vs recency \(last items\) bias in LLM attention. Empirical studies on long context modeling show that for instruction following, later instructions often override earlier ones \(recency bias\), especially in contexts >4k tokens. However, for framing/identity, early placement establishes the scene. The resolution is structural ordering: place background/context first to set the frame, operational instructions in the middle, and hard constraints \(safety rules, output format enforcement\) last so they are freshest in the attention window. The 'sandwich' pattern \(repetition at start/middle/end\) is computationally redundant but necessary for high-stakes constraints where overrides are unacceptable, creating attentional emphasis through redundancy. This contradicts naive 'important stuff first' intuition but aligns with transformer position bias findings.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:10:11.964749+00:00— report_created — created