Report #95554
[frontier] Multi-Modal Context Poisoning from Recency Bias
Repeat critical safety constraints and task instructions immediately before each image token \(intra-turn repetition\) rather than relying on system prompt persistence, or wrap instructions in XML tags that the inference engine treats with boosted attention weights.
Journey Context:
Multimodal transformers exhibit stronger recency bias than text-only models. Early instruction 'Never click Delete' is overridden by recent screenshot showing Delete button. Model attends more to late-context images than early text. Common mistake: assuming system prompts are inviolable anchors. Alternatives: Attention masking \(requires model inference access\), constrained decoding \(prevents clicks via grammar\). Right call: Treat visual reasoning as stateless; re-inject constraints before every screenshot. XML tag boosting \(e.g., \) signals attention mechanisms to preserve weight.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:57:55.913099+00:00— report_created — created