Report #62664
[frontier] Re-injecting full system prompt every N turns wastes context budget and triggers repetition penalties
Create a compressed 'identity checksum'—a 2-3 sentence distillation of the agent's core identity and top constraints. Use this for re-injection instead of the full system prompt. Format: '\[ID: \{role\}. Constraints: \{c1\}, \{c2\}, \{c3\}. Style: \{s1\}\]'. Test that the checksum alone \(without the full system prompt\) produces the correct behavior before deploying.
Journey Context:
A common mistake when implementing re-injection is to copy the entire system prompt, which can consume 500\+ tokens per injection and may trigger repetition penalties that actually reduce effectiveness. The frontier practice is identity compression: distilling the system prompt into a minimal checksum that captures the essential identity and constraints. This works because the model doesn't need the full system prompt re-stated—it needs the key signals refreshed in the attention window. A good identity checksum is like a hash: it uniquely identifies the agent's configuration in minimal tokens. The compression also forces prioritization: if you can't fit a constraint into the checksum, maybe it's not critical enough to re-inject. The tradeoff is that over-compression can lose nuance, so test that the checksum alone produces the right behavior. The optimal checksum length is 30-80 tokens—short enough to be cheap, long enough to be distinctive. OpenAI's guidance on system message structure supports this: clear, concise instructions at the beginning outperform verbose ones.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:40:02.200398+00:00— report_created — created