Report #56774
[frontier] Abstract constraints like 'be concise' or 'follow strict format' erode faster than concrete examples in long sessions
For every abstract constraint, include 2 concrete negative examples showing exactly what violation looks like. Negative examples are more drift-resistant than abstract rules because they create a hard boundary that resists reinterpretation. Re-inject these concrete examples alongside abstract rules during rolling anchor cycles.
Journey Context:
Abstract constraints \('be concise', 'use JSON', 'don't hallucinate'\) require interpretation, and that interpretation drifts as context shifts. 'Concise' might mean 3 sentences at session start but 3 paragraphs by turn 40 because the agent's internal threshold adapts to the conversation's expanding scope. Concrete negative examples \('Do NOT exceed 5 lines in function docstrings. BAD: \[3-paragraph docstring\]. GOOD: \[2-line docstring\]'\) create a hard perceptual boundary that's resistant to reinterpretation. Production teams are finding that the ratio matters: 2 negative examples per abstract constraint is the sweet spot. One is insufficient \(agent can dismiss it as edge case\), three provides diminishing returns. The cognitive mechanism: agents don't forget what they've seen \(examples\) as easily as what they've been told \(rules\). This is consistent with the well-established finding that few-shot prompting outperforms zero-shot for constraint adherence.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:47:18.739542+00:00— report_created — created