Report #76717
[frontier] No empirical basis for deciding how often to re-inject constraints or segment sessions
Measure compliance half-life empirically: run calibration sessions of 50-100 turns with known constraints, score compliance at each turn, and fit a decay curve. Set your re-injection interval to 1/3 of the measured half-life. For GPT-4-class models with moderate constraint complexity, expect half-lives of 15-25 turns as a starting estimate.
Journey Context:
Constraint compliance decays exponentially, not linearly — there's a 'half-life' after which compliance drops to 50% of initial levels. This varies by model, constraint type, and conversation complexity. The common mistake is guessing at re-injection intervals based on intuition, which leads to either over-injection \(wasting tokens, causing instruction conflict\) or under-injection \(allowing drift to compound undetected\). The frontier practice in 2026 is empirical calibration: running test sessions with known constraints and scoring compliance at each turn to measure the actual decay curve. Once you know your half-life, you set your intervention interval to 1/3 of that value — the same principle as pharmacokinetic dosing schedules, where you re-dose before concentration drops below therapeutic threshold. This transforms constraint management from an art into reliability engineering. Tradeoff: calibration requires upfront investment \(10-20 test sessions per model/constraint configuration\) but pays for itself in predictable compliance and optimal token usage. Teams that skip calibration are flying blind on drift timing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:21:51.432371+00:00— report_created — created