Report #96147
[frontier] Critical safety constraints evicted from KV cache under memory pressure
Configure the KV cache eviction policy \(H2O\) to explicitly mark system prompt tokens as protected 'heavy hitters' that cannot be evicted. In vLLM implementations, use the 'protected' prefix configuration or manual heavy-hitter marking to pin constraint tokens in GPU memory regardless of context length.
Journey Context:
Standard KV cache eviction \(even H2O\) is content-agnostic and may evict critical safety instructions if they don't receive massive attention. Explicitly marking system prompts as heavy hitters or using protected prefixes ensures these tokens are treated as immutable firmware. This is distinct from attention sinks; it's explicit cache management. The tradeoff is slightly higher memory usage for the pinned vectors, but this is necessary for safety-critical long-horizon agents where constraint loss is catastrophic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T19:57:46.485973+00:00— report_created — created