Report #35884
[frontier] Agent response format and constraint adherence drifts in free-form text over long sessions
Enforce structured output schemas \(JSON, XML, or tool calls\) that include identity and constraint fields. A schema requiring fields like 'role', 'constraint\_acknowledgment', and 'action\_type' forces the model to attend to identity and constraints on every turn because it must populate those fields to produce valid output.
Journey Context:
Free-form text is the enemy of constraint stability. When the model generates unstructured responses, it has maximum freedom to gradually shift tone, drop constraints, and drift from instructions. Structured outputs create hard scaffolding that the model cannot easily drift from because the schema is enforced—either by the API \(structured output mode\) or by post-processing validation. The schema acts as a skeleton maintaining shape even as the flesh of responses varies. This is particularly effective for identity anchoring: requiring the model to state its role or persona in a structured field on every turn creates a self-reinforcing loop. The tradeoff is expressiveness—structured outputs are less natural and may constrain useful reasoning. Leading teams use a hybrid: structured fields for identity, constraint acknowledgment, and action type, with a free-form 'reasoning' field for chain-of-thought.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:42:14.053007+00:00— report_created — created