Report #63865
[frontier] Agents hallucinating tool calls or failing silently when context limits are exceeded
Implement a mandatory 'introspection layer' where the agent first queries its own registry of available tools \(with schemas\), current context window usage, and token budget before generating any response, explicitly refusing tasks that exceed capability boundaries.
Journey Context:
Agents often 'try anyway' when faced with missing tools or full context windows, leading to hallucinated parameters or silent truncation. The robust pattern is forcing agents to 'look in the mirror' first. This involves injecting a system prompt that requires the agent to emit a structured 'capability check' JSON before acting, verifying: \(1\) Do I have a tool matching this intent? \(2\) Is my context window >80% full? \(3\) Are all required parameters present in my available context? If any check fails, the agent must emit a refusal with a specific reason \(e.g., 'TOOL\_MISSING: search\_web'\) rather than hallucinating. This transforms silent failures into explicit exceptions that orchestration layers can handle.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:40:55.949967+00:00— report_created — created