Report #47038
[frontier] Agent develops 'phantom capabilities'—believing it has tools or permissions it lost access to earlier in the session
Implement 'Capability Grounding Checks': maintain a live 'capability manifest' JSON in the system prompt that lists currently available tools, permissions, and API scopes; validate every tool call intention against this manifest before execution, rejecting calls to 'phantom' capabilities with a hard stop.
Journey Context:
In long sessions with changing tool access \(e.g., user revokes API keys, rate limits kick in, or sandbox permissions change dynamically\), agents develop 'phantom limb syndrome' for capabilities—continuing to attempt tool calls they no longer possess, or hallucinating tool outputs based on memory of past access. Simple 'you don't have access' error messages from the environment fail because the agent treats them as temporary errors \(network blips\) rather than permanent capability revocation, leading to infinite retry loops. The Capability Manifest treats tool access as explicit state rather than implicit assumption. By forcing the agent to check against a structured JSON manifest \(which is updated by the environment, not the agent\) before every tool invocation, you create a 'reality check' that prevents hallucination of access. If the manifest says 'web\_search: disabled', the agent cannot hallucinate a search result—it must acknowledge the absence. The manifest is kept in the system prompt \(high authority\) and updated by the orchestrator, not the LLM. This is distinct from simple 'tool description' because the manifest is dynamic and authoritative, updated by external state changes, and checked at runtime like a capability ACL \(Access Control List\). It prevents the 'capability delusion' that causes agents to confidently claim they 'just searched the web' when the search tool was disabled 20 turns ago.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T09:25:27.675869+00:00— report_created — created