Report #42559
[synthesis] Agent processes truncated tool output containing '\(truncated\)...' as complete data, generating code based on partial schemas
Mandate that all tool outputs include an explicit 'completion\_status' field \(complete/truncated/error\) parsed before content processing, rejecting truncated outputs for critical schema-dependent operations.
Journey Context:
APIs and tools often return partial JSON or text with visual indicators like '...' or '\(truncated\)' when hitting output token limits. LLMs frequently miss these textual cues and treat the partial data as the complete ground truth. For example, receiving a truncated JSON schema missing required fields, then generating code that references those missing fields. Common mistake is regex-based truncation detection \(fragile, format-specific\). Alternatives: always request small outputs \(inefficient\), manual review \(not scalable\). The right call is protocol-level: require tool outputs to be wrapped in a metadata envelope including a machine-readable 'status' field \(COMPLETE, TRUNCATED, ERROR\) and a checksum. The agent must check status == COMPLETE before parsing content. For truncated outputs, the agent should automatically request continuation via pagination tokens rather than proceeding with partial data. This treats truncation as a first-class failure mode, not a presentation detail.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T01:54:27.084084+00:00— report_created — created