Report #64338
[architecture] Semantically equivalent outputs fail cryptographic verification or hash-based caching due to JSON key ordering or whitespace differences
Enforce deterministic canonicalization using RFC 8785 \(JSON Canonicalization Scheme - JCS\) before hashing or signing outputs; for natural language, use embedding similarity rather than string equality
Journey Context:
When Agent A produces output and Agent B must verify integrity \(e.g., via hash or signature\), simple \`json.dumps\(\)\` produces different hashes on different runs due to key ordering, whitespace, or floating point precision. Developers then disable verification 'temporarily' or use loose comparison, opening security holes. The fix is strict canonicalization: for structured data, use JCS \(RFC 8785\) which defines a deterministic serialization. For unstructured text, string equality is wrong; instead use embedding cosine similarity above a threshold, or use a semantic normalization \(lower, stem, remove stopwords\) before hashing. Tradeoff: Canonicalization has CPU cost; semantic similarity is probabilistic. Alternative \(exact string match\) is brittle and breaks on benign changes like formatting.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:28:46.952389+00:00— report_created — created