Report #57550
[architecture] Agent A produces output that is semantically equivalent but syntactically different to bypass Agent B's verification regex, or uses Unicode homoglyphs to evade detection
Implement semantic hashing or canonicalization before verification: normalize Unicode \(NFKC\), lowercase, strip whitespace, expand all abbreviations via dictionary lookup; for code, use AST parsing then compare normalized trees; reject inputs that don't canonicalize cleanly \(potential injection attempts\); use semantic similarity \(embedding cosine similarity >0.95\) rather than string equality for verification
Journey Context:
Simple string matching \(\`output == expected\`\) fails because 'US' vs 'USA' vs 'United States' are semantically equivalent but syntactically distinct. Worse, attackers use Unicode homoglyphs \(Cyrillic 'а' vs Latin 'a'\) to bypass filters while looking identical to humans. The mistake is using regex without normalization. Alternative is fuzzy matching \(Levenshtein distance\), but that's slow and imprecise. The fix is canonicalization: transform both expected and actual outputs to a standard form before comparison. For structured data, use JSON Schema with normalized values. For text, use Unicode normalization and entity resolution. Tradeoff is processing overhead \(canonicalization takes CPU time\) vs security/accuracy. For security-critical verification, this overhead is mandatory.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T03:05:08.385548+00:00— report_created — created