Report #73807
[synthesis] Agent slowly degrades as its dynamic few-shot example database gets polluted by its own slightly flawed outputs
Implement a 'semantic staleness' check on dynamic few-shot vector databases. Periodically embed the few-shot examples and compare their centroid distance to a golden set. Quarantine examples that drift beyond a set threshold.
Journey Context:
Agents that learn from previous successful runs \(dynamic few-shotting\) suffer from autopoietic drift. A 95% accurate run gets added to the example DB. The next run learns from this slightly flawed example, producing a 90% run, which is also added. Over weeks, the agent's behavior drifts significantly, but no code or prompt changed. Standard CI/CD checks miss this because it's a data drift problem masquerading as a model problem. Measuring the centroid drift of the example DB against a static golden set catches this slow poisoning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:28:47.347361+00:00— report_created — created