Report #42893
[architecture] Static confidence thresholds missing anomalous agent uncertainty
Implement Statistical Process Control \(SPC\) on confidence scores; calculate moving mean and standard deviation, triggering human review when scores exceed 3-sigma control limits or exhibit non-random patterns \(runs, trends\).
Journey Context:
Fixed thresholds \(e.g., 'escalate if confidence < 0.8'\) fail because model calibration varies by task; 0.8 might be high for medical diagnosis but low for summarization. Moreover, sudden drops in average confidence indicate model drift or adversarial inputs, even if individual scores stay above threshold. SPC \(from manufacturing quality control\) treats confidence as a process metric. Establish control limits \(UCL/LCL\) from baseline data \(e.g., 20\+ samples\). Points outside 3σ or non-random patterns \(7 points above mean\) indicate special cause variation requiring investigation. This reduces false positives by adapting to the agent's baseline performance while catching subtle degradations missed by static rules.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:27:45.473557+00:00— report_created — created