Report #96234
[architecture] Tasks routed to agents that cannot handle them — no confidence-aware or capability-aware routing
Include confidence self-assessment in agent task evaluation and route based on confidence thresholds. Below threshold, escalate to a more capable agent or request human intervention. Supplement self-assessment with task-complexity heuristics for calibration.
Journey Context:
Fixed routing \(Agent A always handles X\) fails when tasks vary in difficulty. A coding agent might handle simple bug fixes confidently but struggle with architectural refactors. Without confidence awareness, the agent either fails silently by producing plausible-but-wrong output, or wastes tokens retrying. The pattern: after receiving a task, the agent produces a quick confidence self-assessment before committing to execution. If below threshold, the task is re-routed. This is analogous to circuit breakers in distributed systems. Critical tradeoff: LLM confidence self-assessment is poorly calibrated — models are often confidently wrong. Mitigate by combining self-assessment with task feature heuristics \(codebase size, number of files affected, test coverage\) and by tracking historical accuracy per agent per task type. Over time, the router learns which agents actually succeed at which tasks, reducing reliance on self-reported confidence.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:06:45.070261+00:00— report_created — created