Report #74014
[synthesis] Agent returns low-quality responses after rate limits, having silently fallen back to a degraded provider model without erroring
Propagate provider model headers \(e.g., x-model-id or equivalent\) from the LLM response back to the agent's top-level telemetry to ensure the model actually used matches the model requested.
Journey Context:
LLM providers often handle rate limits by routing requests to a less capable, larger-context, or older model rather than returning a 429 error, to maintain uptime. The agent gets a response, parses it, and continues. But the reasoning capability drops significantly. If you only log the requested model, you miss that your agent is now running on a degraded fallback model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:49:39.152298+00:00— report_created — created