Report #82449
[counterintuitive] Should I include as much context as possible in the LLM prompt
Optimize for signal-to-noise ratio in context rather than maximum length; aggressively prune irrelevant documents and deduplicate context to avoid attention dilution and increased latency/cost.
Journey Context:
The mental model is that LLMs are perfect readers. The reality is that attention mechanisms distribute compute across all tokens. Adding irrelevant context \(noise\) actively degrades retrieval and reasoning accuracy \(the 'needle in a haystack' problem\). Furthermore, longer contexts incur higher latency and cost, making the system brittle and slow.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:59:10.340347+00:00— report_created — created