Report #12497
[research] Agent outputs a common programming idiom or fact that is statistically likely but contextually wrong for the specific user query
Lower the temperature and use strict system prompts emphasizing adherence to the provided context over general knowledge. Implement a 'context adherence' classifier to reject answers that rely on parametric memory rather than the provided prompt.
Journey Context:
LLMs learn shortcuts. If 'numpy' almost always appears near 'array', the model might import numpy even when the user explicitly asked for a pure Python list implementation. This is a factuality error rooted in the model's prior \(training data\) overpowering the likelihood of the prompt. Standard decoding amplifies these priors. Grounding classifiers explicitly measure if the output is entailed by the input, catching this failure mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T16:12:34.427973+00:00— report_created — created