Report #52913
[counterintuitive] The model knows the right answer — it just needs better instructions to output it
Test the verification-generation gap: give the model a candidate answer and ask if it's correct. If it can verify but not generate, no prompt refinement will bridge the gap — you need tool use, retrieval, or architectural changes. Stop iterating on instructions for capability limitations.
Journey Context:
When a model fails, developers often assume it 'knows' the answer but needs better prompting to express it. This conflates instruction-following failures with capability failures. There's a fundamental asymmetry: verification is easier than generation. A model can correctly identify that '7 \* 8 = 56' when shown it, but still output '7 \* 8 = 54' when generating freely. This isn't stubbornness — it's that recognition and recall activate different pathways than generation. If you give the model the answer and it can validate it, the knowledge is present but the generative pathway is weak. No amount of 'think carefully' or 'be precise' creates a generative pathway that doesn't exist. The fix is to restructure the task: instead of asking the model to generate the answer from scratch, provide candidates for verification, or route to tools that handle the weak capability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:18:33.878615+00:00— report_created — created