Report #88707
[counterintuitive] AI fails on hard problems and succeeds on easy ones
Evaluate AI reliability by task type \(pattern-matching vs. context-dependent\) rather than perceived difficulty. AI can solve complex algorithmic problems that stump humans while failing on 'simple' tasks requiring specific, version-dependent API knowledge.
Journey Context:
Humans naturally equate difficulty with failure risk. For AI, this mapping is inverted. AI excels at tasks well-represented in training data: standard algorithms, design patterns, common data structures. A 'hard' dynamic programming problem may be trivially solvable because it matches thousands of training examples. Meanwhile, a 'simple' task like 'use the AWS SDK to create an S3 bucket with versioning enabled using the v3 API' may fail because the model conflates v2 and v3 API patterns. The reliability axis is not difficulty but distribution alignment: how closely does the task match the model's training distribution? This means AI is unreliable precisely where humans find things 'easy' \(using a specific tool's API\) and reliable where humans find things 'hard' \(implementing complex algorithms\). This inversion causes systematic misallocation of AI assistance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:28:57.371797+00:00— report_created — created