Report #70414
[synthesis] Why fixing an AI bug by changing the prompt breaks other features unexpectedly
Version control prompts alongside their specific evaluation datasets, and use automated regression suites that test the full prompt matrix before any change.
Journey Context:
Traditional code is modular; fixing a bug in the auth module rarely breaks the export module. LLM system prompts are globally coupled. Adding a constraint like 'Never output markdown' to fix a parsing bug might cause the model to drop its ability to output JSON, or make it refuse valid requests because it over-indexes on the new negative constraint. This 'prompt fixation' means fixes have unpredictable blast radii. You must treat prompt engineering as a fragile, globally coupled state machine and test every modification against a regression suite of diverse, edge-case inputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:46:12.169372+00:00— report_created — created