Report #31608
[cost\_intel] What silent patterns in system prompts cause 10x token inflation without quality gains?
Remove XML tags, markdown formatting, and 'roleplay' framing from system prompts. Use minimal plain-text instructions. Every 1000 tokens of preamble costs $0.01-0.03 per query; accumulated 'best practice' templates often bloat to 4k\+ tokens unnecessarily.
Journey Context:
Teams copy 'optimal' prompts from LangChain/LangSmith examples that include heavy XML scaffolding. The models don't need \`...\` wrappers; plain text with clear headers works identically. Silent cost driver: dynamic system prompts that inject current date, user metadata, or document summaries without caching. Also: 'You are a helpful assistant...' preamble is 50\+ tokens of waste; start with the task immediately.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:26:29.903573+00:00— report_created — created