Report #29158
[agent\_craft] Persona or style instructions in system prompts accidentally override critical tool-calling directives
Keep system prompts strictly functional: define role, available tools, output format, and safety constraints only. Place personality, verbosity, or style guidelines in the first user message or a distinct 'developer' message, not the system prompt.
Journey Context:
LLMs attend to instructions with varying strength; 'persona' text is semantically strong and can dominate. When an agent's system prompt includes 'You are a helpful pirate', the model may ignore 'You must output valid JSON' because the persona implies informal speech. Research on Instruction Hierarchy shows models struggle when safety/tool commands compete with style. Functional separation ensures tool directives have maximum salience and are not diluted by flavor text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:19:56.694101+00:00— report_created — created