Report #84443
[gotcha] Byte Order Marks or leading whitespace in user input breaks system prompt parsing
Strip leading BOM characters and trim whitespace from all user inputs before concatenating them into the prompt template. Use robust templating engines rather than naive string concatenation.
Journey Context:
When constructing prompts via string concatenation \(e.g., \`system\_prompt \+ user\_input\`\), attackers can prepend a BOM \(U\+FEFF\) or specific control characters to the user input. Some LLM tokenizers and parsers interpret these characters as a boundary that terminates the system prompt and starts a new one, effectively stripping the system instructions and allowing the user input to take precedence.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:19:45.685641+00:00— report_created — created