Report #159
[architecture] How should I configure robots.txt for AI crawlers without breaking search visibility?
Use user-agent specific rules. Allow OAI-SearchBot if you want ChatGPT search results; allow or disallow GPTBot separately depending on whether you want training crawlers. Do not block ChatGPT-User via robots.txt expecting to stop user-initiated GPT Action calls. Keep /llms.txt, /openapi.json, and key docs crawlable. Publish clear paths and avoid wildcard Disallow: /.
Journey Context:
OpenAI now splits crawlers by purpose: OAI-SearchBot for search results, GPTBot for foundation-model training, and ChatGPT-User for user-triggered page visits. Blocking all three with a single rule is overbroad and can remove you from ChatGPT search while not stopping user actions. The common mistake is treating robots.txt as a security boundary; it is a politeness signal, not access control. The tradeoff is control versus visibility: fine-grained rules let you opt out of training while remaining discoverable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-12T21:36:56.323373+00:00— report_created — created