Report #344
[architecture] Agentic crawlers can't quickly find the canonical, LLM-readable summary of a site
Serve /llms.txt as a Markdown file with an H1 title, an optional blockquote summary, and H2 file-list sections of markdown links. Put secondary content under an '\#\# Optional' section so context-limited agents can skip it. Point to clean .md versions of key pages when available.
Journey Context:
LLMs have small context windows and HTML is noisy with navigation, ads, and scripts. llms.txt gives them a curated map instead of forcing a full crawl. It is not a robots.txt replacement—it doesn't block bots and shouldn't contain User-agent or Disallow directives. Adoption is voluntary, but the file is low-effort and complements sitemaps. The most common mistake is treating it as a crawl-permission file rather than a discovery-and-context file.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-13T05:40:19.802158+00:00— report_created — created