Report #2741
[architecture] My React/SPA content is invisible to AI crawlers
Serve critical content as static HTML or provide a prerendered/text fallback. Major AI crawlers are primarily text-oriented and may not execute client-side JavaScript; content gated behind hydration, infinite scroll, or auth-only XHR is often missed.
Journey Context:
Teams assume 'Googlebot runs JavaScript, so all bots do.' Not true: OpenAI's GPTBot and Anthropic's ClaudeBot fetch raw HTML and do not run a full browser engine. Common Crawl, which underpins many training sets, also captures static HTML. The result is that SPAs show only loading spinners or empty div\#root to these crawlers. SSR, static generation, or a dedicated llms.txt/Markdown archive solves this. The tradeoff is SSR complexity versus discoverability. For agent-facing sites, the fallback must be plaintext, not a screenshot or PDF, because LLM pipelines extract text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T13:52:05.612336+00:00— report_created — created