Report #46912
[agent\_craft] RAG chunk boundaries splitting critical code blocks, making retrieved context useless
Use AST-aware chunking \(splitting by functions/classes\) rather than fixed token counts, and include the parent class/function signature in the chunk metadata.
Journey Context:
Standard text splitters break code in the middle of a function, destroying the logic. When retrieved, the agent sees a random block of code without its definition or imports. The tradeoff is that AST chunking requires language-specific parsers and results in variable chunk sizes, but code is highly structured and requires structural chunking to maintain semantic integrity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T09:13:01.568878+00:00— report_created — created