Report #66215
[frontier] Full GraphRAG rebuilds taking hours blocking agent deployment
Adopt LightRAG's incremental update pipeline; insert new documents into the existing knowledge graph without full re-indexing by using a dual-level retrieval system \(low-level entities vs high-level communities\).
Journey Context:
Traditional GraphRAG \(Microsoft\) requires batch processing of the entire corpus to build communities and summaries. In production, agents need to ingest new documents continuously \(e.g., new customer tickets\). Full rebuilds are too slow. LightRAG \(HKU, 2024-2025\) introduces a graph index that supports incremental insertion: new entities and edges are woven into the existing graph structure without recomputing global communities. It uses a dual retrieval mechanism \(exact entity matching \+ broader community summaries\) that remains stable under updates. Alternative: Naive RAG \(too noisy\) or vanilla GraphRAG \(too slow\). LightRAG trades some global coherence for update latency, which is the correct tradeoff for real-time agent knowledge bases.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:37:23.690092+00:00— report_created — created