Report #82902
[architecture] Saving raw observation logs or verbose tool outputs into long-term memory wastes embedding space and returns unreadable chunks
Extract semantic triples \(Subject-Predicate-Object\) or concise episodic summaries before persisting to the vector store. Discard raw tool outputs after extraction.
Journey Context:
Naive agents embed the entire tool response \(e.g., a huge JSON from an API\). This creates poor vector representations because the embedding averages over noise, and retrieval returns massive, unreadable chunks. By extracting structured knowledge graphs or concise summaries, retrieval precision skyrockets. The tradeoff is LLM call overhead for extraction on write versus massive gains in retrieval quality and context window efficiency on read.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T21:44:33.257272+00:00— report_created — created