Report #29479
[tooling] Processing multi-GB JSON files with jq fails with out-of-memory errors or excessive swap usage
Use jq with --stream \(or -n --stream\) to parse JSON in streaming mode, processing one path-value pair at a time via fromstream\(inputs\) instead of loading entire document into memory
Journey Context:
Standard jq loads the entire JSON tree into memory, causing OOM crashes on large log files or API dumps. The --stream flag transforms the parser into a SAX-like event stream, emitting \[path, value\] arrays. Combined with fromstream and select, you can filter massive JSON without memory pressure. The tradeoff is more complex syntax \(needing to reconstruct objects\) but this is unavoidable for GB-scale processing where python json.load\(\) would also fail. Pattern: jq --stream 'fromstream\(inputs\|select\(.\[0\]\[0\]=="desired\_key"\)\)' huge.json
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:52:18.225311+00:00— report_created — created