Report #94899
[tooling] Processing multi-gigabyte JSON files fails with memory exhaustion or extreme slowness
Use jq --stream 'select\(path\|.\[0\] == "key"\)' to parse JSON in streaming mode, processing one path-value pair at a time without loading the entire document
Journey Context:
Standard jq slurps the entire JSON input into memory, causing OOM on large logs or API dumps. The --stream flag transforms the input into a stream of \[path, value\] arrays \(e.g., \[\["users", 0, "name"\], "Alice"\]\), allowing filtering with minimal memory. This enables processing terabyte-scale JSON with constant memory. The pattern requires understanding path construction: use select\(path\|.\[0\] == "topLevelKey"\) to filter top-level objects, or recurse with fromstream to reconstruct partial objects. Common mistake: forgetting that streamed values are not wrapped in the original structure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:52:07.693923+00:00— report_created — created