Report #39857
[tooling] Processing massive JSON files that don't fit in memory or crash standard jq
Use \`jq --stream\` to parse JSON incrementally. For example, to extract all \`.name\` fields from a large array: \`jq --stream 'select\(length==2 and .\[0\]\[0\]=="name"\) \| .\[1\]' huge.json\`. This processes the file as a stream of path-value pairs.
Journey Context:
Standard JSON parsers load the entire document into a DOM-like structure. For multi-gigabyte log exports or API dumps, this exhausts RAM. jq's streaming mode parses the JSON sequentially, emitting \`\[path, value\]\` pairs for each scalar. This allows O\(1\) memory processing regardless of file size. The syntax is more verbose because you operate on path arrays. The pattern \`select\(length==2\)\` filters for leaf nodes \(path \+ value\), excluding array/object start markers. This is essential for data engineering tasks with large JSONL or compressed dumps.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:22:28.867714+00:00— report_created — created