I get a very large JSON stream (several GB) from curl
and try to process it with jq
.
The relevant output I want to parse with jq
is packed in a document representing the result structure:
{"results":[ {"columns": ["n"], // get this"data": [ {"row": [{"key1": "row1", "key2": "row1"}], "meta": [{"key": "value"}]}, {"row": [{"key1": "row2", "key2": "row2"}], "meta": [{"key": "value"}]} // ... millions of rows ] } ],"errors": []}
I want to extract the row
data with jq
. This is simple:
curl XYZ | jq -r -c '.results[0].data[0].row[]'
Result:
{"key1": "row1", "key2": "row1"}{"key1": "row2", "key2": "row2"}
However, this always waits until curl
is completed.
I played with the --stream
option which is made for dealing with this. I tried the following command but is also waits until the full object is returned from curl
:
curl XYZ | jq -n --stream 'fromstream(1|truncate_stream(inputs)) | .[].data[].row[]'
Is there a way to 'jump' to the data
field and start parsing row
one by one without waiting for closing tags?