There are multiple questions on this topic already, and they all look closely related:
- jq merge values of same key into an array? requires knowledge of the keys in each object, as far as I can tell; we won't have that in advance
- How to combine JSON objects with the same key, using jq mentions "Note that if two or more objects share the same key and if that key refers to a scalar or array, then the later objects in the input will overwrite the value" which is exactly the problem I'm trying to avoid.
- How to combine the sequence of objects in jq into one object? looks hopeful but requires more jq syntax knowledge than I can figure out.
So there are related examples, it's not clear how to modify the reduce
statement from those examples to do what I want.
Problem: We've got a file containing multiple JSON object-blobs. Each top-level object has a single key, with an array as its value. Essentially it's
{"SomeCategory": [ {"Key": "value1" # very first entry }, {"Key": "value2" }, { ... }, {"Key": "valueA" } ]}{"SomeCategory": [ {"Key": "valueA+1" }, {"Key": "valueA+2" }, { ... }, {"Key": "valueA+B" # very last entry } ]}{ ... repeat ad nauseam with different values ...}
Goal: Here's what I'm hoping to end up with:
{"SomeCategory": [ {"Key": "value1" # very first entry }, {"Key": "value2" }, { ... }, {"Key": "valueA+B" # very last entry } ]}
That is, all the top-level blobs have been merged into a single top-level blob, and the arrays contained in the blobs' single key are all merged.
Attempts: The linked examples all recommend things along the lines of
jq -n 'reduce inputs as $foo ({}; . *= $foo)'
which runs into the problem of "if that key refers to a scalar or array, then the later objects in the input will overwrite the value":
{"SomeCategory": [ {"Key": "valueY+1" }, {"Key": "valueY+2" }, { ... }, {"Key": "valueY+Z" # very last entry } ]}
i.e., only the last top-level blob survives.
Other things that might matter:
- There's only a single key (
"SomeCategory"
) in all the top-level object blobs, and it's always the same key, but how that key is spelled isn't known in advance. - There are lots of individual object keys; I've only shown one
"Key"
here. - The individual object keys (
"Key"
) are the same in every individual object. I'm not looking to merge any of those objects, since their values will be violently unique.
I'm guessing that some nested merge in the "UPDATE" clause in reduce EXP as $var (INIT; UPDATE)
will do this, but I cannot figure out from the jq
man page what that syntax is supposed to look like (all its examples are extremely contrived and simplistic). The instances of reduce
that google can find don't use any nested update pipelines, so perhaps it's not even supported and that idea was dumb. Normally the right answer is some form of "pipelined expressions in a single jq invocation" but we haven't figure out how to avoid the replacing of the array value on each update.