Given input JSON like this (there's a lot more to it really, but I've stripped the fields that aren't of any interest:
{"modules": {"data": [ {"id": "aod_play_area","data": [ {"titles": {"primary": "Primary","secondary": "Secondary" } } ] }, {"id": "aod_tracks","data": [ {"titles": {"primary": "First Artist name here","secondary": "First Track title here" },"uris": [ {"id": "commercial-music-service-spotify","uri": "https://open.spotify.com/track/1234567890" }, {"id": "commercial-music-service-apple","uri": "https://music.apple.com/gb/album/xyz/1234?i=9876" } ] }, {"titles": {"primary": "Second Artist name here","secondary": "Second Track title here" },"uris": [ {"id": "commercial-music-service-spotify","label": "Spotify","uri": "https://open.spotify.com/track/555555555555" }, {"id": "commercial-music-service-apple","label": "Apple Music","uri": "https://music.apple.com/gb/album/abc/5555?i=5555" } ] } ] } ] }}
... and desired output which has two top-level properties, each populated from different elements within the modules.data[]
array, indexed by their .id
:
{"title": "Primary - Secondary","tracks": [ {"title": "First Track title","artist": "First Artist name","start": 3645,"end": 3820,"apple": "https://music.apple.com/gb/album/xyz/1234?i=9876","spotify": "https://open.spotify.com/track/1234567890" }, {"title": "Second Track title","artist": "Second Artist name","start": 3645,"end": 3820,"apple": "https://music.apple.com/gb/album/abc/5555?i=5555","spotify": "https://open.spotify.com/track/555555555555" } ]}
... what should my jq
query look like to pull data from those two objects within modules.data
? I can write queries to do one or the other, but not both, presumably because my first query has caused jq
to walk down one branch of the structure and I don't know how to make it "unwind" so that the second query still works.
Extracting the titles:
cat sample.json | jq '.modules.data.[] | { title: select(.id == "aod_play_area").data[0] | "\(.titles.primary) - \(.titles.secondary)", tracks: []}'
Produces:
{"title": "Primary - Secondary","tracks": []}
Extracting just the tracks:
cat sample.json | jq '.modules.data.[] | { title: "title", tracks: select(.id == "aod_tracks").data | map({ title: .titles.primary, artist: .titles.secondary, start: .offset.start, end: .offset.end, apple: .uris[] | select(.id =="commercial-music-service-apple").uri, spotify: .uris[] | select(.id =="commercial-music-service-spotify").uri })}'
Produces:
{"title": "title","tracks": [ {"title": "First Artist name here","artist": "First Track title here","start": null,"end": null,"apple": "https://music.apple.com/gb/album/xyz/1234?i=9876","spotify": "https://open.spotify.com/track/1234567890" }, {"title": "Second Artist name here","artist": "Second Track title here","start": null,"end": null,"apple": "https://music.apple.com/gb/album/abc/5555?i=5555","spotify": "https://open.spotify.com/track/555555555555" } ]}
Combining the two:
cat sample.json | jq '.modules.data.[] | { title: select(.id == "aod_play_area").data[0] | "\(.titles.primary) - \(.titles.secondary)", tracks: select(.id == "aod_tracks").data | map({ title: .titles.primary, artist: .titles.secondary, start: .offset.start, end: .offset.end, apple: .uris[] | select(.id =="commercial-music-service-apple").uri, spotify: .uris[] | select(.id =="commercial-music-service-spotify").uri })}'
... produces no output at all. I believe this is because the first select
has taken us down one "branch" of the outer-most data, so the second select
doesn't find what it's looking for (as children of where it's ended up down that first branch). How should I rewrite my query to successfully extract all of the data of interest?
(I'm new to jq
, so apologies if I've misused any terminology)