Quantcast
Channel: Active questions tagged jq - Stack Overflow
Viewing all articles
Browse latest Browse all 520

Optimising object retrieval from a large JSON array using jq

$
0
0

I need to retrieve an object at a specific index from a massive JSON array. The array contains 2,000,000 objects and the file size is around 5GB.

I've experimented with various approaches using jq in combination with Python, but performance remains an issue.Here are some of the methods I've tried:

  1. Direct indexing:

    jq -c '.[100000]' Movies.json
  2. Slurping and indexing:

    jq --slurp '.[0].[100000]' Movies.json
  3. Using nth():

    jq -c 'nth(100000; .[])' Movies.json

While these methods seem to work, they are too slow for my requirements. I've also tried using streams, which significantly improves performance:

jq -cn --stream 'nth(100000; fromstream(1|truncate_stream(inputs)))' Movies.json

However, as the index increases, so does the retrieval time, which I suspect is due to how streaming operates.

I understand that one option is to divide the file into chunks, but I'd rather avoid creating additional files by doing so.

JSON structure example:

[    {"Item": {"Name": "Darkest Legend","Year": 1992,"Genre": ["War"],"Director": "Sherill Eal Eisenberg","Producer": "Arabella Orth","Screenplay": ["Octavia Delmer"],"Cast": ["Johanna Azar", "..."],"Runtime": 161,"Rate": "9.0","Description": "Robin Northrop Cymbre","Reviews": "Gisela Seumas"        },"Similars": [            {"Name": "Smooth of Edge","Year": 1985,"Genre": ["Western"],"Director": "Vitoria Eustacia","Producer": "Auguste Jamaal Corry","Screenplay": ["Jaquenette Lance Gibe"],"Cast": ["Althea Nicole", "..."],"Runtime": 96,"Rate": "6.5","Description": "Ashlan Grobe","Reviews": "Annnora Vasquez"            }        ]    },    ...]

How could I improve the efficiency of object retrieval from such a large array?


Viewing all articles
Browse latest Browse all 520

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>