There exists a really cool tool called jq that makes it a little easier to see the data structure. It takes a while to learn the syntax, but it is pretty powerful. Below are a few of the commands I use regularly to learn how my data is structured.
Assume we have a file names largeResponse.json. That has a similar structure like:
{ "data_available": true, "query": "The query is here", "results": { "details": [ { "id": 1, "data1": "data1", "data2": "data2" }, { "id": 2, "data1": "data1", "data2": "data2" } ... Many more rows here ] } }
Getting the keys
jq 'keys' largeResponse.jsonThis command will list all of the keys of the root object. We can also see the keys of child objects using
jq '.results | keys' largeResponseAssuming there is a child object of 'results' this will show the list of the keys in the results object.
Get an object
jq '.results' largeResponse.jsonwill show all of the results object, which for this case would be a a LOT of information. However, for smaller objects, like the query, it could be very helpful.
Working with Arrays
With a large list of data it is useful is see a sample of the data.
Length. We can see the length an array with something like this
jq '.results.details | length' largeResponse.jsonNotice how details is an array.
Get a single Object. We can get an item from an array with the index.
jq '.results.details[0]' largeResponse.jsonThis would return
{ "id": 1, "data1": "data1", "data2": "data2" }
Get a range of objects. jq can also get a range in an array.
jq '.results.details[0,10]' largeResponse.jsonThis would return the first ten objects in the details array.
Tons more
And of course you can a million other things that I haven't come close to exploring. To see all that you can do take a look at the jq manual.
Misc
To print out multiple fields:jq '.users[] | .first + " " + .last'
Nice post on parsing JSON. The only thing I dislike about JSON is that in order to determine schema, the entire file has to be parsed because it can change per object. This is the issue I've be running into recently on massive JSON files I'm parsing. The same issue exists with any open format language like XML.
ReplyDelete