Converting query on large JSON file to use a stream - jq [duplicate] - json

This question already has answers here:
Process large JSON stream with jq
(3 answers)
Improving performance when using jq to process large files
(3 answers)
Closed 4 months ago.
I have a large JSON file (2.7gb) that I would like filter for only the data I'm interested in to make the file smaller. The data consists of an array of objects. The following query works on a small subset of the JSON file, but when I try to run it on the 2.7gb file it doesn't work. How can I convert this into a stream query so that it can process the entire file?
.[] | {
food: .foodClass,
description: .description,
foodNutrients: [.foodNutrients[] | { nutrients: .nutrient, amount: .amount}],
upc: .gtinUpc,
servingSize: .servingSize,
servingSizeUnit: .servingSizeUnit,
ingredients: .ingredients,
fdcId: .fdcId,
dataType: .dataType,
brandOwner: .brandOwner,
marketCountry: .marketCountry,
brandedFootCategory: .brandedFoodCategory
}

Related

Exporting Data from List of JSON Objects to CSV using Powershell [duplicate]

This question already has answers here:
Convert nested JSON array into separate columns in CSV file
(2 answers)
Powershell nested JSON to csv conversion
(4 answers)
Closed 5 months ago.
I have been trying to export multiple data fields from a single json file using powershell, and trying to place them nicely into an excel document. We basically have multiple data fields in each json file that are the same that contain contact information, and we are trying to export a list of that data to place into a new record system.
The flow is as follows: Export ClientID, ContactInfo, CustomerWebsite from each json object (BUT some have multiple entries, its archaic) to an excel CSV that lists File Name | ClientID, ContactInfo, CustomerWebsite.
I have been using the following code block in powershell to do it manually per each field I am trying to export >
Select-String -Path .\*.json -Pattern 'ClientID' | Select-Object -Property Filename,Line | Export-Csv "C:\Users\User\appdev\powershell-scripts\ClientID.csv"
This exports the data in such a fashion:
exported data
Really looking for a way to stream line this into one call, and one singular output file so I can automate this. Ive looked everywhere. I am open to using python instead of powershell, Powershell is just something I know a lot better.
EDIT:
I have tried to use json psobject properties, but an running into issues as well.
Running
$json = (Get-Content "test.json" -Raw) | ConvertFrom-Json
$json.psobject.properties.name
Will only output the first line of the code, which is
sfg_ping::qa::standard_sp_connections
Any sub tag,object listed under that data seemingly isnt accessible by the psobject options. I am not sure if I am doing it incorrectly or what.

How to get the value of a key in a JSON? [duplicate]

This question already has answers here:
Parsing JSON with Unix tools
(45 answers)
Closed 3 years ago.
I have a JSON body. I would like to obtain the value of "access_token". I would like to use grep in bash. The grep I used doesn't not work.
I tried the following but it provides me a blank result.
HTTP_BODY=$(echo $HTTP_RESPONSE | grep ^"access_token":" ","experies_in$ )
I want only to obtain the access_token value, i.e.:
eyJhbGciOiJSUzI1NiIsImtpZCI6IkE4M0UwQTFEQTY1MzE0NkZENUQxOTFDMzRDNTQ0RDJDODYyMzMzMzkiLCJ0eXAiOiJKV1QiLCJ4NXQiOiJxRDRLSGFaVEZHX1YwWkhEVEZSTkxJWWpNemsifQ.eyJuYmYiOjE1NTkzMTYzNTYsImV4cCI6MTU1OTMxOTk1NiwiaXNzIjoiaHR0cHM6Ly9jaW5jaHktbnByLmNsb3VkLnJlcy5ibmdmLmxvY2FsL2NpbmNoeXNzbyIsImF1ZCI6WyJodHRwczovL2NpbmNoeS1ucHIuY2xvdWQucmVzLmJuZ2YubG9jYWwvY2luY2h5c3NvL3Jlc291cmNlcyIsImpzX2FwaSJdLCJjbGllbnRfaWQiOiJhcGkiLCJzdWIiOiIxIiwiYXV0aF90aW1lIjoxNTU5MzE2MzU2LCJpZHAiOiJsb2NhbCIsInByb2ZpbGUiOiJBZG1pbmlzdHJhdG9yIiwiZW1haWwiOiJhZG1pbkBjaW5jaHkuY28iLCJyb2xlIjoiQ2luY2h5IFVzZXIgQWNjb3VudCIsImlkIjoiYWRtaW4iLCJzY29wZSI6WyJqc19hcGkiXSwiYW1yIjpbImN1c3RvbSJdfQ.O0--cahxPKlwHp-7fP0CMgSJTaXleupH32x7vVoxe8THVdeRIgyuZoKWPAK9p10PMO9a5Mi3N0t1Nqut5-dUS7lmeUfNKe25K1got9de7ghQ56QQXnL2SWd6g4I8Zi1R9fZsln7bZCIJvnG3_wIWHKGHBco9jEvKtO3AdYF4T9LAbdpT51SDzKPhX16BPc0Do6KfNImQpPQdK4fP3-JqxD4sOBldUg-g3aau2F_DmapEd0p5hTI4qeKgORnXJ3NadwWscQREGWVXhIRu_BF_cmEoIfPNyJI7D_L7EWn8XcFa2Gu-8khQ-WDpVUcpyidF_VHYRkMtYwpJ2dcYUaLILQ
Input:
HTTP_RESPONSE="{"access_token":"eyJhbGciOiJSUzI1NiIsImtpZCI6IkE4M0UwQTFEQTY1MzE0NkZENUQxOTFDMzRDNTQ0RDJDODYyMzMzMzkiLCJ0eXAiOiJKV1QiLCJ4NXQiOiJxRDRLSGFaVEZHX1YwWkhEVEZSTkxJWWpNemsifQ.eyJuYmYiOjE1NTkzMTYzNTYsImV4cCI6MTU1OTMxOTk1NiwiaXNzIjoiaHR0cHM6Ly9jaW5jaHktbnByLmNsb3VkLnJlcy5ibmdmLmxvY2FsL2NpbmNoeXNzbyIsImF1ZCI6WyJodHRwczovL2NpbmNoeS1ucHIuY2xvdWQucmVzLmJuZ2YubG9jYWwvY2luY2h5c3NvL3Jlc291cmNlcyIsImpzX2FwaSJdLCJjbGllbnRfaWQiOiJhcGkiLCJzdWIiOiIxIiwiYXV0aF90aW1lIjoxNTU5MzE2MzU2LCJpZHAiOiJsb2NhbCIsInByb2ZpbGUiOiJBZG1pbmlzdHJhdG9yIiwiZW1haWwiOiJhZG1pbkBjaW5jaHkuY28iLCJyb2xlIjoiQ2luY2h5IFVzZXIgQWNjb3VudCIsImlkIjoiYWRtaW4iLCJzY29wZSI6WyJqc19hcGkiXSwiYW1yIjpbImN1c3RvbSJdfQ.O0--cahxPKlwHp-7fP0CMgSJTaXleupH32x7vVoxe8THVdeRIgyuZoKWPAK9p10PMO9a5Mi3N0t1Nqut5-dUS7lmeUfNKe25K1got9de7ghQ56QQXnL2SWd6g4I8Zi1R9fZsln7bZCIJvnG3_wIWHKGHBco9jEvKtO3AdYF4T9LAbdpT51SDzKPhX16BPc0Do6KfNImQpPQdK4fP3-JqxD4sOBldUg-g3aau2F_DmapEd0p5hTI4qeKgORnXJ3NadwWscQREGWVXhIRu_BF_cmEoIfPNyJI7D_L7EWn8XcFa2Gu-8khQ-WDpVUcpyidF_VHYRkMtYwpJ2dcYUaLILQ","expires_in":360,"token_type":"Bearer"}"
Expected Output:
eyJhbGciOiJSUzI1NiIsImtpZCI6IkE4M0UwQTFEQTY1MzE0NkZENUQxOTFDMzRDNTQ0RDJDODYyMzMzMzkiLCJ0eXAiOiJKV1QiLCJ4NXQiOiJxRDRLSGFaVEZHX1YwWkhEVEZSTkxJWWpNemsifQ.eyJuYmYiOjE1NTkzMTYzNTYsImV4cCI6MTU1OTMxOTk1NiwiaXNzIjoiaHR0cHM6Ly9jaW5jaHktbnByLmNsb3VkLnJlcy5ibmdmLmxvY2FsL2NpbmNoeXNzbyIsImF1ZCI6WyJodHRwczovL2NpbmNoeS1ucHIuY2xvdWQucmVzLmJuZ2YubG9jYWwvY2luY2h5c3NvL3Jlc291cmNlcyIsImpzX2FwaSJdLCJjbGllbnRfaWQiOiJhcGkiLCJzdWIiOiIxIiwiYXV0aF90aW1lIjoxNTU5MzE2MzU2LCJpZHAiOiJsb2NhbCIsInByb2ZpbGUiOiJBZG1pbmlzdHJhdG9yIiwiZW1haWwiOiJhZG1pbkBjaW5jaHkuY28iLCJyb2xlIjoiQ2luY2h5IFVzZXIgQWNjb3VudCIsImlkIjoiYWRtaW4iLCJzY29wZSI6WyJqc19hcGkiXSwiYW1yIjpbImN1c3RvbSJdfQ.O0--cahxPKlwHp-7fP0CMgSJTaXleupH32x7vVoxe8THVdeRIgyuZoKWPAK9p10PMO9a5Mi3N0t1Nqut5-dUS7lmeUfNKe25K1got9de7ghQ56QQXnL2SWd6g4I8Zi1R9fZsln7bZCIJvnG3_wIWHKGHBco9jEvKtO3AdYF4T9LAbdpT51SDzKPhX16BPc0Do6KfNImQpPQdK4fP3-JqxD4sOBldUg-g3aau2F_DmapEd0p5hTI4qeKgORnXJ3NadwWscQREGWVXhIRu_BF_cmEoIfPNyJI7D_L7EWn8XcFa2Gu-8khQ-WDpVUcpyidF_VHYRkMtYwpJ2dcYUaLILQ
jq -r '.access_token' file
Output:
eyJhbGciOiJSUzI1NiIsImtpZCI6IkE4M0UwQTFEQTY1MzE0NkZENUQxOTFDMzRDNTQ0RDJDODYyMzMzMzkiLCJ0eXAiOiJKV1QiLCJ4NXQiOiJxRDRLSGFaVEZHX1YwWkhEVEZSTkxJWWpNemsifQ.eyJuYmYiOjE1NTkzMTYzNTYsImV4cCI6MTU1OTMxOTk1NiwiaXNzIjoiaHR0cHM6Ly9jaW5jaHktbnByLmNsb3VkLnJlcy5ibmdmLmxvY2FsL2NpbmNoeXNzbyIsImF1ZCI6WyJodHRwczovL2NpbmNoeS1ucHIuY2xvdWQucmVzLmJuZ2YubG9jYWwvY2luY2h5c3NvL3Jlc291cmNlcyIsImpzX2FwaSJdLCJjbGllbnRfaWQiOiJhcGkiLCJzdWIiOiIxIiwiYXV0aF90aW1lIjoxNTU5MzE2MzU2LCJpZHAiOiJsb2NhbCIsInByb2ZpbGUiOiJBZG1pbmlzdHJhdG9yIiwiZW1haWwiOiJhZG1pbkBjaW5jaHkuY28iLCJyb2xlIjoiQ2luY2h5IFVzZXIgQWNjb3VudCIsImlkIjoiYWRtaW4iLCJzY29wZSI6WyJqc19hcGkiXSwiYW1yIjpbImN1c3RvbSJdfQ.O0--cahxPKlwHp-7fP0CMgSJTaXleupH32x7vVoxe8THVdeRIgyuZoKWPAK9p10PMO9a5Mi3N0t1Nqut5-dUS7lmeUfNKe25K1got9de7ghQ56QQXnL2SWd6g4I8Zi1R9fZsln7bZCIJvnG3_wIWHKGHBco9jEvKtO3AdYF4T9LAbdpT51SDzKPhX16BPc0Do6KfNImQpPQdK4fP3-JqxD4sOBldUg-g3aau2F_DmapEd0p5hTI4qeKgORnXJ3NadwWscQREGWVXhIRu_BF_cmEoIfPNyJI7D_L7EWn8XcFa2Gu-8khQ-WDpVUcpyidF_VHYRkMtYwpJ2dcYUaLILQ
See: man jq

Rewriting a JSON file into a CSV efficiently in Bash [duplicate]

This question already has answers here:
Use jq to Convert json File to csv
(1 answer)
Converting json map to csv using jq
(3 answers)
Closed 4 years ago.
I want to efficiently rewrite a large json, which has always the same field names, into a csv, ignoring its keys.
To give a concrete example, here is a large JSON file (tempST.json):
https://gist.githubusercontent.com/pedro-roberto/b81672a89368bc8674dae21af3173e68/raw/e4afc62b9aa3092c8722cdbc4b4b4b6d5bbc1b4b/tempST.json
If I rewrite just fields time, ancestorcount and descendantcount from this JSON into a CSV I should get:
1535995526,1,1
1535974524,1,1
1535974528,1,2
...
1535997274,1,1
The following script tempSpeedTest.sh writes the value of the fields time, ancestorcount and descendantcount into each line of the csv:
rm tempOutput.csv
jq -c '.[]' < tempST.json | while read line; do
descendantcount=$(echo $line | jq '.descendantcount')
ancestorcount=$(echo $line | jq '.ancestorcount')
time=$(echo $line | jq '.time')
echo "${time},${ancestorcount},${descendantcount}" >> tempOutput.csv
done
However the script takes around 3 minutes to run, which is unsatisfying:
>time bash tempSpeedTest.sh
real 2m50.254s
user 2m43.128s
sys 0m34.811s
What is a faster way to achieve the same result?
jq -r '.[] | [.time, .descendantcount, .ancestorcount] | #csv' <tempST.json >tempOutput.csv
See this running at https://jqplay.org/s/QJz5FCmuc9

JSON object returned to bash need to extract key value [duplicate]

This question already has answers here:
Parsing JSON with Unix tools
(45 answers)
Closed 4 years ago.
I have a curl command that gets a JSON response and it gives the response below. I need to get the value of data.value
How can I do this without a "hacky" solution?
{
"request_id":"50aaabe7-d01b-0a83-da86-8f01cb1da74b",
"lease_id":"",
"renewable":false,
"lease_duration":2764800,
"data":{"value":"randomBinaryString"},
"wrap_info":null,
"warnings":null,
"auth":null
}
If you install jq, then it is as easy as:
<curl-command> | jq .data.value
If you don't want to install extra software and the response is always in that format, you can do some dirty tricks:
<curl-command> | grep data | tr '":{' ' ' | awk '{print $3}'

Turning a JSON object into a BASH array [duplicate]

This question already has answers here:
Bash Store Curl Results into Array
(1 answer)
Accessing a JSON object in Bash - associative array / list / another model
(7 answers)
How to get key names from JSON using jq
(9 answers)
Closed 4 years ago.
I have a URL
http://localhost/status?json
Which is PHP-FPM's status page, it outputs this
{
"pool":"www",
"process manager":"dynamic",
"start time":1526919087,
"start since":69780,
"accepted conn":403320,
"listen queue":0,
"max listen queue":0,
"listen queue len":0,
"idle processes":21,
"active processes":6,
"total processes":27,
"max active processes":200,
"max children reached":1,
"slow requests":0
}
I want to turn this JSON into an Array in Bash, so I can do a loop around it to check for stuff, I've heard JQ can parse this, but I'm unsure in BASH how I can convert it to a useable array.
Any ideas?