Json parsing with Linux shell script - json

Here is the json file I want to parse. I specially need the json objects inside the json array. Shell script is the only tool I can use right now.
{
"entries": [
{
"author": {
"value": "plugin-demo Administrator",
"origin": "http://localhost:8080/webservice/person/18"
},
"creator": {
"value": "plugin-demo Administrator",
"origin": "http://localhost:8080/webservice/person/18"
},
"creationDate": "2015-11-04T15:14:18.000+0600",
"lastModifiedDate": "2015-11-04T15:14:18.000+0600",
"model": "http://localhost:8080/plugin-editorial/model/281/football",
"payload": [
{
"name": "basic",
"value": "Real Madrid are through"
}
],
"publishDate": "2015-11-04T15:14:18.000+0600"
},
{
"author": {
"value": "plugin-demo Administrator",
"origin": "http://localhost:8080/webservice/person/18"
},
"creator": {
"value": "plugin-demo Administrator",
"origin": "http://localhost:8080/webservice/person/18"
},
"creationDate": "2015-11-04T15:14:18.000+0600",
"lastModifiedDate": "2015-11-04T15:14:18.000+0600",
"model": "http://localhost:8080/plugin-editorial/model/281/football",
"payload": [
{
"name": "basic",
"value": "Real Madrid are through"
}
],
"publishDate": "2015-11-04T15:14:18.000+0600"
}
]
}
How can I do it in shell script?

Use something, anything other than shell.
Since the original answer I've found jq:
jq '.entries[0].author.value' /tmp/data.json
"plugin-demo Administrator"
Install node.js
node -e 'console.log(require("./data.json").entries[0].author.value)'
Install jsawk
cat data.json | jsawk 'return this.entries[0].author.value'
Install Ruby
ruby -e 'require "json"; puts JSON.parse(File.read("data.json"))["entries"][0]["author"]["value"]'
Just don't try and parse it in shell script.

Related

storing json output in bash from cloudfromation

I am using aws ecs query to get list of properties being used by the current running task.
command -
cft = "aws ecs describe-tasks --cluster arn:aws:ecs:us-west-2:4984314772:cluster/secrets --tasks arn:aws:ecs:us-west-2:4984314772:task/secrets/86855757eec4487f9d4475a1f7c4cb0b
I am storing this in an output variable
output= $( eval $cft)
Output:
"tasks": [
{
"attachments": [
{
"id": "da8a1312-8278-46d5-8e3b-6b6a1d96f820",
"type": "ElasticNetworkInterface",
"status": "ATTACHED",
"details": [
{
"name": "subnetId",
"value": "subnet-0a151f2eb959ad4"
},
{
"name": "networkInterfaceId",
"value": "eni-081948e3666253f"
},
{
"name": "macAddress",
"value": "02:2a:9i:5c:4a:77"
},
{
"name": "privateDnsName",
"value": "ip-172-56-17-177.us-west-2.compute.internal"
},
{
"name": "privateIPv4Address",
"value": "172.56.17.177"
}
]
}
],
"availabilityZone": "us-west-2a",
"clusterArn": "arn:aws:ecs:us-west-2:4984314772:cluster/secrets",
"containers": [
{
"taskArn": "arn:aws:ecs:us-west-2:4984314772:task/secrets/86855757eec4487f9d4475a1f7c4cb0b",
"name": "nginx",
"image": "nginx",
"lastStatus": "PENDING",
"networkInterfaces": [
{
"attachmentId": "da8a1312-8278-46d5-6b6a1d96f820",
"privateIpv4Address": "172.31.17.176"
}
],
"healthStatus": "UNKNOWN",
"cpu": "0"
}
],
"cpu": "256",
"createdAt": "2020-12-10T18:00:16.320000+05:30",
"desiredStatus": "RUNNING",
"group": "family:nginx",
"healthStatus": "UNKNOWN",
"lastStatus": "PENDING",
"launchType": "FARGATE",
"memory": "512",
"overrides": {
"containerOverrides": [
{
"name": "nginx"
}
],
"inferenceAcceleratorOverrides": []
},
"platformVersion": "1.4.0",
"tags": [],
"taskArn": "arn:aws:ecs:us-west-2:4984314772:task/secrets/86855757eec4487f9d4475a1f7c4cb0b",
"taskDefinitionArn": "arn:aws:ecs:us-west-2:4984314772:task-definition/nginx:17",
"version": 2
}
],
"failures": []
}
now if do an echo of $output.tasks[0].containers[0] nothing happens it prints the entire thing again, i want to store the result in output variable and refer different parameter like we do in json format.
You will need to use a json parser such as jq and so:
eval $cft | jq '.tasks[].containers[]'
To avoid using eval you could simple pipe the aws command into jq and so:
aws ecs describe-tasks --cluster arn:aws:ecs:us-west-2:4984314772:cluster/secrets --tasks arn:aws:ecs:us-west-2:4984314772:task/secrets/86855757eec4487f9d4475a1f7c4cb0b | jq '.tasks[].containers[]'
or:
cft=$(aws ecs describe-tasks --cluster arn:aws:ecs:us-west-2:4984314772:cluster/secrets --tasks arn:aws:ecs:us-west-2:4984314772:task/secrets/86855757eec4487f9d4475a1f7c4cb0b | jq '.tasks[].containers[]')
echo $cft | jq '.tasks[].containers[]'

Parse and filter complex JSON Response in Python (the easy compute method)

I have a list of dictionaries (basically JSON Response of an endpoint)
I would need to Parse this 16000 lines of json objects and fitler the documents/objects which match criteria that
whose leaf element/field : statusInfo/status in not "UP" and of those filtered objects, just return "name" , "serviceUrl","status"
example :
"ADMIN-V1" "http://aws-ec2.aws.com:4435" "Warning"
I have been researching about JSONPath module , but there is no good documentation about it, and I could not find any easier way.
Any guidance is highly appreciated.
here is a snippet from long 16000 lines of JSON response.
[
{
"id": "9c108ec5",
"name": "USER-V2",
"managementUrl": "http://aws-ec2.aws.com:5784/",
"healthUrl": "http://aws-ec2.aws.com:5784/health",
"serviceUrl": "http://aws-ec2.aws.com:5784/",
"statusInfo": {
"status": "UP",
"timestamp": 1566663146681,
"details": {
"description": " Eureka Discovery Client",
"status": "UP"
}
},
"source": "discovery",
"metadata": {},
"info": {
"component": "user",
"description": "User REST Resource",
"version": "2.2.1",
"git": {
"commit": {
"time": "07/27/2018 # 15:06:55 CDT",
"id": "b2a1b37"
},
"branch": "refs/tags/v2.2.1"
}
}
},
{
"id": "1a381f20",
"name": "ADMIN-V1",
"managementUrl": "http://aws-ec2.aws.com:4435/",
"healthUrl": "http://aws-ec2.aws.com:4435/health",
"serviceUrl": "http://aws-ec2.aws.com:4435/",
"statusInfo": {
"status": "Warning",
"timestamp": 1566663146682,
"details": {
"description": "Spring Cloud Eureka Discovery Client",
"status": "Warning"
}
},
"source": "discovery",
"metadata": {},
"info": {
"description": "Exchange Admin REST Resource",
"api": {
"version": "1.2.1",
"name": "admin",
"link": "https://app.swaggerhub.com/apis/AWSExchange/admin/1.2.1"
},
"implementation": "admin",
"version": "1.1.0",
"git": {
"commit": {
"time": "01/04/2019 # 15:36:48 UTC",
"id": "39d5551"
},
"branch": "refs/tags/v1.1.0"
}
}
}
]
If your json file contains one big array, you'll want to stream that file in truncating out the array. Then use fromstream/1 to rebuild the objects and filtering them out as you go.
I don't have a representative file to test out the performance myself, but give this a try:
$ jq --stream -n 'fromstream(1|truncate_stream(inputs))
| select(.statusInfo.status != "UP")
| .name, .serviceUrl, .statusInfo.status
' input.json

How to find a key-value pair in json text using shell scripting with in-built linux tools like sed?

I have a JSON file abc.json containing text:
{
"size": 3,
"limit": 25,
"isLastPage": true,
"values": [
{
"slug": "docker_apache_customised",
"id": 234889,
"name": "docker_apache_customised",
"scmId": "git",
"state": "AVAILABLE",
"statusMessage": "Available",
"forkable": true,
"project": {
"key": "UFD",
"id": 36239,
"name": "UF_docker",
"public": false,
"type": "NORMAL",
"links": {
"self": [{
"href": "https://rndwww.abc.xxx.net/git/projects/UFD"
}]
}
},
"public": false,
"links": {
"clone": [{
"href": "https://rndwww.abc.xxx.net/git/scm/ufd/docker_apache_customised.git",
"name": "http"
}, {
"href": "ssh://git#git.rnd.xxx.net/ufd/docker_apache_customised.git",
"name": "ssh"
}],
"self": [{
"href": "https://rndwww.abc.xxx.net/git/projects/UFD/repos/docker_apache_customised/browse"
}]
}
},
{
"slug": "web-software",
"id": 241533,
"name": "web-software",
"scmId": "git",
"state": "AVAILABLE",
"statusMessage": "Available",
"forkable": true,
"project": {
"key": "UFD",
"id": 36239,
"name": "UF_docker",
"public": false,
"type": "NORMAL",
"links": {
"self": [{
"href": "https://rndwww.abc.xxx.net/git/projects/UFD"
}]
}
},
"public": false,
"links": {
"clone": [{
"href": "https://rndwww.abc.xxx.net/git/scm/ufd/web-software.git",
"name": "http"
}, {
"href": "ssh://git#git.rnd.xxx.net/ufd/web-software.git",
"name": "ssh"
}],
"self": [{
"href": "https://rndwww.abc.xxx.net/git/projects/UFD/repos/web-software/browse"
}]
}
},
{
"slug": "web-loy-conf",
"id": 240959,
"name": "web-loy-conf",
"scmId": "git",
"state": "AVAILABLE",
"statusMessage": "Available",
"forkable": true,
"project": {
"key": "UFD",
"id": 36239,
"name": "UF_docker",
"public": false,
"type": "NORMAL",
"links": {
"self": [{
"href": "https://rndwww.abc.xxx.net/git/projects/UFD"
}]
}
},
"public": false,
"links": {
"clone": [{
"href": "ssh://git#git.rnd.xxx.net/ufd/web-loy-conf.git",
"name": "ssh"
}, {
"href": "https://rndwww.abc.xxx.net/git/scm/ufd/web-loy-conf.git",
"name": "http"
}],
"self": [{
"href": "https://rndwww.abc.xxx.net/git/projects/UFD/repos/web-loy-conf/browse"
}]
}
}
],
"start": 0
}
This text contains three repositories(named docker_apache_customised, web-software, web-loy-conf) in a git project. There may be more repos containing web as substring.
I want to perform some operation on the repositories which has web as substring, and for that I think I have to apply a for loop in shell script. I don't want to use jq tool
I wrote a script using external tool jq, but I want to do it with Linux in-built tools only. The script using jq is working fine:
for k in $(jq '.values | keys | .[]' abc.json); do
value=$(jq -r ".values[$k]" abc.json);
name=$(jq -r '.name' <<< "$value");
if [[ $name == *"web"* ]]; then
#MYLOGIC
done
done
Expected result are names (web-software, web-loy-conf) and to be able to loop through that
You can run jq from its current path in your git repository, there's no need to copy it to a directory in the PATH. After adding execution permissions:
value=$(<path to jq in git dir>/jq -r ".values[$k]" abc.json);
You can make it relative to git repository root
value=$(./<path to jq from git repo root>/jq -r ".values[$k]" abc.json);
Also, you can set the path to it in a variable
jqbin='./<path to jq from git repo root>/jq'
value=$($jqbin -r ".values[$k]" abc.json);

Passing JSON value using jq command to a new JSON file

I ran curl command and then parsed the value ("id").
request:
curl "http://192.168.22.22/test/index/limit:1/page:1/sort:id/pag1.json" | jq -r '.[0].id'
curl response:
[
{
"id": "381",
"org_id": "9",
"date": "2018-10-10",
"info": "THIS IS TEST",
"uuid": "5bbd1b41bc",
"published": 1,
"an": "2",
"attribute_count": "4",
"orgc_id": "8",
"timestamp": "1",
"dEST": "0",
"sharing": "0",
"proposal": false,
"locked": false,
"level_id": "1",
"publish_timestamp": "0",
"disable_correlation": false,
"extends_uuid": "",
"Org": {
"id": "5",
"name": "test",
"uuid": "5b9bc"
},
"Orgc": {
"id": "1",
"name": "test",
"uuid": "5b9f93bdeac1b41bc"
},
"ETag": []
}
]
jq response:
381
Now I'm trying to get the "id" number 381, and then to create a new JSON file on the disk when I place the "id" number in the right place.
The new JSON file for example:
{
"request": {
"Event": {
"id": "381",
"task": "new"
}
}
}
Given your input, this works:
jq -r '{"request": {"Event": {"id": .[0].id, "task": "new"}}}' > file

Add the same element of array in a existing JSON using jq

I have a json file and I want to add some value from top in another place in json.
I am trying to use jq command line.
{
"channel": "mychannel",
"videos": [
{
"id": "10",
"url": "youtube.com"
},
{
"id": "20",
"url": "youtube.com"
}
]
}
The output would be:
{
"channel": "mychannel",
"videos": [
{
"channel": "mychannel",
"id": "10",
"url": "youtube.com"
},
{
"channel": "mychannel",
"id": "20",
"url": "youtube.com"
}
]
}
in my json the "channel" is static, same value always. I need a way to concatenate always in each video array.
Someone can help me?
jq .videos + channel
Use a variable to remember .channel in the later stages of the pipeline.
$ jq '.channel as $ch | .videos[].channel = $ch' tmp.json
{
"channel": "mychannel",
"videos": [
{
"id": "10",
"url": "youtube.com",
"channel": "mychannel"
},
{
"id": "20",
"url": "youtube.com",
"channel": "mychannel"
}
]
}