I have a my_file.json file, which contains invalid json as below, and i need to delete the lines starting from "{" upto "}," if the "name": my_script.py.
[
{
"use": abcd
"name": my_script.py
"contact": xyz
"time": 11:22:33
},
{
"use": abcd
"name": some_other_script.py
"contact": xyz
"time": 11:22:33
},
{
"use": abcd
"name": my_script.py
"contact": xyz
"time": 11:22:33
}
]
I tried below sed,
sed '//{/ {:a;/}/!{N;ba};/my_script/d}' my_file.json
but it is not working and giving me error
"sed: -e expression#1, char 11: `}' doesn't want any addresses".
This might work for you (GNU sed):
sed '/{/{:a;N;/}/!ba;/my_script\.py/d}' file
Gather up the lines between { and } and if those lines contain my_script.py delete them.
The current example is not json, however this is probably a more robust solution:
sed 's/\S\+/"&"/2;T;N;/}/!s/\n/,&/;P;D' file |
jq '[ .[]|select(.name!="my_script.py") ]'
Related
I'm struggling for some time and I would need some help with the following operation.
I have a JSON file and I would like to replace a string with something a bit more complex.
This is a snippet of my json file:
{ "AWS679f53fac002430cb0da5b7982bd22872D164C4C": {
"Type": "AWS::Lambda::Function",
"Properties": {
"Code": {
"S3Bucket": "hnb659fds-assets-xxccddff",
"S3Key": "68b4ffa1c39cb3733535725f85311791c09eab53b7ab8efa5152e68f8abdb005.zip"
},
"Role": {
"Fn::GetAtt": [
"AWS679f53fac002430cb0da5b7982bd2287ServiceRoleC1EA0FF2",
"Arn"
]
},
"Handler": "index.handler",
"Runtime": "nodejs12.x",
"Timeout": 120
},
"DependsOn": [
"AWS679f53fac002430cb0da5b7982bd2287ServiceRoleC1EA0FF2"
],
"Metadata": {
"aws:cdk:path": "CODE/AWS679f53fac002430cb0da5b7982bd2287/Resource",
"aws:asset:path": "asset.68b4ffa1c39cb3733535725f85311791c09eab53b7ab8efa5152e68f8abdb005",
"aws:asset:is-bundled": false,
"aws:asset:property": "Code"
}
}
}
What I need is to replace this part
"S3Bucket": "hnb659fds-assets-xxccddff",
and have the following result
"S3Bucket": {"Fn::Sub": "AAA-${AWS::Region}" },
I don't know the AWS679f53fac002430cb0da5b7982bd22872D164C4C. It is generated randomly and the string to replace is present several times in my json file.
The initial values to be replaced is stored in a variable along with the new value to be used in the replaced version as following:
cdk_bucket_name=hnb659fds-assets-xxccddff
OUTPUT_BUCKET=AAA
I need these variables because this is part of a bigger script
So I tried some sed but does not work
new_bucket_name="{"Fn::Sub\": \"$OUTPUT_BUCKET-${AWS::Region}\" }"
sed -i "s#$cdk_bucket_name#$new_bucket_name#g" my.template.json
One issue that I have is that ${AWS::Region} gets interpreted so is empty.
And second, I cannot manage the quotes in order to have my desired result.
Using sed
$ output_bucket=AAA
$ new_bucket_name="{\"Fn::Sub\": \"$output_bucket-\${AWS::Region}\" }"
$ cdk_bucket_name=hnb659fds-assets-xxccddff
$ sed s"/\"$cdk_bucket_name\"/$new_bucket_name/" input_file
{ "AWS679f53fac002430cb0da5b7982bd22872D164C4C": {
"Type": "AWS::Lambda::Function",
"Properties": {
"Code": {
"S3Bucket": {"Fn::Sub": "AAA-${AWS::Region}" },
"S3Key": "68b4ffa1c39cb3733535725f85311791c09eab53b7ab8efa5152e68f8abdb005.zip"
},
"Role": {
"Fn::GetAtt": [
"AWS679f53fac002430cb0da5b7982bd2287ServiceRoleC1EA0FF2",
"Arn"
]
},
"Handler": "index.handler",
"Runtime": "nodejs12.x",
"Timeout": 120
},
"DependsOn": [
"AWS679f53fac002430cb0da5b7982bd2287ServiceRoleC1EA0FF2"
],
"Metadata": {
"aws:cdk:path": "CODE/AWS679f53fac002430cb0da5b7982bd2287/Resource",
"aws:asset:path": "asset.68b4ffa1c39cb3733535725f85311791c09eab53b7ab8efa5152e68f8abdb005",
"aws:asset:is-bundled": false,
"aws:asset:property": "Code"
}
}
}
Using a proper JSON parser shell tool like jq:
jq '
(
.[].Properties.Code.S3Bucket |
select(. == "hnb659fds-assets-xxccddff")
) = $newS3Bucket
' input_file.json \
--argjson newS3Bucket '{"Fn::Sub":"AAA-${AWS::Region}"}'
I am attempting to iterate through all my JSON files and add properties but I am relatively new jq.
here is what I am attempting:
find hashlips_art_engine/build -type f -name '*.json' | jq '. + {
"creators": [
{
"address": "4iUFmB3H3RZGRrtuWhCMtkXBT51iCUnX8UV7R8rChJsU",
"share": 10
},
{
"address": "2JApg1AXvo1Xvrk3vs4vp3AwamxQ1DHmqwKwWZTikS9w",
"share": 45
},
{
"address": "Zdda4JtApaPs47Lxs1TBKTjh1ZH2cptjxXMwrbx1CWW",
"share": 45
}
]
}'
However this is returning an error:
parse error: Invalid numeric literal at line 2, column 0
I have around 10,000 JSON files that I need to iterate over and add
{
"creators": [
{
"address": "4iUFmB3H3RZGRrtuWhCMtkXBT51iCUnX8UV7R8rChJsU",
"share": 10
},
{
"address": "2JApg1AXvo1Xvrk3vs4vp3AwamxQ1DHmqwKwWZTikS9w",
"share": 45
},
{
"address": "Zdda4JtApaPs47Lxs1TBKTjh1ZH2cptjxXMwrbx1CWW",
"share": 45
}
]
}
to, is this possible or am I barking up the wrong tree on this?
thanks for your assistance with this, I have been searching the web for several hours now but either my terminology is incorrect or there isn't much out there regarding this issue.
The problem is that you are piping the filenames to jq rather than making the contents available to jq.
Most likely you could use the following approach, e.g. if you want the augmented contents of each file to be handled separately:
find ... | while read f ; do jq ... "$f" ; done
An alternative that might be relevant would be:
jq ... $(find ...)
If you have 2 files:
file01.json :
{"a":"1","b":"2"}
file02.json :
{"x":"10","y":"12","z":"15"}
you can:
for f in file*.json ;do cat $f | jq '. + { creators:[{address: "xxx",share:1}] } ' ; done
result:
{
"a": "1",
"b": "2",
"creators": [
{
"address": "xxx",
"share": 1
}
]
}
{
"x": "10",
"y": "12",
"z": "15",
"creators": [
{
"address": "xxx",
"share": 1
}
]
}
Hi I am new to JQ and Json. I am using
$ jq --version
jq-1.5
I am having a heck of a time trying to figure out how to select the values for id, attributes.name, attributes.albumName, and attributes.artistName
I am using the terminal app on a mac. I am running into some sort strange parsing problem
$ jq '.results.songs.data[0] | {id, attributes.name } ' t
jq: error: syntax error, unexpected FIELD, expecting '}' (Unix shell quoting issues?) at <top-level>, line 1:
.results.songs.data[0] | {id, attributes.name }
jq: 1 compile error
$
This example shows the structure of the data I am trying to filter looks like
$ jq '.results.songs.data[0] | {id, attributes } ' t
{
"id": "152471393",
"attributes": {
"previews": [
{
"url": "https://audio-ssl.itunes.apple.com/apple-assets-us-std-000001/AudioPreview71/v4/7d/c5/68/7dc56849-29b8-bd90-2bb1-51750e479569/mzaf_4742389090778091050.plus.aac.p.m4a"
}
],
"artwork": {
"width": 1449,
"height": 1449,
"url": "https://is5-ssl.mzstatic.com/image/thumb/Music/v4/7d/01/56/7d0156be-12cd-8724-a0ca-727b1013a81d/source/{w}x{h}bb.jpeg",
"bgColor": "ddcfc4",
"textColor1": "010100",
"textColor2": "422f10",
"textColor3": "2d2a27",
"textColor4": "614f34"
},
"artistName": "Gnarls Barkley",
"url": "https://itunes.apple.com/us/album/crazy/152471339?i=152471393",
"discNumber": 1,
"genreNames": [
"Alternative",
"Music",
"R&B/Soul",
"Rock",
"Soul",
"Hip-Hop/Rap",
"Rap",
"Hip-Hop",
"Adult Alternative",
"Neo-Soul",
"Alternative Rap",
"Underground Rap"
],
"durationInMillis": 178387,
"releaseDate": "2006-03-13",
"name": "Crazy",
"isrc": "USAT20611041",
"albumName": "St. Elsewhere",
"playParams": {
"id": "152471393",
"kind": "song"
},
"trackNumber": 2
}
}
Thanks
Andy
With your sample JSON as input, the following invocation:
jq '{id, name: .attributes.name }' input.json
produces:
{
"id": "152471393",
"name": "Crazy"
}
The filter above is short for:
{"id" : .id, "name": .attributes.name }
In any case, the keys must be appropriately specified.
For future reference, when asking questions on stackoverflow.com, please adhere to the http://stackoverflow.com/help/mcve guidelines as much as possible.
I have a file:
{
"test_data": [
{
"id": "1",
"pm": "30",
"mp": "40"
}
],
"test": [
"id",
"pm",
"mp"
]
}
I want to extract test_data. Output:
"test_data": [
{
"id": "1",
"pm": "30",
"mp": "40"
}
],
I try this command: cat myFile | sed -n '/^"test_data": \[$/,/^\],$/p'
But it's don't work. An idea ?
Thanks you !
jq seems the right tool for the job :
$ jq '.|{test_data:.test_data}' filename
{
"test_data": [
{
"id": "1",
"pm": "30",
"mp": "40"
}
]
}
Solution 1st: With sed
sed -n '/"test_data"/,/],/p' Input_file
OR: as per OP, OP needs to append a string/data after a line matches:
sed -n '/"test_data"/,/],/p;/],/s/$/"test"/p' Input_file
OR2: If one wants to add an another file's content to a match then following may help in same:
sed -n '/"test_data"/,/],/p;/],/r another_file' Input_file
Solution 2nd: Following simple awk may help you in same.
awk '/test_data/, /],/' Input_file
Output will be as follows.
"test_data": [
{
"id": "1",
"pm": "30",
"mp": "40"
}
],
Logic for above solutions:
For sed: -n option in sed will turn OFF the printing of any line till it is explicitly mentioned to print it, then by doing /"test_data"/,/],/ I am letting sed know that I need to get the data from test_data to till ,/] and mentioning p after that will make sure those lines which are coming in this range are getting printed here/
For awk: Simply mentioning the range from /"test_data"/,/],/ and not mentioning any action so when any line comes into this range condition becomes true and since no action mentioned so by default print of that line happens then.
You can try that with gnu
csplit -s -z infile %test_data%-1 /],/1;rm xx01;echo "Add Text here" >> xx00;cat xx00
The right way is jq tool:
jq 'del(.test)' inputfile
The output:
{
"test_data": [
{
"id": "1",
"pm": "30",
"mp": "40"
}
]
}
I have the following json format that I need to convert to CSV
[{
"name": "joe",
"age": 21,
"skills": [{
"lang": "spanish",
"grade": "47",
"school": {
"name": "my school",
"url": "example.com/sp-school"
}
}, {
"lang": "english",
"grade": "87"
}]
},
{
"name": "sarah",
"age": 34,
"skills": [{
"lang": "french",
"grade": "47",
"school": {
"name": "my school",
"url": "example.com/sp-school"
}
}, {
"lang": "english",
"grade": "87"
}]
}, {
"name": "jim",
"age": 26,
"skills": [{
"lang": "spanish",
"grade": "60"
}, {
"lang": "english",
"grade": "66",
"school": {
"name": "eg school",
"url": "eg-school.com"
}
}]
}
]
to convert to csv
name,age,grade,school,url,file,line_number
joe,21,47,"my school","example.com/sp-school",sample.json,1
jim,26,60,"","",sample.json,3
So add the top level fields and the object from the skills array if lang=spanish and the school hash from the skills object for spanish if it exists
I'd also like to add the file and line number it came from.
I would like to use jq for the job, but can't figure out the syntax , anyone help me out ?
With your data in input.json, and the following jq program in tocsv.jq:
.[]
| [.name, .age] +
(.skills[]
| select(.lang == "spanish")
| [.grade, .school.name, .school.url, input_filename, input_line_number] )
| #csv
the invocation:
jq -r -f tocsv.jq input.json
yields:
"joe",21,"47","my school","example.com/sp-school","input.json",51
"jim",26,"60",,,"input.json",51
If you want the number-valued strings converted to numbers, you could use the "tonumber" filter. If you want the null-valued fields replaced by strings, use e.g. .school.name // ""
Of course this approach doesn't yield a very useful line number. One approach that would yield higher granularity would be to stream the individual objects into jq, but then you'd lose the filename. To recover the filename you could pass it in as an argument. So you would have a pipeline like so:
jq -c '.[]' input.json | jq -r --arg file input.json -f tocsv2.jq
where tocsv2.jq would be like tscsv.jq above but without the initial .[] |, and with $file instead of input_filename.
Finally, please also consider using the TSV format (#tsv) rather than the rather messy CSV format (#csv).