I'm struggling for some time and I would need some help with the following operation.
I have a JSON file and I would like to replace a string with something a bit more complex.
This is a snippet of my json file:
{ "AWS679f53fac002430cb0da5b7982bd22872D164C4C": {
"Type": "AWS::Lambda::Function",
"Properties": {
"Code": {
"S3Bucket": "hnb659fds-assets-xxccddff",
"S3Key": "68b4ffa1c39cb3733535725f85311791c09eab53b7ab8efa5152e68f8abdb005.zip"
},
"Role": {
"Fn::GetAtt": [
"AWS679f53fac002430cb0da5b7982bd2287ServiceRoleC1EA0FF2",
"Arn"
]
},
"Handler": "index.handler",
"Runtime": "nodejs12.x",
"Timeout": 120
},
"DependsOn": [
"AWS679f53fac002430cb0da5b7982bd2287ServiceRoleC1EA0FF2"
],
"Metadata": {
"aws:cdk:path": "CODE/AWS679f53fac002430cb0da5b7982bd2287/Resource",
"aws:asset:path": "asset.68b4ffa1c39cb3733535725f85311791c09eab53b7ab8efa5152e68f8abdb005",
"aws:asset:is-bundled": false,
"aws:asset:property": "Code"
}
}
}
What I need is to replace this part
"S3Bucket": "hnb659fds-assets-xxccddff",
and have the following result
"S3Bucket": {"Fn::Sub": "AAA-${AWS::Region}" },
I don't know the AWS679f53fac002430cb0da5b7982bd22872D164C4C. It is generated randomly and the string to replace is present several times in my json file.
The initial values to be replaced is stored in a variable along with the new value to be used in the replaced version as following:
cdk_bucket_name=hnb659fds-assets-xxccddff
OUTPUT_BUCKET=AAA
I need these variables because this is part of a bigger script
So I tried some sed but does not work
new_bucket_name="{"Fn::Sub\": \"$OUTPUT_BUCKET-${AWS::Region}\" }"
sed -i "s#$cdk_bucket_name#$new_bucket_name#g" my.template.json
One issue that I have is that ${AWS::Region} gets interpreted so is empty.
And second, I cannot manage the quotes in order to have my desired result.
Using sed
$ output_bucket=AAA
$ new_bucket_name="{\"Fn::Sub\": \"$output_bucket-\${AWS::Region}\" }"
$ cdk_bucket_name=hnb659fds-assets-xxccddff
$ sed s"/\"$cdk_bucket_name\"/$new_bucket_name/" input_file
{ "AWS679f53fac002430cb0da5b7982bd22872D164C4C": {
"Type": "AWS::Lambda::Function",
"Properties": {
"Code": {
"S3Bucket": {"Fn::Sub": "AAA-${AWS::Region}" },
"S3Key": "68b4ffa1c39cb3733535725f85311791c09eab53b7ab8efa5152e68f8abdb005.zip"
},
"Role": {
"Fn::GetAtt": [
"AWS679f53fac002430cb0da5b7982bd2287ServiceRoleC1EA0FF2",
"Arn"
]
},
"Handler": "index.handler",
"Runtime": "nodejs12.x",
"Timeout": 120
},
"DependsOn": [
"AWS679f53fac002430cb0da5b7982bd2287ServiceRoleC1EA0FF2"
],
"Metadata": {
"aws:cdk:path": "CODE/AWS679f53fac002430cb0da5b7982bd2287/Resource",
"aws:asset:path": "asset.68b4ffa1c39cb3733535725f85311791c09eab53b7ab8efa5152e68f8abdb005",
"aws:asset:is-bundled": false,
"aws:asset:property": "Code"
}
}
}
Using a proper JSON parser shell tool like jq:
jq '
(
.[].Properties.Code.S3Bucket |
select(. == "hnb659fds-assets-xxccddff")
) = $newS3Bucket
' input_file.json \
--argjson newS3Bucket '{"Fn::Sub":"AAA-${AWS::Region}"}'
I have a json file that looks like so:
[
{
"code": "1234",
"files": [
{
"fileType": "pdf",
"url": "http://.../a.pdf"
},
{
"fileType": "video",
"url": "http://.../b.mp4"
}
]
},
{
"code": "4321",
"files": [
{
"fileType": "pdf",
"url": "http://.../c.pdf"
},
{
"fileType": "video",
"url": "http://.../d.mp4"
}
]
},
{
"code": "9999",
"files": [
{
"fileType": "pdf",
"url": "http://.../e.pdf"
}
]
}
]
I would like to print out only the files that are of fileType == video in the files array such that I end up with output that looks like so:
1234, "http://.../b.mp4"
4321, "http://.../d.mp4"
So far I am only able to output something that looks like this:
1234, "http://.../a.pdf", "http://.../b.mp4",
4321, "http://.../c.pdf", "http://.../d.mp4"
Using the following:
jq -r '.[] | select(.files[]?.fileType == "video") | [.code, .files[].url] | #csv'
I was wondering how I can filter the .files[] based on the fileType as I am outputting them?
The following pipeline makes the solution fairly self-explanatory, assuming one understands the basic syntax and the -r command-line option:
< input.json jq -r '
.[]
| .code as $code
| .files[]
| select(.fileType == "video")
| "\($code), \"\(.url)\""
'
I'm not sure how to name these elements properly, it'll be easier just to show it. I have following JSON:
{
"DEV": [
{
"GitEmail": "asd#asd.com"
}
],
"TEST": [
{
"GitEmail": "asd1#asd.com"
}
],
"PROD": [
{
"GitEmail": "asd2#asd.com"
}
]
}
I would like to get the "DEV" by providing it's email. How to implement that in powershell?
Something like below can help -
PS> $json = '{
"DEV": [
{
"GitEmail": "asd#asd.com"
}
],
"TEST": [
{
"GitEmail": "asd1#asd.com"
}
],
"PROD": [
{
"GitEmail": "asd2#asd.com"
}
]
}' | ConvertFrom-Json
PS> ($json.psobject.Properties | ? {$_.Value -match "asd#asd.com"}).Name
Depending on the email matches you can retrieve the environment names.
I can't promise there is an easier method, but this here is one way:
Given that you json is stored in a variable $json:
You can get every head object with $json.psobject.properties.name:
Input:
$json.psobject.properties.name
Output:
DEV
TEST
PROD
With this we can create a foreach loop and search for the Email:
foreach ($dev in $json.psobject.properties.name)
{
if($json.$dev.GitEmail -eq "asd#asd.com") {
echo $dev
}
}
I do not know any elegant way of doing it. ConvertFrom-Json does not create neat objects with easy ways to traverse them like convertfrom-xml, instead result is just a PsObject with bunch of noteproperties.
What I do in such cases is
$a= #"
{
"DEV": [
{
"GitEmail": "asd#asd.com"
}
],
"TEST": [
{
"GitEmail": "asd1#asd.com"
}
],
"PROD": [
{
"GitEmail": "asd2#asd.com"
}
]
}
"#
$JsonObject= ConvertFrom-Json -InputObject $a
$NAMES= $JsonObject|Get-Member |WHERE MemberType -EQ NOTEPROPERTY
$NAMES|Foreach-Object {IF($JsonObject.$($_.NAME).GITEMAIL -EQ 'asd#asd.com'){$_.NAME}}
Result of above is
DEV
Not pretty, not really re-usable but works.
If anyone knows a better way of going about it - I'll be happy to learn it:)
I am trying to covert the below csv into json format.
Africa,Kenya,NAI,281
Africa,Kenya,NAI,281
Asia,India,NSI,100
Asia,India,BSE,160
Asia,Pakistan,ISE,100
Asia,Pakistan,ANO,100
European Union,United Kingdom,LSE,100
This is the desired json format and I just cannot get to create it. I will post my work in progress below this.. Any help or direction would be appreciated...
{"name":"Africa",
"children":[
{"name":"Kenya",
"children":[
{"name":"NAI","size":"109"},
{"name":"NAA","size":"160"}]}]},
{"name":"Asia",
"children":[
{"name":"India",
"children":[
{"name":"NSI","size":"100"},
{"name":"BSE","size":"60"}]},
{"name":"Pakistan",
"children":[
{"name":"ISE","size":"120"},
{"name":"ANO","size":"433"}]}]},
{"name":"European Union",
"children":[
{"name":"United Kingdom",
"children":[
{"name":"LSE","size":"550"},
{"name":"PLU","size":"123"}]}]}
Work in Progress.
$1 is the file with the csv values pasted above.
#!/bin/bash
pcountry=$(head -1 $1 | cut -d, -f2)
cat $1 | while read line ; do
region=$(echo $line|cut -d, -f1)
country=$(echo $line|cut -d, -f2)
code=$(echo $line|cut -d, -f3-)
size=$(echo $line|cut -d, -f4)
if test "$pcountry" == "$country" ;
then
echo -e {\"name\":\"$region\", '\n' \"children\": [ '\n'{\"name\":\"$country\",'\n'\"children\": [ '\n' \{\"name\":\"NAI\",\"size\":\"$size\"\}
else
if test "$pregion" == "$region"
then :
else
echo -e ,'\n'{\"name\":\""$region\", '\n' \"children\": [ '\n'{\"name\":\"$country\",'\n'\"children\": [ '\n' \{\"name\":\"NAI\",\"size\":\"$size\"\},
pcountry=$country
pregion=$region
fi ; done
Problem is that I cannot seem to find a way to find out when a countries value ends.
As a number of the commenters have said, using the shell for this kind of conversion is a horrible idea. And, it would be nigh impossible to do it with just bash builtins; and shell scripts are used to combine standard unix commands like sed, awk, cut, etc. anyway. You should choose a better language that's built for that kind of iterative parsing/processing to solve your problem.
However, because it's late and I've had too much coffee, I threw together a bash script (with a few bits of sed thrown in for parsing help) that takes the example .csv data you have and outputs the JSON in the format you noted. Here's the script:
#! /bin/bash
# Initial input file format:
#
# Africa,Kenya,NAI,281
# Africa,Kenya,NAA,281
# Asia,India,NSI,100
# Asia,India,BSE,160
# Asia,Pakistan,ISE,100
# Asia,Pakistan,ANO,100
# European Union,United Kingdom,LSE,100
#
# Intermediate file format for parsing to JSON:
#
# Africa|Kenya:NAI=281
# Asia|India:BSE=160&NSI=100|Pakistan:ISE=100&ANO=100
# European Union|United Kingdom:LSE=100
#
# Call as:
#
# $ ./script INPUTFILE.csv >OUTPUTFILE.json
#
# temporary files for output/parsing
TMP="./tmp.dat"
TMP2="./tmp2.dat"
>$TMP
>$TMP2
# read through initial file and output intermediate format
while read line
do
region=$(echo $line | cut -d, -f1)
country=$(echo $line | cut -d, -f2)
code=$(echo $line | cut -d, -f3)
size=$(echo $line | cut -d, -f4)
# region record already started
if grep "^$region" $TMP 2>&1 >/dev/null ;then
>$TMP2
while read rec
do
if echo $rec | grep "^$region" 2>&1 >/dev/null
then
if echo "$rec" | grep "\|$country:" 2>&1 >/dev/null
then
echo "$rec" | sed -e 's/\('"$country"':[^\|][^\|]*\)/\1\&'"$code"'='"$size"'/' >>$TMP2
else
echo "$rec|$country:$code=$size" >>$TMP2
fi
else
echo $rec >>$TMP2
fi
done < $TMP
mv $TMP2 $TMP
else
# new region
echo "$region|$country:$code=$size" >>$TMP
fi
done < $1
# Parse through our intermediary format and output JSON to standard out
echo "["
country_count=$(cat $TMP | wc -l)
while read line
do
country=$(echo $line | cut -d\| -f1)
echo "{ \"name\": \"$country\", "
echo " \"children\": ["
region_count=$(echo $line | cut -d\| -f2- | sed -e 's/|/\n/g' | wc -l)
echo $line | cut -d\| -f2- | sed -e 's/|/\n/g' |
while read region
do
name=$(echo $region | cut -d: -f1)
echo " { \"name\": \"$name\", "
echo " \"children\": ["
code_count=$(echo $region | sed -e 's/^'"$name"'://' -e 's/&/\n/g' | wc -l)
echo $region | sed -e 's/^'"$name"'://' -e 's/&/\n/g' |
while read code_size
do
code=$(echo $code_size | cut -d= -f1)
size=$(echo $code_size | cut -d= -f2)
code_count=$((code_count - 1))
COMMA=""
if [ $code_count -gt 0 ]; then
COMMA=","
fi
echo " { \"name\": \"$code\", \"size\": \"$size\" }$COMMA "
done
echo " ]"
region_count=$((region_count - 1))
if [ $region_count -gt 0 ]; then
echo " },"
else
echo " }"
fi
done
echo " ]"
country_count=$((country_count - 1))
COMMA=""
if [ $country_count -gt 0 ]; then
COMMA=","
fi
echo "}$COMMA"
done < $TMP
echo "]"
exit 0
And, here's the resulting output from the above script:
[
{ "name": "Africa",
"children": [
{ "name": "Kenya",
"children": [
{ "name": "NAI", "size": "281" },
{ "name": "NAA", "size": "281" }
]
}
]
},
{ "name": "Asia",
"children": [
{ "name": "India",
"children": [
{ "name": "NSI", "size": "100" },
{ "name": "BSE", "size": "160" }
]
},
{ "name": "Pakistan",
"children": [
{ "name": "ISE", "size": "100" },
{ "name": "ANO", "size": "100" }
]
}
]
},
{ "name": "European Union",
"children": [
{ "name": "United Kingdom",
"children": [
{ "name": "LSE", "size": "100" }
]
}
]
}
]
Please don't use code like the above in any production environment.
Here is a solution using jq.
If filter.jq contains the following filter
reduce (
split("\n")[] # split string into lines
| split(",") # split data
| select(length>0) # eliminate blanks
) as [$c1,$c2,$c3,$c4] ( # convert to object
{} # e.g. "Africa": { "Kenya": {
; setpath([$c1,$c2,"name"];$c3) # "name": "NAI",
| setpath([$c1,$c2,"size"];$c4) # "size": "281"
) # }, }
| [ # then build final array of objects format:
keys[] as $k1 # [ {
| {name: $k1, children: ( # "name": "Africa",
.[$k1] # "children": {
| keys[] as $k2 # "name": "Kenya",
| {name: $k2, children:.[$k2]} # "children": { "name": "NAI", "size": "281" }
)} # ...
]
and data contains the sample data then the command
$ jq -M -Rsr -f filter.jq data
produces
[
{
"name": "Africa",
"children": {
"name": "Kenya",
"children": {
"name": "NAI",
"size": "281"
}
}
},
{
"name": "Asia",
"children": {
"name": "India",
"children": {
"name": "BSE",
"size": "160"
}
}
},
{
"name": "Asia",
"children": {
"name": "Pakistan",
"children": {
"name": "ANO",
"size": "100"
}
}
},
{
"name": "European Union",
"children": {
"name": "United Kingdom",
"children": {
"name": "LSE",
"size": "100"
}
}
}
]
You'd be much better off using a tool like xidel that can manipulate csv / raw text and understands JSON:
I'm going to assume so_24300508.csv :
Africa,Kenya,NAI,109
Africa,Kenya,NAA,160
Asia,India,NSI,100
Asia,India,BSE,60
Asia,Pakistan,ISE,120
Asia,Pakistan,ANO,433
European Union,United Kingdom,LSE,550
European Union,United Kingdom,PLU,123
(this is extracted from your JSON sample instead of the CSV sample you provided)
xidel -s so_24300508.csv --json-mode=deprecated --xquery '
[
let $csv:=x:lines($raw)
for $region in distinct-values($csv ! tokenize(.,",")[1])
return {
"name":$region,
"children":[
for $country in distinct-values($csv[starts-with(.,$region)] ! tokenize(.,",")[2]) return {
"name":$country,
"children":for $data in $csv[starts-with(.,$region) and contains(.,$country)]
let $value:=tokenize($data,",")
return {
"name":$value[3],
"size":$value[4]
}
}
]
}
]
'
(without --json-mode=deprecated replace [ ] with array{ })
See this code snippet for intermediate steps leading to this query.
Also see this online xidelcgi demo.
Output:
[
{
"name": "Africa",
"children": [
{
"name": "Kenya",
"children": [
{
"name": "NAI",
"size": "109"
},
{
"name": "NAA",
"size": "160"
}
]
}
]
},
{
"name": "Asia",
"children": [
{
"name": "India",
"children": [
{
"name": "NSI",
"size": "100"
},
{
"name": "BSE",
"size": "60"
}
]
},
{
"name": "Pakistan",
"children": [
{
"name": "ISE",
"size": "120"
},
{
"name": "ANO",
"size": "433"
}
]
}
]
},
{
"name": "European Union",
"children": [
{
"name": "United Kingdom",
"children": [
{
"name": "LSE",
"size": "550"
},
{
"name": "PLU",
"size": "123"
}
]
}
]
}
]