Shell: Replacing each New Line "\n" character with "\\n" - json

I'm inserting a git diff of changed files into a JSON object to send using a curl request.
The problem is it doesn't like the new-line characters being inserted into the JSON but I'm not sure how to get around that. Translate tool didn't work, this perl solution I'm using is close but just replaces with spaces:
changedfiles=$(git diff --name-only $3..$4 | perl -p -e 's/\n/ /')
and changing it to this didn't help:
changedfiles=$(git diff --name-only $3..$4 | perl -p -e 's/\n/\\n/')
Can anyone point me in the right direction? It doesn't need to use perl, it just needs to work
(...being simple would be nice too)

Instead of trying to do ad-hoc escaping for characters that your immediate testing finds problematic, how about using an actual JSON library that handles all of them in a solid way?
Here's an example in bash using inlined python:
python -c '
import json
import sys
print(json.dumps({"data": sys.argv[1]}))
' "$(git diff --name-only $3..$4)"
It prints the json object { "data": "your command output here" } with standards compliant escaping.

This is what I think you want to do to get a quoted list of files separated by commas (i.e. for inserting into a JSON string):
git diff --name-only $3..$4 | perl -p -e 's/(.*)/"$1",/;s/\n//;s/""/","/'
This works if your files don't contain double quotes or special characters that need to be JSON escaped.
First, we put the files in quotes followed by a comma, then remove newlines, then change the "" between files to ",". Although, this is kind of a hack. Somewhat better might be:
git diff --name-only $3..$4 | perl -p -e '$/="";s/(.*)\n/"$1",/g;s/,$//'
Here we read in the whole input, newlines and all, do our substitution and remove the final comma.

Related

How can I replace everything after a string using Bash?

I have a Perl script that uses some local variables as per below:
my $cool_variable="Initial value";
COOLVAR="Initial value for COOLVAR"
I would like to replace the content between the quotes using a bash script.
I got it to work for a non-variable like below:
#!/bin/sh
dummy_var="Replaced value"
sed -i -r "s#^(COOLVAR=).*#\1$dummy_var#" perlscript.pl
But if I replace it with cool_variable or $cool_variable:
sed -i -r "s#^($cool_variable=).*#\1$dummy_var#" perlscript.pl
It does not work..
The are multiple code injection bugs in that snippet. You shouldn't be generating code from the shell or sed.
Say you have
var=COOLVAR
val=coolval
As per How can I process options using Perl in -n or -p mode?, you can use any of
perl -spe's{^$var=\K.*}{"\Q$val\E";};' -- -var="$var" -val="$val" perlscript.pl
var=var val=val perl -pe's{^$ENV{var}=\K.*}{"\Q$ENV{val}\E";};' perlscript.pl
export var
export val
perl -pe's{^$ENV{var}=\K.*}{"\Q$ENV{val}\E";};' perlscript.pl
to transform
COOLVAR="dummy";
HOTVAR="dummy";
into
COOLVAR="coolvar";
HOTVAR="dummy";
The values are passed to the program using arguments to avoid injecting them into the fixer, and the fixer uses Perl's quotemeta (aka \Q..\E) to quote special characters.
Note that $var is assumed to be a valid identifier. No validation checks are performed. This program is absolutely unsafe using untrusted input.
Use -i to modify the file in place.

Calling Imagemagick from awk?

I have a CSV of image details I want to loop over in a bash script. awk seems like an obvious choice to loop over the data.
For each row, I want to take the values, and use them to do Imagemagick stuff. The following isn't working (obviously):
awk -F, '{ magick "source.png" "$1.jpg" }' images.csv
GNU AWK excels at processing structured text data, although it can be used to summon commands using system function it is less handy for that than some other language, e.g. python has module of standard library called subprocess which is more feature-rich.
If you wish to use awk for this task anyway, then I suggest preparing output to be feed into bash command, say you have file.txt with following content
file1.jpg,file1.bmp
file2.png,file2.bmp
file3.webp,file3.bmp
and you have files listed in 1st column in current working directory and wish to convert them to files shown in 2nd column and access to convert command, then you might do
awk 'BEGIN{FS=","}{print "convert \"" $1 "\" \"" $2 "\""}' file.txt | bash
which is equvialent to starting bash and doing
convert "file1.jpg" "file1.bmp"
convert "file2.png" "file2.bmp"
convert "file3.webp" "file3.bmp"
Observe that I have used literal " to enclose filenames, so it should work with names containing spaces. Disclaimer: it might fail if name containing special character, e.g. ".

Grep ignore special characters before applying regular expression

General
I am trying to recursively search through hundreds of JSON files under a specific directory for lines that match a specific regular expression.
grep -rh works great for searching recursively for specific lines. I am having a problem applying a regular expression with the search because all the lines in the JSON files begin with a " and end in either ", or ".
Example: If I want to apply a regular expression to get all the lines that begin with zxc I will not be able to do it because the lines actually begin with "zxc
Code
The following command would work if the lines had no " at the beginning.
/bin/grep -rh -E "^(zxc)" "/etc/json_dir/"
The following command works, but I do not want grep to get hundreds of thousands of lines from all the JSON files and then apply a regular expression afterwards.
/bin/grep -rh -E ".*" "/etc/json_dir/" | /bin/sed -e 's/^"//g' -e 's/,$//g' -e 's/"$//g' | /bin/grep -E "^(zxc)"
Question
Is there a way for grep to ignore the " character at the beginning and " and ", characters at the end of the lines before it applies a regular expression ?
If there's no way, is there a way to do it with some other bash command, perl, python or some other language.
You can go with awk if I understand Your question properly:
awk '{gsub(/^"|"$/,"") } # this part removes all the "s from the start and end of line
/^WHAT/ { print } # or any other processing
' **/*.json
Note: the **/* requires the globestar recursive globbing option in (modern) bash.
See it in action at Ideone.
You can shorten it somewhat to:
awk '/^"?WHAT/' **/* # this executes the default printing action
But awk|sed|grep might not be the right tool to search JSON.

How can I format a json file into a bash environment variable?

I'm trying to take the contents of a config file (JSON format), strip out extraneous new lines and spaces to be concise and then assign it to an environment variable before starting my application.
This is where I've got so far:
pwr_config=`echo "console.log(JSON.stringify(JSON.parse(require('fs').readFileSync(process.argv[2], 'utf-8'))));" | node - config.json | xargs -0 printf '%q\n'` npm run start
This pipes a short node.js app into the node runtime taking an argument of the file name and it parses and stringifies the JSON file to validate it and remove any unnecessary whitespace. So far so good.
The result of this is then piped to printf, or at least it would be but printf doesn't support input in this way, apparently, so I'm using xargs to pass it in in a way it supports.
I'm using the %q formatter to format the string escaping any characters that would be a problem as part of a command, but when calling printf through xargs, printf claims it doesn't support %q. I think this is perhaps because there is more than one version of printf but I'm not exactly sure how to resolve that.
Any help would be appreciated, even if the solution is completely different from what I've started :) Thanks!
Update
Here's the output I get on MacOS:
$ cat config.json | xargs -0 printf %q
printf: illegal format character q
My JSON file looks like this:
{
"hue_host": "192.168.1.2",
"hue_username": "myUsername",
"port": 12000,
"player_group_config": [
{
"name": "Family Room",
"player_uuid": "ATVUID",
"hue_group": "3",
"on_events": ["media.play", "media.resume"],
"off_events": ["media.stop", "media.pause"]
},
{
"name": "Lounge",
"player_uuid": "STVUID",
"hue_group": "1",
"on_events": ["media.play", "media.resume"],
"off_events": ["media.stop", "media.pause"]
}
]
}
Two ways:
Use xargs to pick up bash's printf builtin instead of the printf(1) executable, probably in /usr/bin/printf(thanks to #GordonDavisson):
pwr_config=`echo "console.log(JSON.stringify(JSON.parse(require('fs').readFileSync(process.argv[2], 'utf-8'))));" | node - config.json | xargs -0 bash -c 'printf "%q\n"'` npm run start
Simpler: you don't have to escape the output of a command if you quote it. In the same way that echo "<|>" is OK in bash, this should also work:
pwr_config="$(echo "console.log(JSON.stringify(JSON.parse(require('fs').readFileSync(process.argv[2], 'utf-8'))));" | node - config.json )" npm run start
This uses the newer $(...) form instead of `...`, and so the result of the command is a single word stored as-is into the pwr_config variable.*
Even simpler: if your npm run start script cares about the whitespace in your JSON, it's fundamentally broken :) . Just do:
pwr_config="$(< config.json)" npm run start
The $(<...) returns the contents of config.json. They are all stored as a single word ("") into pwr_config, newlines and all.* If something breaks, either config.json has an error and should be fixed, or the code you're running has an error and needs to be fixed.
* You actually don't need the "" around $(). E.g., foo=$(echo a b c) and foo="$(echo a b c)" have the same effect. However, I like to include the "" to remind myself that I am specifically asking for all the text to be kept together.

Force mongodb to output strict JSON

I want to consume the raw output of some MongoDB commands in other programs that speak JSON. When I run commands in the mongo shell, they represent Extended JSON, fields in "shell mode", with special fields like NumberLong , Date, and Timestamp. I see references in the documentation to "strict mode", but I see no way to turn it on for the shell, or a way to run commands like db.serverStatus() in things that do output strict JSON, like mongodump. How can I force Mongo to output standards-compliant JSON?
There are several other questions on this topic, but I don't find any of their answers particularly satisfactory.
The MongoDB shell speaks Javascript, so the answer is simple: use JSON.stringify(). If your command is db.serverStatus(), then you can simply do this:
JSON.stringify(db.serverStatus())
This won't output the proper "strict mode" representation of each of the fields ({ "floatApprox": <number> } instead of { "$numberLong": "<number>" }), but if what you care about is getting standards-compliant JSON out, this'll do the trick.
I have not found a way to do this in the mongo shell, but as a workaround, mongoexport can run queries and its output uses strict mode and can be piped into other commands that expect JSON input (such as json_pp or jq). For example, suppose you have the following mongo shell command to run a query, and you want to create a pipeline using that data:
db.myItemsCollection.find({creationDate: {$gte: ISODate("2016-09-29")}}).pretty()
Convert that mongo shell command into this shell command, piping for the sake of example to `json_pp:
mongoexport --jsonArray -d myDbName -c myItemsCollection -q '{"creationDate": {"$gte": {"$date": "2016-09-29T00:00Z"}}}' | json_pp
You will need to convert the query into strict mode format, and pass the database name and collection name as arguments, as well as quote properly for your shell, as shown here.
In case of findOne
JSON.stringify(db.Bill.findOne({'a': '123'}))
In case of a cursor
db.Bill.find({'a': '123'}).forEach(r=>print(JSON.stringify(r)))
or
print('[') + db.Bill.find().limit(2).forEach(r=>print(JSON.stringify(r) + ',')) + print(']')
will output
[{a:123},{a:234},]
the last one will have a ',' after the last item...remove it
To build on the answer from #jbyler, you can strip out the numberLongs using sed after you get your data - that is if you're using linux.
mongoexport --jsonArray -d dbName -c collection -q '{fieldName: {$regex: ".*turkey.*"}}' | sed -r 's/\{ "[$]numberLong" : "([0-9]+)" }/"\1"/g' | json_pp
EDIT: This will transform a given document, but will not work on a list of documents. Changed find to findOne.
Adding
.forEach(function(results){results._id=results._id.toString();printjson(results)})`
to a findOne() will output valid JSON.
Example:
db
.users
.findOne()
.forEach(function (results) {
results._id = results._id.toString();
printjson(results)
})
Source: https://www.mydbaworld.com/mongodb-shell-output-valid-json/