HTML to text conversion in shell script - html

I wrote a shell script to convert HTML source to plain text using lynx.
Here it is:
#!/bin/sh
if [ -f = "/usr/bin/lynx" ]
then
if [ -f = "$1" ]
then
lynx -dump $1 > $2
else
echo "File $1 does not exist!"
fi
else
echo "Lynx is not installed!"
fi
Now, although lynx exists in the right directory, and I pass correct arguments I get either "Lyns is not installed!" message or (if I comment the first test) "File $1 does not exist!". I'm not to good at sh so could someone tell me what is wrong with the script?

I think the first if is wrong and should be replaced with
if [ -f /usr/bin/lynx ]

Try removing the "-f =" and keeping it just "-f"

It's like that now:
#!/bin/sh
if [ -f /usr/bin/lynx ]
then
if [ -f $1 ]
then
lynx -dump $1 > $2
else
echo "File $1 does not exist!"
fi
else
echo "Lynx is not installed!"
fi
And the problem with tests is gone, but now I get this error:
line 7: $2: ambiguous redirect
although
lynx -dump site.html > site.txt works fine if run from console

LINKS=`which links`
if [ -x $LINKS ]; then
...
else
...
endif
What if it's not installed in /usr/bin/? What if for some reason it's not executable?

Related

variable expansion in subshell in makefile fails

I have a makefile function inside a makefile (myfunction.mk):
.ONESHELL:
define call_script
set +x
mkdir -p $$(dirname $(2))
if [ ! -f $(2) ]; then
echo "" > $(2)
fi
REDIRECT='| tee -a'
echo '>> $(1)'
($(1) ???????? $(2))
RET_CODE=$$?
echo "exit_code is: $$RET_CODE"
if [ ! $$RET_CODE = 0 ]; then
echo "$(3) terminated with error $$RET_CODE"
exit $$RET_CODE
else
if [ ! -z "$(strip $(3))" ]; then
echo "$(3) done"
fi
fi
endef
this function call a script and append result to a log (which is created with its folder if non existing), the result of the script is append only if the makefile variable given as the 4th ($(4)) argument is equal to 'yes'.
you call it like this:
include myfunction.mk
OUTPUT_ENABLED ?= yes
target:
$(call call_script, echo "test", reports/mylog.log, "doing test", OUTPUT_ENABLED)
This works for the most part:
if i replace '????????' by '| tee -a', it works.
if i replace '????????' by $(REDIRECT), it fails.
if i replace '????????' by $$REDIRECT, it fails.
why?
note: running it from a shell /bin/sh: symbolic link to dash
note: of course i want to add a ifeq that allows me to check for $(4) and replace | tee -a by &>>
I'll assume that you use call in a recipe, not flat in your Makefile. There are few problems with your shell script. First, if you try the following on the command line:
mkdir -p reports
REDIRECT='| tee -a'
echo '>> echo "test"'
(echo "test" $REDIRECT reports/mylog.log)
you'll see that echo considers:
"test" $REDIRECT reports/mylog.log
as its arguments. They are expanded and echoed, which prints:
test | tee -a reports/mylog.log
on the standard output, not the effect you expected, I guess. You could, for instance, use eval. On the command line:
eval "echo "test" $REDIRECT reports/mylog.log"
Which, in your Makefile, would become:
eval "$(1) $$REDIRECT $(2)"
Next you should not quote the third parameter of call because the quotes will be passed unmodified and your script will be expanded by make as:
echo " "doing test" terminated with error $RET_CODE"
Again probably not what you want.
Third, you should avoid useless spaces in the parameters of call because they are preserved too (as you can see above between the first 2 double quotes):
.PHONY: foo
foo:
$(call call_script,echo "test",reports/mylog.log,doing test,OUTPUT_ENABLED)
And for your last desired feature, it would be slightly easier to pass the value of OUTPUT_ENABLED to call instead of its name, but let's go this way:
$ cat myfunction.mk
define call_script
set +x
mkdir -p $$(dirname $(2))
if [ ! -f $(2) ]; then
echo "" > $(2)
fi
if [ "$($(4))" = "yes" ]; then
REDIRECT='| tee -a'
else
REDIRECT='&>>'
fi
echo '>> $(1)'
eval "$(1) $$REDIRECT $(2)"
RET_CODE=$$?
echo "exit_code is: $$RET_CODE"
if [ ! $$RET_CODE = 0 ]; then
echo "$(3) terminated with error $$RET_CODE"
exit $$RET_CODE
else
if [ ! -z "$(strip $(3))" ]; then
echo "$(3) done"
fi
fi
endef
$ cat Makefile
.ONESHELL:
include myfunction.mk
OUTPUT_ENABLED ?= yes
target:
$(call call_script,echo "test",reports/mylog.log,doing test,OUTPUT_ENABLED)
Note that I moved the .ONESHELL: in the main Makefile because it is probably better to not hide it inside an included file. Up to you.
The most problematic issue here is that if you pipe your commands, the exit code is the exit code of the last command in a pipe, e.g false | tee foo.log will exit with 0 as tee will most probably succeed. Note also that pipe only redirects stdout, so your log will not contain any stderr messages unless explicitly redirected.
Considering that piping commands influence exit code and lack of portability of $PIPESTATUS (most specifically not being supported in dash), I would try to avoid piping commands and use a temporary file for gathering output, i.e.:
$ cat Makefile
# $(1) - script to execute
# $(2) - log file
# $(3) - description
define call_script
echo '>> $(1)'
$(if $(OUTPUT_ENABLED), \
$(1) > $#.log 2>&1; RET_CODE=$$?; mkdir -p $(dir $(2)); cat $#.log >> $(2); cat $#.log; rm -f $#.log, \
$(1); RET_CODE=$$? \
); \
echo "EXIT_CODE is: $${RET_CODE}"; \
if [ $${RET_CODE} -ne 0 ]; then $(if $(3),echo "$(3) terminated with error $${RET_CODE}";) exit $${RET_CODE}; fi; \
$(if $(3), echo "$(3) done.")
endef
good:
$(call call_script,echo "test",reports/mylog.log,doing test)
bad:
$(call call_script,mkdir /root/foo,reports/mylog.log,intentional fail)
ugly:
$(call call_script,bad_command,reports/mylog.log)
Regular call will not create the logs and will stop on errors:
$ make good bad ugly
echo '>> echo "test"'
>> echo "test"
echo "test"; RET_CODE=$? ; echo "EXIT_CODE is: ${RET_CODE}"; if [ ${RET_CODE} -ne 0 ]; then echo "doing test terminated with error ${RET_CODE}"; exit ${RET_CODE}; fi; echo "doing test done."
test
EXIT_CODE is: 0
doing test done.
echo '>> mkdir /root/foo'
>> mkdir /root/foo
mkdir /root/foo; RET_CODE=$? ; echo "EXIT_CODE is: ${RET_CODE}"; if [ ${RET_CODE} -ne 0 ]; then echo "intentional fail terminated with error ${RET_CODE}"; exit ${RET_CODE}; fi; echo "intentional fail done."
mkdir: cannot create directory ‘/root/foo’: Permission denied
EXIT_CODE is: 1
intentional fail terminated with error 1
make: *** [Makefile:19: bad] Error 1
Note that ugly was not built due to failure on bad. Now the same with the log:
$ make good bad ugly OUTPUT_ENABLED=1
echo '>> echo "test"'
>> echo "test"
echo "test" > good.log 2>&1; RET_CODE=$?; mkdir -p reports/; cat good.log >> reports/mylog.log; cat good.log; rm -f good.log; echo "EXIT_CODE is: ${RET_CODE}"; if [ ${RET_CODE} -ne 0 ]; then echo "doing test terminated with error ${RET_CODE}"; exit ${RET_CODE}; fi; echo "doing test done."
test
EXIT_CODE is: 0
doing test done.
echo '>> mkdir /root/foo'
>> mkdir /root/foo
mkdir /root/foo > bad.log 2>&1; RET_CODE=$?; mkdir -p reports/; cat bad.log >> reports/mylog.log; cat bad.log; rm -f bad.log; echo "EXIT_CODE is: ${RET_CODE}"; if [ ${RET_CODE} -ne 0 ]; then echo "intentional fail terminated with error ${RET_CODE}"; exit ${RET_CODE}; fi; echo "intentional fail done."
mkdir: cannot create directory ‘/root/foo’: Permission denied
EXIT_CODE is: 1
intentional fail terminated with error 1
make: *** [Makefile:19: bad] Error 1
$ cat reports/mylog.log
test
mkdir: cannot create directory ‘/root/foo’: Permission denied
Note that this time ugly was also not run. But if run later, it will correctly append to the log:
$ make ugly OUTPUT_ENABLED=1
echo '>> bad_command'
>> bad_command
bad_command > ugly.log 2>&1; RET_CODE=$?; mkdir -p reports/; cat ugly.log >> reports/mylog.log; cat ugly.log; rm -f ugly.log; echo "EXIT_CODE is: ${RET_CODE}"; if [ ${RET_CODE} -ne 0 ]; then exit ${RET_CODE}; fi;
/bin/sh: 1: bad_command: not found
EXIT_CODE is: 127
make: *** [Makefile:22: ugly] Error 127
$ cat reports/mylog.log
test
mkdir: cannot create directory ‘/root/foo’: Permission denied
/bin/sh: 1: bad_command: not found
Personally I am not fan of implementing logging in this way. It is complicated and it only logs output of commands, not make output itself, and only of those commands which are explicitly called to do so. I'd rather keep Makefile clean and simple and just run make 2>&1 | tee log instead to have the output logged.

Getting unexpect end of file with control-m characters

I have a simple script which give me an unexpected end of file. Everything seems good to me
#!/bin/bash
me="$(basename "$(test -L "$0" && readlink "$0" || echo "$0")")"
if [ $# -ge 5 ]; then
echo "OK"
else
echo "$me <arg1> <arg2> <arg3> <arg4> <arg5>"
fi
After checking with OP in comments got to know that OP may have got control M characters in your file use tr -d '\r' < Input_file > temp_file && mv temp_file Input_file put your script's actual name in place of Input_file and try this command and you should be good then.

How delete last comma in json file using Bash?

I wrote some script that takes all user data of aws ec2 instance, and echo to local.json. All this happens when I install my node.js modules.
I don't know how to delete last comma in the json file. Here is the bash script:
#!/bin/bash
export DATA_DIR=/data
export PATH=$PATH:/usr/local/bin
#install package from git repository
sudo -- sh -c "export PATH=$PATH:/usr/local/bin; export DATA_DIR=/data; npm install git+https://reader:secret#bitbucket.org/somebranch/$1.git#$2"
#update config files from instance user-data
InstanceConfig=`cat /instance-config`
echo '{' >> node_modules/$1/config/local.json
while read line
do
if [ ! -z "$line" -a "$line" != " " ]; then
Key=`echo $line | cut -f1 -d=`
Value=`echo $line | cut -f2 -d=`
if [ "$Key" = "Env" ]; then
Env="$Value"
fi
printf '"%s" : "%s",\n' "$Key" "$Value" >> node_modules/*/config/local.json
fi
done <<< "$InstanceConfig"
sed -i '$ s/.$//' node_modules/$1/config/local.json
echo '}' >> node_modules/$1/config/local.json
To run him im doing that way: ./script
I get json(OUTPUT), but with comma in all lines. Here is local.json that I get:
{
"Env" : "dev",
"EngineUrl" : "engine.url.net",
}
All I trying to do, is delete in last line of the json file - comma(",").
I try many ways, that I found in internet. I know that it should be after last "fi"(end of the loop). And I know that it should be something like this line:
sed -i "s?${Key} : ${Value},?${Key} : ${Value}?g" node_modules/$1/config/local.json
Or this:
sed '$ s/,$//' node_modules/$1/config/local.json
But they not work for me.
Can someone help me with that? Who knows Bash scripting well?
Thanks!
If you know that it is the last comma that needs to be replaced, a reasonably robust way is to use GNU sed in "slurp" mode like this:
sed -zr 's/,([^,]*$)/\1/' local.json
Output:
{
"Env" : "dev",
"EngineUrl" : "engine.url.net"
}
If you'd just post some sample input/output it'd remove the guess-work but IF this is your input file:
$ cat file
Env=dev
EngineUrl=engine.url.net
Then IF you're trying to do what I think you are then all you need is:
$ cat tst.awk
BEGIN { FS="="; sep="{\n" }
{
printf "%s \"%s\" : \"%s\"", sep, $1, $2
sep = ",\n"
}
END { print "\n}" }
which you'd execute as:
$ awk -f tst.awk file
{
"Env" : "dev",
"EngineUrl" : "engine.url.net"
}
Or you can execute the awk script inline within a shell script if you prefer:
awk '
BEGIN { FS="="; sep="{\n" }
{
printf "%s \"%s\" : \"%s\"", sep, $1, $2
sep = ",\n"
}
END { print "\n}" }
' file
{
"Env" : "dev",
"EngineUrl" : "engine.url.net"
}
The above is far more robust, portable, efficient and better in every other way than the shell script you posted because it's using the right tool for the job. A UNIX shell is an environment from which to call tools with a language to sequence those calls. It is NOT a language to process text which is why it's so difficult to get it right. The UNIX tool for general text processing is awk so when you need to process text in UNIX, you just have shell call awk, that's all.
Here a jq version if it's available:
jq --raw-input 'split("=") | {(.[0]):.[1]}' /instance-config | jq --slurp 'add'
There might be a way to do it with one jqpass, but I couldn't see it.
You an remove all trailing commas from invalid json with:
sed -i.bak ':begin;$!N;s/,\n}/\n}/g;tbegin;P;D' FILE
sed -i.bak = creates a backup of the original file, then applies changes to the file
':begin;$!N;s/,\n}/\n}/g;tbegin;P;D' = anything ending with , followed by
"new line and }". Remove the , on the previous line
FILE = the file you want to make the change to
If you're willing to use it, xidel is rather forgiving for trailing commas:
xidel -s local.json -e '$json'
{
"Env": "dev",
"EngineUrl": "engine.url.net"
}
xidel - -se '$json' <<< '{"Env":"dev","EngineUrl":"engine.url.net",}'
#or
xidel - -se 'parse-json($raw,{"liberal":true()})' <<< '{"Env":"dev","EngineUrl":"engine.url.net",}'
{
"Env": "dev",
"EngineUrl": "engine.url.net"
}

conditional statement in bash with mysql query

im trying to write a bash script that will do a mysql query and if the number of results is 1, do something. i cant get it to work though.
#!/bin/sh
file=`mysql -uroot -proot -e "select count(*) from MyTable.files where strFilename='file.txt'"`
if [[ $file == "count(*) 1" ]];
then
echo $file
else
echo $file
echo "no"
fi
i verified the query works. i keep getting this returned
count(*) 1
no
im not sure why but i think it might have something to do with the type of variable $file is. any ideas?
To prevent exposing your database credential in the script you can store them in the local .my.cnf file located in your home directory.
This technic will allow your script to work on any server without modification.
Path: /home/youruser/.my.cnf
Content:
[client]
user="root"
password="root"
host="localhost"
[mysql]
database="MyTable"
So, Renato code could be rewritten by following:
#!/bin/sh
file=`mysql -e "select count(*) as count from files where strFilename='file.txt'" | cut -d \t -f 2`
if [ $file == "1" ];
then
echo $file
else
echo $file
echo "no"
fi
I rewrote your script, it works now:
#!/bin/sh
file=`mysql -uroot -proot -e "select count(*) as count from MyTable.files where strFilename='file.txt'" | cut -d \t -f 2`
if [ $file == "1" ];
then
echo $file
else
echo $file
echo "no"
fi
I'm giving a better name to the count field, using 'cut' to split the mysql output into two fields and putting the content of the second field into $file. Then you can test $file for "1". Hope it helps you..
I'm guessing that isn't actually count(*) 1 but instead count(*)\n1 or something. echo $file will convert all the characters in IFS to a space, but == will distinguish between those whitespace characters. If this is the case, echo "$file" will give you a different result. (Notice the quotes.)

MySQL error in Bash

i'm trying to load some tables to mysql into a bash script so i have the followin code
DOMY="$MYSQL --user=xxxxx --password=xxxxx --database=$DBNAME"
for filename in $(cat $HPATH/toload.tables)
do
$DOMY < $filename 2>/dev/null
if [ $? -ne 0 ]
then
echo "#003|Error loading $filename"
exit 1
fi
done
if i see $? (echo $?) it give me 0 (zero) but the exit 1 is executed.
What i'm doin wrong?
You can't see what's wrong because you have /dev/null-ed standard error.
Also, you are using an unnecessarily complex and somewhat error-prone code pattern where you examine $? later. It would be better to just write something like:
for i in "$#"; do
if mysql -u root < $i; then
echo ok # do "ok" processing here
else
echo not so ok # error path
fi
done