I'm trying to get the output of an awk command to be formatted into a JSON. I don't have much experience with awk or regex to get this to format correctly. The script is taking raw data from Apache by tailing the access logs and parsing out the data I want using awk.
I've spent about an hour googling and trying different formats. The best setup I currently have is the following code.
awk:
awk -F " " '{
print "{\"ipAddr\":\"" $1 "\",\"reqType\":\"" $4 "\",\"reqItem\":\"" $5
"\",\"reqStatus\":\"" $7 "\",\"reqUrl\":\"" $8 "\"}"
}' temp1.log >> hmReq.log;
The output I'm currently getting is:
{"{\"ipAddr\":\"my.ip.Addr\",
\"reqType\":\"GET\",
\"reqItem\":\"/contact\",
\"reqStatus\":\"200\",
\"reqUrl\":\"\"https://www.example.com/\"\"}":""}
I'm trying to get it to be like so:
{
"ipAddr":"123.456.789.0",
"reqType":"GET",
"reqItem":"/contact",
"reqStatus":"200",
"reqUrl":"www.com"
}
Any help is greatly appreciated!
EDIT:
This is the entire script:
inotifywait -m ~/hmLogs/a2access.log -e modify | while read path action file; do
cd ~/hmLogs
tail -n1 a2access.log >> temp1.log;
awk -F " " '{
print "{\"ipAddr\":\"" $1 "\",\"reqType\":\"" $4 "\",\"reqItem\":\""
$5 "\",\"reqStatus\":\"" $7 "\",\"reqUrl\":\""
$8 "\"}"}' temp1.log >> hmReq.log;
curl -s -w "\n" -d #hmReq.log -X POST http://localhost:8080/logs >> hmRes.log;
rm hmReq.log;
rm temp1.log;
cd -;
done;
Sample of raw data input from apache2/access.log:
161.69.99.11 [01/Aug/2019:03:59:35 +0000] GET /static/js/2.59e222c7.chunk.js HTTP/1.1 200 "https://www.tjbrackett.com/"
Literally my first Linux script so I'm sure I'm doing something wrong.
EDIT 2:
This is the contents of temp1.log before and after the awk.
Before:
76.20.106.208 [01/Aug/2019:17:08:40 +0000] GET /static/media/rock.1bacda84.jpg HTTP/1.1 200 "https://www.tjbrackett.com/about"
After:
76.20.106.208 [01/Aug/2019:17:08:40 +0000] GET /static/media/rock.1bacda84.jpg HTTP/1.1 200 "https://www.tjbrackett.com/about"
EDIT 3:
It seems the output is correct but for whatever reason, the response from the Express app is being weird. Also, the only issue now is that there is an extra set of quotations around the URL.
Related
I am performing a curl test using shell script where I need to save curl server response into JSON along with CURL Information like time_connect, http_code, etc. I am trying the following code to write output to a JSON
HOST_ADDR="http://$mynds/"
do_request(){
echo $(curl $HOST_ADDR --silent -w "\n{\n
\"HttpCode\": %{http_code},\n
\"NumRedirects\":%{num_redirects},\n
\"NumConnects\":%{num_connects},\n
\"SizeDownloadInBytes\":%{size_download},\n
\"SizeHeaderInBytes\":%{size_header},\n
\"SizeRequestInBytes\":%{size_request},\n
\"SizeUploadInBytes\":%{size_upload},\n
\"SpeedUploadBPS\":%{speed_upload},\n
\"SpeedDownloadBPS\":%{speed_download},\n
\"TimeAppConnectSec\":%{time_appconnect},\n
\"TimeConnectSec\":%{time_connect},\n
\"TimeNamelookupSec\":%{time_namelookup},\n
\"TimePreTransferSec\":%{time_pretransfer},\n
\"TimeRedirectSec\":%{time_redirect},\n
\"TimeStartTransferSec\":%{time_starttransfer},\n
\"TimeTotalSec\":%{time_total},\n
\"UrlEffective\":\"%{url_effective}\"
}" -s)
}
do_request
The simple output I am getting:
{"hostIPAddr":"0.0.0.0","hostname":"vm01","text":"Hello World from
vm01"}` and `{ "HttpCode": 200, "NumRedirects":0, "NumConnects":1,
"SizeDownloadInBytes":85, "SizeHeaderInBytes":263,
"SizeRequestInBytes":99, "SizeUploadInBytes":0,
"SpeedUploadBPS":0.000, "SpeedDownloadBPS":14.000,
"TimeAppConnectSec":0.000000, "TimeConnectSec":5.553587,
"TimeNamelookupSec":5.097553, "TimePreTransferSec":5.553868,
"TimeRedirectSec":0.000000, "TimeStartTransferSec":5.827584,
"TimeTotalSec":5.827704, "UrlEffective":"http://dns" }
I am getting two JSON outputs one for curl information and one from the server. How do I combine these two outputs into a single JSON variable? Please help.
guid_id=$(uuidgen)
file_1="curlJsonRes_$guid_id.json"
file_2="curlMetaRes_$guid_id.json"
do_request(){
echo $(curl $HOST_ADDR --silent --output $file_1 -w "\n{\n
\"HttpCode\": %{http_code},\n
\"NumRedirects\":%{num_redirects},\n
\"NumConnects\":%{num_connects},\n
}")
} > $file_2
do_request
#Merging file outputs
echo $(jq -s '.[0] * .[1]' $file_1 $file_2) > $RESULTS_FILE
rm $file_1
rm $file_2
My overall task is constantly to collect data from UNIX system log file, filter it, prepare a json payload based on the filtered data and process the data by sending a post api call to another server.
I wonder if that can be done using let's say shell script to monitor the log file with tail, filter with grep to get the specific lines dumpted in another file. With cronjob to run another script which contruct a .json and send curl request with the json to external server.
Some details:
In the log file - connector.log I am interested in lines like:
2020-09-16T15:14:37,337 INFO (tomcat-http--131) [tenant-test;-;138.188.247.4;] com.vmware.horizon.adapters.passwordAdapter.PasswordIdpAdapter - Login: user123 - SUCCESS
These lines, I can collect by the below command:
tailf connector.log | grep 'PasswordIdpAdapter - Login\|FAILURE\|SUCCESS'
and probably dump them into a file:
tailf connector.log | grep 'PasswordIdpAdapter - Login\|FAILURE\|SUCCESS' > log_data.txt
I wonder at this point, is it possible to extract only specific fields from a line(not the whole line) from the connector.log , so one line in log_data.txt to look like(1, 4, 6, 7, 8):
1 2020-09-29T07:15:13,881 [tenant1;usrname#tenant1;10.93.231.5;] - username - SUCCESS
From that point, I need to write a script(maybe could be run by cronjob every minute)/or a command to construct the below json and send the request. One line - one request.
This is the example of the json:
{
"timestamp": "2020-09-16T15:24:35,377",
"tenant_name": "tenant-test",
"log_type": "SERVICE",
"log_entry": "Login: user123 - SUCCESS"
}
The field values that should be replaced already exist in the log line: timestamp(the 1st field, e.g. 2020-09-16T15:14:37,337), tenant_name(the 1st part of the 4th field, tenant-test) and the log_entry(the last four fields, e.g. Login: user123 - SUCCESS).
When the json is constructed, I'll send it by:
curl --header "Content-Type: application/json" --request POST --data \
$payload http://myservert:8080/api/requests
What is not clear to me, this script to get the data line by line from log_data.txt e.g.
and populate some of the fields to create the .json and send it to the server.
Thanks for your answers in advance,
Petko
Thanks #shellter for the awk idea. So, bash, awk, grep, cat, cut and curl did the job.
I've created a cronjob to execute the bash script on 5 min interval.
The script gets the last 5mins of log data, dump it to another file, reads the filtered data, prepare the payload and then executes the API call. Maybe it is stupid but it works.
#!/bin/bash
MONITORED_LOG="/var/logs/test.log"
FILTERED_DATA="/tmp/login/login_data.txt"
REST_HOST="https://rest-host/topics/logs-"
# dump the last 5 mins of log data(date format: 2020-09-28T10:52:28,334)
# to a file, filter for keywords FAILURE\|SUCCESS and NOT having 'lookup|SA'
# an example of data record taken: 1 2020-09-29T07:15:13,881 [tenant1;usrname#tenant1;10.93.231.5;] - username - SUCCESS
awk -v d1="$(date --date="-5 min" "+%Y-%m-%dT%H:%M:%S")" -v d2="$(date "+%Y-%m-%dT%H:%M:%S")" '$0 > d1 && $0 < d2' $MONITORED_LOG | grep 'FAILURE\|SUCCESS' | grep -v 'lookup\|SA-' | awk '{ print $2, $3, $5, $7}' | uniq -c > $FILTERED_DATA
## loop through all the filtered records and send an API call
cat $FILTERED_DATA | while read LINE; do
## preparing the variables
timestamp=$(echo $LINE | cut -f2 -d' ')
username=$(echo $LINE | cut -f5 -d' ')
log_entry=$(echo $LINE | cut -f7 -d' ')
# get the tenant name, split by ; and remove the first char [
tenant_name=$(echo $tenant_name | cut -f1 -d';')
tenant_name="${tenant_name:1}"
# preparing the payload
payload=$'{"records":[{"value":{"timestamp":"'
payload+=$timestamp
payload+=$'","tenant_name":"'
payload+=$tenant_name
payload+=$'","log_entry":"'
payload+=$log_entry
payload+=$'"}}]}'
echo 'payload: ' $payload
# send the api call to the server with dynamic construction of tenant name
curl -i -k -u 'api_user:3494ssdfs3' --request POST --header "Content-type:application/json" --data "$payload" "$REST_HOST$tenant_name"
done
I run the curl command $(curl -i -o - --silent -X GET --cert "${CERT}" --key "${KEY}" "$some_url") and save the response in the variable response. ${response} is as shown below
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Length: 34
Connection: keep-alive
Keep-Alive: timeout=5
X-XSS-Protection: 1;
{"status":"running","details":"0"}
I want to parse the JSON {"status":"running","details":"0"} and assign 'running' and 'details' to two different variables where I can print status and details both. Also if the status is equal to error, the script should exit. I am doing the following to achieve the task -
status1=$(echo "${response}" | awk '/^{.*}$/' | jq -r '.status')
details1=$(echo "${response}" | awk '/^{.*}$/' | jq -r '.details')
echo "Status: ${status1}"
echo "Details: ${details1}"
if [[ $status1 == 'error' ]]; then
exit 1
fi
Instead of parsing the JSON twice, I want to do it only once. Hence I want to combine the following lines but still assign the status and details to two separate variables -
status1=$(echo "${response}" | awk '/^{.*}$/' | jq -r '.status')
details1=$(echo "${response}" | awk '/^{.*}$/' | jq -r '.details')
First, stop using the -i argument to curl. That takes away the need for awk (or any other pruning of the header after-the-fact).
Second:
{
IFS= read -r -d '' status1
IFS= read -r -d '' details1
} < <(jq -r '.status + "\u0000" + .details + "\u0000"' <<<"$response")
The advantage of using a NUL as a delimiter is that it's the sole character that can't be present in the value of a C-style string (which is how shell variables' values are stored).
You can use a construction like:
read status1 details1 < <(jq -r '.status + " " + .details' <<< "${response}")
You use read to assign the different inputs to two variables (or an array, if you want), and use jq to print the data you need separated by whitespace.
As Benjamin already suggested, only retrieving the json is a better way to go. Poshi's solution is solid.
However, if you're looking for the most compact to do this, no need to save the response as a variable if the only thing your're going to do with it is extract other variables from it on a one time basis. Just pipe curl directly into:
curl "whatever" | jq -r '[.status, .details] |#tsv'
or
curl "whatever" | jq -r '[.status, .details] |join("\t")'
and you'll get your values fielded for you.
I was wondering how to parse the CURL JSON output from the server into variables.
Currently, I have -
curl -X POST -H "Content: agent-type: application/x-www-form-urlencoded" https://www.toontownrewritten.com/api/login?format=json -d username="$USERNAME" -d password="$PASSWORD" | python -m json.tool
But it only outputs the JSON from the server and then have it parsed, like so:
{
"eta": "0",
"position": "0",
"queueToken": "6bee9e85-343f-41c7-a4d3-156f901da615",
"success": "delayed"
}
But how do I put - for example the success value above returned from the server into a variable $SUCCESS and have the value as delayed & have queueToken as a variable $queueToken and 6bee9e85-343f-41c7-a4d3-156f901da615 as a value?
Then when I use-
echo "$SUCCESS"
it shows this as the output -
delayed
And when I use
echo "$queueToken"
and the output as
6bee9e85-343f-41c7-a4d3-156f901da615
Thanks!
Find and install jq (https://stedolan.github.io/jq/). jq is a JSON parser. JSON is not reliably parsed by line-oriented tools like sed because, like XML, JSON is not a line-oriented data format.
In terms of your question:
source <(
curl -X POST -H "$content_type" "$url" -d username="$USERNAME" -d password="$PASSWORD" |
jq -r '. as $h | keys | map(. + "=\"" + $h[.] + "\"") | .[]'
)
The jq syntax is a bit weird, I'm still working on it. It's basically a series of filters, each pipe taking the previous input and transforming it. In this case, the end result is some lines that look like variable="value"
This answer uses bash's "process substitution" to take the results of the jq command, treat it like a file, and source it into the current shell. The variables will then be available to use.
Here's an example of Extract a JSON value from a BASH script
#!/bin/bash
function jsonval {
temp=`echo $json | sed 's/\\\\\//\//g' | sed 's/[{}]//g' | awk -v k="text" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]}' | sed 's/\"\:\"/\|/g' | sed 's/[\,]/ /g' | sed 's/\"//g' | grep -w $prop`
echo ${temp##*|}
}
json=`curl -s -X GET http://twitter.com/users/show/$1.json`
prop='profile_image_url'
picurl=`jsonval`
`curl -s -X GET $picurl -o $1.png`
A bash script which demonstrates parsing a JSON string to extract a
property value. The script contains a jsonval function which operates
on two variables, json and prop. When the script is passed the name of
a twitter user it attempts to download the user's profile picture.
You could use perl module on command line:
1st, ensure they is installed, under debian based, you could
sudo apt-get install libjson-xs-perl
But for other OS, you could install perl modules via CPAN (the Comprehensive Perl Archive Network):
cpan App::cpanminus
cpan JSON::XS
Note: You may have to run this with superuser privileges.
then:
curlopts=(-X POST -H
"Content: apent-type: application/x-www-form-urlencoded"
-d username="$USERNAME" -d password="$PASSWORD")
curlurl=https://www.toontownrewritten.com/api/login?format=json
. <(
perl -MJSON::XS -e '
$/=undef;my $a=JSON::XS::decode_json <> ;
printf "declare -A Json=\047(%s)\047\n", join " ",map {
"[".$_."]=\"".$a->{$_}."\""
} qw|queueToken success eta position|;
' < <(
curl "${curlopts[#]}" $curlurl
)
)
The line qw|...| let you precise which variables you want to be driven... This could be replaced by keys $a, but could have to be debugged as some characters is forbiden is associative arrays values names.
echo ${Json[queueToken]}
6bee9e85-343f-41c7-a4d3-156f901da615
echo ${Json[eta]}
0
I am trying to catch the log, Serverlist.txt contains some servers details like root 10.0.0.1 22 TestServer, while I run the script it only read the first line and exit, its not working for further lines. Below is my script.
newdate1=`date -d "yesterday" '+%b %d' | sed 's/0/ /g'`
newdate2=`date -d "yesterday" '+%d/%b/%Y'`
newdate3=`date -d "yesterday" '+%y%m%d'`
DL=/opt/$newdate3
Serverlist=/opt/Serverlist.txt
serverlog()
{
mkdir -p $DL/$NAME
ssh -p$PORT $USER#$IP "cat /var/log/messages*|grep '$newdate1'"|cat > $DL/$NAME/messages.log
}
while read USER IP PORT NAME
do
serverlog
sleep 1;
done <<<"$Serverlist"
Use < instead of <<<. <<<is a Here String substitution. The right side is evaluated, and then the result is read from the loop as standard input:
$ FILE="my_file"
$ cat $FILE
First line
Last line
$ while read LINE; do echo $LINE; done <$FILE
First line
Last line
$ set -x
$ while read LINE; do echo $LINE; done <<<$FILE
+ read LINE
+ echo my_file
my_file
+ read LINE
$ while read LINE; do echo $LINE; done <<<$(ls /home)
++ ls /home
+ read LINE
+ echo antxon install lost+found
antxon install lost+found
+ read LINE
$
I got the answer from another link.
you can use "-n" option in ssh, this will not break the loop and you will get the desired result.