JMESPath filter with >1 match ANDING - json

I saw the ORING post; this should cover ANDING; I struggled with this one.
Given this while loop:
while read -r resourceID resourceName; do
pMsg "Processing: $resourceID with $resourceName"
aws emr describe-cluster --cluster-id="$resourceID" --output table > ${resourceName}.md"
done <<< "$(aws emr list-clusters --active --query='Clusters[].Id' \
--output text | sortExpression)"
I need to feed my loop with the ID AND Name of the clusters. One is easy; two is eluding me. Any help is appreciated.

If your goal is to end up with a output looking like this from list-clusters:
1 ABCD
2 EFGH
In order to feed it to describe-cluster, then you should create a multiselect list.
Something like:
Clusters[].[Id, Name]
This is actually described in the user guide about text output format, where they show that:
'Reservations[*].Instances[*].[Placement.AvailabilityZone, State.Name,
InstanceId]' --output text
Gives
us-west-2a running i-4b41a37c
us-west-2a stopped i-a071c394
us-west-2b stopped i-97a217a0
us-west-2a running i-3045b007
us-west-2a running i-6fc67758
Source: https://docs.aws.amazon.com/cli/latest/userguide/cli-usage-output-format.html#text-output
So you should end up with
while read -r resourceID resourceName; do
pMsg "Processing: $resourceID with $resourceName"
aws emr describe-cluster \
--cluster-id="$resourceID" \
--output table > ${resourceName}.md"
done <<< "$(aws emr list-clusters \
--active \
--query='Clusters[].[Id, Name]' \
--output text | sortExpression \
)"

Related

Can you separate distinct JSON attributes into two files using jq?

I am following this tutorial from Vault about creating your own certificate authority. I'd like to separate the response (change the output to API call using cURL to see the response) into two distinct files, one file possessing the certificate and issuing_ca attributes, the other file containing the private_key. The tutorial is using jq to parse JSON objects, but my unfamiliarity with jq isn't helpful here, and most searches are returning info on how to merge JSON using jq.
I've tried running something like
vault write -format=json pki_int/issue/example-dot-com \
common_name="test.example.com" \
ttl="24h" \
format=pem \
jq -r '.data.certificate, .data.issuing_ca > test.cert.pem \
jq -r '.data.private_key' > test.key.pem
or
vault write -format=json pki_int/issue/example-dot-com \
common_name="test.example.com" \
ttl="24h" \
format=pem \
| jq -r '.data.certificate, .data.issuing_ca > test.cert.pem \
| jq -r '.data.private_key' > test.key.pem
but no dice.
It is not an issue with jq invocation, but the way the output files get written. Per your usage indicated, after writing the file test.cert.pem, the contents over the read end of the pipe (JSON output) is no longer available to extract the private_key contents.
To duplicate the contents over at the write end of pipe, use tee along with process substitution. The following should work on bash/zsh or ksh93 and not on POSIX bourne shell sh
vault write -format=json pki_int/issue/example-dot-com \
common_name="test.example.com" \
ttl="24h" \
format=pem \
| tee >( jq -r '.data.certificate, .data.issuing_ca' > test.cert.pem) \
>(jq -r '.data.private_key' > test.key.pem) \
>/dev/null
See this in action
jq -n '{data:{certificate: "foo", issuing_ca: "bar", private_key: "zoo"}}' \
| tee >( jq -r '.data.certificate, .data.issuing_ca' > test.cert.pem) \
>(jq -r '.data.private_key' > test.key.pem) \
>/dev/null
and now observe the contents of both the files.
You could abuse jq's ability to write to standard error (version 1.6 or later) separately from standard output.
vault write -format=json pki_int/issue/example-dot-com \
common_name="test.example.com" \
ttl="24h" \
format=pem \
| jq -r '.data as $f | ($f.private_key | stderr) | ($f.certificate, $f.issuing_ca)' > test.cert.pem 2> test.key.pem
There's a general technique for this type of problem that is worth mentioning
because it has minimal prerequisites (just jq and awk), and because
it scales well with the number of files. Furthermore it is quite efficient in that only one invocation each of jq and awk is needed. The idea is to setup a pipeline of the form: jq ... | awk ...
There are many variants
of the technique but in the present case, the following would suffice:
jq -rc '
.data
| "test.cert.pem",
"\t\(.certificate)",
"\t\(.issuing_ca)",
"test.key.pem",
"\t\(.private_key)"
' | awk -F\\t 'NF == 1 {fn=$1; next} {print $2 > fn}'
Notice that this works even if the items of interest are strings with embedded tabs.

Create Azure EventHub via CLI with Capture

Scenario: I am putting together a repeatable script that creates, among other things, an Azure EventHub. My code looks like:
az eventhubs eventhub create \
--name [name] \
--namespace-name [namespace] \
--resource-group [group] \
--status Active \
--enable-capture true \
--archive-name-format "{Namespace}/{EventHub}/{Year}/{Month}/{Day}/{Hour}/{Minute}/{Second}/{PartitionId}" \
--storage-account [account] \
--blob-container [blob] \
--capture-interval 300 \
--partition-count 10 \
--skip-empty-archives true
If I run the code as written, I get a "Required property 'name' not found in JSON. Path 'properties.captureDescription.destination', line 1, position 527."
However, if I remove the --enable-capture true parameter, the EventHub is created, albeit with Capture not enabled. If I enable Capture, none of the capture-related parameters other than the interval are set.
Is there a typo in there that I'm not seeing?
Try providing the --destination-name.
az eventhubs eventhub create --name
--namespace-name
--resource-group
[--archive-name-format]
[--blob-container]
[--capture-interval]
[--capture-size-limit]
[--destination-name]
[--enable-capture {false, true}]
[--message-retention]
[--partition-count]
[--skip-empty-archives {false, true}]
[--status {Active, Disabled, SendDisabled}]
[--storage-account]

How to store json key value pair in and store it in one variable in linux

I am calling python command which will return data is JSON key-value pair.
I have put python command and other command in one shell script named as - a.sh
Code (a.sh):
cd /home/drg/Code/dth
a=$(python3 main.py -z shell -y droub -i 56)
echo "$a"
When I am calling this script I am getting output as:
{'password': 'XYZ', 'name': 'Stguy', 'port': '5412', 'host': 'igtet', 'db_name': 'test3'}
And after getting this output I want to pass the output value like password, name to psql command to run postgresql query.
So, what I want is that I should be able to store password value in one variable, name in one variable like:
a= xyz
b=Stguy
p= port
So, that I can use this variables to pass in psql query as:
psql -h $a -p $p -U $b -d $db -c "CREATE SCHEMA IF NOT EXISTS ${sname,,};"
Can someone please help me with this?
Note: Env is linux(Centos 8)
Thanks in advance!
One way of solving this could be a combination of jq for value extraction and shell-builtin read for multiple variable assignment:
JSON='{"name": "Stguy", "port": 5412, "host": "igtet", "db_name": "test3"}'
read -r a b c <<<$( echo $JSON | jq -r '"\(.host) \(.port) \(.name)"' )
echo "a: $a, b: $b, c: $c"
doing jq string interpolation "\( )" to print result in one line
You can aslo go with sed or awk:
PSQL="$( python3 main.py -z shell -y droub -i 56 | sed "s/^[^:]*: *'\([^']*\)'[^:]*: *'\([^']*\)'[^:]*: *'\([^']*\)'[^:]*: *'\([^']*\)'[^:]*: *'\([^']*\)'}/psql -h '\4' -p '\1' -U '\2' -d '\5'/")"
[ "${PSQL:0:5}" = "psql " ] && ${PSQL} -c "CREATE SCHEMA IF NOT EXISTS ${sname,,};"
For security consideration, i urge you anyway to avoid passing account data (user passwd) through environment variables.
It would be better if your python script had an option to directly launch psql with required parameters.

Extract data from unix log file, construct JSON and perform post request using curl

My overall task is constantly to collect data from UNIX system log file, filter it, prepare a json payload based on the filtered data and process the data by sending a post api call to another server.
I wonder if that can be done using let's say shell script to monitor the log file with tail, filter with grep to get the specific lines dumpted in another file. With cronjob to run another script which contruct a .json and send curl request with the json to external server.
Some details:
In the log file - connector.log I am interested in lines like:
2020-09-16T15:14:37,337 INFO (tomcat-http--131) [tenant-test;-;138.188.247.4;] com.vmware.horizon.adapters.passwordAdapter.PasswordIdpAdapter - Login: user123 - SUCCESS
These lines, I can collect by the below command:
tailf connector.log | grep 'PasswordIdpAdapter - Login\|FAILURE\|SUCCESS'
and probably dump them into a file:
tailf connector.log | grep 'PasswordIdpAdapter - Login\|FAILURE\|SUCCESS' > log_data.txt
I wonder at this point, is it possible to extract only specific fields from a line(not the whole line) from the connector.log , so one line in log_data.txt to look like(1, 4, 6, 7, 8):
1 2020-09-29T07:15:13,881 [tenant1;usrname#tenant1;10.93.231.5;] - username - SUCCESS
From that point, I need to write a script(maybe could be run by cronjob every minute)/or a command to construct the below json and send the request. One line - one request.
This is the example of the json:
{
"timestamp": "2020-09-16T15:24:35,377",
"tenant_name": "tenant-test",
"log_type": "SERVICE",
"log_entry": "Login: user123 - SUCCESS"
}
The field values that should be replaced already exist in the log line: timestamp(the 1st field, e.g. 2020-09-16T15:14:37,337), tenant_name(the 1st part of the 4th field, tenant-test) and the log_entry(the last four fields, e.g. Login: user123 - SUCCESS).
When the json is constructed, I'll send it by:
curl --header "Content-Type: application/json" --request POST --data \
$payload http://myservert:8080/api/requests
What is not clear to me, this script to get the data line by line from log_data.txt e.g.
and populate some of the fields to create the .json and send it to the server.
Thanks for your answers in advance,
Petko
Thanks #shellter for the awk idea. So, bash, awk, grep, cat, cut and curl did the job.
I've created a cronjob to execute the bash script on 5 min interval.
The script gets the last 5mins of log data, dump it to another file, reads the filtered data, prepare the payload and then executes the API call. Maybe it is stupid but it works.
#!/bin/bash
MONITORED_LOG="/var/logs/test.log"
FILTERED_DATA="/tmp/login/login_data.txt"
REST_HOST="https://rest-host/topics/logs-"
# dump the last 5 mins of log data(date format: 2020-09-28T10:52:28,334)
# to a file, filter for keywords FAILURE\|SUCCESS and NOT having 'lookup|SA'
# an example of data record taken: 1 2020-09-29T07:15:13,881 [tenant1;usrname#tenant1;10.93.231.5;] - username - SUCCESS
awk -v d1="$(date --date="-5 min" "+%Y-%m-%dT%H:%M:%S")" -v d2="$(date "+%Y-%m-%dT%H:%M:%S")" '$0 > d1 && $0 < d2' $MONITORED_LOG | grep 'FAILURE\|SUCCESS' | grep -v 'lookup\|SA-' | awk '{ print $2, $3, $5, $7}' | uniq -c > $FILTERED_DATA
## loop through all the filtered records and send an API call
cat $FILTERED_DATA | while read LINE; do
## preparing the variables
timestamp=$(echo $LINE | cut -f2 -d' ')
username=$(echo $LINE | cut -f5 -d' ')
log_entry=$(echo $LINE | cut -f7 -d' ')
# get the tenant name, split by ; and remove the first char [
tenant_name=$(echo $tenant_name | cut -f1 -d';')
tenant_name="${tenant_name:1}"
# preparing the payload
payload=$'{"records":[{"value":{"timestamp":"'
payload+=$timestamp
payload+=$'","tenant_name":"'
payload+=$tenant_name
payload+=$'","log_entry":"'
payload+=$log_entry
payload+=$'"}}]}'
echo 'payload: ' $payload
# send the api call to the server with dynamic construction of tenant name
curl -i -k -u 'api_user:3494ssdfs3' --request POST --header "Content-type:application/json" --data "$payload" "$REST_HOST$tenant_name"
done

arangoimp of graph from CSV file

I have a network scan in a TSV file that contains data in a form like the following sample
source IP target IP source port target port
192.168.84.3 192.189.42.52 5868 1214
192.168.42.52 192.189.42.19 1214 5968
192.168.4.3 192.189.42.52 60680 22
....
192.189.42.52 192.168.4.3 22 61969
Is there an easy way to import this using arangoimp into the (pre-created) edge collection networkdata?
You could combine the TSV importer, if it wouldn't fail converting the IPs (fixed in ArangoDB 3.0), so you need a bit more conversion logic to get valid CSV. One will use the ede attribute conversion option to convert the first two columns to valid _from and _to attributes during the import.
You shouldn't specify column subjects with blanks in them, and it should really be tabs or a constant number of columns. We need to specify a _from and a _to field in the subject line.
In order to make it work, you would pipe the above through sed to get valid CSV and proper column names like this:
cat /tmp/test.tsv | \
sed -e "s;source IP;_from;g;" \
-e "s;target IP;_to;" \
-e "s; port;Port;g" \
-e 's; *;",";g' \
-e 's;^;";' \
-e 's;$;";' | \
arangoimp --file - \
--type csv \
--from-collection-prefix sourceHosts \
--to-collection-prefix targetHosts \
--collection "ipEdges" \
--create-collection true \
--create-collection-type edge
Sed with these regular expressions will create an intermediate representation looking like that:
"_from","_to","sourcePort","targetPort"
"192.168.84.3","192.189.42.52","5868","1214"
The generated edges will look like that:
{
"_key" : "21056",
"_id" : "ipEdges/21056",
"_from" : "sourceHosts/192.168.84.3",
"_to" : "targetHosts/192.189.42.52",
"_rev" : "21056",
"sourcePort" : "5868",
"targetPort" : "1214"
}