Regex for string starting with doueble quote and ending with [ - json

I have to update dictionary with new value for existing key in a JSON file.
I need no write new line after existing string using regex
Current file:
{
"id": "aaaa",
"desc": "Service aaa",
"boss":"user#company.de",
"email": [
"user#company.de"
],
desired file:
{
"id": "aaaa",
"desc": "Service aaa",
"boss":"user#company.de",
"email": [
"user#company.de"
"another_user#company.de"
]
I have this ansible lineinfile module playbook, but I struggle with decent regex. Everything I try just adds new line in the very end of file.
---
- hosts: localhost
gather_facts: no
tasks:
- name: insert line
lineinfile:
path: /home/file.json
state: present
insertafter: " ^ "email": [ "
line: 'another_user#company.de'
How should I write correct regex in this case to write line after the string "email": [ ?

quick comment :
JSON spec mandates an ASCII comma (",") between values of arrays (plus your choice of whitespace(s)), so to make the proposed solutions compliant, they would have to resemble this instead
—— (snippet directly from jq):
{
"id": "aaaa",
"desc": "Service aaa",
"boss": "user#company.de",
"email": [
"user#company.de",
"newline"
]
}

Let's say current file is aa.txt as follows
{
"id": "aaaa",
"desc": "Service aaa",
"boss":"user#company.de",
"email": [
"user#company.de"
],
Use sed command
sed '/"email.*\[/!{p;d;};n;a new line' aa.txt
Output
{
"id": "aaaa",
"desc": "Service aaa",
"boss":"user#company.de",
"email": [
"user#company.de"
new line
],
Alternatively use AWK
awk '1;/"email.*\[/{c=2}c&&!--c{print "new text"}' aa.txt
Output
{
"id": "aaaa",
"desc": "Service aaa",
"boss":"user#company.de",
"email": [
"user#company.de"
new line
],

Related

Extracting values from nested arrays

I'm trying to extract values from nested arrays in JSON below and output as CSV.
Fields to extract:
templates.name
items.name
triggers.name
Output as:
templates.name; items.name; triggers.name
Anticipated output something like:
"Template App Agent"; "Host name of zabbix_agentd running"; "Host name of zabbix_agentd was changed on {HOST.NAME}"
"Template App Agent"; "Agent ping"; "Zabbix agent on {HOST.NAME} is unreachable for 5 minutes"
"Template App Agent"; "Version of zabbix_agent(d) running"; ""
Note:
Not every item has a trigger.
Several triggers may exist for an item.
I'm new to JQ. So far only success is extracting the template name.
jq '.[] | {templates: [.templates[].name]}'
Data:
{
"zabbix_export": {
"version": "5.4",
"date": "2022-05-17T06:25:59Z",
"groups": [
{
"uuid": "7df96b18c230490a9a0a9e2307226338",
"name": "Templates"
}
],
"templates": [
{
"uuid": "e60e6598cf19448089a5f5a6c5d796a2",
"template": "Template App Agent",
"name": "Template App Agent",
"groups": [
{
"name": "Templates"
}
],
"items": [
{
"uuid": "24c03ed734d54dc8868a282a83a02200",
"name": "Host name of zabbix_agentd running",
"key": "agent.hostname",
"delay": "1h",
"history": "1w",
"trends": "0",
"value_type": "CHAR",
"request_method": "POST",
"tags": [
{
"tag": "Application",
"value": "Zabbix agent"
}
],
"triggers": [
{
"uuid": "d2d12d9e7dfe4fedb252f19b85e5e6aa",
"expression": "(last(/Template App Agent/agent.hostname,#1)<>last(/Template App Agent/agent.hostname,#2))>0",
"name": "Host name of zabbix_agentd was changed on {HOST.NAME}",
"priority": "INFO"
}
]
},
{
"uuid": "abacad4ca5eb46d29864d8a4998f1cbb",
"name": "Agent ping",
"key": "agent.ping",
"history": "1w",
"description": "The agent always returns 1 for this item. It could be used in combination with nodata() for availability check.",
"valuemap": {
"name": "Zabbix agent ping status"
},
"request_method": "POST",
"tags": [
{
"tag": "Application",
"value": "Zabbix agent"
}
],
"triggers": [
{
"uuid": "6d2a73199f3b4288bf36331a142c1725",
"expression": "nodata(/Template App Agent/agent.ping,5m)=1",
"name": "Zabbix agent on {HOST.NAME} is unreachable for 5 minutes",
"priority": "AVERAGE"
}
]
},
{
"uuid": "2cc337555efd43d181c28c792f8cbbdb",
"name": "Version of zabbix_agent(d) running",
"key": "agent.version",
"delay": "1h",
"history": "1w",
"trends": "0",
"value_type": "CHAR",
"request_method": "POST",
"tags": [
{
"tag": "Application",
"value": "Zabbix agent"
}
]
}
],
"valuemaps": [
{
"uuid": "3d66c59a28c04b0ca8227c87902ddb4d",
"name": "Zabbix agent ping status",
"mappings": [
{
"value": "1",
"newvalue": "Up"
}
]
}
]
}
]
}
}
.zabbix_export.templates[] | .name as $tn | .items[] | [ $tn, .name, .triggers[]?.name? ] | join("; ")
Loop over the templates
.zabbix_export.templates[]
Save the template name in a var
.name as $tn
Loop over the items
.items[]
Create an array with fields you like (including the name from step 1
[ $tn, .name, .triggers[]?.name? ]
Join the array to a string
join("; ")
Will output:
"Template App Agent; Host name of zabbix_agentd running; Host name of zabbix_agentd was changed on {HOST.NAME}"
"Template App Agent; Agent ping; Zabbix agent on {HOST.NAME} is unreachable for 5 minutes"
"Template App Agent; Version of zabbix_agent(d) running"
Online demo
This is a nested structure, you need to iterate level by level and add up the items you want to be in one output line. Store values from previous levels in variables.
To account for an inexistent .triggers array, you may use the Error Suppression Operator ? in combination with Alternative Operator //.
Finally, wrap the items in quotes (here using map), join them using join, and output them as raw text using the -r option
jq -r '
.[].templates[] | .name as $t
| .items[] | .name as $i
| [$t, $i, (.triggers[].name)? // ""]
| map("\"\(.)\"") | join("; ")
'
"Template App Agent"; "Host name of zabbix_agentd running"; "Host name of zabbix_agentd was changed on {HOST.NAME}"
"Template App Agent"; "Agent ping"; "Zabbix agent on {HOST.NAME} is unreachable for 5 minutes"
"Template App Agent"; "Version of zabbix_agent(d) running"; ""
Demo
Also consider using the #csv builtin, which gives you valid CSV right away (properly encoded (not just quoted) items, but separated with commas, not semicolons):
jq -r '
.[].templates[] | .name as $t
| .items[] | .name as $i
| [$t, $i, (.triggers[].name)? // ""]
| #csv
'
"Template App Agent","Host name of zabbix_agentd running","Host name of zabbix_agentd was changed on {HOST.NAME}"
"Template App Agent","Agent ping","Zabbix agent on {HOST.NAME} is unreachable for 5 minutes"
"Template App Agent","Version of zabbix_agent(d) running",""
Demo

Targeting sensitivity labels with New-DlpComplianceRule -ContentContainsSensitiveInformation via JSON file

Trying to import a DLP rule via a JSON file - current contents below:
{
"DlpRules":
[
{
"Name": "Example",
"Comment": "THis is an Example",
"Policy": "[TEST] EXAMPLE POLICY",
"Disabled": "false",
"Priority": "0",
"BlockAccess": "true",
"BlockAccessScope": "All",
"AlertProperties": {
"AggregationType": "None"
},
"GenerateAlert": "true",
"NotifyUser": [
"example#example.com"
],
"NotifyEmailCustomText": "Test",
"ReportSeverityLevel": "Medium",
"ContentContainsSensitiveInformation": [{
"Groups": [{
"operator": "Or",
"labels": [
{
"name": "EXAMPLE - LABEL",
"id": "[PRETEND GUID IS HERE]",
"type": "Sensitivity"
}
],
"name": "Default"
}]
}]
}
]
}
When running I get the below the error:
The value specified in sensitive information is invalid.
+ CategoryInfo : NotSpecified: (:) [Set-DlpComplianceRule], CompliancePolicyValidationException
+ FullyQualifiedErrorId : Microsoft.Office.CompliancePolicy.PolicyEvaluation.CompliancePolicyValidationException,Microsoft.Office.Compli
ancePolicy.Tasks.SetDlpComplianceRule
+ PSComputerName : aus01b.ps.compliance.protection.outlook.com
Replacing the Labels block with 'sensitivetypes' targeting a Sensitive information type is successful.
Current file is based on manually creating it in the compliance portal exporting it and expanding the 'System.Collections.Hashtable' values with the below - is possible I'm doing something daft when it comes to combining these imputs:
(get-dlpcompliancerule "Example").ContentContainsSensitiveInformation | ConvertTo-Json
(get-dlpcompliancerule "Example").ContentContainsSensitiveInformation.groups | ConvertTo-Json
(get-dlpcompliancerule "Example").ContentContainsSensitiveInformation.groups.labels | ConvertTo-Json

Read json values using sed or awk. I am not allowed to use jq

For the following json data, I need to retrieve the value of the status. I tried to look for examples online and adopt the same, but couldn't do it successfully as this json has arrays. Can you please help me retrieving the "status" in the following json?
This is how the jq version looks echo $JSON | jq -r .data.affected_items[].status I need the same using
{
"data": {
"affected_items": [
{
"os": {
"arch": "x86_64",
"major": "2",
"name": "Amazon Linux",
"platform": "amzn",
"uname": "Linux |ip-10-179-120-6.vpc.internal |4.14.256-197.484.amzn2.x86_64 |#1 SMP Tue Nov 30 00:17:50 UTC 2021 |x86_64",
"version": "2"
},
"manager": "wazuh-manager-worker-0",
"dateAdd": "2022-02-24T08:42:52Z",
"lastKeepAlive": "2022-03-08T04:33:44Z",
"group": [
"default"
],
"name": "ec2_us-west-2_279976188247_i-030ccd7d70b84f0ee",
"ip": "10.179.120.6",
"configSum": "ab73af41699f13fdd81903b5f23d8d00",
"node_name": "wazuh-manager-worker-0",
"status": "active",
"version": "Wazuh v4.1.5",
"mergedSum": "56dfa0edef630b932284df2f81bf4a1c",
"id": "006",
"registerIP": "any"
}
],
"total_affected_items": 1,
"total_failed_items": 0,
"failed_items": []
},
"message": "All selected agents information was returned",
"error": 0
}
If this isn't all you need:
$ sed -n 's/.*"status": \("[^"]*"\).*/\1/p' file
"active"
then edit your question to contain a better explanation of your requirements and more truly representative sample input/output that the above doesn't work for.

AWS Data Pipeline - Set Hive site values during EMR Creation

We are upgrading our Data pipeline version from 3.3.2 to 5.8, so those bootstrap actions on old AMI release have changed to be setup using configuration and specifying them under classification / property definition.
So my Json looks like below
{
"enableDebugging": "true",
"taskInstanceBidPrice": "1",
"terminateAfter": "2 Hours",
"name": "ExportCluster",
"taskInstanceType": "m1.xlarge",
"schedule": {
"ref": "Default"
},
"emrLogUri": "s3://emr-script-logs/",
"coreInstanceType": "m1.xlarge",
"coreInstanceCount": "1",
"taskInstanceCount": "4",
"masterInstanceType": "m3.xlarge",
"keyPair": "XXXX",
"applications": ["hadoop","hive", "tez"],
"subnetId": "XXXXX",
"logUri": "s3://pipelinedata/XXX",
"releaseLabel": "emr-5.8.0",
"type": "EmrCluster",
"id": "EmrClusterWithNewEMRVersion",
"configuration": [
{ "ref": "configureEmrHiveSite" }
]
},
{
"myComment": "This object configures hive-site xml.",
"name": "HiveSite Configuration",
"type": "HiveSiteConfiguration",
"id": "configureEmrHiveSite",
"classification": "hive-site",
"property": [
{"ref": "hive-exec-compress-output" }
]
},
{
"myComment": "This object sets a hive-site configuration
property value.",
"name":"hive-exec-compress-output",
"type": "Property",
"id": "hive-exec-compress-output",
"key": "hive.exec.compress.output",
"value": "true"
}
],
"parameters": []
With the above Json file it gets loaded into Data Pipeline but throws an error saying
Object:HiveSite Configuration
ERROR: 'HiveSiteConfiguration'
Object:ExportCluster
ERROR: 'configuration' values must be of type 'null'. Found values of type 'null'
I am not sure what this really means and could you please let me know if i am specifying this correctly which i think i am according to http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html
The below block should have the name as "EMR Configuration" only then its recognized correctly by the AWS Data pipeline and the Hive-site.xml is being set accordingly.
{
"myComment": "This object configures hive-site xml.",
"name": "EMR Configuration",
"type": "EmrConfiguration",
"id": "configureEmrHiveSite",
"classification": "hive-site",
"property": [
{"ref": "hive-exec-compress-output" }
]
},

Creating a CSV from json using jq, based on elements in array

I have the following json format that I need to convert to CSV
[{
"name": "joe",
"age": 21,
"skills": [{
"lang": "spanish",
"grade": "47",
"school": {
"name": "my school",
"url": "example.com/sp-school"
}
}, {
"lang": "english",
"grade": "87"
}]
},
{
"name": "sarah",
"age": 34,
"skills": [{
"lang": "french",
"grade": "47",
"school": {
"name": "my school",
"url": "example.com/sp-school"
}
}, {
"lang": "english",
"grade": "87"
}]
}, {
"name": "jim",
"age": 26,
"skills": [{
"lang": "spanish",
"grade": "60"
}, {
"lang": "english",
"grade": "66",
"school": {
"name": "eg school",
"url": "eg-school.com"
}
}]
}
]
to convert to csv
name,age,grade,school,url,file,line_number
joe,21,47,"my school","example.com/sp-school",sample.json,1
jim,26,60,"","",sample.json,3
So add the top level fields and the object from the skills array if lang=spanish and the school hash from the skills object for spanish if it exists
I'd also like to add the file and line number it came from.
I would like to use jq for the job, but can't figure out the syntax , anyone help me out ?
With your data in input.json, and the following jq program in tocsv.jq:
.[]
| [.name, .age] +
(.skills[]
| select(.lang == "spanish")
| [.grade, .school.name, .school.url, input_filename, input_line_number] )
| #csv
the invocation:
jq -r -f tocsv.jq input.json
yields:
"joe",21,"47","my school","example.com/sp-school","input.json",51
"jim",26,"60",,,"input.json",51
If you want the number-valued strings converted to numbers, you could use the "tonumber" filter. If you want the null-valued fields replaced by strings, use e.g. .school.name // ""
Of course this approach doesn't yield a very useful line number. One approach that would yield higher granularity would be to stream the individual objects into jq, but then you'd lose the filename. To recover the filename you could pass it in as an argument. So you would have a pipeline like so:
jq -c '.[]' input.json | jq -r --arg file input.json -f tocsv2.jq
where tocsv2.jq would be like tscsv.jq above but without the initial .[] |, and with $file instead of input_filename.
Finally, please also consider using the TSV format (#tsv) rather than the rather messy CSV format (#csv).