merging 2 json into one single json with value parsing in bash. - json

I have two JSONS:
{
"name": "paypal_modmon",
"description": "Role For Paypal admin-service box",
"run_list": [
"recipe[djcm_paypal_win::sslVerify]"
]
}
and
{
"name": "paypal_dev",
"default_attributes": {
"7-zip": {
"home": "%SYSTEMDRIVE%\\7-zip"
},
"modmon": {
"env": "dev"
},
"paypal": {
"artifact": "%5BINTEGRATION%5D"
}
},
"override_attributes": {
"default": {
"env": "developmen"
},
"windows": {
"password": "Pib1StheK1N5"
},
"task_sched":{
"credentials": "kX?rLQ4XN$q"
},
"seven_zip": {
"url": "https://djcm:Pib1StheK1N5#artifactory.dowjones.io/artifactory/djcm-zip-local/djcm/chef/paypal/7z1514-x64.msi"
}
},
"chef_type": "environment"
}
I want to read the values from the second json : "default_attributes" and "override_attributes" and merge them with the first json into an output like :
{
"description": "Role For Paypal admin-service box",
"run_list": [
"recipe[djcm_paypal_win::sslVerify]"
],
"chef_type": "environment",
"seven_zip": {
"url": "https://djcm:Pib1StheK1N5#artifactory.dowjones.io/artifactory/djcm-zip-local/djcm/chef/paypal/7z1514-x64.msi"
},
"task_sched": {
"credentials": "kX?rLQ4XN$q"
},
"windows": {
"password": "Pib1StheK1N5"
},
"paypal": {
"artifact": "%5BINTEGRATION%5D"
},
"modmon": {
"env": "dev"
},
"7-zip": {
"home": "%SYSTEMDRIVE%\\7-zip"
},
"default": {
"env": "developmen"
},
"name": "paypal_modmon"
}
Is there a way to do this in bash and how would go to achieve it ?

Generally, if you're reading in multiple files, you should use the --argfile option so you can reference the contents of the file by name. And judging by the name of the attributes you wish to merge, you should be wary of the different merging options you have. default_attributes suggests it should be attributes that should be used if omitted. override_attributes suggests it should force it's values in.
$ jq --argfile merge input2.json \
'($merge.default_attributes * .) + $merge.override_attributes' input1.json
By merging the input with the default_attributes using *, it allows you to start with the defaults and add your actual values in place. That way missing values end up being provided by the default object.
Then adding the override_attributes object, the values are completely replaced and not just merged.

Got it. With jq seems super simple :
jq -s '.[0] + .[1].default_attributes + .[1].override_attributes' a-roles.json a-env.json > manifest.json
manifest.json ->
{
"default": {
"env": "developmen-jq"
},
"7-zip": {
"home": "%SYSTEMDRIVE%\\7-zip"
},
"name": "paypal_modmon",
"description": "Role For Paypal admin-service box",
"run_list": [
"recipe[djcm_paypal_win::sslVerify]"
],
"seven_zip": {
"url": "https://djcm:Pib1StheK1N5#artifactory.dowjones.io/artifactory/djcm-zip-local/djcm/chef/paypal/7z1514-x64.msi"
},
"task_sched": {
"credentials": "kX?rLQ4XN$q"
},
"windows": {
"password": "Pib1StheK1N5"
},
"paypal": {
"artifact": "%5BINTEGRATION%5D"
},
"modmon": {
"env": "dev"
}
}
EDIT 1 :
I also need to parse out the run_list key value pair from a-roles.json and ignore all other info to have something:
{
"default": {
"env": "developmen-jq"
},
"7-zip": {
"home": "%SYSTEMDRIVE%\\7-zip"
},
"run_list": [
"recipe[djcm_paypal_win::sslVerify]"
],
"seven_zip": {
"url": "https://djcm:Pib1StheK1N5#artifactory.dowjones.io/artifactory/djcm-zip-local/djcm/chef/paypal/7z1514-x64.msi"
},
"task_sched": {
"credentials": "kX?rLQ4XN$q"
},
"windows": {
"password": "Pib1StheK1N5"
},
"paypal": {
"artifact": "%5BINTEGRATION%5D"
},
"modmon": {
"env": "dev"
}
}
is that possible with jq ?

Related

Extract JSON including key with jq command

HERE is sample json file.
sample.json
{
"apps": [
{
"name": "app1"
},
{
"name": "app2"
},
{
"name": "app3"
}
],
"test": [
{
"name": "test1"
},
{
"name": "test2"
}
]
}
I want to divide the above JSON file into the following two files.
I want to manage the entire configuration file with one JSON, divide the file when necessary and give it to the tool.
apps.json
{
"apps": [
{
"name": "app1"
},
{
"name": "app2"
},
{
"name": "app3"
}
}
test.json
{
"test": [
{
"name": "test1"
},
{
"name": "test1"
}
]
}
jq .apps sample.json outputs only value.
[
// Not contain the key
{
"name": "app1"
},
{
"name": "app2"
},
{
"name": "app3"
}
]
Can you have any idea?
Construct a new object using {x} which is a shorthand for {x: .x}.
jq '{apps}' sample.json
{
"apps": [
{
"name": "app1"
},
{
"name": "app2"
},
{
"name": "app3"
}
]
}
Demo
And likewise with {test}.
You can do
{apps}, {test}
Demo
https://jqplay.org/s/P_9cc2uANV

Delete json block with jq command

I have json file with multiple domains which is formated as is showed below. How can I delete whole blocks with domains? For example if I will want to delete whole block in json for domain domain.tld?
I tryed this, but output is error:
jq '."http-01"."domain"[]."main"="domain.tld"' acme.json
jq: error (at acme.json:11483): Cannot iterate over null (null)
formating example file:
{
"http-01": {
"Account": {
"Email": "mail#placeholder.tld",
"Registration": {
"body": {
"status": "valid",
"contact": [
"mailto:mail#placeholder.tld"
]
},
"uri": "https://acme-v02.api.letsencrypt.org/acme/acct/110801506"
},
"PrivateKey": "main_priv_key_string",
"KeyType": "4096"
},
"Certificates": [
{
"domain": {
"main": "www.some_domain.tld"
},
"certificate": "cert_string",
"key": "key_string",
"Store": "default"
},
{
"domain": {
"main": "some_domain.tld"
},
"certificate": "cert_string",
"key": "key_string",
"Store": "default"
},
{
"domain": {
"main": "www.some_domain2.tld"
},
"certificate": "cert_string",
"key": "key_string",
"Store": "default"
},
{
"domain": {
"main": "some_domain2.tld"
},
"certificate": "cert_string",
"key": "key_string",
"Store": "default"
}
]
}
}
To delete domain block "www.some_domain.tld" :
jq '."http-01".Certificates |= map(select(.domain.main != "www.some_domain.tld"))' input.json
Your question is quite broad. What is a "block"?
Let's assume you want to delete from within the object under http-01 each field that is of type array and has at index 0 an object satisfying .domain.main == "domain.tld". Then first navigate to where you want to delete from, and update it (|=) using del and select which performs the filtered deletion.
jq '
."http-01" |= del(
.[] | select(arrays[0] | objects.domain.main == "domain.tld")
)
' acme.json
{
"http-01": {
"Account": {
"Email": "email#domain.tld",
"Registration": {
"body": {
"status": "valid",
"contact": [
"mailto:email#domain.tld"
]
},
"uri": "https://acme-v02.api.letsencrypt.org/acme/acct/110801506"
},
"PrivateKey": "long_key_string",
"KeyType": "4096"
}
}
}
Demo
If your "block" is deeper, go deeper before updating. If it is higher, the whole document for instance, there's no need to update, just start with del.

Selecting multiple conditionals in JQ

I've just started using jq json parser, is there anyway to choose multiple select?
I have this:
cat file | jq -r '.Instances[] | {ip: .PrivateIpAddress, name: .Tags[]}
| select(.name.Key == "Name")'
And I need to also include the .name.Key == "Type"
This is the JSON:
{
"Instances": [
{
"PrivateIpAddress": "1.1.1.1",
"Tags": [
{
"Value": "Daily",
"Key": "Backup"
},
{
"Value": "System",
"Key": "Name"
},
{
"Value": "YES",
"Key": "Is_in_Domain"
},
{
"Value": "PROD",
"Key": "Type"
}
]
}
]
}
And this is the current output:
{
"ip": "1.1.1.1",
"name": "System"
}
{
"ip": "2.2.2.2",
"name": "host"
}
{
"ip": "3.3.3.3",
"name": "slog"
}
Desired output:
{
"ip": "1.1.1.1",
"name": "System",
"type": "PROD"
}
{
"ip": "2.2.2.2",
"name": "host",
"type": "PROD"
}
{
"ip": "3.3.3.3",
"name": "slog",
"type": "PROD"
}
What is the right way to do it? Thanks.
There's no "right" way to do it, but there are approaches to take that can make things easier for you.
The tags are already in a format that makes converting to objects simple (they're object entries). Convert the tags to an object for easy access to the properties.
$ jq '.Instances[]
| .Tags |= from_entries
| {
ip: .PrivateIpAddress,
name: .Tags.Name,
type: .Tags.Type
}' file

Elasticsearch : Default template does not detect date

I have a default template in place which looks like
PUT /_template/abtemp
{
"template": "abt*",
"settings": {
"index.refresh_interval": "5s",
"number_of_shards": 5,
"number_of_replicas": 1,
"index.codec": "best_compression"
},
"mappings": {
"_default_": {
"_all": {
"enabled": false
},
"_source": {
"enabled": true
},
"dynamic_templates": [
{
"message_field": {
"match": "message",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"omit_norms": true,
"fielddata": {
"format": "disabled"
}
}
}
},
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"omit_norms": true,
"fielddata": {
"format": "disabled"
},
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
}
}
]
}
}
}
the idea here is this
apply the template for all indices whose name matches abt*
Only analyze a string field if it is named message. All other string fields will be not_analyzed and will have a corresponding .raw field
now i try to index some data into this as
curl -s -XPOST hostName:port/indexName/_bulk --data-binary #myFile.json
and here is the file
{ "index" : { "_index" : "abtclm3","_type" : "test"} }
{ "FIELD1":1, "FIELD2":"2015-11-18 15:32:18"", "FIELD3":"MATTHEWS", "FIELD4":"GARY", "FIELD5":"", "FIELD6":"STARMX", "FIELD7":"AL", "FIELD8":"05/15/2010 11:30", "FIELD9":"05/19/2010 7:00", "FIELD10":"05/19/2010 23:00", "FIELD11":3275, "FIELD12":"LC", "FIELD13":"WIN", "FIELD14":"05/15/2010 11:30", "FIELD15":"LC", "FIELD16":"POTUS", "FIELD17":"WH", "FIELD18":"S GROUNDS", "FIELD19":"OFFICE", "FIELD20":"VISITORS", "FIELD21":"STATE ARRIVAL - MEXICO**", "FIELD22":"08/27/2010 07:00:00 AM +0000", "FIELD23":"MATTHEWS", "FIELD24":"GARY", "FIELD25":"", "FIELD26":"STARMX", "FIELD27":"AL", "FIELD28":"05/15/2010 11:30", "FIELD29":"05/19/2010 7:00", "FIELD30":"05/19/2010 23:00", "FIELD31":3275, "FIELD32":"LC", "FIELD33":"WIN", "FIELD34":"05/15/2010 11:30", "FIELD35":"LC", "FIELD36":"POTUS", "FIELD37":"WH", "FIELD38":"S GROUNDS", "FIELD39":"OFFICE", "FIELD40":"VISITORS", "FIELD41":"STATE ARRIVAL - MEXICO**", "FIELD42":"08/27/2010 07:00:00 AM +0000" }
note that there are a few fields, such as FIELD2 that should be classified as a date. Also, FIELD31 should be classified as long. So the indexing happens and when i look at the data i see that the numbers have been correctly classified but everything else has been put under string. How do i make sure that the fields that have timestamps get classified as dates?
You have a lot of date formats there. You need a template like this one:
{
"template": "abt*",
"settings": {
"index.refresh_interval": "5s",
"number_of_shards": 5,
"number_of_replicas": 1,
"index.codec": "best_compression"
},
"mappings": {
"_default_": {
"dynamic_date_formats":["dateOptionalTime||yyyy-mm-dd HH:mm:ss||mm/dd/yyyy HH:mm||mm/dd/yyyy HH:mm:ss aa ZZ"],
"_all": {
"enabled": false
},
"_source": {
"enabled": true
},
"dynamic_templates": [
{
"message_field": {
"match": "message",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"omit_norms": true,
"fielddata": {
"format": "disabled"
}
}
}
},
{
"dates": {
"match": "*",
"match_mapping_type": "date",
"mapping": {
"type": "date",
"format": "dateOptionalTime||yyyy-mm-dd HH:mm:ss||mm/dd/yyyy HH:mm||mm/dd/yyyy HH:mm:ss aa ZZ"
}
}
},
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"type": "string",
"index": "analyzed",
"omit_norms": true,
"fielddata": {
"format": "disabled"
},
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed",
"ignore_above": 256
}
}
}
}
}
]
}
}
}
This probably doesn't cover all the formats you have in there, you need to add the remaining ones. The idea is to specify them under dynamic_date_formats separated by || and then to specify them, also, under the format field for the date field itself.
To get an idea on what you need to do to define them, please see this section of the documentation for builtin formats and this piece of documentation for any custom formats you'd plan on using.

how to match an array value by it's key in a key value pair elasticsearch array?

I have an array of key value pairs. Is it possible to exact match value of key & then do a check on it's value's range value?
Example: In below doc oracle_props is an array with name, value pairs. I need to check if it has "oracle_cursors" key and then check if it's value is less than 1000.
GET /eg/message/_percolate
{
"doc": {
"client": {
"name": "Athena",
"version": 1,
"db": {
"#type": "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 64bit",
"oracle_props": [
{
"#name": "open_cursors",
"#value": 4000
},
{
"#name": "USER_ROLE_PRIVS_COUNT",
"#value": 1
},
{
"#name": "CREATE_PERMISSION",
"#value": "Y"
}
]
}
}
}
}
Below is my percolator.
I also need to check the following so that it gives back 3 as my result
"client.name" must be "Athena"
"client.db.#type" must be "Oracle" then only go ahead and check below points
"client.db.oracle_props.#name" field is not found
check if it has "oracle_cursors" key and then check if it's value is < 1000
1 & 2 are and operations and any of 3 or 4 satisfies it should result 3. I need help with point 4, below is my query. Also please suggest if there is a better way.
PUT /eg/.percolator/3
{
"query": {
"filtered": {
"filter": {
"or": [
{
"missing": {
"field": "client.db.oracle_props.#name"
}
}
]
},
"query": {
"bool": {
"must": [
{
"match": {
"client.name": "Athena"
}
},
{
"match": {
"client.db.#type": "Oracle"
}
}
]
}
}
}
}
}
Update
Can I have something like below
{
"match": {
"client.db.oracle_props[name='open_cursors'].value": 4000
}
}
More tries
I followed elasticsearch nested query and changed the mapping to nestedtype by re-indexing. Can anyone find problem why am i getting nested: NullPointerException;?
PUT /eg/.percolator/3
{
"nested" : {
"path" : "client.db.oracle_props",
"score_mode" : "avg",
"query" : {
"bool" : {
"must" : [
{
"match" : {"client.db.oracle_props.#name" : "open_cursors"}
},
{
"range" : {"client.db.oracle_props.#value" : {"lt" : 4000}}
}
]
}
}
}
}
mapping change
...
"properties": {
"#type": {
"type": "string"
},
"oracle_props": {
"type" : "nested",
"properties": {
"#name": {
"type": "string"
},
"#value": {
"type": "long"
}
}
}
}
...
Let's get into it:
You seem to map your nested path wrong, oracle_props is a child item of db in your example document, but not in your mapping, where it appears directly as child of your root.
You are mapping oracle_props.#value as long, but assign a text Y to it at the CREATE_PERMISSION nested doc
You query for range lt 4000, which excludes 4000, lte would fit for you
I didn't get your requirement for the missing value, hence I skipped that.
To get you to the right path, I has to simplify it a bit (since I couldn't follow all the mess in your question, sorry)
I'm not going into percolation either, and renamed everything to twitter/tweet, since this was easier for me to copy from my examples.
1) Create empty index "twitter"
curl -XDELETE 'http://localhost:9200/twitter/'
curl -XPUT 'http://localhost:9200/twitter/'
2) create geo_point mapping for the actual "tweet"
curl -XPUT 'http://localhost:9200/twitter/tweet/_mapping' -d '
{
"tweet": {
"properties": {
"db": {
"type": "object",
"properties": {
"#type": {
"type": "string"
},
"oracle_props": {
"type": "nested",
"properties": {
"#name": {
"type": "string"
},
"#value": {
"type": "string"
}
}
}
}
}
}
}
}'
3) Let's check if the mapping was set
curl -XGET 'http://localhost:9200/twitter/tweet/_mapping?pretty=true'
4) Post some tweets, with nested data
curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"name": "Athena",
"version": 1,
"db": {
"#type": "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 64bit",
"oracle_props": [
{
"#name": "open_cursors",
"#value": 4000
},
{
"#name": "USER_ROLE_PRIVS_COUNT",
"#value": 1
},
{
"#name": "CREATE_PERMISSION",
"#value": "Y"
}
]
}
}'
5) Query nested only
curl -XGET localhost:9200/twitter/tweet/_search -d '{
"query": {
"nested" : {
"path" : "db.oracle_props",
"score_mode" : "avg",
"query" : {
"bool" : {
"must" : [
{
"term": {
"db.oracle_props.#name": "open_cursors"
}
},
{
"range": {
"db.oracle_props.#value": {
"lte": 4000
}
}
}
]
}
}
}
}
}';
6) Query "Athena" and "Oracle"
curl -XGET localhost:9200/twitter/tweet/_search -d '{
"query" : {
"bool" : {
"must" : [
{
"match" : {"tweet.name" : "Athena"}
},
{
"match" : {"tweet.db.#type" : "Oracle"}
}
]
}
}
}'
7) Combine the former two queries
curl -XGET localhost:9200/twitter/tweet/_search -d '{
"query" : {
"bool" : {
"must" : [
{
"match" : {"tweet.name" : "Athena"}
},
{
"match" : {"tweet.db.#type" : "Oracle"}
},
{
"nested" : {
"path" : "db.oracle_props",
"score_mode" : "avg",
"query" : {
"bool" : {
"must" : [
{
"term": {
"db.oracle_props.#name": "open_cursors"
}
},
{
"range": {
"db.oracle_props.#value": {
"lte": 4000
}
}
}
]
}
}
}
}
]
}
}
}'
Results as
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 2.462332,
"hits": [
{
"_index": "twitter",
"_type": "tweet",
"_id": "1",
"_score": 2.462332,
"_source": {
"name": "Athena",
"version": 1,
"db": {
"#type": "Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 64bit",
"oracle_props": [
{
"#name": "open_cursors",
"#value": 4000
},
{
"#name": "USER_ROLE_PRIVS_COUNT",
"#value": 1
},
{
"#name": "CREATE_PERMISSION",
"#value": "Y"
}
]
}
}
}
]
}
}
You need to use a Nested Document. See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-nested-type.html