How to remove duplicate dict in Ansible? - json

So I have a dict data that look like this
{
"json_response": [
{
"message": "The Value is 1505",
"Value": "1505"
},
{
"message": "The Value is 1534",
"Value": "1534"
},
{
"message": "The Value is 1505",
"Value": "1505"
},
{
"message": "The Value is 1534",
"Value": "1534"
}
],
}
What I'm trying to achieve is something like this:
{
"json_response": [
{
"message": "The Value is 1505",
"Value": "1505"
},
{
"message": "The Value is 1534",
"Value": "1534"
}
],
}
I have tried:
remove_duplicate: "{{ json_response('\n') | unique | join('\n') }}"
and also:
remove_duplicate: "{{ json_response('\n') | unique | list }}"
But it returned error like this:
{
"msg": "Unexpected templating type error occurred on ({{ json_response('\n') | unique | join('\n') }}): 'list' object is not callable",
"_ansible_no_log": false
}
How can I achieve unique dict from my data?

Related

Count number of json objects in output

I am using Ansible's vmware_cluster_info to give the following json. I need to be able to count the number of hosts in a cluster. Cluster names will change.
{
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python"
},
"changed": false,
"clusters": {
"Cluster1": {
"drs_default_vm_behavior": "fullyAutomated",
"ha_restart_priority": [
"medium"
],
"ha_vm_failure_interval": [
50
],
"ha_vm_max_failure_window": [
0
],
"ha_vm_max_failures": [
10
],
"ha_vm_min_up_time": [
90
],
"ha_vm_monitoring": "vmMonitoringOnly",
"ha_vm_tools_monitoring": [
"vmAndAppMonitoring"
],
"hosts": [
{
"folder": "/Datacenter/host/Cluster1",
"name": "host1"
},
{
"folder": "/Datacenter/host/Cluster1",
"name": "host2"
},
{
"folder": "/Datacenter/host/Cluster1",
"name": "host3"
},
{
"folder": "/Datacenter/host/Cluster1",
"name": "host4"
}
],
"resource_summary": {
"cpuCapacityMHz": 144000,
},
"tags": [],
"vsan_auto_claim_storage": false
}
},
"invocation": {
"module_args": {
"cluster_name": "Cluster1",
}
}
}
I have tried:
- debug:
msg: "{{ cluster_info.clusters.[].hosts.name | length }}"
- debug:
msg: "{{ cluster_info.clusters.*.hosts.name | length }}"
Both give me template error while templating string... Also, any suggestions on tutorials, etc to learn how to parse json would be appreciated. It seems to be a difficult topic for me. Any ideas?
one similar approach to what you were trying is by using json_query. I understand you want to print the count of hosts, so the .name is not needed in your code:
- name: print count of hosts
debug:
var: cluster_info.clusters | json_query('*.hosts') | first | length
json_query returns a list, this is why we pass it to first before length.

Leveling select fields

I am fetching a json response of following structure:
{
"data": {
"children": [
{
"data": {
"id": "abcdef",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_1.jpg"
}
}
]
},
"title": "Boring Title One"
}
},
{
"data": {
"id": "ghijkl",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_2.jpg"
}
}
]
},
"title": "Boring Title Two"
}
},
{
"data": {
"id": "mnopqr",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_3.jpg"
}
}
]
},
"title": "Boring Title Three"
}
},
{
"data": {
"id": "stuvwx",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_4.jpg"
}
}
]
},
"title": "Boring Title Four"
}
}
]
}
}
Ideally I would like to have a shortened json like this:
{
"data": [
{
"id": "abcdef",
"title": "Boring Title One",
"url": "https://example.com/somefiles_1.jpg"
},
{
"id": "ghijkl",
"title": "Boring Title Two",
"url": "https://example.com/somefiles_2.jpg"
},
{
"id": "mnopqr",
"title": "Boring Title Three",
"url": "https://example.com/somefiles_3.jpg"
},
{
"id": "stuvwx",
"title": "Boring Title Four",
"url": "https://example.com/somefiles_4.jpg"
}
]
}
If this is not possible I can work with joining those three values into a single string and latter split when necessary; like this:
abcdef#Boring Title One#https://example.com/somefiles_1.jpg
ghijkl#Boring Title Two#https://example.com/somefiles_2.jpg
mnopqr#Boring Title Three#https://example.com/somefiles_3.jpg
stuvwx#Boring Title Four#https://example.com/somefiles_4.jpg
This is where I am. I was uring the jq with select() and then pipe the results to to_entries like this:
jq -r '.data.children[] | select(.data.post_type|test("image")?) | .data | to_entries[] | [ .value.title , .value.preview.images[0].source.url ] | join("#")' ~/Documents/json/sample.json
I don't understand what goes after to_entries[]; I have tried multiple variations of .key and .values; Mostly I don't get any result but sometimes I get key pairs I do not intend to select. How to learn the proper syntax for it?
Is creating a flat json out of a nested json like this good or is it better to create the string outputs? I feel the string might be error prone especially with the presence of spaces or special characters.
Apparently what you're looking for is the {field} syntax. You don't need to resort to string outputs.
{ data: [
.data.children[].data
| select(has("post_type") and (.post_type | index("image")))
| {id, title} + (.preview.images[].source | {url})
# or, if images array always contains one element:
# | {id, title, url: .preview.images[0].source.url}
]
}
A simple solution to the main question is:
{data: [.data.children[]
| .data
| {id, title, url: .preview.images[0].source.url} ]}
(The "post_type" seems to have disappeared, but hopefully if it's relevant, you will be able to adapt the above as required. Likewise if .images[1] and beyond are relevant.)
String Output
If you want linear output, you should probably consider CSV or TSV, both of which are supported by jq via #csv and #tsv respectively.

Remove parent elements with certain key-value pairs using JQ

I need to remove elements from a json file based on certain key values. Here is the file I am trying to process.
{
"element1": "Test Element 1",
"element2": {
"tags": "internal",
"data": {
"data1": "Test Data 1",
"data2": "Test Data 2"
}
},
"element3": {
"function1": {
"tags": [
"new",
"internal"
]
},
"data3": "Test Data 3",
"data4": "Test Data 4"
},
"element4": {
"function2": {
"tags": "new"
},
"data5": "Test Data 5"
}
}
I want to remove all elements that have a "tag" with value "internal". So the result should look like this:
{
"element1": "Test Element 1",
"element4": {
"function2": {
"tags": "new"
},
"data5": "Test Data 5"
}
}
I tried various approaches but I just don't get it done using jq. Any ideas? Thanks.
Just to add some more complexity. Let's assume the json is:
{
"element1": "Test Element 1",
"element2": {
"tags": "internal",
"data": {
"data1": "Test Data 1",
"data2": "Test Data 2"
}
},
"element3": {
"function1": {
"tags": [
"new",
"internal"
]
},
"data3": "Test Data 3",
"data4": "Test Data 4"
},
"element4": {
"function2": {
"tags": "new"
},
"data5": "Test Data 5"
},
"structure1" : {
"substructure1": {
"element5": "Test Element 5",
"element6": {
"tags": "internal",
"data6": "Test Data 6"
}
}
}
}
and I want to get
{
"element1": "Test Element 1",
"element4": {
"function2": {
"tags": "new"
},
"data5": "Test Data 5"
},
"structure1" : {
"substructure1": {
"element5": "Test Element 5",
}
}
}
Not easy, finding elements which have a tags key somewhere whose value is either the string internal, or an array of which an element is the string internal in a reliable way is only possible with a complex boolean expression as below.
Once found, deleting them can be done using the del built-in.
del(.[] | first(select(recurse
| objects
| has("tags") and (.tags
| . == "internal" or (
type == "array" and index("internal")
)
)
)))
Online demo
I think I figured out how to also solve the more complex case. I am now running:
walk(if type == "object" and has("tags") and (.tags | . == "internal" or (type == "array" and index("internal"))) then del(.) else . end) | delpaths([paths as $path | select(getpath($path) == null) | $path])
This will remove all elements that contain 'internal' as 'tag'.
The following solution is written with a helper function for clarity. The helper function uses any for efficiency and is defined so as to add a dash of generality.
To understand the solution, it will be helpful to know about with_entries and the infix // operator, both of which are explained in the jq manual.
# Does the incoming JSON value contain an object which has a .tags
# value that is equal to $value or to an array containing $value ?
def hasTag($value):
any(.. | select(type=="object") | .tags;
. == $value or (type == "array" and index($value)));
Assuming the top-level JSON entity is a JSON object, we can now simply write:
with_entries( select( .value | hasTag("internal") | not) )

How to add a json object to multiple documents in a Elastic index using _update_by_query?

I need to update several documents in my Elasticsearch index and I tried the following using the the _update_by_query plugin.
What I need to do is to add a new field to several existing documents matching a certain condition. The new field is a nested JSON. So after adding it document source should look like
_source: {
...existing fields,
"new_field" : {
"attrName1" : "value",
"attrName2" : "value",
}
}
I tried using the _update_by_query API to get this done. But so far I only could add String fields and arrays with it. When trying to add a JSON with the following query it gives me an error.
Query
curl -XPOST "http://xxx.xxx.xxx.xxx:pppp/my_index_name/_update_by_query" -d'
{
"query": {
"bool": {
"must": [
{
"term": {
"team.keyword": "search_phrase"
}
}
]
}
},
"script" : {
"inline":"ctx._source.field_name = {\"a\":\"b\"}"
}
}'
Error
{
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "compile error",
"script_stack": [
"ctx._source.field_name = {\"a\":\"b\"}",
" ^---- HERE"
],
"script": "ctx._source.field_name = {\"a\":\"b\"}",
"lang": "painless"
}
],
"type": "script_exception",
"reason": "compile error",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "invalid sequence of tokens near ['{'].",
"caused_by": {
"type": "no_viable_alt_exception",
"reason": null
}
},
"script_stack": [
"ctx._source.field_name = {\"a\":\"b\"}",
" ^---- HERE"
],
"script": "ctx._source.field_name = {\"a\":\"b\"}",
"lang": "painless"
},
"status": 500
}
So far I could only add Strings as a new field. What is the correct way to achieve this?
Instead of direct assignment, use params to achieve the same.
{
"query": {
"bool": {
"must": [
{
"term": {
"team.keyword": "search_phrase"
}
}
]
}
},
"script": {
"inline": "ctx._source.field_name = params.new_field",
"params": {
"new_field": {
"a": "b"
}
}
}
}

RestKit mapKeyOfNestedDictionaryToAttribute with array

I have a JSON response which looks like the sample below. I have added some comments // to emphasize my question.
I have no idea how to build the RKObjectMapping for the dynamic keys ("FieldNameA", "FieldNameB" - this could be anything) in combination with the array as the value. Each item of the array is of a type FieldResult.
I already learned how to handle varying key names here, but I don't get how I could properly map the array item type.
{
"result": {
"status": "FAILURE",
"details": {
"FieldNameA": [ // dynamic key name here, array of objects as a value
{
"details": {
"errorName": "InvalidField",
"errorNumber": 123
},
"status": "FAILURE"
}
],
"FieldNameB": [ // multiple values in this array, all of same type FieldResult
{
"details": {
"errorName": "UpdateRequired",
"errorNumber": 321
},
"status": "UPDATE_REQUIRED",
"suggestion": {
"update": "UpdatedInputValue"
}
},
{
"details": {
"errorName": "TooShort",
"errorNumber": 1
},
"status": "FAILURE"
}
]
}
}
}
Any help appreciated!