how to remove duplicates from a json defaultdict? - json

(Re-post with accurate data sample)
I have a json dictionary where each value in turn is a defaultdict as follows:
"Parent_Key_A": [{"a": 1.0, "b": 2.0}, {"a": 5.1, "c": 10}, {"b": 20.3, "a": 1.0}]
I am trying to remove both duplicate keys and values so that each element of the json has unique values. So for the above example, I am looking for output something like this:
"Parent_Key_A": {"a":[1.0,5.1], "b":[2.0,20.3], "c":[10]}
Then I need to write this output to a json file. I tried using set to handle duplicates but set is not json serializable.
Any suggestions on how to handle this?

The solution using itertools.chain() and itertools.groupby() functions:
import itertools, json
input_d = { "Parent_Key_A": [{"a": 1.0, "b": 2.0}, {"a": 5.1, "c": 10}, {"b": 20.3, "a": 1.0}] }
items = itertools.chain.from_iterable(list(d.items()) for d in input_d["Parent_Key_A"])
# dict comprehension (updated syntax here)
input_d["Parent_Key_A"] = { k:[i[1] for i in sorted(set(g))]
for k,g in itertools.groupby(sorted(items), key=lambda x: x[0]) }
print(input_d)
The output:
{'Parent_Key_A': {'a': [1.0, 5.1], 'b': [2.0, 20.3], 'c': [10]}}
Printing to json file:
json.dump(input_d, open('output.json', 'w+'), indent=4)
output.json contents:
{
"Parent_Key_A": {
"a": [
1.0,
5.1
],
"c": [
10
],
"b": [
2.0,
20.3
]
}
}

Related

How do I collect values by keys from different levels in JQ?

Let's suppose I have a JSON like this:
[
{
"a": 1,
"l": [
{"b": "z"},
{"b": "x"}
]
},
{
"a": 2,
"l": [
{"b": "c"}
]
}
]
I want to collect the data from all embedded arrays and to get an array of all objects with "a" and "b" values. For the JSON above the result should be:
[
{"a": 1, "b": "z"},
{"a": 1, "b": "x"},
{"a": 2, "b": "c"}
]
What JQ expression do I need to try to solve the issue?
You can use .l[] within the expression in order to return each element of the array returned in the response. So, use this one below
map({a} + .l[])
Demo

How can I convert to Json Object to Json Array in Karate?

I want to convert Json Object to Json Array in Karate to use 'match each' func.
I am getting to ('match each' failed, not a json array) error when I use match each func with Json Object.
Here is My Json Object:
{
{ "a": "q"
"b": "w",
"c": "t"
},
{ "a": "x"
"b": "y",
"c": "z"
}
}
And here is what I need:
[
{
{ "a": "q"
"b": "w",
"c": "t"
},
{ "a": "x"
"b": "y",
"c": "z"
}
}
]
Try this approach, using embedded expressions: https://github.com/intuit/karate#embedded-expressions
* def foo = { a: 1 }
* def list = [ '#(foo)' ]
* match each list == foo

How to apply left join on two JSON objects in groovy script

I have two JSON files:
first.json
[
{"a":"1", "b": "tmp"},
{"a":"2", "b": "tmp"},
{"a":"3", "b": "tmp"}
]
second.json
[
{"c":"1", "d": "tmp"},
{"c":"2", "d": "tmp"},
{"c":"4", "d": "tmp"}
]
output.json
[
{"a":"1", "b": "tmp", "c": "1" , "d": "tmp"},
{"a":"2", "b": "tmp", "c": "2" , "d": "tmp"},
{"a":"3", "b": "tmp", "c": "" , "d": ""}
]
I want to apply left join on two json files first.json and second.json on basis of two fields - "a" of first.json and "c" of second.json to get the output as output.json. How can I achieve the same using Groovy Script?
NOTE: I would like to achieve this in a single line if possible.
You would need to do something like this:
def firstJson = '''[
{"a":"1", "b": "tmp"},
{"a":"2", "b": "tmp"},
{"a":"3", "b": "tmp"}
]'''
def secondJson = '''[
{"c":"1", "d": "tmp"},
{"c":"2", "d": "tmp"},
{"c":"4", "d": "tmp"}
]'''
import groovy.json.JsonSlurper
import groovy.json.JsonOutput
def slurpy = new JsonSlurper()
def first = slurpy.parseText(firstJson)
def second = slurpy.parseText(secondJson)
def result = first.collect { f ->
f + (second.find { it.c == f.a } ?: second[0].keySet().collectEntries { [it, ''] })
}
println JsonOutput.toJson(result)

KarateException Missing Property in path - JSON

I was trying to match particular variable from response and tried as below. But im getting error saying KarateException Missing Property in path $['Odata']. My question is: how we can modify so that we won't get this error?
Feature:
And match response.#odata.context.a.b contains '<b>'
Examples:
|b|
|b1 |
|b2 |
Response is
{
"#odata.context": "$metadata#Accounts",
"a": [
{
"c": 145729,
"b": "b1",
"d": "ON",
},
{
"c": 145729,
"b": "b2",
"d": "ON",
}
]
}
I think you are confused with the structure of your JSON. Also note that when the JSON key has special characters, you need to change the way you use them in path expressions. You can try paste the below in a new Scenario and see it work:
* def response =
"""
{
"#odata.context": "$metadata#Accounts",
"a": [
{
"c": 145729,
"b": "b1",
"d": "ON",
},
{
"c": 145729,
"b": "b2",
"d": "ON",
}
]
}
"""
* match response['#odata.context'] == '$metadata#Accounts'
* match response.a[0].b == 'b1'
* match response.a[1].b == 'b2'

Julia | DataFrame conversion to JSON

I have a dataframe in Julia like df = DataFrame(A = 1:4, B = ["M", "F", "F", "M"]). I have to convert it into a JSON like
{
"nodes": [
{
"A": "1",
"B": "M"
},
{
"A": "2",
"B": "F"
},
{
"A": "3",
"B": "F"
},
{
"A": "4",
"B": "M"
}
]
}
Please help me in this.
There isn't a method in DataFrames to do this. In a github issue where the following snippet, using JSON.jl, is offered as a method to write json:
using JSON
using DataFrames
function df2json(df::DataFrame)
len = length(df[:,1])
indices = names(df)
jsonarray = [Dict([string(index) => (isna(df[index][i])? nothing : df[index][i])
for index in indices])
for i in 1:len]
return JSON.json(jsonarray)
end
function writejson(path::String,df::DataFrame)
open(path,"w") do f
write(f,df2json(df))
end
end
JSONTables package provides JSON conversion to/from Tables.jl-compatible sources like DataFrame.
using DataFrames
using JSONTables
df = DataFrame(A = 1:4, B = ["M", "F", "F", "M"])
jsonstr = objecttable(df)