I'm unable to make this JSON:
{
“profiles”: {
“1”: {
“id”: “1”,
“property1”: “value1”,
“property2”: “value2”
},
“2”: {
“id”: “2”,
“property1”: “value21”,
“property2”: “value22”
}
}}
To this format
Desired output
Id Property1 Property2
1 Value1 Value2
2 Value21 Value22
I've attempted different approaches, that just result in one col all data.
Can someone please orient me on this?
Based on this example:
data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']}
pd.DataFrame.from_dict(data)
col_1 col_2
0 3 a
1 2 b
2 1 c
3 0 d
I would suggest something like:
your_json = {<your_json>}
property1 = []
property2 = []
for key, value in your_json.items():
for k, v in value.items():
property1.append(v['property1'])
property2.append(v['property2'])
data = {'property1': property1, 'property2': property2}
tt = pd.DataFrame.from_dict(data)
print(tt)
Related
I am having 2 arrays
array1 = ['a','b']
array2 = [1,2]
I want to merge these 2 arrays and convert them to map like below:
[
{
"firstparam": 'a'
"secondparam": 1
},
{
"firstparam": 'b'
"secondparam": 2
}
]
I am trying this code:
* def map1 = array1
* def map1 = karate.mapWithKey(map1, 'firstparam')
* def map2 = array2
* def map2 = karate.mapWithKey(map2, 'secondparam')
this code is creating map1 & map2. now I want to merge these 2 maps in the above format. how to do it?
basically, i want to send this map to a feature file which is expected 2 parameters.
* def result = karate.call('*.feature', map)
'*.feature' is expecting 2 parameters per call i.e, firstparam & secondparam
Here you go:
* def array1 = ['a', 'b']
* def array2 = [1, 2]
* def array3 = array1.map((x, i) => ({ firstparam: x, secondparam: array2[i] }))
* match array3 == [{ firstparam: 'a', secondparam: 1 }, { firstparam: 'b', secondparam: 2 }]
I am quite new to Pyspark, here is what I try to do, below is the table, type is ArrayType(DoubleType), ArrayType(DecimalType)
A
B
[1,2]
[2,4]
[1,2,4]
[1,3,3]
What I want to do is treat A and B as np.array, then pass a function to do calculation.
def func(row):
a = row.A
b = row.B
res = some-function(a,b)
return res
What I am trying now is
res = a.rdd.map(func)
resDF = res.toDF(res)
resDF.show()
But I am receiving the following error, could someone guide me a bit here? Thank you.
TypeError: schema should be StructType or list or None, but got: PythonRDD[167] at RDD at PythonRDD.scala:53
You can use pandas_udf
sample data
df = spark.createDataFrame([
([1,2], [2,4]),
([1,2,4], [1,3,3]),
], 'a array<int>, b array<int>')
df.show()
+---------+---------+
|a |b |
+---------+---------+
|[1, 2] |[2, 4] |
|[1, 2, 4]|[1, 3, 3]|
+---------+---------+
create new column with pandas_udf
#F.pandas_udf("array<int>")
def func(a, b):
return a * b
df.withColumn('c', func('a', 'b')).show()
+---------+---------+----------+
| a| b| c|
+---------+---------+----------+
| [1, 2]| [2, 4]| [2, 8]|
|[1, 2, 4]|[1, 3, 3]|[1, 6, 12]|
+---------+---------+----------+
I have a dataframe df:
d = {'col1': [1, 2,0,55,12,3], 'col3': ['A','A','A','B','B','B'] }
df = pd.DataFrame(data=d)
df
col1 col3
0 1 A
1 2 A
2 0 A
3 55 B
4 12 B
6 3 B
and want to build a Json from it, as the results looks like this :
json_result = { 'A' : [1,2,0], 'B': [55,12,3] }
basically, I would like for each group of the col3 to affect an array of its corresponding values from the dataframe
Aggregate list and then use Series.to_json:
print (df.groupby('col3')['col1'].agg(list).to_json())
{"A":[1,2,0],"B":[55,12,3]}
or if need dictionary use Series.to_dict:
print (df.groupby('col3')['col1'].agg(list).to_dict())
{'A': [1, 2, 0], 'B': [55, 12, 3]}
I want to merge 2 arrays in the following format.
array1 = [ "a" , "b" , "c"]
array2 = [ 1 , 2 , 3]
merged_array = [ {"a",1} , {"b",2} , {"c",3}]
The goal is to use this as values of 2 columns and rewrite this back to google sheet.
is my format correct and if yes how should i merge the arrays as said above ?
EDIT:
i decided to use this
var output = [];
for(var a = 0; a <= array1.length; a++)
output.push([array1[a],array2[a]]);
how would this compare to map function, performancewise ?
array1 = [ "a" , "b" , "c"]
array2 = [ 1 , 2 , 3]
merged_array = []
for index, value in enumerate(array1): merged_array.append({value,array2[index]})
print (merged_array)
-> [{'a', 1}, {'b', 2}, {'c', 3}]
Merging two arrays into and array of arrays
function myFunk() {
let array1 = ["a", "b", "c"];
let array2 = [1, 2, 3];
let a = array1.map((e,i) => {return [e,array2[i]];})
Logger.log(JSON.stringify(a));
}
Execution log
4:17:09 PM Notice Execution started
4:17:08 PM Info [["a",1],["b",2],["c",3]]
Array.map()
I have a JSON data source that is a list of objects. Some of the object properties are themselves lists. I want to turn the whole thing into a data frame, preserving the lists as data frame values.
Example JSON data:
[{
"id": "A",
"p1": [1, 2, 3],
"p2": "foo"
},{
"id": "B",
"p1": [4, 5, 6],
"p2": "bar"
}]
Desired data frame:
id p2 p1
1 A foo 1, 2, 3
2 B bar 4, 5, 6
Failed attempt 1
I have found this nicely straightforward way of parsing my JSON:
unlisted_data <- lapply(fromJSON(json_str), function(x){unlist(x)})
data.frame(do.call("rbind", unlisted_data))
However, the unlisting process spreads my repeated value across multiple columns:
id p11 p12 p13 p2
1 A 1 2 3 foo
2 B 4 5 6 bar
I expected that calling unlist with the recursive = FALSE option would take care of this, but it doesn't.
Failed attempt 2
I noticed that I can almost do this with the I function:
> data.frame(I(parsed_json[[1]]))
parsed_json..1..
id A
p1 1, 2, 3
p2 foo
But the rows and columns are reversed. Transposing the result mangles the repeated data:
> t(data.frame(I(parsed_json[[1]])))
id p1 p2
parsed_json..1.. "A" Numeric,3 "foo"
The jsonlite package can handle this just fine:
library(jsonlite)
fromJSON(txt)
# id p1 p2
#1 A 1, 2, 3 foo
#2 B 4, 5, 6 bar
fromJSON(txt)$p1
#[[1]]
#[1] 1 2 3
#
#[[2]]
#[1] 4 5 6