Pandas Dataframe row as a formatted JSON output

Pandas Dataframe row as a formatted JSON output - json

I have data frame which I am trying to group by customer and print an output ,the to_json is not giving the format. Also I need to create separate json file for each customer, I think using the pandas generic method custom json formatting is not possible, what should be the direction I should be looking for.
I tried to group by customer_id , first_name and last_name and then set them as index and tried the orientation as index value but that didn't really worked out.
import pandas as pd
data = [{'customer_id': 1, 'first_name':'John', 'last_name':'Doe', 'amount':100, 'sub_amount':50,'total': 150,'product':'tool box'},
{'customer_id': 1, 'first_name':'John', 'last_name':'Doe', 'amount':50, 'sub_amount':50,'total': 100,'product':'light'},
{'customer_id': 2, 'first_name':'Jane', 'last_name':'Doe', 'amount':200, 'sub_amount':50,'total': 250,'product':'iron box'},
{'customer_id': 2, 'first_name':'Jane', 'last_name':'Doe', 'amount':50, 'sub_amount':50,'total': 100,'product':'led'}
]
df = pd.DataFrame(data)
df
customer_id first_name last_name amount sub_amount total product
0 1 John Doe 100 50 150 tool box
1 1 John Doe 50 50 100 light
2 2 Jane Doe 200 50 250 iron box
3 2 Jane Doe 50 50 100 led
expected output
{
"frist_name": "John",
"last_name": "Doe",
"Product_Details": {
"too box": {
"total": 150,
"amount": 100
},
"light": {
"total": 100,
"amount": 50
}
}
}

clients={}
for index,row in df.iterrows():
clients.setdefault(row['customer_id'], {'first_name': row['first_name'],
'last_name': row['last_name']})
clients[row['customer_id']].setdefault('Product_Details',{})[row['product']] = \
{'total': row['total'], 'amount': row['amount']}
print(json.dumps(clients[1],indent=4))

Related

How to create a CSV File that will look Like This JSON File

I am basically wanting to update multiple scholars for an NFT game (axie infinity). It requires a JSON file that looks like this:
{
"name": "Scholar 1",
"ronin": "ronin:<account_s1_address>",
"splits": [
{
"persona": "Manager",
"percentage": 44,
"ronin": "ronin:<manager_address>"
},
{
"persona": "Scholar",
"percentage": 40,
"ronin": "ronin:<scholar_1_address>"
},
{
"persona": "Other Person",
"percentage": 6,
"ronin": "ronin:<other_person_address>"
},
{
"persona": "Trainer",
"percentage": 10,
"ronin": "ronin:<trainer_address>"
}
]
}
But since there are multiple scholars/players, I wanted to know if there was anyway to format something on a CSV file that if I convert or import it using a JSON tool it will look like like the JSON above?
Your help is much appreciated.. Thank you!
PS:
The first lines:
"name": "Scholar 1",
"ronin": "ronin:<account_s1_address>",
"splits":
Would need to be repeated since again there are multiple scholars, i.e. Scholar 1, Scholar 2, Scholar 3...

CSV file structure is column-base, if Axie infinity require JSON file, you can create a CSV file by Excel or Google sheet and convert to JSON.
there is a similar answer to convert CSV to JSON

starting from this CSV that has this structure
name
ronin
id_persona
persona
percentage
split_ronin
Scholar 1
ronin:<account_s1_address>
1
Manager
44
ronin:<manager_address>
Scholar 1
ronin:<account_s1_address>
2
Scholar
40
ronin:<scholar_1_address>
Scholar 1
ronin:<account_s1_address>
3
Other Person
6
ronin:<other_person_address>
Scholar 1
ronin:<account_s1_address>
4
Trainer
10
ronin:<trainer_address>
you can run this Miller command
mlr --c2j reshape -r "^(p|s)" -o k,v then \
put '$k="splits".".".${id_persona}.".".$k' then \
cut -x -f id_persona then \
reshape -s k,v out.csv
to have
[
{
"name": "Scholar 1",
"ronin": "ronin:<account_s1_address>",
"splits": [
{
"persona": "Manager",
"percentage": 44,
"split_ronin": "ronin:<manager_address>"
},
{
"persona": "Scholar",
"percentage": 40,
"split_ronin": "ronin:<scholar_1_address>"
},
{
"persona": "Other Person",
"percentage": 6,
"split_ronin": "ronin:<other_person_address>"
},
{
"persona": "Trainer",
"percentage": 10,
"split_ronin": "ronin:<trainer_address>"
}
]
}
]
Some notes:
reshape -r "^(p|s)" -o k,v, to transform the input from wide to long;
put '$k="splits".".".${id_persona}.".".$k', to create values that I will use as field names (splits.1.persona,splits.1.percentage,splits.1.split_ronin,splits.2.persona,splits.2.percentage, ....
cut -x -f id_persona, to remove the field id_persona;
reshape -s k,v, to transform all from long to wide.
The real goal is to build, starting from that input, this kind of CSV
+-----------+----------------------------+------------------+---------------------+-------------------------+------------------+---------------------+---------------------------+------------------+---------------------+------------------------------+------------------+---------------------+-------------------------+
| name | ronin | splits.1.persona | splits.1.percentage | splits.1.split_ronin | splits.2.persona | splits.2.percentage | splits.2.split_ronin | splits.3.persona | splits.3.percentage | splits.3.split_ronin | splits.4.persona | splits.4.percentage | splits.4.split_ronin |
+-----------+----------------------------+------------------+---------------------+-------------------------+------------------+---------------------+---------------------------+------------------+---------------------+------------------------------+------------------+---------------------+-------------------------+
| Scholar 1 | ronin:<account_s1_address> | Manager | 44 | ronin:<manager_address> | Scholar | 40 | ronin:<scholar_1_address> | Other Person | 6 | ronin:<other_person_address> | Trainer | 10 | ronin:<trainer_address> |
+-----------+----------------------------+------------------+---------------------+-------------------------+------------------+---------------------+---------------------------+------------------+---------------------+------------------------------+------------------+---------------------+-------------------------+
and than use it to create the final JSON output

Sort and Select Top 5 JSON values

I have a two-fold issue and looking for clues as to how to approach it.
I have a json file that is formatted as such:
{
"code": 2000,
"data": {
"1": {
"attribute1": 40,
"attribute2": 1.4,
"attribute3": 5.2,
"attribute4": 124
"attribute5": "65.53%"
},
"94": {
"attribute1": 10,
"attribute2": 4.4,
"attribute3": 2.2,
"attribute4": 12
"attribute5": "45.53%"
},
"96": {
"attribute1": 17,
"attribute2": 9.64,
"attribute3": 5.2,
"attribute4": 62
"attribute5": "51.53%"
}
},
"message": "SUCCESS"
}
My goals are to:
I would first like to sort the data by any of the attributes.
There are around 100 of these, I would like to grab the top 5 (depending on how they are sorted), then...
Output the data in a table e.g.:
These are sorted by: attribute5
---
attribute1 | attribute2 | attribute3 | attribute4 | attribute5
40 |1.4 |5.2|124|65.53%
17 |9.64|5.2|62 |51.53%
10 |4.4 |2.2|12 |45.53%
*also, attribute5 above is a string value
Admittedly, my knowledge here is very limited.
I attempted to mimick the method used here:
python sort list of json by value
I managed to open the file and I can extract the key values from a sample row:
import json
jsonfile = path-to-my-file.json
with open(jsonfile) as j:
data=json.load(j)
k = data["data"]["1"].keys()
print(k)
total=data["data"]
for row in total:
v = data["data"][str(row)].values()
print(v)
this outputs:
dict_keys(['attribute1', 'attribute2', 'attribute3', 'attribute4', 'attribute5'])
dict_values([1, 40, 1.4, 5.2, 124, '65.53%'])
dict_values([94, 10, 4.4, 2.2, 12, '45.53%'])
dict_values([96, 17, 9.64, 5.2, 62, '51.53%'])
Any point in the right direction would be GREATLY appreciated.
Thanks!

If you don't mind using pandas you could do it like this
import pandas as pd
rows = [v for k,v in data["data"].items()]
df = pd.DataFrame(rows)
# then to get the top 5 values by attribute can choose either ascending
# or descending with the ascending keyword and head prints the top 5 rows
df.sort_values('attribute1', ascending=True).head()
This will allow you to sort by any attribute you need at any time and print out a table.
Which will produce output like this depending on what you sort by
attribute1 attribute2 attribute3 attribute4 attribute5
0 40 1.40 5.2 124 65.53%
1 10 4.40 2.2 12 45.53%
2 17 9.64 5.2 62 51.53%

I'll leave this answer here in case you don't want to use pandas but the answer from #MatthewBarlowe is way less complicated and I recommend that.
For sorting by a specific attribute, this should work:
import json
SORT_BY = "attribute4"
with open("test.json") as j:
data = json.load(j)
items = data["data"]
sorted_keys = list(sorted(items, key=lambda key: items[key][SORT_BY], reverse=True))
Now, sorted_keys is a list of the keys in order of the attribute they were sorted by.
Then, to print this as a table, I used the tabulate library. The final code for me looked like this:
from tabulate import tabulate
import json
SORT_BY = "attribute4"
with open("test.json") as j:
data = json.load(j)
items = data["data"]
sorted_keys = list(sorted(items, key=lambda key: items[key][SORT_BY], reverse=True))
print(f"\nSorted by: {SORT_BY}")
print(
tabulate(
[
[sorted_keys[i], *items[sorted_keys[i]].values()]
for i, _ in enumerate(items)
],
headers=["Column", *items["1"].keys()],
)
)
When sorting by 'attribute5', this outputs:
Sorted by: attribute5
Column attribute1 attribute2 attribute3 attribute4 attribute5
-------- ------------ ------------ ------------ ------------ ------------
1 40 1.4 5.2 124 65.53%
96 17 9.64 5.2 62 51.53%
94 10 4.4 2.2 12 45.53%

Karate API framework how to match the response values with the table columns?

I have below API response sample
{
"items": [
{
"id":11,
"name": "SMITH",
"prefix": "SAM",
"code": "SSO"
},
{
"id":10,
"name": "James",
"prefix": "JAM",
"code": "BBC"
}
]
}
As per above response, my tests says that whenever I hit the API request the 11th ID would be of SMITH and 10th id would be JAMES
So what I thought to store this in a table and assert against the actual response
* table person
| id | name |
| 11 | SMITH |
| 10 | James |
| 9 | RIO |
Now how would I match one by one ? like first it parse the first ID and first name from the API response and match with the Tables first ID and tables first name
Please share any convenient way of doing it from KARATE

There are a few possible ways, here is one:
* def lookup = { 11: 'SMITH', 10: 'James' }
* def items =
"""
[
{
"id":11,
"name":"SMITH",
"prefix":"SAM",
"code":"SSO"
},
{
"id":10,
"name":"James",
"prefix":"JAM",
"code":"BBC"
}
]
"""
* match each items contains { name: "#(lookup[_$.id+''])" }
And you already know how to use table instead of JSON.
Please read the docs and other stack-overflow answers to get more ideas.

Python csv conversion to specific nested json

I have a dataframe (csv file loaded into Pandas) as below :
col1 col2 col3 col4 col5 name amount
1 USA 4000 Air 60 Education 200
1 USA 4000 Air 60 Car 100
1 USA 4000 Air 60 Restaurant 100
2 UK 5000 Cash 50 Government 125
2 UK 5000 Cash 50 Restaurant 135
Now, i need to convert it into nested json format. For one record ( Col1, col2, col3, col4 - consider for grouping )
Below Json format is expected output :
{
“col5”: 60,
“col4”: [
{
“name”: “Air”
}
],
“expenses”: [
{
“amount”: 200,
“name”: “Education”
},
{
“amount”: Car,
“name”: “Car”
},
{
“amount”: 100,
“name”: “Restaurant”
}
],
“col1”: 1,
“col2”: “USA”,
“col3”: “4000”
}
I understand, its going to be bit complex code... But is there some one to help ?
Thanks in advance !!

I believe you need:
For dictionary:
d = (df.groupby(['col1','col2','col3','col4','col5'])
.apply(lambda x: dict(zip(x['name'], x['amount'])))
.reset_index(name='expenses')
.to_dict(orient='records')
)
print (d)
For json:
j = (df.groupby(['col1','col2','col3','col4','col5'])
.apply(lambda x: dict(zip(x['name'], x['amount'])))
.reset_index(name='expenses')
.to_json(orient='records')
)
print (j)

How to format the structure of data returned from SQL query

I got this data from my SQL query:
addon_id | addon_name | addon_category_id
---------+------------+------------------
1 | abc | 10
2 | def | 20
3 | ghi | 10
Now I have to send this in the following JSON format and group the addons based on addon_category_id:
[
{
addon_category_id: 10,
addons:
[
{
addon_id: 1,
addon_name: abc
},
{
addon_id: 3,
addon_name: ghi
}
]
},
{
addon_category_id: 20
addons:
[
{
addon_id: 2,
addon_name: def
}
]
}
]
How can I do this? What is the logic behind that? Do I have to do it programmatically using a for loop or is there any other way?

As mentioned in the comments it depends on the programming language you use. In SQL Server 2016 you can use FOR JSON AUTO
SELECT b.addon_category_id ,addons.addon_id , addons.addon_name
FROM addon a
JOIN addon addons
ON a.addon_category_id = b.addon_category_id
FOR JSON AUTO;

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Pandas Dataframe row as a formatted JSON output - json

Related

How to create a CSV File that will look Like This JSON File

Sort and Select Top 5 JSON values

Karate API framework how to match the response values with the table columns?

Python csv conversion to specific nested json

How to format the structure of data returned from SQL query

Categories

Resources