From DataFrame to particular json format - json

I would like to convert this DataFrame (as dict):
{'bisac_code1': {2: {'BIO016000': 0.8,
'CKB041000': 0.30000000000000004}},
'bisac_code2': {2: {'CKB049000': 0.3,
'BIO028000': 0.8}},
'bisac_code3': {2: {'SPO058000': 0.8,
'CKB030000': 0.3}}}
to this json_format:
"bisac_code_1": { [
{"code": …
"weight": ….
},
{"code": …
"weight": ….
}
]
},
"bisac_code_2": {
[
{"code": …
"weight": ….
},
{"code": …
"weight": ….
}
]
},
"bisac_code_3": {
[
{"code": …
"weight": ….
},
{"code": …
"weight": ….
}
]
I could solve it, but in a not very pythonic way (some for loops and a lot of string format). Is there any nice way to do it?

Not sure what's unpythonic of using a loop? a defaultdict from the collections lib and you're good to go.
assuming your dict is called d
from collections import defaultdict
import json
target_ = defaultdict(list)
for code, vals in d.items():
for _,items in vals.items():
for each_item in items.items():
target_[code].append(dict(zip(['code','weight'], each_item)))
print(json.dumps(target_, indent=1))
{
"bisac_code1": [
{
"code": "BIO016000",
"weight": 0.8
},
{
"code": "CKB041000",
"weight": 0.30000000000000004
}
],
"bisac_code2": [
{
"code": "CKB049000",
"weight": 0.3
},
{
"code": "BIO028000",
"weight": 0.8
}
],
"bisac_code3": [
{
"code": "SPO058000",
"weight": 0.8
},
{
"code": "CKB030000",
"weight": 0.3
}
]
}

Related

Get Object from http get API result

I know question has already asked but I can't do it with my get result.
My service get data from API return this kind of result :
{
"nhits": 581,
"parameters": {
"dataset": "communes-belges-2019",
"rows": 1,
"start": 0,
"facet": [
"niscode",
"region",
"nis_code_region"
],
"format": "json",
"timezone": "UTC"
},
"records": [
{
"datasetid": "communes-belges-2019",
"recordid": "65d40b7bc42f766b4fdb04c4a985766dc8b51717",
"fields": {
"shape_area": 79397718.576,
"mun_name_upper_fr": "ÉTALLE",
"arr_name_fr": "Virton",
"region": "Région wallonne",
"niscode": "85009",
"mun_off_language": "FR",
"geo_shape": {
"coordinates": [
[
[
5.678490965,
49.687217222
],
[
5.678462422,
49.6873304
]
]
],
"type": "Polygon"
},
"prov_name_fr": "Luxembourg",
"namefre": "Étalle",
"nom_commune": "Étalle",
"nis_code_region": "03000",
"mun_name_fr": "Étalle",
"reg_name_fr": "Région wallonne",
"mun_area_code": "BEL",
"modifdate": "2007-01-05",
"mun_type": "Commune/Gemeente/Gemeinde",
"mun_name_lower_fr": "étalle",
"prov_code": "80000",
"year": "2021-01-01",
"geo_point_2d": [
49.6639160352,
5.60896600843
]
},
"geometry": {
"type": "Point",
"coordinates": [
5.60896600843,
49.6639160352
]
},
"record_timestamp": "2019-05-24T09:44:14.333000+00:00"
}
],
"facet_groups": [
{
"name": "niscode",
"facets": [
{
"name": "11001",
"count": 1,
"state": "displayed",
"path": "11001"
},
{
"name": "33021",
"count": 1,
"state": "displayed",
"path": "33021"
},
{
"name": "33029",
"count": 1,
"state": "displayed",
"path": "33029"
},
{
"name": "33037",
"count": 1,
"state": "displayed",
"path": "33037"
}
]
},
{
"name": "region",
"facets": [
{
"name": "Région flamande",
"count": 299,
"state": "displayed",
"path": "Région flamande"
},
{
"name": "Région wallonne",
"count": 262,
"state": "displayed",
"path": "Région wallonne"
},
{
"name": "Région de Bruxelles-Capitale",
"count": 19,
"state": "displayed",
"path": "Région de Bruxelles-Capitale"
}
]
},
{
"name": "nis_code_region",
"facets": [
{
"name": "02000",
"count": 299,
"state": "displayed",
"path": "02000"
},
{
"name": "03000",
"count": 262,
"state": "displayed",
"path": "03000"
},
{
"name": "01000",
"count": 19,
"state": "displayed",
"path": "01000"
}
]
}
]
}
For the moment my service get data like this :
getProvinces(): Observable<any[]>{
return this._http.get<any>(this.apiProvinces)
.pipe(
map((response: any) => response.records as any[]),
catchError(this.handleError)
)
}
It returns a Observable<any[]> but I would like to get an object.
Therefore I defined a class with the below properties.
export class Record{
region : string;
nom_commune : string;
prov_name_fr: string;
records?: [];
constructor(reg: string, commune: string, prov: string) {
this.region = reg;
this.nom_commune = commune;
this.prov_name_fr = prov;
}
}
To reach that I try by replacing any[] by Record[] like below but it doesn't work.
getProvinces(): Observable<Record[]>{
return this._http.get<Record[]>(this.apiProvinces)
.pipe(
map(response => response as Record[]),
catchError(this.handleError)
)
}
In my component I defined an Observable<Record[]> like this :
public results$!: Observable<Record[]>;
And call the service like this :
this.results$ = this.apiService.getProvinces();
And try to see content in the html part:
<div *ngFor="let p of (results$ | async)">
{{p | json}}
</div>
I have the following error message : "Cannot find a differ supporting object '[object Object]' of type 'object'. NgFor only supports binding to Iterables such as Arrays."
And can't access to my object.
Any suggestions is helpfull because. I'm absolutely new to ANgular.
Thanks
When you create an observable, you don't actually make a request to your api. To make the request and get the data from the observable, you need to subscribe.
this.apiService.getProvinces().subscribe((res) => (this.results = res));
This will execute the get request and return your data to the callback function. Get requests are finite observables, so you don't need to unsubscribe, but you will need to subscribe again if you want to make another request.

How to avoid generating all combinations of selected data while constructing an object?

My original JSON is given below.
[
{
"id": "1",
"name": "AA_1",
"total": "100002",
"files": [
{
"filename": "8665b987ab48511eda9e458046fbc42e.csv",
"filename_original": "some.csv",
"status": "3",
"total": "100002",
"time": "2020-08-24 23:25:49"
}
],
"status": "3",
"created": "2020-08-24 23:25:49",
"filenames": "8665b987ab48511eda9e458046fbc42e.csv",
"is_append": "0",
"is_deleted": "0",
"comment": null
},
{
"id": "4",
"name": "AA_2",
"total": "43806503",
"files": [
{
"filename": "1b4812fe634938928953dd40db1f70b2.csv",
"filename_original": "other.csv",
"status": "3",
"total": "21903252",
"time": "2020-08-24 23:33:43"
},
{
"filename": "63ab85fef2412ce80ae8bd018497d8bf.csv",
"filename_original": "some.csv",
"status": "2",
"total": 0,
"time": "2020-08-24 23:29:30"
}
],
"status": "2",
"created": "2020-08-24 23:35:51",
"filenames": "1b4812fe634938928953dd40db1f70b2.csv&&63ab85fef2412ce80ae8bd018497d8bf.csv",
"is_append": "0",
"is_deleted": "0",
"comment": null
}
]
From this JSON I want to create new objects by combining fields from objects which have status: 2 and their files which also have the same pair, status: 2.
So, I am expecting a JSON array as below.
[
{
"id": "4",
"name": "AA_2",
"file_filename": "63ab85fef2412ce80ae8bd018497d8bf.csv",
"file_status": 2
}
]
So far I tried with this JQ filter:
.[]|select(.status=="2")|[{id:.id,file_filename:.files[].filename,file_status:.files[].status}]
But this produces some invalid data.
[
{
"id": "4", # want to remove this as file.status != 2
"file_filename": "1b4812fe634938928953dd40db1f70b2.csv",
"file_status": "3"
},
{
"id": "4",
"file_filename": "1b4812fe634938928953dd40db1f70b2.csv",
"file_status": "2"
},
{
"id": "4", # Repeat
"file_filename": "63ab85fef2412ce80ae8bd018497d8bf.csv",
"file_status": "3"
},
{
"id": "4", # Repeat
"file_filename": "63ab85fef2412ce80ae8bd018497d8bf.csv",
"file_status": "2"
}
]
How do I filter the new JSON using JQ and remove these duplicate objects?
By applying [] operator to files twice, you're running into a combinatorial explosion. That needs to be avoided, for example:
[ .[] | select(.status == "2") | {id, name} + (.files[] | select(.status == "2") | {file_filename: .filename, file_status: .status}) ]
Online demo

JsonPath - Extract object meeting multiple criteria?

In the Json string given below, I want to find all elements in which category = m AND the "middle" array contains elements which match this condition - the element's "middle" array has objects whose itemType = Executable.
I would like to use jsonpath to get the desired objects. I prefer to not use jmespath because it can be too complex for my purpose. But, I am new to jsonpath and I am not able to figure out the json query from online tutorials which are too trivial or basic. I wonder if its better to use a programming language instead to get the data I need. Please advise.
So far, I was able to only extract elements in which category = m by using this jsonpath query $.[?(#.category=="m")]. How do I do the remaining part ?
Json :
Overview - Every object has a "content" object. Each content object generally has a start, middle and end array besides other fields. Middle arrays can have multiple content objects inside them and so on. Some of the content objects have only a middle array. I am interested in locating items in such content objects as mentioned above.
Note that this is not the actual json which I have to process. It is an imitation which has been sanitized for SO.
{
"id": "123",
"contents": {
"title": "B1",
"start": [],
"middle": [
{
"level": "1",
"contents": {
"title": "C1",
"category": "c",
"start": [],
"middle": [
{
"level": "2",
"contents": {
"title": "M1",
"category": "m",
"start": [],
"middle": [
{
"level": "3",
"contents": {
"title": "MAT1",
"middle": [
{
"itemType": "Data"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT2",
"middle": [
{
"itemType": "Executable",
"id": "exec1"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT3",
"middle": [
{
"itemType": "Data"
}
]
}
}
],
"end": []
}
},
{
"level": "2",
"contents": {
"title": "M2",
"category": "m",
"start": [],
"middle": [
{
"level": "3",
"contents": {
"title": "MAT1",
"middle": [
{
"itemType": "Data"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT2",
"middle": [
{
"itemType": "Executable",
"id": "exec2"
}
]
}
}
],
"end": []
}
}
],
"end": []
}
},
{
"level": "1",
"contents": {
"title": "C2",
"category": "c",
"start": [],
"middle": [
{
"level": "2",
"contents": {
"title": "M1",
"category": "m",
"start": [],
"middle": [
{
"level": "3",
"contents": {
"title": "MAT1",
"middle": [
{
"itemType": "Data"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT2",
"middle": [
{
"itemType": "Executable",
"id": "exec3"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT3",
"middle": [
{
"itemType": "Data"
}
]
}
}
],
"end": []
}
},
{
"level": "2",
"contents": {
"title": "M2",
"category": "m",
"start": [],
"middle": [
{
"level": "3",
"contents": {
"title": "MAT1",
"middle": [
{
"itemType": "Data"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT2",
"middle": [
{
"itemType": "Executable",
"id": "exec4"
}
]
}
},
{
"level": "3",
"contents": {
"title": "MAT3",
"middle": [
{
"itemType": "Data"
}
]
}
}
],
"end": []
}
}
],
"end": []
}
}
],
"end": []
}
}
Context
json with nested objects1
jsonpath expression language
choosing between jsonpath and jmespath (or other JSON expression engine)
Problem
DeveMasterJoe2 wants to extract some values from nested JSON
Discussion
There are lots of implementations of jsonpath out there, and they do not all support the same features
The structure and normalization of the source JSON is going to influence how easily this can be done with pure jsonpath
In choosing a JSON expression engine, one has to weigh multiple factors
how consistent are the implementations across languages?
how many choices are there within a given language?
how clear is the specification?
how many examples, unit-tests or tutorials are available?
who is supporting it?
Example solution using Python and jsonpath-ng
Here is an example solution using python 3.7 and jsonpath-ng
This example uses a mix of jsonpath and python instead of just pure jsonpath, because of the heavily-nested JSON
I will leave it for someone else to provide an answer that relies on pure jsonpath
Note that the source JSON arguably could stand to be cleaned up a bit
(for example, why is there no id field attached to itemType==Data elements?)
(for example, why is category not found on all contents elements?)
(for example, if you expressly specify level why complicate things with heavily nested objects when you can determine depth by level ?)
This example:
## import libraries
import codecs
import json
import jsonpath_ng
from jsonpath_ng.ext import parse
##;;
## init vars
href="path/to/my/jsonfile/nested_dict.json"
json_string = codecs.open(href, 'rb', encoding='utf8').read()
json_dataroot = json.loads(json_string)
final_result = []
##;;
## init jsonpath outer-query
match = parse('$..contents.middle[*]').find(json_dataroot)
##;;
## iterate through outer-query and gather subelements
for ijj,item in enumerate(match):
## restrict to desired category == 'm'
if(match[ijj].value.get('contents',{}).get('category','') == 'm'):
## extract out desired subelements
json_datafrag001 = [item.get('contents',{}).get('middle',{})[0]
for item in match[ijj].value.get('contents',{}).get('middle',{})
]
match001 = parse("$[?(#.itemType=='Executable')]").find(json_datafrag001)
final_result.extend(list(match001[ikk].value for ikk,item in enumerate(match001)))
pass
##;;
## show final result
vout = json.dumps(final_result, sort_keys=True,indent=4, separators=(',', ': '))
print(vout)
##;;
... produces this result ...
[
{
"id": "exec1",
"itemType": "Executable"
},
{
"id": "exec2",
"itemType": "Executable"
},
{
"id": "exec3",
"itemType": "Executable"
},
{
"id": "exec4",
"itemType": "Executable"
}
]
1 (aka dictionary, associative-array, hash)

Aggregate and sum json data in python

I am new to python, using python3. I have json data like:
{
"message": {
"count": 46,
"limit": 1000,
"schools": [
{
"class": "1",
"class_id": "1c8***",
"charges": [
{
"cost": 10,
"breakdown": [
{
"books": "1",
"unitQuantity": "10"
}
]
}
],
"area": "maccau"
},
{
"class": "2",
"class_id": "1c3***",
"charges": [
{
"cost": 100,
"breakdown": [
{
"books": "1",
"unitQuantity": "100"
}
]
}
],
"area": "maccau"
},
{
"class": "1",
"class_id": "1c3***",
"charges": [
{
"cost": 10,
"breakdown": [
{
"books": "1",
"unitQuantity": "10"
}
]
}
],
"area": "maccau"
},
{
"class": "2",
"class_id": "1c8***",
"charges": [
{
"cost": 50,
"breakdown": [
{
"books": "1",
"unitQuantity": "50"
}
]
}
],
"area": "maccau"
}
],
"url": {
"link": "/"
}
}
}
I was able to use json.loads to load data and I am trying to get results like:
class Cost
1 20
2 150
I tried converting json to a dictionary:
item_dict = json.load(json_data)
Tried to get data out using for loop and checking if class = 1 and then summing up the cost. But I feel like that is not the best approach. Can someone please tell me what would be the best way of doing this?

Convert nested json to csv to sheets json api

I'm want to make my json to csv so that i can upload it on google sheets and make it as json api. Whenever i have change data i will just change it on google sheets. But I'm having problems on converting my json file to csv because it changes the variables whenever i convert it. I'm using https://toolslick.com/csv-to-json-converter to convert my json file to csv.
What is the best way to convert json nested to csv ?
JSON
{
"options": [
{
"id": "1",
"value": "Jumbo",
"shortcut": "J",
"textColor": "#FFFFFF",
"backgroundColor": "#00000"
},
{
"id": "2",
"value": "Hot",
"shortcut": "D",
"textColor": "#FFFFFF",
"backgroundColor": "#FFFFFF"
}
],
"categories": [
{
"id": "1",
"order": 1,
"name": "First Category",
"active": true
},
{
"id": "2",
"order": 2,
"name": "Second Category",
"shortcut": "MT",
"active": true
}
],
"products": [
{
"id": "03c6787c-fc2a-4aa8-93a3-5e0f0f98cfb2",
"categoryId": "1",
"name": "First Product",
"shortcut": "First",
"options": [
{
"optionId": "1",
"price": 23
},
{
"optionId": "2",
"price": 45
}
],
"active": true
},
{
"id": "e8669cea-4c9c-431c-84ba-0b014f0f9bc2",
"categoryId": "2",
"name": "Second Product",
"shortcut": "Second",
"options": [
{
"optionId": "1",
"price": 11
},
{
"optionId": "2",
"price": 20
}
],
"active": true
}
],
"discounts": [
{
"id": "1",
"name": "S",
"type": 1,
"amount": 20,
"active": true
},
{
"id": "2",
"name": "P",
"type": 1,
"amount": 20,
"active": true
},
{
"id": "3",
"name": "G",
"type": 2,
"amount": 5,
"active": true
}
]
}
Using python, this can be easily done or almost done. Maybe this code will help you in some way to understand that.
import json,csv
data = []
with open('your_json_file_here.json') as file:
for line in file:
data.append(json.loads(line))
length = len(data)
with open('create_new_file.csv','w') as f:
writer = csv.writer(f)
writers = csv.DictWriter(f, fieldnames=['header1','header2'])
writers.writeheader()
for iter in range(length):
writer.writerow((data[iter]['specific_col_name1'],data[iter]['specific_col_name2']))
f.close()