Building JSON object from recursive directory tree traversal - json

I'm traversing a directory tree, which contains directories and files. I know I could use os.walk for this, but this is just an example of what I'm doing, and the end result has to be recursive.
The function to get the data out is below:
def walkfn(dirname):
for name in os.listdir(dirname):
path = os.path.join(dirname, name)
if os.path.isdir(path):
print(name)
walkfn(path)
elif os.path.isfile(path):
print(name)
Assuming we had a directory structure such as this:
testDir/
a/
1/
2/
testa2.txt
testa.txt
b/
3/
testb3.txt
4/
The code above would return the following:
a
testa.txt
1
2
testa2.txt
c
d
b
4
3
testb3.txt
It's doing what I would expect at this point, and the values are all correct, but I'm trying to get this data into a JSON object. I've seen that I can add these into nested dictionaries, and then convert it to JSON, but I've failed miserably at getting them into nested dictionaries using this recursive method.
The JSON I'm expecting out would be something like:
{
"test": {
"b": {
"4": {},
"3": {
"testb3.txt": null
}
},
"a": {
"testa.txt": null,
"1": {},
"2": {
"testa2.txt": null
}
}
}
}

You should pass json_data in your recursion function:
import os
from pprint import pprint
from typing import Dict
def walkfn(dirname: str, json_data: Dict=None):
if not json_data:
json_data = dict()
for name in os.listdir(dirname):
path = os.path.join(dirname, name)
if os.path.isdir(path):
json_data[name] = dict()
json_data[name] = walkfn(path, json_data=json_data[name])
elif os.path.isfile(path):
json_data.update({name: None})
return json_data
json_data = walkfn(dirname="your_dir_name")
pprint(json_data)

Related

Python - parsing JSON and f string

I am trying to write a small JSON script that parses JSON files. I need to include multiple variables in the code but currently, I'm stuck since f string does not seem to be working as I expected. Here is an example code:
import json
test = 10
json_data = f'[{"ID": {test},"Name":"Pankaj","Role":"CEO"}]'
json_object = json.loads(json_data)
json_formatted_str = json.dumps(json_object, indent=2)
print(json_formatted_str)
The above code returns an error:
json_data = f'[{"ID": { {test} },"Name":"Pankaj","Role":"CEO"}]'
ValueError: Invalid format specifier
Could you, please let me know how can I add variables to the JSON?
Thank you.
You can put extra{ and } to your string:
import json
test = 10
json_data = f'[{{"ID": {test},"Name":"Pankaj","Role":"CEO"}}]'
json_object = json.loads(json_data)
json_formatted_str = json.dumps(json_object, indent=2)
print(json_formatted_str)
Prints:
[
{
"ID": 10,
"Name": "Pankaj",
"Role": "CEO"
}
]

How do I read data in sub-dictionary in python

My problem is that I cannot see the data I pulled from the mongo database separately. The data comes as a dictionary and when I try to read it with pandas, it returns the sub dictionary group as a single data.
import pandas
dic = {
"value1" : "a",
"value2" : {
"subvalue1" : "sub-a",
"subvalue2" : "sub-b"
},
"value3" : "c"
}
df = pandas.DataFrame(dic)
df = pandas.DataFrame(list(dic.items()), columns=["value1","subvalue1"])
print(df)
When I run the code, the output I get is as follows.
value1 subvalue1
0 value1 a
1 value2 {'subvalue1': 'sub-a', 'subvalue2': 'sub-b'}
2 value3 c
Process finished with exit code 0
What I want is I want to produce an output with the values in the "columns" array by writing a code like the one below.
import pandas
dic = {
"value1" : "a",
"value2" : {
"subvalue1" : "sub-a",
"subvalue2" : "sub-b"
},
"value3" : "c"
}
df = pandas.DataFrame(dic)
df = pandas.DataFrame(list(dic.items()), columns=["value1","subvalue1","subvalue2","value3"])
print(df)
output sample I want
How can i do this.
Thank you for all.
you can do that by flattening the dictionary like this:
def flatten_dict(d):
flattened={}
for key, val in d.items():
if isinstance(val,dict):
flattened.update(flatten_dict(val))
else:
flattened.update({key:val})
return flattened
I don't recommend this way however. If you happen to have a dictionary of the form {"a":{"same_key_name":"value_a"}, "b":{"same_key_name":"value_b"}} then the flattened dictionary would be {'same_key_name': 'value_b'}
A safer and more canonical way to do it is to flatten the dictionary but keep the key names in a concatenated form:
df = pd.json_normalize(d, sep='_')

Take Input Dynamically from user in Python Dictionary

I've created a Python Dictionary Structure as below:
import pprint
log_data = {
'Date':'',
'Prayers':{
'Fajr':'',
'Dhuhr/Jumu\'ah':'',
'Asr':'',
'Maghrib':'',
'Isha\'a':''
},
'Task List':[{
'Task':'',
'Timeline':'',
'Status':''
}],
'Meals':{
'Breakfast':{
'Menu':'',
'Place':'',
'Time':''
},
'Lunch':{
'Menu':'',
'Place':'',
'Time':''
},
'Evening Snacks':{
'Menu':'',
'Place':'',
'Time':''
},
'Dinner':{
'Menu':'',
'Place':'',
'Time':''
}
},
'Exercises':[{
'Exercise':'',
'Duration':''
}]
}
pprint.pprint(log_data)
As you see this is just an dictionary structure without data. I want to iterate over all the keys and take input data as value from user using input().
Then I would like to save this dictionary as json file.
Could you please help on how I can iterate over all keys and take input from user.
Thanks.
Searched but couldn't found exact type of help that I need.
For this kind of thing, one needs to use recursion.
This is not fancy, but will get the job done:
from copy import deepcopy
import json
import pprint
log_data = {
'Date':'',
'Prayers':{
'Fajr':'',
'Dhuhr/Jumu\'ah':'',
'Asr':'',
'Maghrib':'',
'Isha\'a':''
},
'Task List':[{
'Task':'',
'Timeline':'',
'Status':''
}],
# ...
}
def input_fields(substruct, path=""):
print(f"Inputing values '{path}':")
for fieldname, value in substruct.items():
if isinstance(value, (str, int)):
substruct[fieldname] = input(f"{path}.{fieldname}: ")
elif isinstance(value, dict):
input_fields(value, f"{path}.{fieldname}")
elif isinstance(value, list):
original = value[0]
value.pop()
counter = 0
if not isinstance(original, dict):
raise ValueError("Not supported: A list should contain a dictionary-substructure")
while True:
item = deepcopy(original)
input_fields(item, f"{path}.{fieldname}.[{counter}]")
value.append(item)
continue_ = input(f"Enter one more {path}.{fieldname} item? (y/n) ").lower().strip()[0] == "y"
if not continue_:
break
counter+=1
return substruct
def main():
values = input_fields(deepcopy(log_data))
json.dump(values, open("myfile.json", "wt"), indent=4)
if __name__ == "__main__":
main()

Deserialise a JSON string to nested objects using jsons

I am trying to deserialise a json string to an object using jsons but having problems with nested objects, but can't work out the syntax.
As an example the following code attempts to define the data structure as a series of dataclasses but fails to deserialise the nested objects C and D ? The syntax is clearly wrong, but its not clear to me how it should structured
import jsons
from dataclasses import dataclass
#dataclass
class D:
E: str
class C:
id: int
name:str
#dataclass
class test:
A: str
B: int
C: C()
D: D()
jsonString = {"A":"a","B":1,"C":[{"id":1,"name":"one"},{"id":2,"name":"two"}],"D":[{"E":"e"}]}
instance = jsons.load(jsonString, test)
Can anyone indicate the correct way to deserialise the objects from json ?
There are two relatively simple problems with your attempt:
You forgot to decorate C with #dataclass.
Test.C and Test.D aren't defined with types, but with instances of the types. (Further, you want both fields to be lists of the given type, not single instances of each.)
Given the code
import jsons
from dataclasses import dataclass
from typing import List
#dataclass
class D:
E: str
#dataclass # Problem 1 fixed
class C:
id: int
name: str
#dataclass
class Test:
A: str
B: int
C: List[C] # Problem 2 fixed; List[C] not C() or even C
D: List[D] # Problem 2 fixed; List[D], not D() or even D
Then
>>> obj = {"A":"a", "B":1, "C": [{"id": 1,"name": "one"}, {"id": 2, "name": "two"}], "D":[{"E": "e"}]}
>>> jsons.load(obj, Test)
test(A='a', B=1, C=[C(id=1, name='one'), C(id=2, name='two')], D=[D(E='e')])
from dataclasses import dataclass
from typing import List
from validated_dc import ValidatedDC
#dataclass
class D(ValidatedDC):
E: str
#dataclass
class C(ValidatedDC):
id: int
name: str
#dataclass
class Test(ValidatedDC):
A: str
B: int
C: List[C]
D: List[D]
jsonString = {
"A": "a",
"B": 1,
"C": [{"id": 1, "name": "one"}, {"id": 2, "name": "two"}],
"D": [{"E": "e"}]
}
instance = Test(**jsonString)
assert instance.C == [C(id=1, name='one'), C(id=2, name='two')]
assert instance.C[0].id == 1
assert instance.C[1].name == 'two'
assert instance.D == [D(E='e')]
assert instance.D[0].E == 'e'
ValidatedDC: https://github.com/EvgeniyBurdin/validated_dc
You can do something like this:
from collections import namedtuple
# First parameter is the class/tuple name, second parameter
# is a space delimited string of varaibles.
# Note that the variable names should match the keys from
# your dictionary of arguments unless only one argument is given.
A = namedtuple("A", "a_val") # Here the argument `a_val` can be called something else
B = namedtuple("B", "num")
C = namedtuple("C", "id name")
D = namedtuple("D", "E") # This must be `E` since E is the key in the dictionary.
# If you dont want immutable objects to can use full classes
# instead of namedtuples
# A dictionary which matches the name of an object seen in a payload
# to the object we want to create for that name.
object_options = {
"A": A,
"B": B,
"C": C,
"D": D
}
my_objects = [] # This is the list of object we get from the payload
jsonString = {"A":"a","B":1,"C":[{"id":1,"name":"one"},{"id":2,"name":"two"}],"D":[{"E":"e"}]}
for key, val in jsonString.items():
if key in object_options: # If this is a valid object
if isinstance(val, list): # If it is a list of this object
for v in val: # Then we need to add each object in the list
my_objects.append(object_options[key](**v))
elif isinstance(val, dict): # If the object requires a dict then pass the whole dict as arugments
my_objects.append(object_options[key](**val))
else: # Else just add this object with a singular argument.
my_objects.append(object_options[key](val))
print(my_objects)
Output:
[A(a_val='a'), B(num=1), C(id=1, name='one'), C(id=2, name='two'), D(E='e')]
I've finally managed to get this to work by removing the dataClass definition and expanding the class definitions old school.... code as follows...
import jsons
class D:
def __init__(self, E = ""):
self.E = E
class C:
def __init__(self, id = 0, name=""):
self.id = id
self.name = name
class test:
def __init__(self, A = "", B = 0, C = C(), D = D()):
self.A = A
self.B = B
self.C = C
self.D = D
jsonString = {"A":"a","B":1,"C":[{"id":1,"name":"one"},{"id":2,"name":"two"}],"D":[{"E":"e"}]}
instance = jsons.load(jsonString, test)
It now works but is not as clean as with a dataClass. Grateful if anyone can indicate how the original post can be constructed with the dataClass definition.

Scala: parse JSON file into List[DBObject]

1.Input is JSON file that contains multiple records. Example:
[
{"user": "user1", "page": 1, "field": "some"},
{"user": "user2", "page": 2, "field": "some2"},
...
]
2.I need to load each record from the file as a Document to MongoDB collection.
Using casbah for interacting with mongo, inserting data may look like:
def saveCollection(inputListOfDbObjects: List[DBObject]) = {
val xs = inputListOfDbObjects
xs foreach (obj => {
Collection.save(obj)
})
Question: What is the correct way (using scala) to parse JSON to get data as List[DBObject] at output?
Any help is appreciated.
You could use the parser combinator library in Scala.
Here's some code I found that does this for JSON: http://booksites.artima.com/programming_in_scala_2ed/examples/html/ch33.html#sec4
Step 1. Create a class named JSON that contains your parser rules:
import scala.util.parsing.combinator._
class JSON extends JavaTokenParsers {
def value : Parser[Any] = obj | arr |
stringLiteral |
floatingPointNumber |
"null" | "true" | "false"
def obj : Parser[Any] = "{"~repsep(member, ",")~"}"
def arr : Parser[Any] = "["~repsep(value, ",")~"]"
def member: Parser[Any] = stringLiteral~":"~value
}
Step 2. In your main function, read in your JSON file, passing the contents of the file to your parser.
import java.io.FileReader
object ParseJSON extends JSON {
def main(args: Array[String]) {
val reader = new FileReader(args(0))
println(parseAll(value, reader))
}
}