Uploading JSON to firebase from dataframe - json

Having a puzzling problem posting JSON to firebase programatically:
Original JSON retreived from firebase:
{'recipe1': {'abbie':2,'ben':0,'chris':1},'recipe2': {'abbie':1,'ben': 5,'chris':5}}
I then convert it into a dataframe using pandas to manipulate the data, before turning it back into JSON. Here is where I'm getting stuck.
Convert dataframe to JSON:
out = df.to_json()
Result printed in terminal:
{"recipe1":{"abbie":2,"ben":0,"chris":1},"recipe2":{"abbie":1,"ben":5,"chris":5}}
firebase.post("/testupdate", out)
Yet if I manually assign out to the same JSON structure:
out = {"recipe1":{"abbie":2,"ben":0,"chris":1},"recipe2":{"abbie":1,"ben":5,"chrisy":5}}
and post that,it works perfectly.
If anyone can help me out here it would be greatly appreciated!

Actually I've just figured it out myself, assumed it would be a pretty simple fix.
Anyone else having this difficulty simply use:
out = df.to_dict()
Instead of:
out = df.to_json()
When converting the dataframe.

Related

Converting CSV to Parquet in spark without writing/saving it

I am trying to convert a CSV to Parquet and then use it for other purposes and I couldn't find how to do it without saving it, I've only seen this kind of structure to convert and it always writes/saves the data:
data = spark.read.load(sys.argv[1], format="csv", sep="|", inferSchema="true", header="true")
data.write.format("parquet").mode("overwrite").save(sys.argv[2])
Am I missing something? I am also happy with scala solutions.
Thanks!
I've tried to play with the mode option in "write" but it didn't help.

Nested JSON to dataframe in Scala

I am using Spark/Scala to make an API Request and parse the response into a dataframe. Following is the sample JSON response I am using for testing purpose:
API Request/Response
However, I tried to use the following answer from StackOverflow to convert to JSON but the nested fields are not being processed. Is there any way to convert the JSON string to a dataframe with columns??
I think the problem is that the json that you have attached, if we read it as a df, it is giving a single row(and it is very huge) and hence spark might be truncating the result.
If this is what you want then you can try to use the spark property spark.debug.maxToStringFields to a higher value(default is 25)
spark.conf().set("spark.debug.maxToStringFields", 100)
However, if you want to process the Results from json, then it would be better to get it as data frame and then do the processing. Here is how you can do it
val results = JsonParser.parseString(<json content>).getAsJsonObject().get("Results").getAsJsonArray.toString
import spark.implicits._
val df = spark.read.json(Seq(results).toDS)
df.show(false)

Python - How to update a value in a json file?

I hate json files. They are unwieldy and hard to handle :( Please tell me why the following doesn't work:
with open('data.json', 'r+') as file_object:
data = json.load(file_object)[user_1]['balance']
am_nt = 5
data += int(am_nt['amount'])
print(data)
file_object[user_1]['balance'] = data
Through trial and error (and many print statements), I have discovered that it opens the file, goes to the correct place, and then actually adds the am_nt, but I can't make the original json file update. Please help me :( :( . I get:
2000
TypeError: '_io.TextIOWrapper' object is not subscriptable
json is fun to work with as it is similar to python data structures.
The error is: object is not subscriptable
This error is for this line:
file_object[user_1]['balance'] = data
file_object is not json/dictionary data that can be updated like above. Hence the error.
Try to read the json data:
data=json.load(file_object)
Then manipulate the data as python dictionary. And save the file.

How to represent nested json objects in a cucumber feature file

I have a need to represent JSON object in the feature file. I could use a json file for this for the code to pick up. But this would mean that i cant pass values from the feature file.
Scenario: Test
Given a condition is met
Then the following json response is sent
| json |
| {"dddd":"dddd","ggggg":"ggggg"}|
Above works for a normal json. However if there are nested objects etc then writing the json in a single line like above would make the feature very difficult to read and difficult to fix.
Please let me know.
You can use a string to do that, it makes the json much more legible.
Then the following json response is sent
"""
{
'dddd': 'dddd',
'ggggg': 'ggggg',
'somethingelse': {
'thing': 'thingvalue',
'thing2': 'thing2value'
}
}
"""
In the code, you can use it directly:
Then(/^the following json response is sent$/) do |message|
expect(rest_stub.body).to eq(message)
end
or something like that.

Spark exception handling for json

I am trying to catch/ignore a parsing error when I'm reading a json file
val DF = sqlContext.jsonFile("file")
There are a couple of lines that aren't valid json objects, but the data is too large to go through individually (~1TB)
I've come across exception handling for mapping using import scala.util.Tryand in.map(a => Try(a.toInt)) referencing:
how to handle the Exception in spark map() function?
How would I catch an exception when reading a json file with the function sqlContext.jsonFile?
Thanks!
Unfortunately you are out of luck here. DataFrameReader.json which is used under the hood is pretty much all-or-nothing. If your input contains malformed lines you have to filter these manually. A basic solution could look like this:
import scala.util.parsing.json._
val df = sqlContext.read.json(
sc.textFile("file").filter(JSON.parseFull(_).isDefined)
)
Since above validation is rather expensive you may prefer to drop jsonFile / read.json completely and to use parsed JSON lines directly.