Convert Response Variable to JSON file - json

I'm trying to convert the page at https://lpl.qq.com/web201612/data/LOL_MATCH_DETAIL_7325.js to a json file so that I can parse it into a pandas table.
I'm using the python requests library and my function looks like this:
[1]def lpl_get_match_data(url, test):
[2] good_url = "https://lpl.qq.com/web201612/data/LOL_MATCH_DETAIL_7325.js"
[3] json_file = requests.get(good_url)
[4] json_content = json_file.json()
[5] if test == True:
[6] pprint.pprint(json_content)
[7] return(json_content)
When I run this function I get an error, "JSONDecodeError: Expecting value: line 1 column 1 (char 0)".
I did the obvious "google the error" and checked a bunch of the responses and it looked like in a lot of cases people had empty files or weren't getting responses. I tried changing line 3 to requests.get(good_url).content to confirm that I was getting the proper response and it looked like I was. I'm wondering if anyone could point me to why this page isn't readable as a json object.

The problem is that the page you are trying to read constains a Javascript object, instead of a JSON file. You can convert the object you are getting with demjson library, using: demjson.decode(json_file.content[12:-1]). Note that content is a string, and therefore you can slice the parts "var dataObj = {" and "}", with the given indexes. You can use this instead of your third line, and this should fix you problem.

Related

Processing JSON from a .txt file and converting to a DataFrame in Julia

Cross posting from Julia Discourse in case anyone here has any leads.
I’m just looking for some insight into why the below code is returning a dataframe containing just the first line of my json file. If you’d like to try working with the file I’m working with, you can download the aminer_papers_0.zip from the Microsoft Open Academic Graph site, I’m using the first file in that group of files.
using JSON3, DataFrames, CSV
file_name = "path/aminer_papers_0.txt"
json_string = read(file_name, String)
js = JSON3.read(json_string)
df = DataFrame([js])
The resulting DataFrame has just one line, but the column titles are correct, as is the first line. To me the mystery is why the rest isn’t getting processed. I think I can rule out that read() is only reading the first JSON object, because I can index into the resulting object and see many JSON objects:
enter image description here
My first guess was maybe the newline \n was causing escape issues, and tried to use chomp to get rid of them, but couldn’t get it to work.
Anyway - any help would be greatly appreciated!
I think the problem is that the file is in JSON Lines format, and the JSON3 library only returns the first valid JSON value that it finds at the start of a string unless told otherwise.
tl;dr
Call JSON3.read with the keyword argument jsonlines=true.
Why?
By default, JSON3 interprets a string passed to its read function as a single "JSON text", defined by RFC 8259 section 1.3.2:
A JSON text is a serialized value....
(My emphasis on the use of the indefinite singular article "a.") A "JSON value" is defined in section 1.3.3:
A JSON value MUST be an object, array, number, or string, or one of the following three literal names: false, null, true.
A string with multiple JSON values in it is technically multiple "JSON texts." It is up to the parser to determine what part of the string argument you give it is a JSON text, and the authors of JSON3 chose as the default behavior to parse from the start of the string to the end of the first valid JSON value.
In order to get JSON3 to read the string as multiple JSON values, you have to give it the keyword option jsonlines=true, which is documented as:
jsonlines: A Bool indicating that the json_str contains newline delimited JSON strings, which will be read into a JSON3.Array of the JSON values. See jsonlines for reference. [default false]
Example
Take for example this simple string:
two_values = "3.14\n2.72"
Each one of these lines is a valid JSON serialization of a number. However, when passed to JSON3.read, only the first is parsed:
using JSON3
#assert JSON3.read(two_values) == 3.14
Using jsonlines=true, both values are parsed and returned as a JSON3.Array struct:
#assert JSON3.read(two_values, jsonlines=true) == [3.14, 2.72]
Other Packages
The JSON.jl library, which people might use by default given the name, does not implement parsing of JSON Lines strings at all, leaving it up to the caller to properly split the string as needed:
using JSON
JSON.parse(two_values)
# ERROR: Expected end of input
# Line: 1
# Around: ...3.14 2.72...
# ^
A simple way to implement reading multiple values is to use eachline:
#assert [JSON.parse(line) for line in eachline(IOBuffer(two_values))] == [3.14, 2.72]

Python - How to update a value in a json file?

I hate json files. They are unwieldy and hard to handle :( Please tell me why the following doesn't work:
with open('data.json', 'r+') as file_object:
data = json.load(file_object)[user_1]['balance']
am_nt = 5
data += int(am_nt['amount'])
print(data)
file_object[user_1]['balance'] = data
Through trial and error (and many print statements), I have discovered that it opens the file, goes to the correct place, and then actually adds the am_nt, but I can't make the original json file update. Please help me :( :( . I get:
2000
TypeError: '_io.TextIOWrapper' object is not subscriptable
json is fun to work with as it is similar to python data structures.
The error is: object is not subscriptable
This error is for this line:
file_object[user_1]['balance'] = data
file_object is not json/dictionary data that can be updated like above. Hence the error.
Try to read the json data:
data=json.load(file_object)
Then manipulate the data as python dictionary. And save the file.

SurveyMonkey API Error400 on bad JSON string when JSON string for creating survey poll is valid

I was looking into ways to create surveys programmatically and came across SurveyMonkey APIs. I manually created an empty survey on my account and created a draft app and got an access token from there. And then I copied & pasted the example code from the API doc on the section of creating a survey question to try it out.
You can also see the code here where the survey ID and the page ID were obtained by using the /surveys and /surveys/{id}/pages end points.
However, I get this following error.
{u'error': {u'docs': u'https://developer.surveymonkey.com/api/v3/#error-codes', u'message': u'The body provided was not a proper JSON string.', u'http_status_code': 400, u'id': u'1001', u'name': u'Bad Request'}}
I checked that the JSON object is valid and contains all the required parameters as specified in the doc. I also tried some smaller, simpler questions and still no success. If I passed in an empty JSON object, it does complain about the missing required parameters. I was wondering if someone could point out what I'm doing wrong. Thank you!
The issue is you aren't actually sending the body as json properly.
url = "https://api.surveymonkey.net/v3/surveys/%s/pages/%s/questions" % (survey_id, page_id)
r = s.post(url, data=payload)
print r.json()
You need to json.dumps(payload)
import json
url = "https://api.surveymonkey.net/v3/surveys/%s/pages/%s/questions" % (survey_id, page_id)
r = s.post(url, data=json.dumps(payload))
print r.json()
In newer version of the requests library you can use the json kwarg for that.
url = "https://api.surveymonkey.net/v3/surveys/%s/pages/%s/questions" % (survey_id, page_id)
r = s.post(url, json=payload)
print r.json()

Problems importing JSON data with R

I'm using the jsonlite library to try to get some JSON data and turn it into a data frame that I could use, but so far it hasn't worked. I've tried three methods so far:
testData = fromJSON("<json file location>")
Which outputs:
Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) :
parse error: trailing garbage
n_reply_to_status_id":null} {"contributors":null,"text":"Te
(right here) ------^
So I figured that if the quotes caused error, I'd just have to remove them:
singleString <- paste(readLines("<json file location>"), collapse=" ")
singleString = gsub('"', "",singleString)
singleString = fromJSON(singleString)
Which outputs:
Error: lexical error: invalid char in json text.
{contributors:null,text:A colour
(right here) ------^
It seems to be pointing to an 'o' in the text. If I delete that 'o' it just points to the 'n' that comes after it.
My last attempt I read on a forum post of someone having a similar problem that it should solve it:
stream = stream_in(file("<json file location>"))
I was pretty happy to see that this did not return an error and that a data frame named "stream" had been added to memory, but when I tried to view that data frame...
> View(stream)
Error in View : 'names' attribute [1] must be the same length as the vector [0]
I'm not exactly sure what this means or what I could do about it in this situation.
If you have an idea of I could get this data loaded correctly, I would deeply appreciate it. Thanks!
EDIT: Here is the JSON data in question: https://gist.github.com/geocachecs/f68d769aeed8e019a26cc230559bbf7f

Parsing large JSON file with Scala and JSON4S

I'm working with Scala in IntelliJ IDEA 15 and trying to parse a large twitter record json file and count the total number of hashtags. I am very new to Scala and the idea of functional programming. Each line in the json file is a json object (representing a tweet). Each line in the file starts like so:
{"in_reply_to_status_id":null,"text":"To my followers sorry..
{"in_reply_to_status_id":null,"text":"#victory","in_reply_to_screen_name"..
{"in_reply_to_status_id":null,"text":"I'm so full I can't move"..
I am most interested in a property called "entities" which contains a property called "hastags" with a list of hashtags. Here is an example:
"entities":{"hashtags":[{"text":"thewayiseeit","indices":[0,13]}],"user_mentions":[],"urls":[]},
I've browsed the various scala frameworks for parsing json and have decided to use json4s. I have the following code in my Scala script.
import org.json4s.native.JsonMethods._
var json: String = ""
for (line <- io.Source.fromFile("twitter38.json").getLines) json += line
val data = parse(json)
My logic here is that I am trying to read each line from twitter38.json into a string and then parse the entire string with parse(). The parse function is throwing an error claiming:
"Type mismatch, expected: Nothing, found:String."
I have seen examples that use parse() on strings that hold json objects such as
val jsontest =
"""{
|"name" : "bob",
|"age" : "50",
|"gender" : "male"
|}
""".stripMargin
val data = parse(jsontest)
but I have received the same error. I am coming from an object oriented programming background, is there something fundamentally wrong with the way I am approaching this problem?
You have most likely incorrectly imported dependencies to your Intellij project or modules into your file. Make sure you have the following lines imported:
import org.json4s.native.JsonMethods._
Even if you correctly import this module, parse(String: json) will not work for you, because you have incorrectly formed a json. Your json String will look like this:
"""{"in_reply_...":"someValue1"}{"in_reply_...":"someValues2"}"""
but should look as follows to be a valid json that can be parsed:
"""{{"in_reply_...":"someValue1"},{"in_reply_...":"someValues2"}}"""
i.e. you need starting and ending brackets for the json, and a comma between each line of tweets. Please read the json4s documenation for more information.
Although being almost 6 years old, I think this question deserves another try.
JSON format has a few misunderstandings in people's minds, especially how they are stored and how they are read back.
JSON documents, are stored as either a single object having all the other fields, or an array of multiple object possibly in same format. this second part is important because arrays in almost every programming language are defined by angle brackets and values separated by commas (note here I used a person object as my single value):
[
{"name":"John","surname":"Doe"},
{"name":"Jane","surname":"Doe"}
]
also note that everything except brackets, numbers and booleans are enclosed in quotes when written into file.
however, there is another use that is not official but preferred to transfer datasets easily where every object, or document as in nosql/mongo language, are stored in a new line like this:
{"name":"John","surname":"Doe"}
{"name":"Jane","surname":"Doe"}
so for the question, OP has a document written in this second form, but tries an algorithm written to read the first form. following code has few simple changes to achieve this, and the user must read the file knowing that:
var json: String = "["
for (line <- io.Source.fromFile("twitter38.json").getLines) json += line + ","
json=json.splitAt(json.length()-1)._1
json+= "]"
val data = parse(json)
PS: although #sbrannon, has the correct idea, the example he/she gave has mistakenly curly braces instead of angle brackets to surround the data.
EDIT: I have added json=json.splitAt(json.length()-1)._1 because the code above ends with a trailing comma which will cause parse error per the JSON format definition.