JSONDecodeError: Expecting value: line 1 column 1 - json

I am receiving this error in Python 3.5.1.
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Here is my code:
import json
import urllib.request
connection = urllib.request.urlopen('http://python-data.dr-chuck.net/comments_220996.json')
js = connection.read()
print(js)
info = json.loads(str(js))

If you look at the output you receive from print() and also in your Traceback, you'll see the value you get back is not a string, it's a bytes object (prefixed by b):
b'{\n "note":"This file .....
If you fetch the URL using a tool such as curl -v, you will see that the content type is
Content-Type: application/json; charset=utf-8
So it's JSON, encoded as UTF-8, and Python is considering it a byte stream, not a simple string. In order to parse this, you need to convert it into a string first.
Change the last line of code to this:
info = json.loads(js.decode("utf-8"))

in my case, some characters like " , :"'{}[] " maybe corrupt the JSON format, so use try json.loads(str) except to check your input

Related

python error on string format with "\n" exec(compile(contents+"\n", file, 'exec'), glob, loc)

i try to construct JSON with string that contains "\n" in it like this :
ver_str= 'Package ID: version_1234\nBuild\nnumber: 154\nBuilt\n'
proj_ver_str = 'Version_123'
comb = '{"r_content": {0}, "s_version": {1}}'.format(ver_str,proj_ver_str)
json_content = json.loads()
d =json.dumps(json_content )
getting this error:
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Dev/python/new_tester/simple_main.py", line 18, in <module>
comb = '{"r_content": {0}, "s_version": {1}}'.format(ver_str,proj_ver_str)
KeyError: '"r_content"'
The error arises not because of newlines in your values, but because of { and } characters in your format string other than the placeholders {0} and {1}. If you want to have an actual { or a } character in your string, double them.
Try replacing the line
comb = '{"r_content": {0}, "s_version": {1}}'.format(ver_str,proj_ver_str)
with
comb = '{{"r_content": {0}, "s_version": {1}}}'.format(ver_str,proj_ver_str)
However, this will give you a different error on the next line, loads() missing 1 required positional argument: 's'. This is because you presumably forgot to pass comb to json.loads().
Replacing json.loads() with json.loads(comb) gives you another error: json.decoder.JSONDecodeError: Expecting value: line 1 column 15 (char 14). This tells you that you've given json.loads malformed JSON to parse. If you print out the value of comb, you see the following:
{"r_content": Package ID: version_1234
Build
number: 154
Built
, "s_version": Version_123}
This isn't valid JSON, because the string values aren't surrounded by quotes. So a JSON parsing error is to be expected.
At this point, let's take a look at what your code is doing and what you seem to want it to do. It seems you want to construct a JSON string from your data, but your code puts together a JSON string from your data, parses it to a dict and then formats it back as a JSON string.
If you want to create a JSON string from your data, it's far simpler to create a dict with your values and use json.dumps on that:
d = json.dumps({"r_content": ver_str, "s_version": proj_ver_str})

Cannot send base64 String to PubNub

I am using PyCamera module of Raspberry Pi to capture an image and store as .jpg.
first encoding the image using base64.encodestring(). but while sending encoded string to PubNub server, I get error on my_publish_callback as
('ERROR: ', 'Expecting value: line 1 column 1 (char 0)')
('ERROR: ', JSONDecodeError('Expecting value: line 1 column 1 (char 0)',))
I have tried using base64.b64encode() but still get the same errors. I have tried the script in python 2 and 3;
def my_publish_callback(envelope, status):
if not status.is_error():
pass # Message successfully published to specified channel.
else:
#print("recv: ", envelope)
print("ERROR: ", status.error_data.information)
print("ERROR: ", status.error_data.exception)
def publish(channel, msg):
pubnub.publish().channel(channel).message(msg).async(my_publish_callback)
def captureAndSendImage():
camera.start_preview()
time.sleep(2)
camera.capture("/home/pi/Desktop/image.jpg")
camera.stop_preview()
with open("/home/pi/Desktop/image.jpg", "rb") as f:
encoded = base64.encodestring(f.read())
publish(myChannel, str(encoded))
I am not able to find or print full error traceback so that I can get some more clues about where the error is occurring. But it looks like PubNub is trying to parse the data in JSON, and its failing.
I realized the .jpg file size is 154KB, whereas PubNub max packet size is 32KB, so that should clearly say it all. PubNub recommends to send large messages by splitting them and re-arranging them in subscriber-end. Thanks #Craig for referring to that link, Its useful though support.pubnub.com/support/discussions/topics/14000006326

How to load a json file with strings including double quotes (")

I've been given a load of JSON files which I'm trying to load into python 3.5
I've already had to do some clean up work, removing double backslashes and extra quotations, however I've run into an issue I don't know how to solve.
I'm running the following code:
with open(filepath,'r') as json_file:
reader = json_file.readlines()
for row in reader:
row = row.replace('\\', '')
row = row.replace('"{', '{')
row = row.replace('}"', '}')
response = json.loads(row)
for i in response:
responselist.append(i['ActionName'])
However it's throwing up the error:
JSONDecodeError: Expecting ',' delimiter: line 1 column 388833 (char 388832)
The part of the JSON that's causing the issue is the status text entry below:
"StatusId":8,
"StatusIdString":"UnknownServiceError",
"StatusText":"u003cCompany docTypeu003d"Mobile.Tile" statusIdu003d"421" statusTextu003d"Start time of 11/30/2015 12:15:00 PM is more than 5 minutes in the past relative to the current time of 12/1/2015 12:27:01 AM." copyrightu003d"Copyright Company Inc." versionNumberu003d"7.3" createdDateu003d"2015-12-01T00:27:01Z" responseIdu003d"e74710c0-dc7c-42db-b608-bf905d95d153" /u003e",
"ActionName":"GetTrafficTile"
I added the line breaks to illustrate my point, it looks like python is unhappy that the string contains double quotes.
I have a feeling this may be to do with my replacing '\ \' with '' messing with the unicode characters in the string. Is there any way to repair these nested strings? I don't mind if the StatusText field is deleted completely, all I'm after is a list of the ActionName fields.
EDIT:
I've hosted an example file here:
https://www.dropbox.com/s/1oanrneg3aqandz/2015-12-01T00%253A00%253A42.527Z_2015-12-01T00%253A01%253A17.478Z?dl=0
This is exactly as I received, before I've replaced the extra backslashes and quotations
Here is a pared down version of the sample with one bad entry
["{\"apiServerType\":0,\"RequestId\":\"52a65260-1637-4653-a496-7555a2386340\",\"StatusId\":0,\"StatusIdString\":\"Ok\",\"StatusText\":null,\"ActionName\":\"GetCameraImage\",\"Url\":\"http://mosi-prod.cloudapp.net/api/v1/GetCameraImage?AuthToken=vo*AB57XLptsKXf0AzKjf1MOgQ1hZ4BKipKgYl3uGew%7C&CameraId=13782\",\"Lat\":0.0,\"Lon\":0.0,\"iVendorId\":12561,\"iConsumerId\":2986897,\"iSliverId\":51846,\"UserId\":\"2986897\",\"HardwareId\":null,\"AuthToken\":\"vo*AB57XLptsKXf0AzKjf1MOgQ1hZ4BKipKgYl3uGew|\",\"RequestTime\":\"2015-12-01T00:00:42.5278699Z\",\"ResponseTime\":\"2015-12-01T00:01:02.5926127Z\",\"AppId\":null,\"HttpMethod\":\"GET\",\"RequestHeaders\":\"{\\\"Connection\\\":[\\\"keep-alive\\\"],\\\"Via\\\":[\\\"HTTP/1.1 nycnz01msp1ts10.wnsnet.attws.com\\\"],\\\"Accept\\\":[\\\"application/json\\\"],\\\"Accept-Encoding\\\":[\\\"gzip\\\",\\\"deflate\\\"],\\\"Accept-Language\\\":[\\\"en-us\\\"],\\\"Host\\\":[\\\"mosi-prod.cloudapp.net\\\"],\\\"User-Agent\\\":[\\\"Traffic/5.4.0\\\",\\\"CFNetwork/758.1.6\\\",\\\"Darwin/15.0.0\\\"]}\",\"RequestContentHeaders\":\"{}\",\"RequestContentBody\":\"\",\"ResponseBody\":null,\"ResponseContentHeaders\":\"{\\\"Content-Type\\\":[\\\"image/jpeg\\\"]}\",\"ResponseHeaders\":\"{}\",\"MiniProfilerJson\":null}"]
The problem is a little different than you think. Whatever program built these files used data that was already json-encoded and ended up double and even triple encoding some of the information. I peeled it apart in a shell session and got usable python data. You can (1) go dope-slap whoever wrote the program that built this steaming pile of... um... goodness? and (2) manually scan through and decode inner json strings.
I decoded the data and it was a list of strings, but those strings looked suspiciously like json
>>> data = json.load(open('test.json'))
>>> type(data)
<class 'list'>
>>> d0 = data[0]
>>> type(d0)
<class 'str'>
>>> d0[:70]
'{"apiServerType":0,"RequestId":"52a65260-1637-4653-a496-7555a2386340",'
Sure enough, I can decode it
>>> d0_1 = json.loads(d0)
>>> type(d0_1)
<class 'dict'>
>>> d0_1
{'ResponseBody': None, 'StatusText': None, 'AppId': None, 'ResponseTime': '2015-12-01T00:01:02.5926127Z', 'HardwareId': None, 'RequestTime': '2015-12-01T00:00:42.5278699Z', 'StatusId': 0, 'Lon': 0.0, 'Url': 'http://mosi-prod.cloudapp.net/api/v1/GetCameraImage?AuthToken=vo*AB57XLptsKXf0AzKjf1MOgQ1hZ4BKipKgYl3uGew%7C&CameraId=13782', 'RequestContentBody': '', 'RequestId': '52a65260-1637-4653-a496-7555a2386340', 'MiniProfilerJson': None, 'RequestContentHeaders': '{}', 'ActionName': 'GetCameraImage', 'StatusIdString': 'Ok', 'HttpMethod': 'GET', 'iSliverId': 51846, 'ResponseHeaders': '{}', 'ResponseContentHeaders': '{"Content-Type":["image/jpeg"]}', 'apiServerType': 0, 'AuthToken': 'vo*AB57XLptsKXf0AzKjf1MOgQ1hZ4BKipKgYl3uGew|', 'iConsumerId': 2986897, 'RequestHeaders': '{"Connection":["keep-alive"],"Via":["HTTP/1.1 nycnz01msp1ts10.wnsnet.attws.com"],"Accept":["application/json"],"Accept-Encoding":["gzip","deflate"],"Accept-Language":["en-us"],"Host":["mosi-prod.cloudapp.net"],"User-Agent":["Traffic/5.4.0","CFNetwork/758.1.6","Darwin/15.0.0"]}', 'iVendorId': 12561, 'Lat': 0.0, 'UserId': '2986897'}
Picking one of the entries, that looks like more json
>>> hdrs = d0_1['RequestHeaders']
>>> type(hdrs)
<class 'str'>
Yep, it decodes to what I want
>>> hdrs_0 = json.loads(hdrs)
>>> type(hdrs_0)
<class 'dict'>
>>>
>>> hdrs_0["Via"]
['HTTP/1.1 nycnz01msp1ts10.wnsnet.attws.com']
>>>
>>> type(hdrs_0["Via"])
<class 'list'>
Here you are :) :
responselist = []
with open('dataFile.json','r') as json_file:
reader = json_file.readlines()
for row in reader:
strActNm = 'ActionName":"'; lenActNm = len(strActNm)
actionAt = row.find(strActNm)
while actionAt > 0:
nxtQuotAt = row.find('"',actionAt+lenActNm+2)
responselist.append( row[actionAt-1: nxtQuotAt+1] )
actionAt = row.find('ActionName":"', nxtQuotAt)
print(responselist)
which gives:
>python3.6 -u "dataFile.py"
['"ActionName":"GetTrafficTile"']
>Exit code: 0
where dataFile.json is the file with the line you provided and dataFile.py the code provided above.
It's the hard tour, but if the files are in a bad format you have to find a way around and a simple pattern matching works in any case. For more complex cases you will need regex (regular expressions), but in this case a simple .find() is enough to do the job.
The code finds also multiple "actions" in the line (if the line would contain more than one action).
Here the result for the file you provided in your link while using following small modification of the code above:
responselist = []
with open('dataFile1.json','r') as json_file:
reader = json_file.readlines()
for row in reader:
strActNm='\\"ActionName\\":\\"'
# strActNm = 'ActionName":"'
lenActNm = len(strActNm)
actionAt = row.find(strActNm)
while actionAt > 0:
nxtQuotAt = row.find('"',actionAt+lenActNm+2)
responselist.append( row[actionAt: nxtQuotAt+1].replace('\\','') )
actionAt = row.find('ActionName":"', nxtQuotAt)
print(responselist)
gives:
>python3.6 -u "dataFile.py"
['"ActionName":"GetCameraImage"']
>Exit code: 0
where dataFile1.json is the file you provided in the link.

Json Type Provider: Parsing Valid Json Fails

I have the following code block in my REPL
#r "../packages/FSharp.Data.2.2.1/lib/net40/FSharp.Data.dll"
open FSharp.Data
[<Literal>]
let uri = "http://www.google.com/finance/option_chain?q=AAPL&output=json"
type OptionChain = JsonProvider<uri>
When I run it, FSI is returning
Error 1 The type provider 'ProviderImplementation.JsonProvider'
reported an error: Cannot read sample JSON from
'http://www.google.com/finance/option_chain?q=AAPL&output=json':
Invalid JSON starting at character 1, snippet =
---- {expiry:{y:2
----- json =
------ {expiry:{y:2015,m:5,d:8},expirations: [{y:2015,m:5,d:8},{y:2015,m:5,d:15},
This json is valid according to two other sites. Is it a bug in the TP?
The output isn't valid JSON because some keys are not quoted.
{expiry:{y:2015,m:5,d:8},expirations:[{y:2015,m:5,d:8},{y:2015,m:5,d:15},{y:2015,m:5,d:22},{y:2015,m:5,d:29},{y:2015,m:6,d:5},{y:2015,m:6,d:12},{y:2015,m:6,d:19},{y:2015,m:6,d:26},{y:2015,m:7,d:17},{y:2015,m:8,d:21},{y:2015,m:10,d:16},{y:2016,m:1,d:15},{y:2017,m:1,d:20}],
puts:[{cid:"43623726334021",s:"AAPL150508P00085000",e:"OPRA",p:"-",c:"-",b:"-",a:"-",oi:"-",vol:"-",strike:"85.00",expiry:"May 8, 2015"},
...

Json parse error using POST in django rest api

I am trying to implement a simple GET/POST api via Django REST framework
views.py
class cuser(APIView):
def post(self, request):
stream = BytesIO(request.DATA)
json = JSONParser().parse(stream)
return Response()
urls.py
from django.conf.urls import patterns, url
from app import views
urlpatterns = patterns('',
url(r'^challenges/',views.getall.as_view() ),
url(r'^cuser/' , views.cuser.as_view() ),
)
I am trying to POST some json to /api/cuser/ (api is namespace in my project's urls.py ) ,
the JSON
{
"username" : "abhishek",
"email" : "john#doe.com",
"password" : "secretpass"
}
I tried from both Browseable API page and httpie ( A python made tool similar to curl)
httpie command
http --json POST http://localhost:58601/api/cuser/ username=abhishek email=john#doe.com password=secretpass
but I am getting JSON parse error :
JSON parse error - Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
Whole Debug message using --verbose --debug
POST /api/cuser/ HTTP/1.1
Content-Length: 75
Accept-Encoding: gzip, deflate
Host: localhost:55392
Accept: application/json
User-Agent: HTTPie/0.8.0
Connection: keep-alive
Content-Type: application/json; charset=utf-8
{"username": "abhishek", "email": "john#doe.com", "password": "aaezaakmi1"}
HTTP/1.0 400 BAD REQUEST
Date: Sat, 24 Jan 2015 09:40:03 GMT
Server: WSGIServer/0.1 Python/2.7.9
Vary: Accept, Cookie
Content-Type: application/json
Allow: POST, OPTIONS
{"detail":"JSON parse error - Expecting property name enclosed in double quotes: line 1 column 2 (char 1)"}
The problem that you are running into is that your request is already being parsed, and you are trying to parse it a second time.
From "How the parser is determined"
The set of valid parsers for a view is always defined as a list of classes. When request.data is accessed, REST framework will examine the Content-Type header on the incoming request, and determine which parser to use to parse the request content.
In your code you are accessing request.DATA, which is the 2.4.x equaivalent of request.data. So your request is being parsed as soon as you call that, and request.DATA is actually returning the dictionary that you were expecting to parse.
json = request.DATA
is really all you need to parse the incoming JSON data. You were really passing a Python dictionary into json.loads, which does not appear to be able to parse it, and that is why you were getting your error.
I arrived at this post via Google for
"detail": "JSON parse error - Expecting property name enclosed in double-quotes":
Turns out you CANNOT have a trailing comma in JSON.
So if you are getting this error you may need to change a post like this:
{
"username" : "abhishek",
"email" : "john#doe.com",
"password" : "secretpass",
}
to this:
{
"username" : "abhishek",
"email" : "john#doe.com",
"password" : "secretpass"
}
Note the removed comma after the last property in the JSON object.
Basically, whenever you are trying to make a post request with requests lib, This library also contains json argument which is ignored in the case when data argument is set to files or data. So basically when json argument is set with json data. Headers are set asContent-Type: application/json. Json argument basically encodes data sends into a json format. So that at DRF particularly is able to parse json data. Else in case of only data argument it is been treated as form-encoded
requests.post(url, json={"key1":"value1"})
you can find more here request.post complicated post methods