python error on string format with "\n" exec(compile(contents+"\n", file, 'exec'), glob, loc) - json

i try to construct JSON with string that contains "\n" in it like this :
ver_str= 'Package ID: version_1234\nBuild\nnumber: 154\nBuilt\n'
proj_ver_str = 'Version_123'
comb = '{"r_content": {0}, "s_version": {1}}'.format(ver_str,proj_ver_str)
json_content = json.loads()
d =json.dumps(json_content )
getting this error:
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Dev/python/new_tester/simple_main.py", line 18, in <module>
comb = '{"r_content": {0}, "s_version": {1}}'.format(ver_str,proj_ver_str)
KeyError: '"r_content"'

The error arises not because of newlines in your values, but because of { and } characters in your format string other than the placeholders {0} and {1}. If you want to have an actual { or a } character in your string, double them.
Try replacing the line
comb = '{"r_content": {0}, "s_version": {1}}'.format(ver_str,proj_ver_str)
with
comb = '{{"r_content": {0}, "s_version": {1}}}'.format(ver_str,proj_ver_str)
However, this will give you a different error on the next line, loads() missing 1 required positional argument: 's'. This is because you presumably forgot to pass comb to json.loads().
Replacing json.loads() with json.loads(comb) gives you another error: json.decoder.JSONDecodeError: Expecting value: line 1 column 15 (char 14). This tells you that you've given json.loads malformed JSON to parse. If you print out the value of comb, you see the following:
{"r_content": Package ID: version_1234
Build
number: 154
Built
, "s_version": Version_123}
This isn't valid JSON, because the string values aren't surrounded by quotes. So a JSON parsing error is to be expected.
At this point, let's take a look at what your code is doing and what you seem to want it to do. It seems you want to construct a JSON string from your data, but your code puts together a JSON string from your data, parses it to a dict and then formats it back as a JSON string.
If you want to create a JSON string from your data, it's far simpler to create a dict with your values and use json.dumps on that:
d = json.dumps({"r_content": ver_str, "s_version": proj_ver_str})

Related

ERROR: invalid input syntax for type json DETAIL: Token "'" is invalid. while importing csv in pgadmin

I have made a new table with three columns
customer_id,media_urls,survey_taste
in a db in pgadmin with attributes as
int,text[],jsonb
respectively.
I have a csv that I was trying to import into this table using pgadmin and
the contents of that file are like this
1,"{'http://example.com','http://example.com'}","{'taste':[1,2,3,4]}"
but I am getting this error
ERROR: invalid input syntax for type json
DETAIL: Token "'" is invalid.
CONTEXT: JSON data, line 1: '...
COPY survey_taste, line 2, column survey_taste: "{'taste': [-0.19101654669350904, 0.08575981750112513, 0.07133783942655376, -0.10579014363010293, 0.0..." ```
To address your comments in reverse order. To have this entered in one field you would need to have it as:
'[{"http":"abc","http":"abc"},{"taste":[1,2,3,4]}]'
Per:
select '[{"http":"abc","http":"abc"},{"taste":[1,2,3,4]}]'::json;
json
---------------------------------------------------
[{"http":"abc","http":"abc"},{"taste":[1,2,3,4]}]
As to the quoting issue:
When you pass a dict to csv you will get:
d = {"taste":[1,2,3,4]}
print(d)
{'taste': [1, 2, 3, 4]
What you need is:
import json
json.dumps(d)
'{"test": [1, 2, 3, 4]}'
Using json.dumps will turn the dict into a proper JSON string representation.
Putting it all together:
# Create list of dicts
l = [{'http': 'abc', 'http': 'abc'}, {'taste': [1,2,3,4]}]
# Create JSON string representattion
json.dumps(l)
'[{"http": "abc"}, {"taste": [1, 2, 3, 4]}]'

Why do I always get a "trailing characters" error when trying to parse data with serde_json?

I have a server that returns requests in a JSON format. When trying to parse the data I always get "trailing characters" error. This happens only when getting the JSON from postman
let type_of_request = parsed_request[1];
let content_of_msg: Vec<&str> = msg_from_client.split("\r\n\r\n").collect();
println!("{}", content_of_msg[1]);
// Will print "{"username":"user","password":"password","email":"dwadwad"}"
let res: serde_json::Value = serde_json::from_str(content_of_msg[1]).unwrap();
println!("The username is: {}", res["username"]);
when getting the data from postman this happens:
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error("trailing characters", line: 1, column: 60)', src\libcore\result.rs:997:5
but when having the string inside Rust:
let j = "{\"username\":\"user\",\"password\":\"password\",\"email\":\"dwadwad\"}";
let res: serde_json::Value = serde_json::from_str(j).unwrap();
println!("The username is: {}", res["username"]);
it works like a charm:
The username is: "user"
EDIT: Apparently as I read the message into a buffer and turned it into a string it saved all the NULL characters the buffer had which are of course the trailing characters.
Looking at the serde json code, one finds the following comment above the relevant ErrorCode enum element:
/// JSON has non-whitespace trailing characters after the value.
TrailingCharacters,
So as the error code implies, you've got some trailing character which is not whitespace. In your snippet, you say:
println!("{}", content_of_msg[1]);
// Will print "{"username":"user","password":"password","email":"dwadwad"}"
If you literally copy and pasted the printed output here, I'd note that I wouldn't expect the output to be wrapped in the leading and trailing quotation marks. Did you include these yourself or were they part of what was printed? If they were printed, I suspect that's the source of your problem.
Edit:
In fact, I can nearly recreate this using a raw string with leading/trailing quotation marks in Rust:
extern crate serde_json;
#[cfg(test)]
mod tests {
#[test]
fn test_serde() {
let s =
r#""{"username":"user","password":"password","email":"dwadwad"}""#;
println!("{}", s);
let _res: serde_json::Value = serde_json::from_str(s).unwrap();
}
}
Running it via cargo test yields:
test tests::test_serde ... FAILED
failures:
---- tests::test_serde stdout ----
"{"username":"user","password":"password","email":"dwadwad"}"
thread 'tests::test_serde' panicked at 'called `Result::unwrap()` on an `Err` value: Error("trailing characters", line: 1, column: 4)', src/libcore/result.rs:997:5
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
failures:
tests::test_serde
Note that my printed output also includes leading/trailing quotation marks and I also get a TrailingCharacter error, albeit at a different column.
Edit 2:
Based on your comment that you've added the wrapping quotations yourself, you've got a known good string (the one you've defined in Rust), and one which you believe should match it but doesn't (the one from Postman).
This is a data problem and so we should examine the data. You can adapt the below code to check the good string against the other:
#[test]
fn test_str_comp() {
// known good string we'll compare against
let good =
r#"{"username":"user","password":"password","email":"dwadwad"}"#;
// lengthened string, additional characters
// also n and a in username are transposed
let bad =
r#"{"useranme":"user","password":"password","email":"dwadwad"}abc"#;
let good_size = good.chars().count();
let bad_size = bad.chars().count();
for (idx, (c1, c2)) in (0..)
.zip(good.chars().zip(bad.chars()))
.filter(|(_, (c1, c2))| c1 != c2)
{
println!(
"Strings differ at index {}: (good: `{}`, bad: `{}`)",
idx, c1, c2
);
}
if good_size < bad_size {
let trailing = bad.chars().skip(good_size);
println!(
"bad string contains extra characters: `{}`",
trailing.collect::<String>()
);
} else if good_size > bad_size {
let trailing = good.chars().skip(bad_size);
println!(
"good string contains extra characters: `{}`",
trailing.collect::<String>()
);
}
assert!(false);
}
For my example, this yields the failure:
test tests::test_str_comp ... FAILED
failures:
---- tests::test_str_comp stdout ----
Strings differ at index 6: (good: `n`, bad: `a`)
Strings differ at index 7: (good: `a`, bad: `n`)
bad string contains extra characters: `abc`
thread 'tests::test_str_comp' panicked at 'assertion failed: false', src/lib.rs:52:9
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
failures:
tests::test_str_comp

Decoding Dict in Elm failing due to extra backslashes

I'm trying to send a dict to javascript via port for storing the value in localStorage, and retrieve it next time the Elm app starts via flag.
Below code snippets show the dict sent as well as the raw json value received through flag. The Json decoding fails showing the error message at the bottom.
The issue seems to be the extra backslashes (as in \"{\\"Left\\") contained in the raw flag value. Interestingly, console.log shows that the flag value passed by javascript is "dict1:{"Left":"fullHeightVerticalCenter","Right":"fullHeightVerticalCenter","_default":"fullHeightVerticalBottom"}"as intended, so the extra backslashes seem to be added by Elm, but I can't figure out why. Also, I'd be interested to find out a better way to achieve passing a dict to and from javascript.
import Json.Decode as JD
import Json.Encode as JE
dict1 = Dict.fromList[("_default", "fullHeightVerticalBottom")
, ("Left", "fullHeightVerticalCenter")
, ("Right", "fullHeightVerticalCenter")]
type alias FlagsJEValue =
{dict1: String}
port setStorage : FlagsJEValue -> Cmd msg
-- inside Update function Cmd
setStorage {dict1 = JE.encode 0 (dictEncoder JE.string model.dict1)}
dictEncoder enc dict =
Dict.toList dict
|> List.map (\(k,v) -> (k, enc v))
|> JE.object
--
type alias Flags =
{dict1: Dict String String}
flagsDecoder : Decoder Flags
flagsDecoder =
JD.succeed Flags
|> required "dict1" (JD.dict JD.string)
-- inside `init`
case JD.decodeValue MyDecoders.flagsDecoder raw_flags of
Err e ->
_ = Debug.log "raw flag value" (Debug.toString (JE.encode 2 raw_flags) )
_ = Debug.log "flags error msg" (Debug.toString e)
... omitted ...
Ok flags ->
... omitted ...
-- raw flag value
"{\n \"dict1\": \"{\\\"Left\\\":\\\"fullHeightVerticalCenter\\\",\\\"Right\\\":\\\"fullHeightVerticalCenter\\\",\\\"_default\\\":\\\"fullHeightVerticalBottom\\\"}\"\n}"
--flags error msg
"Failure \"Json.Decode.oneOf failed in the following 2 ways:\\n\\n\\n\\n
(1) Problem with the given value:\\n \\n \\\"{\\\\\\\"Left\\\\\\\":\\\\\\\"fullHeightVerticalCenter\\\\\\\",\\\\\\\"Right\\\\\\\":\\\\\\\"fullHeightVerticalCenter\\\\\\\",\\\\\\\"_default\\\\\\\":\\\\\\\"fullHeightVerticalBottom\\\\\\\"}\\\"\\n \\n Expecting an OBJECT\\n\\n\\n\\n
(2) Problem with the given value:\\n \\n \\\"{\\\\\\\"Left\\\\\\\":\\\\\\\"fullHeightVerticalCenter\\\\\\\",\\\\\\\"Right\\\\\\\":\\\\\\\"fullHeightVerticalCenter\\\\\\\",\\\\\\\"_default\\\\\\\":\\\\\\\"fullHeightVerticalBottom\\\\\\\"}\\\"\\n \\n Expecting null\" <internals>”
You don't need to use JE.encode there.
You can just use your dictEncoder to produce a Json.Encode.Value and pass that directly to setStorage.
The problem you're encountering it that you've encoded the dict to a json string (using JE.encode) and then sent that string over a port and the port has encoded that string as json again. You see extra slashes because the json string is double encoded.

How to load a json file with strings including double quotes (")

I've been given a load of JSON files which I'm trying to load into python 3.5
I've already had to do some clean up work, removing double backslashes and extra quotations, however I've run into an issue I don't know how to solve.
I'm running the following code:
with open(filepath,'r') as json_file:
reader = json_file.readlines()
for row in reader:
row = row.replace('\\', '')
row = row.replace('"{', '{')
row = row.replace('}"', '}')
response = json.loads(row)
for i in response:
responselist.append(i['ActionName'])
However it's throwing up the error:
JSONDecodeError: Expecting ',' delimiter: line 1 column 388833 (char 388832)
The part of the JSON that's causing the issue is the status text entry below:
"StatusId":8,
"StatusIdString":"UnknownServiceError",
"StatusText":"u003cCompany docTypeu003d"Mobile.Tile" statusIdu003d"421" statusTextu003d"Start time of 11/30/2015 12:15:00 PM is more than 5 minutes in the past relative to the current time of 12/1/2015 12:27:01 AM." copyrightu003d"Copyright Company Inc." versionNumberu003d"7.3" createdDateu003d"2015-12-01T00:27:01Z" responseIdu003d"e74710c0-dc7c-42db-b608-bf905d95d153" /u003e",
"ActionName":"GetTrafficTile"
I added the line breaks to illustrate my point, it looks like python is unhappy that the string contains double quotes.
I have a feeling this may be to do with my replacing '\ \' with '' messing with the unicode characters in the string. Is there any way to repair these nested strings? I don't mind if the StatusText field is deleted completely, all I'm after is a list of the ActionName fields.
EDIT:
I've hosted an example file here:
https://www.dropbox.com/s/1oanrneg3aqandz/2015-12-01T00%253A00%253A42.527Z_2015-12-01T00%253A01%253A17.478Z?dl=0
This is exactly as I received, before I've replaced the extra backslashes and quotations
Here is a pared down version of the sample with one bad entry
["{\"apiServerType\":0,\"RequestId\":\"52a65260-1637-4653-a496-7555a2386340\",\"StatusId\":0,\"StatusIdString\":\"Ok\",\"StatusText\":null,\"ActionName\":\"GetCameraImage\",\"Url\":\"http://mosi-prod.cloudapp.net/api/v1/GetCameraImage?AuthToken=vo*AB57XLptsKXf0AzKjf1MOgQ1hZ4BKipKgYl3uGew%7C&CameraId=13782\",\"Lat\":0.0,\"Lon\":0.0,\"iVendorId\":12561,\"iConsumerId\":2986897,\"iSliverId\":51846,\"UserId\":\"2986897\",\"HardwareId\":null,\"AuthToken\":\"vo*AB57XLptsKXf0AzKjf1MOgQ1hZ4BKipKgYl3uGew|\",\"RequestTime\":\"2015-12-01T00:00:42.5278699Z\",\"ResponseTime\":\"2015-12-01T00:01:02.5926127Z\",\"AppId\":null,\"HttpMethod\":\"GET\",\"RequestHeaders\":\"{\\\"Connection\\\":[\\\"keep-alive\\\"],\\\"Via\\\":[\\\"HTTP/1.1 nycnz01msp1ts10.wnsnet.attws.com\\\"],\\\"Accept\\\":[\\\"application/json\\\"],\\\"Accept-Encoding\\\":[\\\"gzip\\\",\\\"deflate\\\"],\\\"Accept-Language\\\":[\\\"en-us\\\"],\\\"Host\\\":[\\\"mosi-prod.cloudapp.net\\\"],\\\"User-Agent\\\":[\\\"Traffic/5.4.0\\\",\\\"CFNetwork/758.1.6\\\",\\\"Darwin/15.0.0\\\"]}\",\"RequestContentHeaders\":\"{}\",\"RequestContentBody\":\"\",\"ResponseBody\":null,\"ResponseContentHeaders\":\"{\\\"Content-Type\\\":[\\\"image/jpeg\\\"]}\",\"ResponseHeaders\":\"{}\",\"MiniProfilerJson\":null}"]
The problem is a little different than you think. Whatever program built these files used data that was already json-encoded and ended up double and even triple encoding some of the information. I peeled it apart in a shell session and got usable python data. You can (1) go dope-slap whoever wrote the program that built this steaming pile of... um... goodness? and (2) manually scan through and decode inner json strings.
I decoded the data and it was a list of strings, but those strings looked suspiciously like json
>>> data = json.load(open('test.json'))
>>> type(data)
<class 'list'>
>>> d0 = data[0]
>>> type(d0)
<class 'str'>
>>> d0[:70]
'{"apiServerType":0,"RequestId":"52a65260-1637-4653-a496-7555a2386340",'
Sure enough, I can decode it
>>> d0_1 = json.loads(d0)
>>> type(d0_1)
<class 'dict'>
>>> d0_1
{'ResponseBody': None, 'StatusText': None, 'AppId': None, 'ResponseTime': '2015-12-01T00:01:02.5926127Z', 'HardwareId': None, 'RequestTime': '2015-12-01T00:00:42.5278699Z', 'StatusId': 0, 'Lon': 0.0, 'Url': 'http://mosi-prod.cloudapp.net/api/v1/GetCameraImage?AuthToken=vo*AB57XLptsKXf0AzKjf1MOgQ1hZ4BKipKgYl3uGew%7C&CameraId=13782', 'RequestContentBody': '', 'RequestId': '52a65260-1637-4653-a496-7555a2386340', 'MiniProfilerJson': None, 'RequestContentHeaders': '{}', 'ActionName': 'GetCameraImage', 'StatusIdString': 'Ok', 'HttpMethod': 'GET', 'iSliverId': 51846, 'ResponseHeaders': '{}', 'ResponseContentHeaders': '{"Content-Type":["image/jpeg"]}', 'apiServerType': 0, 'AuthToken': 'vo*AB57XLptsKXf0AzKjf1MOgQ1hZ4BKipKgYl3uGew|', 'iConsumerId': 2986897, 'RequestHeaders': '{"Connection":["keep-alive"],"Via":["HTTP/1.1 nycnz01msp1ts10.wnsnet.attws.com"],"Accept":["application/json"],"Accept-Encoding":["gzip","deflate"],"Accept-Language":["en-us"],"Host":["mosi-prod.cloudapp.net"],"User-Agent":["Traffic/5.4.0","CFNetwork/758.1.6","Darwin/15.0.0"]}', 'iVendorId': 12561, 'Lat': 0.0, 'UserId': '2986897'}
Picking one of the entries, that looks like more json
>>> hdrs = d0_1['RequestHeaders']
>>> type(hdrs)
<class 'str'>
Yep, it decodes to what I want
>>> hdrs_0 = json.loads(hdrs)
>>> type(hdrs_0)
<class 'dict'>
>>>
>>> hdrs_0["Via"]
['HTTP/1.1 nycnz01msp1ts10.wnsnet.attws.com']
>>>
>>> type(hdrs_0["Via"])
<class 'list'>
Here you are :) :
responselist = []
with open('dataFile.json','r') as json_file:
reader = json_file.readlines()
for row in reader:
strActNm = 'ActionName":"'; lenActNm = len(strActNm)
actionAt = row.find(strActNm)
while actionAt > 0:
nxtQuotAt = row.find('"',actionAt+lenActNm+2)
responselist.append( row[actionAt-1: nxtQuotAt+1] )
actionAt = row.find('ActionName":"', nxtQuotAt)
print(responselist)
which gives:
>python3.6 -u "dataFile.py"
['"ActionName":"GetTrafficTile"']
>Exit code: 0
where dataFile.json is the file with the line you provided and dataFile.py the code provided above.
It's the hard tour, but if the files are in a bad format you have to find a way around and a simple pattern matching works in any case. For more complex cases you will need regex (regular expressions), but in this case a simple .find() is enough to do the job.
The code finds also multiple "actions" in the line (if the line would contain more than one action).
Here the result for the file you provided in your link while using following small modification of the code above:
responselist = []
with open('dataFile1.json','r') as json_file:
reader = json_file.readlines()
for row in reader:
strActNm='\\"ActionName\\":\\"'
# strActNm = 'ActionName":"'
lenActNm = len(strActNm)
actionAt = row.find(strActNm)
while actionAt > 0:
nxtQuotAt = row.find('"',actionAt+lenActNm+2)
responselist.append( row[actionAt: nxtQuotAt+1].replace('\\','') )
actionAt = row.find('ActionName":"', nxtQuotAt)
print(responselist)
gives:
>python3.6 -u "dataFile.py"
['"ActionName":"GetCameraImage"']
>Exit code: 0
where dataFile1.json is the file you provided in the link.

JSONDecodeError: Expecting value: line 1 column 1

I am receiving this error in Python 3.5.1.
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Here is my code:
import json
import urllib.request
connection = urllib.request.urlopen('http://python-data.dr-chuck.net/comments_220996.json')
js = connection.read()
print(js)
info = json.loads(str(js))
If you look at the output you receive from print() and also in your Traceback, you'll see the value you get back is not a string, it's a bytes object (prefixed by b):
b'{\n "note":"This file .....
If you fetch the URL using a tool such as curl -v, you will see that the content type is
Content-Type: application/json; charset=utf-8
So it's JSON, encoded as UTF-8, and Python is considering it a byte stream, not a simple string. In order to parse this, you need to convert it into a string first.
Change the last line of code to this:
info = json.loads(js.decode("utf-8"))
in my case, some characters like " , :"'{}[] " maybe corrupt the JSON format, so use try json.loads(str) except to check your input