Readthedocs build gives AttributeError: '_NamespacePath' object has no attribute 'sort' - read-the-docs

When I try to build the docs of a project (https://github.com/devolksbank/dvb.datascience/tree/master/docs), I get the following error:
python3.6 -mvirtualenv --no-site-packages --no-download
/home/docs/checkouts/readthedocs.org/user_builds/dvbdatascience/envs/latest
Using base prefix '/home/docs/.pyenv/versions/3.6.2' New python
executable in
/home/docs/checkouts/readthedocs.org/user_builds/dvbdatascience/envs/latest/bin/python3.6
Not overwriting existing python script
/home/docs/checkouts/readthedocs.org/user_builds/dvbdatascience/envs/latest/bin/python
(you must use
/home/docs/checkouts/readthedocs.org/user_builds/dvbdatascience/envs/latest/bin/python3.6)
Installing setuptools, pip, wheel... Complete output from command
/home/docs/checkouts...latest/bin/python3.6 - setuptools pip wheel:
Traceback (most recent call last): File "", line 8, in
File "", line 961, in
_find_and_load File "", line 950, in _find_and_load_unlocked File "", line 646, in _load_unlocked File "", line
616, in _load_backward_compatible File
"/home/docs/.pyenv/versions/3.6.2/lib/python3.6/site-packages/virtualenv_support/pip-9.0.3-py2.py3-none-any.whl/pip/init.py",
line 43, in File "", line 961,
in _find_and_load File "", line 950, in
_find_and_load_unlocked File "", line 646, in _load_unlocked File "", line
616, in _load_backward_compatible File
"/home/docs/.pyenv/versions/3.6.2/lib/python3.6/site-packages/virtualenv_support/pip-9.0.3-py2.py3-none-any.whl/pip/utils/init.py",
line 27, in File "", line 961,
in _find_and_load File "", line 950, in
_find_and_load_unlocked File "", line 646, in _load_unlocked File "", line
616, in _load_backward_compatible File
"/home/docs/.pyenv/versions/3.6.2/lib/python3.6/site-packages/virtualenv_support/pip-9.0.3-py2.py3-none-any.whl/pip/_vendor/pkg_resources/init.py",
line 3018, in File
"/home/docs/.pyenv/versions/3.6.2/lib/python3.6/site-packages/virtualenv_support/pip-9.0.3-py2.py3-none-any.whl/pip/_vendor/pkg_resources/init.py",
line 3004, in _call_aside File
"/home/docs/.pyenv/versions/3.6.2/lib/python3.6/site-packages/virtualenv_support/pip-9.0.3-py2.py3-none-any.whl/pip/_vendor/pkg_resources/init.py",
line 3046, in _initialize_master_working_set File
"/home/docs/.pyenv/versions/3.6.2/lib/python3.6/site-packages/virtualenv_support/pip-9.0.3-py2.py3-none-any.whl/pip/_vendor/pkg_resources/init.py",
line 2578, in activate File
"/home/docs/.pyenv/versions/3.6.2/lib/python3.6/site-packages/virtualenv_support/pip-9.0.3-py2.py3-none-any.whl/pip/_vendor/pkg_resources/init.py",
line 2152, in declare_namespace File
"/home/docs/.pyenv/versions/3.6.2/lib/python3.6/site-packages/virtualenv_support/pip-9.0.3-py2.py3-none-any.whl/pip/_vendor/pkg_resources/init.py",
line 2092, in _handle_ns File
"/home/docs/.pyenv/versions/3.6.2/lib/python3.6/site-packages/virtualenv_support/pip-9.0.3-py2.py3-none-any.whl/pip/_vendor/pkg_resources/init.py",
line 2121, in _rebuild_mod_path AttributeError: '_NamespacePath'
object has no attribute 'sort'
How can I solve this?

Related

"trailing data" error when reading json to Pandas dataframe

I have a Python 3.8.5 script that gets a JSON from an API, saves to disk, reads JSON to DF. It works.
df = pd.io.json.read_json('json_file', orient='records')
I want to try IO buffer instead so I don't have to read/write to disk, but I am getting an error. The code is like this:
from io import StringIO
io = StringIO()
json_out = []
# some code to append API results to json_out
json.dump(json_out, io)
df = pd.io.json.read_json(io.getvalue())
On that last line I get the error
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\util\_decorators.py", line 199, in wrapper
return func(*args, **kwargs)
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\util\_decorators.py", line 296, in wrapper
return func(*args, **kwargs)
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 618, in read_json
result = json_reader.read()
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 755, in read
obj = self._get_object_parser(self.data)
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 777, in _get_object_parser
obj = FrameParser(json, **kwargs).parse()
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 886, in parse
self._parse_no_numpy()
File "C:\Users\chap\Anaconda3\lib\site-packages\pandas\io\json\_json.py", line 1119, in _parse_no_numpy
loads(json, precise_float=self.precise_float), dtype=None
ValueError: Trailing data
The JSON is in a list format. So this is not the actual json but it looks like this when I write to disk:
json = [
{"state": "North Dakota",
"address": "123 30th st E #206",
"account": "123"
},
{"state": "North Dakota",
"address": "456 30th st E #206",
"account": "456"
}
]
Given that it worked in the first case (write/read from disk), I don't know how to troubleshoot. How do I troubleshoot something in the buffer? The actual data is mostly text but has some number fields.
Don't know what's going wrong for you, this works for me:
import json
import pandas as pd
from io import StringIO
json_out = [
{"state": "North Dakota",
"address": "123 30th st E #206",
"account": "123"
},
{"state": "North Dakota",
"address": "456 30th st E #206",
"account": "456"
}
]
io = StringIO()
json.dump(json_out, io)
df = pd.io.json.read_json(io.getvalue())
print(df)
leads me to believe there's something wrong with the code that appends the API data...
However, if you have a list of dictionaries, you don't need the IO step. You can just do:
pd.DataFrame(json_out)
EDIT: I think I remember this error when there was a comma at the end of my json like so:
[
{
"hello":"world",
},
]

Why can't python parse this json string into a dictionary?

This is the string I'm trying to parse into a dictionary variable in python 3.8:
str = '{"asdasd": {"username": "asdsad", "filename": "asdsad", "password": "asdsa", "accounts": "11", "headless": True}}'
and when I run:
import json
json.loads(str)
This is the error which is raised:
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
json.loads(str)
File "C:\Users\kaspe\AppData\Local\Programs\Python\Python38-32\lib\json\__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "C:\Users\kaspe\AppData\Local\Programs\Python\Python38-32\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Users\kaspe\AppData\Local\Programs\Python\Python38-32\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 108 (char 107)
Any help would be appreciated!
Your str variable doesn't contains a valid JSON.
It should have been
str = '{"asdasd": {"username": "asdsad", "filename": "asdsad", "password": "asdsa", "accounts": "11", "headless": true}}'
In JSON, boolean is represented as true/false and in Python as True/False.

Why does Sublime Text 3 allow comments in JSON configuration files?

Using comments in JSON configuration files in Sublime Text can make JSON objects unable to be decoded. Here is my story.
I newly installed SublimeREPL plugin in my Sublime Text 3. Soon I discovered it ran Python2.7 instead of 3.5 in default, so I added my own Python3.5 configuration files according to SublimeREPL Docs to make it support Python3.5.
My Packages/SublimeREPL/config/Python3.5/Main.sublime-menu JSON config file looks like this:
[
{
"id": "tools",
"children":
[{
"caption": "SublimeREPL",
"mnemonic": "R",
"id": "SublimeREPL",
"children":
[
{"caption": "Python3.5",
"id": "Python3.5",
"children":[
{"command": "repl_open",
"caption": "Python3.5",
"id": "repl_python3.5",
"mnemonic": "P",
"args": {
"type": "subprocess",
"encoding": "utf8",
"cmd": ["python3", "-i", "-u"],
"cwd": "$file_path",
"syntax": "Packages/Python/Python.tmLanguage",
"external_id": "python3",
"extend_env": {"PYTHONIOENCODING": "utf-8"}
}
},
// run files
{"command": "repl_open",
"caption": "Python3.5 - RUN current file",
"id": "repl_python3.5_run",
"mnemonic": "R",
"args": {
"type": "subprocess",
"encoding": "utf8",
"cmd": ["python3", "-u", "$file_basename"],
"cwd": "$file_path",
"syntax": "Packages/Python/Python.tmLanguage",
"external_id": "python3",
"extend_env": {"PYTHONIOENCODING": "utf-8"}
}
}
]}
]
}]
}]
Note there is a comment // run files in this file. This config works fine from the menu bar tools->SublimeREPL->Python3.5. However,when I tried to bind the F5 key with repl_python3.5_run to have easier access to 3.5,the following exception was thrown in the console:
Traceback (most recent call last):
File "./python3.3/json/decoder.py", line 367, in raw_decode
StopIteration
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/sublime_text/sublime_plugin.py", line 551, in run_
return self.run(**args)
File "/home/ubuntu/.config/sublime-text-3/Packages/SublimeREPL/run_existing_command.py", line 32, in run
json_cmd = self._find_cmd(id, path)
File "/home/ubuntu/.config/sublime-text-3/Packages/SublimeREPL/run_existing_command.py", line 41, in _find_cmd
return self._find_cmd_in_file(id, file)
File "/home/ubuntu/.config/sublime-text-3/Packages/SublimeREPL/run_existing_command.py", line 53, in _find_cmd_in_file
data = json.loads(bytes)
File "./python3.3/json/__init__.py", line 316, in loads
File "./python3.3/json/decoder.py", line 351, in decode
File "./python3.3/json/decoder.py", line 369, in raw_decode
ValueError: No JSON object could be decoded
After I removed the // run files comment. The F5 key works fine.It's exactly the comment that causes the problem.
Sublime Text uses JSON as config files,lots of config files come with // style comments. As we know, comments are removed from JSON by design.
Then how can sublime text allow comments in config files, is it using a pipe? If it is, how can my key binding fail?
Sublime itself (the core program, not plugins like SublimeREPL) uses an internal JSON library for parsing config files like .sublime-settings, .sublime-menu, .sublime-build, etc. This (most likely customized) parser allows comments.
However, plugins are run through a version of Python (currently 3.3.6 for the dev builds) linked to the Sublime plugin_host executable. Any plugin that imports the standard library's json module (such as run_existing_command.py has to obey the restrictions of that module, and that includes failing to recognize JavaScript-style comments like // in JSON.
One workaround to this would be to import an external module like commentjson that strips various types of comments, including //, before passing the data on to the standard json module. Since it is a pure Python module, you could just copy the source directory into the main SublimeREPL dir, then edit run_existing_command.py appropriately - change line 6 to import commentjson as json and you're all set.

loading a json file using python [duplicate]

I am getting some data from a JSON file "new.json", and I want to filter some data and store it into a new JSON file. Here is my code:
import json
with open('new.json') as infile:
data = json.load(infile)
for item in data:
iden = item.get["id"]
a = item.get["a"]
b = item.get["b"]
c = item.get["c"]
if c == 'XYZ' or "XYZ" in data["text"]:
filename = 'abc.json'
try:
outfile = open(filename,'ab')
except:
outfile = open(filename,'wb')
obj_json={}
obj_json["ID"] = iden
obj_json["VAL_A"] = a
obj_json["VAL_B"] = b
And I am getting an error, the traceback is:
File "rtfav.py", line 3, in <module>
data = json.load(infile)
File "/usr/lib64/python2.7/json/__init__.py", line 278, in load
**kw)
File "/usr/lib64/python2.7/json/__init__.py", line 326, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python2.7/json/decoder.py", line 369, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 88 column 2 - line 50607 column 2 (char 3077 - 1868399)
Here is a sample of the data in new.json, there are about 1500 more such dictionaries in the file
{
"contributors": null,
"truncated": false,
"text": "#HomeShop18 #DreamJob to professional rafter",
"in_reply_to_status_id": null,
"id": 421584490452893696,
"favorite_count": 0,
"source": "Mobile Web (M2)",
"retweeted": false,
"coordinates": null,
"entities": {
"symbols": [],
"user_mentions": [
{
"id": 183093247,
"indices": [
0,
11
],
"id_str": "183093247",
"screen_name": "HomeShop18",
"name": "HomeShop18"
}
],
"hashtags": [
{
"indices": [
12,
21
],
"text": "DreamJob"
}
],
"urls": []
},
"in_reply_to_screen_name": "HomeShop18",
"id_str": "421584490452893696",
"retweet_count": 0,
"in_reply_to_user_id": 183093247,
"favorited": false,
"user": {
"follow_request_sent": null,
"profile_use_background_image": true,
"default_profile_image": false,
"id": 2254546045,
"verified": false,
"profile_image_url_https": "https://pbs.twimg.com/profile_images/413952088880594944/rcdr59OY_normal.jpeg",
"profile_sidebar_fill_color": "171106",
"profile_text_color": "8A7302",
"followers_count": 87,
"profile_sidebar_border_color": "BCB302",
"id_str": "2254546045",
"profile_background_color": "0F0A02",
"listed_count": 1,
"profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
"utc_offset": null,
"statuses_count": 9793,
"description": "Rafter. Rafting is what I do. Me aur mera Tablet. Technocrat of Future",
"friends_count": 231,
"location": "",
"profile_link_color": "473623",
"profile_image_url": "http://pbs.twimg.com/profile_images/413952088880594944/rcdr59OY_normal.jpeg",
"following": null,
"geo_enabled": false,
"profile_banner_url": "https://pbs.twimg.com/profile_banners/2254546045/1388065343",
"profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
"name": "Jayy",
"lang": "en",
"profile_background_tile": false,
"favourites_count": 41,
"screen_name": "JzayyPsingh",
"notifications": null,
"url": null,
"created_at": "Fri Dec 20 05:46:00 +0000 2013",
"contributors_enabled": false,
"time_zone": null,
"protected": false,
"default_profile": false,
"is_translator": false
},
"geo": null,
"in_reply_to_user_id_str": "183093247",
"lang": "en",
"created_at": "Fri Jan 10 10:09:09 +0000 2014",
"filter_level": "medium",
"in_reply_to_status_id_str": null,
"place": null
}
As you can see in the following example, json.loads (and json.load) does not decode multiple json object.
>>> json.loads('{}')
{}
>>> json.loads('{}{}') # == json.loads(json.dumps({}) + json.dumps({}))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\json\__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 368, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 3 - line 1 column 5 (char 2 - 4)
If you want to dump multiple dictionaries, wrap them in a list, dump the list (instead of dumping dictionaries multiple times)
>>> dict1 = {}
>>> dict2 = {}
>>> json.dumps([dict1, dict2])
'[{}, {}]'
>>> json.loads(json.dumps([dict1, dict2]))
[{}, {}]
Iterate over the file, loading each line as JSON in the loop:
tweets = []
for line in open('tweets.json', 'r'):
tweets.append(json.loads(line))
This avoids storing intermediate python objects. As long as you write one full tweet per append() call, this should work.
I came across this because I was trying to load a JSON file dumped from MongoDB. It was giving me an error
JSONDecodeError: Extra data: line 2 column 1
The MongoDB JSON dump has one object per line, so what worked for me is:
import json
data = [json.loads(line) for line in open('data.json', 'r')]
This may also happen if your JSON file is not just 1 JSON record.
A JSON record looks like this:
[{"some data": value, "next key": "another value"}]
It opens and closes with a bracket [ ], within the brackets are the braces { }. There can be many pairs of braces, but it all ends with a close bracket ].
If your json file contains more than one of those:
[{"some data": value, "next key": "another value"}]
[{"2nd record data": value, "2nd record key": "another value"}]
then loads() will fail.
I verified this with my own file that was failing.
import json
guestFile = open("1_guests.json",'r')
guestData = guestFile.read()
guestFile.close()
gdfJson = json.loads(guestData)
This works because 1_guests.json has one record []. The original file I was using all_guests.json had 6 records separated by newline. I deleted 5 records, (which I already checked to be bookended by brackets) and saved the file under a new name. Then the loads statement worked.
Error was
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 2 column 1 - line 10 column 1 (char 261900 - 6964758)
PS. I use the word record, but that's not the official name. Also, if your file has newline characters like mine, you can loop through it to loads() one record at a time into a json variable.
I just got the same error while my json file is like this
{"id":"1101010","city_id":"1101","name":"TEUPAH SELATAN"}
{"id":"1101020","city_id":"1101","name":"SIMEULUE TIMUR"}
And I found it malformed, so I changed it to:
{
"datas":[
{"id":"1101010","city_id":"1101","name":"TEUPAH SELATAN"},
{"id":"1101020","city_id":"1101","name":"SIMEULUE TIMUR"}
]
}
One-liner for your problem:
data = [json.loads(line) for line in open('tweets.json', 'r')]
If you want to solve it in a two-liner you can do it like this:
with open('data.json') as f:
data = [json.loads(line) for line in f]
The error is due to the \nsymbol if you use the read()method of the file descriptor... so don't bypass the problem by using readlines()& co but just remove such character!
import json
path = # contains for example {"c": 4} also on multy-lines
new_d = {'new': 5}
with open(path, 'r') as fd:
d_old_str = fd.read().replace('\n', '') # remove all \n
old_d = json.loads(d_old_str)
# update new_d (python3.9 otherwise new_d.update(old_d))
new_d |= old_d
with open(path2, 'w') as fd:
fd.write(json.dumps(new_d)) # save the dictionary to file (in case needed)
... and if you really really want to use readlines() here an alternative solution
new_d = {'new': 5}
with open('some_path', 'r') as fd:
d_old_str = ''.join(fd.readlines()) # concatenate the lines
d_old = json.loads(d_old_str)
# then as above
I think saving dicts in a list is not an ideal solution here proposed by #falsetru.
Better way is, iterating through dicts and saving them to .json by adding a new line.
Our 2 dictionaries are
d1 = {'a':1}
d2 = {'b':2}
you can write them to .json
import json
with open('sample.json','a') as sample:
for dict in [d1,d2]:
sample.write('{}\n'.format(json.dumps(dict)))
And you can read json file without any issues
with open('sample.json','r') as sample:
for line in sample:
line = json.loads(line.strip())
Simple and efficient
My json file was formatted exactly as the one in the question but none of the solutions here worked out. Finally I found a workaround on another Stackoverflow thread. Since this post is the first link in Google search, I put the that answer here so that other people come to this post in the future will find it more easily.
As it's been said there the valid json file needs "[" in the beginning and "]" in the end of file. Moreover, after each json item instead of "}" there must be a "},". All brackets without quotations! This piece of code just modifies the malformed json file into its correct format.
https://stackoverflow.com/a/51919788/2772087
If your data is from a source outside your control, use this
def load_multi_json(line: str) -> [dict]:
"""
Fix some files with multiple objects on one line
"""
try:
return [json.loads(line)]
except JSONDecodeError as err:
if err.msg == 'Extra data':
head = [json.loads(line[0:err.pos])]
tail = FrontFile.load_multi_json(line[err.pos:])
return head + tail
else:
raise err

Avro Namespace Error

i have a question.
I made the following avro schema:
{
"namespace": "foo",
"fields": [
{
"type": [
"string",
"null"
],
"name": "narf"
},
{
"namespace": "foo.run",
"fields": [
{
"type": [
"string",
"null"
],
"name": "baz"
}
],
"type": "record",
"name": "foo"
}
],
"type": "record",
"name": "run"
}
When i try to compile this i get the following error:
/usr/bin/python3.4 /home/marius/PycharmProjects/AvroTest/avroTest.py
Traceback (most recent call last):
File "/home/marius/PycharmProjects/AvroTest/avroTest.py", line 11, in
schema = avro.schema.Parse(open("simple.avsc").read())
File "/usr/local/lib/python3.4/dist-packages/avro_python3_snapshot-1.7.7-py3.4.egg/avro/schema.py", line 1283, in Parse
return SchemaFromJSONData(json_data, names)
File "/usr/local/lib/python3.4/dist-packages/avro_python3_snapshot-1.7.7-py3.4.egg/avro/schema.py", line 1254, in SchemaFromJSONData
return parser(json_data, names=names)
File "/usr/local/lib/python3.4/dist-packages/avro_python3_snapshot-1.7.7-py3.4.egg/avro/schema.py", line 1182, in _SchemaFromJSONObject
other_props=other_props,
File "/usr/local/lib/python3.4/dist-packages/avro_python3_snapshot-1.7.7-py3.4.egg/avro/schema.py", line 1061, in init
fields = make_fields(names=nested_names)
File "/usr/local/lib/python3.4/dist-packages/avro_python3_snapshot-1.7.7-py3.4.egg/avro/schema.py", line 1173, in MakeFields
return tuple(RecordSchema._MakeFieldList(field_desc_list, names))
File "/usr/local/lib/python3.4/dist-packages/avro_python3_snapshot-1.7.7-py3.4.egg/avro/schema.py", line 986, in _MakeFieldList
yield RecordSchema._MakeField(index, field_desc, names)
File "/usr/local/lib/python3.4/dist-packages/avro_python3_snapshot-1.7.7-py3.4.egg/avro/schema.py", line 957, in _MakeField
names=names,
File "/usr/local/lib/python3.4/dist-packages/avro_python3_snapshot-1.7.7-py3.4.egg/avro/schema.py", line 1254, in SchemaFromJSONData
return parser(json_data, names=names)
File "/usr/local/lib/python3.4/dist-packages/avro_python3_snapshot-1.7.7-py3.4.egg/avro/schema.py", line 1135, in _SchemaFromJSONString
% (json_string, sorted(names.names)))
avro.schema.SchemaParseException: Unknown named schema 'record', known names: ['foo.run'].
And i have no idea why. In my mind the error is the record called "foo" but the namespace i gave ("foo.run") is in the namespacelist but ut raises an error anyways. I guess i misunderstand something regarding namespaces but i could not figure out what.
Greetings
Marius
Ok found the error, is generated this schema with my own program and it had a bug when a nested record appears within a nested record. #
See: How to nest records in an Avro schema?