I am trying to train a yolov5 model, but I'm getting an exception error when I try to execute the training module. The error occurs after the model is loaded and when it tries to read the training images. Below is my code and an excerpt of the error. Any help would be appreciated.
!python train.py --img 640 --batch 16 --epochs 150 --data pollen_data.yaml --weights yolov5x.pt
Model summary: 567 layers, 86217814 parameters, 86217814 gradients, 204.2 GFLOPs
Transferred 739/745 items from yolov5x.pt
Scaled weight_decay = 0.0005
optimizer: SGD with parameter groups 123 weight (no decay), 126 weight, 126 bias
albumentations: version 1.0.3 required by YOLOv5, but version 0.1.12 is currently installed
Traceback (most recent call last):
File "/content/yolov5/utils/datasets.py", line 405, in __init__
t = t.read().strip().splitlines()
File "/usr/lib/python3.7/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 643, in <module>
main(opt)
File "train.py", line 539, in main
train(opt.hyp, opt, device, callbacks)
File "train.py", line 227, in train
prefix=colorstr('train: '), shuffle=True)
File "/content/yolov5/utils/datasets.py", line 110, in create_dataloader
prefix=prefix)
File "/content/yolov5/utils/datasets.py", line 415, in __init__
raise Exception(f'{prefix}Error loading data from {path}: {e}\nSee {HELP_URL}')
Exception: train: Error loading data from /content/datasets/images/training/im0.jpg: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
The training images I have (im0.jpg and im1.jpg) are two large files. The first has dimensions of 9058 x 11185, and the second file is 13385 x 12832. I realize they are not square but I'm assuming that the train.py module will make them square, so it's okay. Is that right?
Or could the non-square dimensions be causing the choke?
Also, what is the meaning of the exception "error loading data from /content/datasets/images/training/im0.jpg: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte"?
Thank you.
I've been using yolov5 for the past 1 month.I must say your error is wierd.
And also, you cant train your model with image size as 12000. By default it should be 640.In your case it might change based on your dataset but i'm quite sure that it wont be 12000.
There is a mistake in your data directory also.
--data /content/datasets/annotations/dataset.yaml.txt
The data file wont have '.txt' extension. It should be a '.yaml' file. SO change that to
--data /content/datasets/annotations/dataset.yaml
It should start training after these changes. If not, close this question and please provide additional information and ask another question.
the error
'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
is raised when you use a image of format which is not mentioned by default.
IMG_FORMATS = 'bmp', 'dng', 'jpeg', 'jpg', 'mpo', 'png', 'tif', 'tiff', 'webp' # include image suffixes
But you have mentioned that it is an jpg. I'm confused now. And also if it helps, pls try this solution provided in this issue. link
Related
I've been working on writing some mkdocs documentation which includes mermaid diagrams that I'd like to keep in the markdown files instead of turning into images and embedding them
I came across this great solution here: https://github.com/squidfunk/mkdocs-material/issues/693#issuecomment-411885426
Which uses the super-fences feature of the pymdown-extensions plugin to create a custom code block which renders the mermaid diagrams inside the code block.
It works in mkdocs running locally, but when I submit the configuration file to readthedocs it fails the yaml validation
Your mkdocs.yml could not be loaded, possibly due to a syntax error (line 18, column 19)
Line 18 in the mkdocs.yml config file is the section which calls the superfences python class
format: !!python/name:pymdownx.superfences.fence_div_format
Looking in the yaml specification https://yaml.org/spec/1.2/spec.html Shows that !! is for an explicit tag and it seems to have been part of the spec for quite some time ( back to version 1). I've tried making the value a string but this then causes issue with python reading it as a string
Does anyone know if readthedocs supports this or have you been able to get this working some other way?
ReadTheDocs is parsing the mkdocs.yaml file using pyyaml and it seems that it does not recognize !!.
For example:
>>> import yaml
>>> document = """
a: 1
b:
c: 3
d: !!4
"""
>>> print(yaml.dump(yaml.load(document)))
<stdin>:1: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/site-packages/yaml/__init__.py", line 114, in load
return loader.get_single_data()
File "/usr/lib/python3.8/site-packages/yaml/constructor.py", line 51, in get_single_data
return self.construct_document(node)
File "/usr/lib/python3.8/site-packages/yaml/constructor.py", line 60, in construct_document
for dummy in generator:
File "/usr/lib/python3.8/site-packages/yaml/constructor.py", line 413, in construct_yaml_map
value = self.construct_mapping(node)
File "/usr/lib/python3.8/site-packages/yaml/constructor.py", line 218, in construct_mapping
return super().construct_mapping(node, deep=deep)
File "/usr/lib/python3.8/site-packages/yaml/constructor.py", line 143, in construct_mapping
value = self.construct_object(value_node, deep=deep)
File "/usr/lib/python3.8/site-packages/yaml/constructor.py", line 100, in construct_object
data = constructor(self, node)
File "/usr/lib/python3.8/site-packages/yaml/constructor.py", line 427, in construct_undefined
raise ConstructorError(None, None,
yaml.constructor.ConstructorError: could not determine a constructor for the tag 'tag:yaml.org,2002:4'
in "<unicode string>", line 5, column 8:
d: !!4
^
>>>
See: https://github.com/readthedocs/readthedocs.org/issues/6889
Explanation
This link is where you are sent to after entering in your hardware stats (hashrate, power, power cost, etc.). On the top bar (below the blue Twitter follow button) is a link to a JSON file created after the page loads with the hardware stats information entered; clicking on that JSON link redirects you to another URL (https://whattomine.com/asic.json).
Goal
My goal is to access that JSON file directly after manipulating the values in the URL string via the terminal. For example, if I would like to change hashrate from 100 to 150 in this portion of the URL:
[sha256_hr]=100& ---> [sha256_hr]=150&
After the URL manipulations (like above, but not limited to), I would like to receive the JSON output so that I can pick-out the desired data.
My Code
Advisory: I started Python programming ~June 2017, please forgive.
import json
import pandas as pd
import urllib2
import requests
hashrate_ghs = float(raw_input('Hash Rate (TH/s): '))
power_W = float(raw_input('Power of Miner (W): '))
electric_cost = float(raw_input('Cost of Power ($/kWh): '))
hashrate_ths = hashrate_ghs * 1000
initial_request = ('https://whattomine.com/asic?utf8=%E2%9C%93&sha256f=true&factor[sha256_hr]={0}&factor[sha256_p]={1}&factor[cost]={2}&sort=Profitability24&volume=0&revenue=24h&factor[exchanges][]=&factor[exchanges][]=bittrex&dataset=Main&commit=Calculate'.format(hashrate_ths, power_W, electric_cost))
data_stream_mine = urllib2.Request(initial_request)
json_data = requests.get('https://whattomine.com/asic.json')
print json_data
Error from My Code
I am getting an HTTPS handshake error. This is where my Python freshness is second most blatantly visible:
Traceback (most recent call last):
File "calc_1.py", line 16, in <module>
s.get('https://whattomine.com/asic.json')
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 521, in get
return self.request('GET', url, **kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/Library/Python/2.7/site-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/Library/Python/2.7/site-packages/requests/adapters.py", line 506, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='whattomine.com', port=443): Max retries exceeded with url: /asic.json (Caused by SSLError(SSLError(1, u'[SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:590)'),))
Thank you for your help and time!
Please advise me of any changes or for more information concerning this question.
This is just a comment. The following approach would suffice (Python 3).
import requests
initial_request = 'http://whattomine.com/asic.json?utf8=1&dataset=Main&commit=Calculate'
json_data = requests.get(initial_request)
print(json_data.json())
The key point here this part - put .json in your initial_request and it will be enough.
You may add all you parameters as you did in the query part after ? sign
It looks like a few others faced similar problems.
While for some it seemed to be like a pyOpenSSL version issue, uninstalling and reinstalling which has fixed the problem. Another older answer in SO asks to do the following.
I am trying to read twitter data from json file using python 2.7.12.
Code I used is such:
import json
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
def get_tweets_from_file(file_name):
tweets = []
with open(file_name, 'rw') as twitter_file:
for line in twitter_file:
if line != '\r\n':
line = line.encode('ascii', 'ignore')
tweet = json.loads(line)
if u'info' not in tweet.keys():
tweets.append(tweet)
return tweets
Result I got:
Traceback (most recent call last):
File "twitter_project.py", line 100, in <module>
main()
File "twitter_project.py", line 95, in main
tweets = get_tweets_from_dir(src_dir, dest_dir)
File "twitter_project.py", line 59, in get_tweets_from_dir
new_tweets = get_tweets_from_file(file_name)
File "twitter_project.py", line 71, in get_tweets_from_file
line = line.encode('ascii', 'ignore')
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 3131: invalid start byte
I went through all the answers from similar issues and came up with this code and it worked last time. I have no clue why it isn't working now.
In my case(mac os), there was .DS_store file in my data folder which was a hidden and auto generated file and it caused the issue. I was able to fix the problem after removing it.
It doesn't help that you have sys.setdefaultencoding('utf-8'), which is confusing things further - It's a nasty hack and you need to remove it from your code.
See https://stackoverflow.com/a/34378962/1554386 for more information
The error is happening because line is a string and you're calling encode(). encode() only makes sense if the string is a Unicode, so Python tries to convert it Unicode first using the default encoding, which in your case is UTF-8, but should be ASCII. Either way, 0x80 is not valid ASCII or UTF-8 so fails.
0x80 is valid in some characters sets. In windows-1252/cp1252 it's €.
The trick here is to understand the encoding of your data all the way through your code. At the moment, you're leaving too much up to chance. Unicode String types are a handy Python feature that allows you to decode encoded Strings and forget about the encoding until you need to write or transmit the data.
Use the io module to open the file in text mode and decode the file as it goes - no more .decode()! You need to make sure the encoding of your incoming data is consistent. You can either re-encode it externally or change the encoding in your script. Here's I've set the encoding to windows-1252.
with io.open(file_name, 'r', encoding='windows-1252') as twitter_file:
for line in twitter_file:
# line is now a <type 'unicode'>
tweet = json.loads(line)
The io module also provide Universal Newlines. This means \r\n are detected as newlines, so you don't have to watch for them.
For others who come across this question due to the error message, I ran into this error trying to open a pickle file when I opened the file in text mode instead of binary mode.
This was the original code:
import pickle as pkl
with open(pkl_path, 'r') as f:
obj = pkl.load(f)
And this fixed the error:
import pickle as pkl
with open(pkl_path, 'rb') as f:
obj = pkl.load(f)
I got a similar error by accidentally trying to read a parquet file as a csv
pd.read_csv(file.parquet)
pd.read_parquet(file.parquet)
The error occurs when you are trying to read a tweet containing sentence like
"#Mike http:\www.google.com \A8&^)((&() how are&^%()( you ". Which cannot be read as a String instead you are suppose to read it as raw String .
but Converting to raw String Still gives error so i better i suggest you to
read a json file something like this:
import codecs
import json
with codecs.open('tweetfile','rU','utf-8') as f:
for line in f:
data=json.loads(line)
print data["tweet"]
keys.append(data["id"])
fulldata.append(data["tweet"])
which will get you the data load from json file .
You can also write it to a csv using Pandas.
import pandas as pd
output = pd.DataFrame( data={ "tweet":fulldata,"id":keys} )
output.to_csv( "tweets.csv", index=False, quoting=1 )
Then read from csv to avoid the encoding and decoding problem
hope this will help you solving you problem.
Midhun
I'm using a program called Wrapper.py(this), but there's some type of error. Becouse it's Python, I've tried to found the error. As far as I know, the error is at that it tries to write & load some JSON, but it receives strings like this: "Közép-európai nyelvezet", or something like this. It causes UnicodeDecodeError:
>>>import json
>>>out={"a":"Közép-európai nyelvterület"}
>>>json.dumps(out)
Tracebank(the path, etc.)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x94 in position 1: Invalid start byte
Then I googled, & found that solution for encoding:
>>>a=json.dumps(out,ensure_ascii=False)
>>>a
'{"a":"K\x94z\x82p-eur\xarpai nyelvter\x81let"}'
Then I wanted to load it:
>>>json.loads(a)
Traceback, etc.
UnicodeDecodeError: 'utf8' codec can't decode byte 0x94 in position 1: Invalid start byte
>>>json.load(a,ensure_ascii=False)
Traceback
TypeError: __init__() got an unespected keyword argument: 'ensure_ascii'
How can I load my data back?
Thanks in advance for your help!
Use text instead of bytestrings.
out = {u"a":u"Közép-európai nyelvterület"}
How can I solve a netlogo error like
Extension exception: invalid cell size on line 5
When I try to load an AsciiGrid (.asc) raster with :
set slope gis:load-dataset "data_carto/DTMBanyulsEPSG2154/small_slope.asc"
I have find the github extention code (line 88) but I don't realy understand how it work
thank's
MAJ :
The header of my asc file :
ncols 346
nrows 270
xllcorner 3.087906007412
yllcorner 42.451833343014
dx 0.000106344549
dy 0.000106459930
0 27.467638015747070312 31.712091445922851562 35.38886260986328125 36.1437835693359375 36.798412322998046875 36.798412322998046875 36.37$
0 26.552234649658203125 31.561212539672851562 35.23743438720703125 35.762996673583984375 35.20586395263671875 35.20586395263671875 34.34$
0 27.206226348876953125 29.196367263793945312 30.581308364868164062 29.855892181396484375 29.219537734985351562 29.219537734985351562 29$
There is somthing wrong ?
The GIS extension is expecting line 5 of your .asc file to start with "CELLSIZE" (the value of the CELL_SIZE constant here), in either upper or lower case. If line 5 doesn't start with that value, the extension reports an error as you're seeing. If your .asc file doesn't have cellsize on line 5, you may need to re-arrange the lines of the .asc file.
Finaly I have find where my error come from ... :-)
#Eric Russell was of course right!
my error come from the gdal transformation of my tif file to asc file...
After the 1.9 version (I believe) we need to add a special option in the gdal_translate commande ! -co FORCE_CELLSIZE=TRUE.
with :
gdal_translate -of "AAIGrid" -b 1 -co FORCE_CELLSIZE=TRUE DTMBanyulsEPSG2154/small_slope.tif DTMBanyulsEPSG2154/small_slope.asc
It work and the header is :
ncols 321
nrows 250
xllcorner 3.087906007412
yllcorner 42.451920815321
cellsize 0.000114626835
I had a similar problem while rasterizing a shapefile. It was solved by reprojecting the rasterized file (.asc) to my target SRC in QGIS. I hope this helps ;)