Error when importing CSV files into pandas

Error when importing CSV files into pandas - csv

I am trying to import a single CSV, but I'm getting the following error:
"pandas.parser.CParserError: Erro tokenizing data. C errorL Expeceted 1 fields in line 4, saw 16"
This is the code I'm running
http://nbviewer.ipython.org/urls/bitbucket.org/hrojas/learn-pandas/raw/master/lessons/01%20-%20Lesson.ipynb
Location = (r'path')
df = (read_csv(Location))
print (df)

I was able to correct the error by adding using. df = (read_csv(Location, skiprows=1))

Related

Error while running a python-Storing Data object in JSON

I've extracted data via api against which I had to transformation to read the data in tabular format. Sample code:
import json
import ast
import requests
from pandas import json_normalize
result = requests.get('https://website.com/api')
data = result.json()
df = pd.DataFrame(data['result']['records'])
Every time I run above python(.py) file in terminal, I get an error in line where it says;
in <module>
data = result.json()
Also this;
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Not sure why I am getting this error. Can anyone tell me how to fix this?
Any help would be appreciated.

line-delimited json format txt file, how to import with pandas

I have a line-delimited Json format txt file. The format of the file is .txt. Now I want to import it with pandas. Usually I can import with
df = pd.read_csv('df.txt')
df = pd.read_json('df.txt')
df = pd.read_fwf('df.txt')
they all give me an error.
ParserError: Error tokenizing data. C error: Expected 29 fields in line 1354, saw 34
ValueError: Trailing data
this returns the data, but the data is organized in a weird way where column name is in the left, next to the data
can anyone tells me how to solve this?

pd.read_json('df.txt', lines=True)
read_json accepts a boolean argument lines which will Read the file as a json object per line.

Error trying to open json file [json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes]

I'm trying to open a json file using the json library in Python 3.8 but I have not succeeded.
This is my MWE:
with open(pbit_path + file_name, 'r') as f:
data = json.load(f)
print(data)
where pbit_path and file_name is the absolute path of the .json file. As an example, this is a sample of the .json file that i'm trying to open.
https://github.com/pwnaoj/desktop-tutorial/blob/master/DataModelSchema.json
Error returned
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
I have also tried using the functions loads(), dump(), dumps().
I appreciate any suggestions
Thanks in advance.

I found a solution to my problem. In principle, it is an encoding problem since the type of file I am trying to read is encoded with UCS-2, so in python
with open(file, mode='r', encoding='utf_16_le') as file:
data = file.read()
data = json.loads(data)
file.close()

Unable to print output of JSON code into a .csv file

I'm getting the following errors when trying to decode this data, and the 2nd error after trying to compensate for the unicode error:
Error 1:
write.writerows(subjects)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 160: ordinal not in range(128)
Error 2:
with open("data.csv", encode="utf-8", "w",) as writeFile:
SyntaxError: non-keyword arg after keyword arg
Code
import requests
import json
import csv
from bs4 import BeautifulSoup
import urllib
r = urllib.urlopen('https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=10000&page=1')
data = json.loads(r.read().decode('utf-8'))
subjects = []
for post in data['posts']:
subjects.append([post['title'], post['episodeNumber'],
post['audioSource'], post['image']['large'], post['excerpt']['long']])
with open("data.csv", encode="utf-8", "w",) as writeFile:
write = csv.writer(writeFile)
write.writerows(subjects)

Using requests and with the correction to the second part (as below) I have no problem running. I think your first problem is due to the second error (is a consequence of that being incorrect).
I am on Python3 and can run yours with my fix to open line and with
r = urllib.request.urlopen('https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=10000&page=1')
I personally would use requests.
import requests
import csv
data = requests.get('https://thisiscriminal.com/wp-json/criminal/v1/episodes?posts=10000&page=1').json()
subjects = []
for post in data['posts']:
subjects.append([post['title'], post['episodeNumber'],
post['audioSource'], post['image']['large'], post['excerpt']['long']])
with open("data.csv", encoding ="utf-8", mode = "w",) as writeFile:
write = csv.writer(writeFile)
write.writerows(subjects)
For your second, looking at documentation for open function, you need to use the right argument names and add the name of the mode argument if not positional matching.
with open("data.csv", encoding ="utf-8", mode = "w") as writeFile:

Python 3 Pandas Error: pandas.parser.CParserError: Error tokenizing data. C error: Expected 11 fields in line 5, saw 13

I checked out this answer as I am having a similar problem.
Python Pandas Error tokenizing data
However, for some reason ALL of my rows are being skipped.
My code is simple:
import pandas as pd
fname = "data.csv"
input_data = pd.read_csv(fname)
and the error I get is:
File "preprocessing.py", line 8, in <module>
input_data = pd.read_csv(fname) #raw data file ---> pandas.core.frame.DataFrame type
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/io/parsers.py", line 465, in parser_f
return _read(filepath_or_buffer, kwds)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/io/parsers.py", line 251, in _read
return parser.read()
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/io/parsers.py", line 710, in read
ret = self._engine.read(nrows)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/io/parsers.py", line 1154, in read
data = self._reader.read(nrows)
File "pandas/parser.pyx", line 754, in pandas.parser.TextReader.read (pandas/parser.c:7391)
File "pandas/parser.pyx", line 776, in pandas.parser.TextReader._read_low_memory (pandas/parser.c:7631)
File "pandas/parser.pyx", line 829, in pandas.parser.TextReader._read_rows (pandas/parser.c:8253)
File "pandas/parser.pyx", line 816, in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:8127)
File "pandas/parser.pyx", line 1728, in pandas.parser.raise_parser_error (pandas/parser.c:20357)
pandas.parser.CParserError: Error tokenizing data. C error: Expected 11 fields in line 5, saw 13

Solution is to use pandas built-in delimiter "sniffing".
input_data = pd.read_csv(fname, sep=None)

For those landing here, I got this error when the file was actually an .xls file not a true .csv. Try resaving as a csv in a spreadsheet app.

I had the same error, I read my csv data using this :
d1 = pd.read_json('my.csv')
then I try this
d1 = pd.read_json('my.csv', sep='\t')
and this time it's right.
So you could try this method if your delimiter is not ',', because the default is ',', so if you don't indicate clearly, it go wrong.
pandas.read_csv

This error means, you get unequal number of columns for each row. In your case, until row 5, you've had 11 columns but in line 5 you have 13 inputs (columns).
For this problem, you can try the following approach to open read your file:
import csv
with open('filename.csv', 'r') as file:
reader = csv.reader(file, delimiter=',') #if you have a csv file use comma delimiter
for row in reader:
print (row)

This parsing error could occur for multiple reasons and solutions to the different reasons have been posted here as well as in Python Pandas Error tokenizing data.
I posted a solution to one possible reason for this error here: https://stackoverflow.com/a/43145539/6466550

I have had similar problems. With my csv files it occurs because they were created in R, so it has some extra commas and different spacing than a "regular" csv file.
I found that if I did a read.table in R, I could then save it using write.csv and the option of row.names = F.
I could not get any of the read options in pandas to help me.

The problem could be that one or multiple rows of csv file contain more delimiters (commas ,) than expected. It is solved when each row matches the amount of delimiters of the first line of the csv file where the column names are defined.

use \t+ in the separator pattern instead of \t.
import pandas as pd
fname = "data.csv"
input_data = pd.read_csv(fname, sep='\t+`, header=None)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Error when importing CSV files into pandas - csv

I was able to correct the error by adding using. df = (read_csv(Location, skiprows=1))

Related

Error while running a python-Storing Data object in JSON

line-delimited json format txt file, how to import with pandas

Error trying to open json file [json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes]

Unable to print output of JSON code into a .csv file

Python 3 Pandas Error: pandas.parser.CParserError: Error tokenizing data. C error: Expected 11 fields in line 5, saw 13

Categories

Resources