How to get the actual value of a cell with openpyxl? - csv

I'm a beginner with Python and I need help. I'm using Python 2.7 and I'm trying to retrieve the cell values of an excel file and store it into a csv file. My code is the following:
import os, openpyxl, csv
aggname = "deu"
wb_source = openpyxl.load_workbook(filename, data_only = True)
app_file = open(filename,'a')
dest_file = csv.writer(app_file, delimiter=',', lineterminator='\n')
calib_sheet = wb_source.get_sheet_by_name('Calibration')
data = calib_sheet['B78:C88']
data = list(data)
print(data)
for i in range(len(data)):
dest_file.writerow(data[i])
app_file.close()
In my csv file, I get this, instead of the actual value (for example in my case: SFCG, 99103).
<Cell Calibration.B78>,<Cell Calibration.C78>
<Cell Calibration.B79>,<Cell Calibration.C79>
<Cell Calibration.B80>,<Cell Calibration.C80>
<Cell Calibration.B81>,<Cell Calibration.C81>
<Cell Calibration.B82>,<Cell Calibration.C82>
<Cell Calibration.B83>,<Cell Calibration.C83>
<Cell Calibration.B84>,<Cell Calibration.C84>
<Cell Calibration.B85>,<Cell Calibration.C85>
<Cell Calibration.B86>,<Cell Calibration.C86>
<Cell Calibration.B87>,<Cell Calibration.C87>
<Cell Calibration.B88>,<Cell Calibration.C88>
I tried to set the data_only = True, when opening the excel file as suggested in answers to similar questions but it doesn't solve my problem.
---------------EDIT-------------
Taking into account the first two answers I got (thank you!), I tried several things:
for i in range(len(data)):
dest_file.writerows(data[i].value)
I get this error message :
for i in range(len(data)):
dest_file.writerows(data[i].values)
Traceback (most recent call last):
File "<ipython-input-78-27828c989b39>", line 2, in <module>
dest_file.writerows(data[i].values)
AttributeError: 'tuple' object has no attribute 'values'
Then I tried this instead:
for i in range(len(data)):
for j in range(2):
dest_file.writerow(data[i][j].value)
and then I have the following error message:
for i in range(len(data)):
for j in range(2):
dest_file.writerow(data[i][j].value)
Traceback (most recent call last):
File "<ipython-input-80-c571abd7c3ec>", line 3, in <module>
dest_file.writerow(data[i][j].value)
Error: sequence expected
So then, I tried this:
import os, openpyxl, csv
wb_source = openpyxl.load_workbook(filename, data_only=True)
app_file = open(filename,'a')
dest_file = csv.writer(app_file, delimiter=',', lineterminator='\n')
calib_sheet = wb_source.get_sheet_by_name('Calibration')
list(calib_sheet.iter_rows('B78:C88'))
for row in calib_sheet.iter_rows('B78:C88'):
for cell in row:
dest_file.writerow(cell.value)
Only to get this error message:
Traceback (most recent call last):
File "<ipython-input-81-5bed62b45985>", line 12, in <module>
dest_file.writerow(cell.value)
Error: sequence expected
For the "sequence expected" error I suppose python expects a list rather than a single cell, so I did this:
import os, openpyxl, csv
wb_source = openpyxl.load_workbook(filename, data_only=True)
app_file = open(filename,'a')
dest_file = csv.writer(app_file, delimiter=',', lineterminator='\n')
calib_sheet = wb_source.get_sheet_by_name('Calibration')
list(calib_sheet.iter_rows('B78:C88'))
for row in calib_sheet.iter_rows('B78:C88'):
dest_file.writerow(row)
There is no error message but I only get the reference of the cell in csv file and changing it to dest_file.writerow(row.value) brings me back to the tuple error.
I obviously still need your help!

You've forgot to get the cell's value! See the documentation

I found a way around it using numpy, which allows me to store my values as a list of lists rather than a list of tuples.
import os, openpyxl, csv
import numpy as np
wb_source = openpyxl.load_workbook(filename, data_only=True)
app_file = open(filename,'a')
dest_file = csv.writer(app_file, delimiter=',', lineterminator='\n')
calib_sheet = wb_source.get_sheet_by_name('Calibration')
store = list(calib_sheet.iter_rows('B78:C88'))
print store
truc = np.array(store)
print truc
for i in range(11):
for j in range(1):
dest_file.writerow([truc[i][j].value, truc[i][j+1].value])
app_file.close()
I actually have a sequence as my argument in "writerow()" and with the list object I can also use the double index and the value method to retrieve the value of my cell.

Try using data.values instead of just data when you are printing it.
Hope it helps !!
**
***An example :
import openpyxl
import re
import os
wc=openpyxl.load_workbook('<path of the file>') wcsheet=wc.get_sheet_by_name('test')
store=[]
for data in wcsheet.columns[0]:
store=data
print(store.value)***
=======================
=================================================
**
Live Life Buddha Size

Related

python loop over a list's length, writing specific data to csv file

I am collecting data from a set of urls using curl requests and converting it into json. These data are represented in python as lists or dictionaries.
EDIT:
Next, I want to loop my script over a value in a dictionary that is inside a list (type list of dictionaries) until the length of the list is met. I wish to loop my other curl requests for each 'instance' and then write that information to a .csv with the name 'instance_name'.csv
Information in the .csv is populated from 2 different curl requests, excluding the one I want to loop everything over. The .csv needs to be created and populated by 'instance_name'. But actual content is populated via other curl requests.
Information of the list I want to loop over:
>>> instances = [i['instance_name'] for i in i_data]
>>> print(instances)
[u'Instance0', u'Instance1', .... u'Instance16']
>>> type(i_data)
<type 'list'>
>>> len(i_data)
17
>>> print(i_data[0])
{u'instance_name': u'Instance1', u'attribute2': {u'attribute2_1': u'yes', u'attribute2_2': u'no', u'attribute2_3': u'bye', u'attribute2_4': u'hello', u'attribute2_5': 500}, u'attribute3': u'abcd', u'attribute4': u'wxyz'}
>>>
How can I start this loop? eg:
i = 0
for i in len(i_data[i]):
with open('{}.csv'.format(i['instance_name']), 'w') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
Trying to test:
>>> i = 0
>>> for i in len(i_data[i]):
... print('Hello')
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>>
Secondly, how can I match certain field names in the csv with certain keys from lists or dictionaries?
For a key called 'Id' I want to place the value of that into the .csv file as Id
fieldnames = ['Id', 'domain_name', 'website', 'Usage', 'Limit']
Should my fieldnames be the same as the Keys so that the values know where to go? Or how exactly can I do this?
I am getting this error right now:
Traceback (most recent call last):
File "./usage_2.py", line 42, in <module>
writer.writerow(t_data['domain_name'])
File "/usr/local/lib/python2.7/csv.py", line 152, in writerow
return self.writer.writerow(self._dict_to_list(rowdict))
File "/usr/local/lib/python2.7/csv.py", line 148, in _dict_to_list
+ ", ".join([repr(x) for x in wrong_fields]))
ValueError: dict contains fields not in fieldnames: u'p', u'e', u'r', u'f', u'e', u'c', u't', u'u', u's', u'g', u'r', u'e', u'u', u'p', u'a', u'h', u'r', u'.', u'o', u'n', u'm', u'i', u'c', u'r', u'o', u's', u'o', u'f', u't', u'.', u'c', u'o', u'm'
Because, I think it's trying to put json data with the u' and also I don't know if data is going where it's suppose to.
One example of to write the data:
writer.writerow(t_data['domain_name'])
The entry looks like:
>>> print(t_data['domain_name'])
abc.123.com
>>>
And this 't_data' pulled from another curl request is represented as a dictionary when I check.
>>> type(t_data)
<type 'dict'>
>>>

Converting JSON files to .csv

I've found some data that someone is downloading into a JSON file (I think! - I'm a newb!). The file contains data on nearly 600 football players.
Here you can find the file
In the past, I have downloaded the json file and then used this code:
import csv
import json
json_data = open("file.json")
data = json.load(json_data)
f = csv.writer(open("fix_hists.csv","wb+"))
arr = []
for i in data:
fh = data[i]["fixture_history"]
array = fh["all"]
for j in array:
try:
j.insert(0,str(data[i]["first_name"]))
except:
j.insert(0,'error')
try:
j.insert(1,data[i]["web_name"])
except:
j.insert(1,'error')
try:
f.writerow(j)
except:
f.writerow(['error','error'])
json_data.close()
Sadly, when I do this now in command prompt, i get the following error:
Traceback (most recent call last):
File"fix_hist.py", line 12 (module)
fh = data[i]["fixture_history"]
TypeError: list indices must be integers, not str
Can this be fixed or is there another way I can grab some of the data and convert it to .csv? Specifically the 'Fixture History'? and then 'First'Name', 'type_name' etc.
Thanks in advance for any help :)
Try this tool: http://www.convertcsv.com/json-to-csv.htm
You will need to configure a few things, but should be easy enough.

Trying to merge all multiple CSV files to one excel workbook

I am able to execute the below code in python 2.7 and able to merge all csv files to a single excel workbook . But when i am trying to execute in python 3.4 . Getting an error . Let me know if anyone faced this issue and sorted out .
Code:-
import glob, csv, xlwt, os
wb = xlwt.Workbook()
for filename in glob.glob(r'E:\BMCSoftware\Datastore\utility\BPM_Datastore_Utility\*.csv'):
#print (filename)
(f_path, f_name) = os.path.split(filename)
#print (f_name)
(f_short_name, f_extension) = os.path.splitext(f_name)
#print (f_short_name)
ws = wb.add_sheet(f_short_name)
#print (ws)
with open(filename, 'rU') as f:
spamReader = csv.reader(f)
for rowx, row in enumerate(spamReader):
for colx, value in enumerate(row):
ws.write(rowx, colx, value)
wb.save("f:\find_acs_errors_ALL_EMEA.xls")
ERROR:-
>>>
Traceback (most recent call last):
File "E:\BMCSoftware\Python34\Copy of DataStore.py", line 16, in <module>
wb.save("f:\find_acs_errors_ALL_EMEA.xls")
File "E:\BMCSoftware\Python34\lib\site-packages\xlwt-1.0.0-py3.4.egg\xlwt\Workbook.py", line 696, in save
doc.save(filename_or_stream, self.get_biff_data())
File "E:\BMCSoftware\Python34\lib\site-packages\xlwt-1.0.0-py3.4.egg\xlwt\CompoundDoc.py", line 262, in save
f = open(file_name_or_filelike_obj, 'w+b')
FileNotFoundError: [Errno 2] No such file or directory: 'f:\x0cind_acs_errors_ALL_EMEA.xls'
>>>
you should either make double-backsashes or singe forward-slashes in
wb.save("f:\find_acs_errors_ALL_EMEA.xls")
i.e. one of those:
wb.save("f:\\find_acs_errors_ALL_EMEA.xls")
wb.save("f:/find_acs_errors_ALL_EMEA.xls")
hope that helps!

Python 3 Pandas Error: pandas.parser.CParserError: Error tokenizing data. C error: Expected 11 fields in line 5, saw 13

I checked out this answer as I am having a similar problem.
Python Pandas Error tokenizing data
However, for some reason ALL of my rows are being skipped.
My code is simple:
import pandas as pd
fname = "data.csv"
input_data = pd.read_csv(fname)
and the error I get is:
File "preprocessing.py", line 8, in <module>
input_data = pd.read_csv(fname) #raw data file ---> pandas.core.frame.DataFrame type
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/io/parsers.py", line 465, in parser_f
return _read(filepath_or_buffer, kwds)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/io/parsers.py", line 251, in _read
return parser.read()
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/io/parsers.py", line 710, in read
ret = self._engine.read(nrows)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/io/parsers.py", line 1154, in read
data = self._reader.read(nrows)
File "pandas/parser.pyx", line 754, in pandas.parser.TextReader.read (pandas/parser.c:7391)
File "pandas/parser.pyx", line 776, in pandas.parser.TextReader._read_low_memory (pandas/parser.c:7631)
File "pandas/parser.pyx", line 829, in pandas.parser.TextReader._read_rows (pandas/parser.c:8253)
File "pandas/parser.pyx", line 816, in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:8127)
File "pandas/parser.pyx", line 1728, in pandas.parser.raise_parser_error (pandas/parser.c:20357)
pandas.parser.CParserError: Error tokenizing data. C error: Expected 11 fields in line 5, saw 13
Solution is to use pandas built-in delimiter "sniffing".
input_data = pd.read_csv(fname, sep=None)
For those landing here, I got this error when the file was actually an .xls file not a true .csv. Try resaving as a csv in a spreadsheet app.
I had the same error, I read my csv data using this :
d1 = pd.read_json('my.csv')
then I try this
d1 = pd.read_json('my.csv', sep='\t')
and this time it's right.
So you could try this method if your delimiter is not ',', because the default is ',', so if you don't indicate clearly, it go wrong.
pandas.read_csv
This error means, you get unequal number of columns for each row. In your case, until row 5, you've had 11 columns but in line 5 you have 13 inputs (columns).
For this problem, you can try the following approach to open read your file:
import csv
with open('filename.csv', 'r') as file:
reader = csv.reader(file, delimiter=',') #if you have a csv file use comma delimiter
for row in reader:
print (row)
This parsing error could occur for multiple reasons and solutions to the different reasons have been posted here as well as in Python Pandas Error tokenizing data.
I posted a solution to one possible reason for this error here: https://stackoverflow.com/a/43145539/6466550
I have had similar problems. With my csv files it occurs because they were created in R, so it has some extra commas and different spacing than a "regular" csv file.
I found that if I did a read.table in R, I could then save it using write.csv and the option of row.names = F.
I could not get any of the read options in pandas to help me.
The problem could be that one or multiple rows of csv file contain more delimiters (commas ,) than expected. It is solved when each row matches the amount of delimiters of the first line of the csv file where the column names are defined.
use \t+ in the separator pattern instead of \t.
import pandas as pd
fname = "data.csv"
input_data = pd.read_csv(fname, sep='\t+`, header=None)

Python CSV Has No Attribute 'Writer'

There's a bit of code giving me trouble. It was working great in another script I had but I must have messed it up somehow.
The if csv: is primarily because I was relying on a -csv option in an argparser. But even if I were to run this with proper indents outside the if statement, it still returns the same error.
import csv
if csv:
with open('output.csv', 'wb') as csvfile:
csvout = csv.writer(csvfile, delimiter=',',
quotechar=',', quoting=csv.QUOTE_MINIMAL)
csvout.writerow(['A', 'B', 'C'])
csvfile.close()
Gives me:
Traceback (most recent call last):
File "import csv.py", line 34, in <module>
csvout = csv.writer(csvfile, delimiter=',',
AttributeError: 'str' object has no attribute 'writer'
If I remove the if statement, I get:
Traceback (most recent call last):
File "C:\import csv.py", line 34, in <module>
csvout = csv.writer(csvfile, delimiter=',',
AttributeError: 'NoneType' object has no attribute 'writer'
What silly thing am I doing wrong? I did try changing the file name to things like test.py as I saw that in another SO post, didn't work.
For me I had named my file csv.py. So when I import csv from that file I was essentially trying to import the same file itself.
If you've set something that assigns to csv (looks like a string) then you're shadowing the module import. So, the simplest thing is to just change whatever's assigning to csv that isn't the module and call it something else...
In effect what's happening is:
import csv
csv = 'bob'
csvout = csv.writer(somefile)
Remove the further assignment to csv and go from there...
For my case, my function name happened to be csv(). Once I renamed my function, the error disappeared.