How compute some arithmetical operation on every row in CSV file - csv

I starting with python and i have one question about CSV file where I have rows with data in format number;number:
485;16
646;8
920;16
1102;36
My code know how import csv, but I dont know how I can do some arithmetical operation on every row, f.e. multiplication, division etc. and save it in some variable.
import csv
with open('in.csv', newline='') as csvfile:
spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
for row in spamreader:
print(', '.join(row))
Thanks for help.

First you need to specify delimiter correctly: delimiter=';'
Then you can access elements by row[0] and row[1], but they are strings.
So you have to convert them into integer int(row[0])
All together:
import csv
with open('in.csv') as csvfile:
spamreader = csv.reader(csvfile, delimiter=';')
for row in spamreader:
some_variable = int(row[0])*int(row[1])
print(some_variable)

Related

Issues printing a Dataframe after collecting Data from MySQL

I hope you can help me with my Issue. I connected python to my database using pyodbc and I think I was able to save the data into a pandas Dataframe, but unfortunatly I cant work with the Dataframe (for example simply print it) aftertwards.
The error Message says "undefined name "DataFrame"".
How do I need to change my Code so I can get the data from MySQL and use the Dataframe afterwards regularly.
Just as a side Information I want to calculate with the dataframe a little using pandas (optional) and then create a plot using Plotnine and add a UI later. just in Case that matters.
#This function I call
def SQLtoPandas(Connection,SQLString,DataFrame):
DataFrame = pd.DataFrame(
pd.read_sql(SQLString,
con=Connection)
)
#If i call this function it works just fine
def SQL_ReadFunction(Connection,SQLString):
cursor=Connection.cursor()
cursor.execute(
SQLString
)
rows = cursor.fetchall()
for row in rows:
print(row)
SQLString = 'select * from employees'
SQL_ReadFunction(Connection,SQLString)
Connection.close
#Doesnt work, moving it inside the connection also doesnt help.
print (DataFrame)
you don't need additional function for this. just use
df=pd.read_sql('select * from employees',con=con)
print(df)
and manipulate df as you wish using pandas.
i would reccomend using jupyter notebook as it displays dataframe nicely.
also note pd.read_sql() already returns pandas DataFrame, no need to reconvert
You have a few things to take care:
Your function can directly have pd.read_sql as it will load your table as a dataframe. You do not need an extra pd.DataFrame.
You have to print your dataframe inside the function, or assign the dataframe outside like df = SQLtoPandas(Connection,SQLString) and have a return df inside your function
Avoid using the keyword DataFrame to name your DataFrame, use df or something else that is not reserved.
Method 1:
Inside your function:
def SQLtoPandas(Connection,SQLString):
df= pd.read_sql(SQLString, con=Connection)
print(df)
Now call your function outside:
SQLtoPandas(Connection, SQLString)
Method 2:
Inside your function:
def SQLtoPandas(Connection,SQLString):
df = pd.read_sql(SQLString, con=Connection)
return df
Now outside your function do:
df = SQLtoPandas(Connection, SQLString)
print(df)

Export Datatables from Spotfire to CSV using IronPython Script

I have a IronPython script I use to export all my data tables from a Spotfire project.
Currently it works perfectly. It loops through all datatables and exports them as ".xlsx". Now I need to export the files as ".csv" which I thought would be as simple as changing ".xlsx" to ".csv".
This script still exports the files, names them all .csv, but what is inside the file is a .xlsx, Im not sure how or why. The code is just changing the file extension name but not converting the file to csv.
Here is the code I am currently using:
I have posted the full code at the bottom, and the code I believe is relevant to my question in a separate code block at the top.
if(dialogResult == DialogResult.Yes):
for d in tableList: #cycles through the table list elements defined above
writer = Document.Data.CreateDataWriter(DataWriterTypeIdentifiers.ExcelXlsDataWriter)
table = Document.Data.Tables[d[0]] #d[0] is the Data Table name in the Spotfire project (defined above)
filtered = Document.ActiveFilteringSelectionReference.GetSelection(table).AsIndexSet() #OR pass the filter
stream = File.OpenWrite(savePath+'\\'+ d[1] +".csv") #d[1] is the Excel alias name. You could also use d.Name to export with the Data Table name
names = []
for col in table.Columns:
names.append(col.Name)
writer.Write(stream, table, filtered, names)
stream.Close()
I think it may have to do with the ExcelXlsDataWriter.
I tried with ExcelXlsxDataWriter as well. Is there a csv writer I could use for this? I believe csv and txt files have a different writer.
Any help is appreciated.
Full script shown below:
import System
import clr
import sys
clr.AddReference("System.Windows.Forms")
from sys import exit
from System.Windows.Forms import FolderBrowserDialog, MessageBox, MessageBoxButtons, DialogResult
from Spotfire.Dxp.Data.Export import DataWriterTypeIdentifiers
from System.IO import File, FileStream, FileMode
#This is a list of Data Tables and their Excel file names. You can see each referenced below as d[0] and d[1] respectively.
tableList = [
["TestTable1"],
["TestTable2"],
]
#imports the location of the file so that there is a default place to put the exports.
from Spotfire.Dxp.Application import DocumentMetadata
dmd = Application.DocumentMetadata #Get MetaData
path = str(dmd.LoadedFromFileName) #Get Path
savePath = '\\'.join(path.split('\\')[0:-1]) + "\\DataExports\\"
dialogResult = MessageBox.Show("The files will be save to "+savePath
+". Do you want to change location?"
, "Select the save location", MessageBoxButtons.YesNo)
if(dialogResult == DialogResult.Yes):
# GETS THE FILE PATH FROM THE USER THROUGH A FILE DIALOG instead of using the file location
SaveFile = FolderBrowserDialog()
SaveFile.ShowDialog()
savePath = SaveFile.SelectedPath
#message making sure that the user wants to exporthe files.
dialogResult = MessageBox.Show("Export Files."
+" Export Files","Are you sure?", MessageBoxButtons.YesNo)
if(dialogResult == DialogResult.Yes):
for d in tableList: #cycles through the table list elements defined above
writer = Document.Data.CreateDataWriter(DataWriterTypeIdentifiers.ExcelXlsDataWriter)
table = Document.Data.Tables[d[0]] #d[0] is the Data Table name in the Spotfire project (defined above)
filtered = Document.ActiveFilteringSelectionReference.GetSelection(table).AsIndexSet() #OR pass the filter
stream = File.OpenWrite(savePath+'\\'+ d[1] +".csv") #d[1] is the Excel alias name. You could also use d.Name to export with the Data Table name
names = []
for col in table.Columns:
names.append(col.Name)
writer.Write(stream, table, filtered, names)
stream.Close()
#if the user doesn't want to export then he just gets a message
else:
dialogResult = MessageBox.Show("ok.")
For some reason the Spotfire Iron Python implementation does not support the csv package implemented in Python.
The workaround I found to your implementation is using StdfDataWriter instead of ExcelXsDataWriter. The STDF data format is the Spotfire Text Data Format. The DataWriter class in Spotfire supports only Excel and STDF (see here) and STDF comes closest to CSV.
from System.IO import File
from Spotfire.Dxp.Data.Export import DataWriterTypeIdentifiers
writer = Document.Data.CreateDataWriter(DataWriterTypeIdentifiers.StdfDataWriter)
table = Document.Data.Tables['DropDownSelectors']
filtered = Document.ActiveFilteringSelectionReference.GetSelection(table).AsIndexSet()
stream = File.OpenWrite("C:\Users\KLM68651\Documents\dropdownexport.stdf")
names =[]
for col in table.Columns:
names.append(col.Name)
writer.Write(stream, table, filtered, names)
stream.Close()
Hope this helps

keyword search in string from mysql using python?

I am pulling from a mysql database table using python3.4. I use the csv module to write the rows of data from the database into .CSV format. Now I am trying toros figure out how I can vet the rows of data by keywords that may show up in the fourth column of data (row[3]). I was thinking of using the re moduleas below but I keep getting errors. Is it not possible to search keywords in a field that is string type and to filter those results if they have those keywords? I keep getting an error. Please help
import re
import csv
userdate = input('What date do you want to look at?')
query = ("SELECT *FROM sometable WHERE timestamp LIKE %s", userdate)
keywords = 'apples', 'bananas', 'cocoa'
# Execute sql Query
cursor.execute(query)
result = cursor.fetchall()
#Reads a CSV file and return it as a list of rows
def read_csv_file(filename):
"""Reads a CSV file and return it as a list of rows."""
for row in csv.reader(open(filename)):
data.append(row)
return data
f = open(path_in + data_file)
read_it = read_csv_file(path_in + data_file)
with open('file.csv', 'wb') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=' ',
quotechar='|', quoting=csv.QUOTE_MINIMAL)
for row in data:
match = re.search('keywords, read_it)
if match:
spamwriter.writerow(row)
I gave up on the regular expressions and used
for row in data:
found_it = row.find(keywords)
if found_it != -1:
spamwriter.writerow(row)

To save list in CSV file python?

I want to transpose row into column and then save words in CSV file. The problem is only last value of column after transpose is save in file, and if i append string with list, it save in file but characters not words.
Anyone help me to sort it. Thanks in advance
import re
import csv
app =[]
with open('afterstem.csv') as f:
words = [x.split() for x in f]
for x in zip(*words):
for y in x:
res=y
newstr = re.sub('"', r'', res)
app = app + list(res)
#print("AFTER" ,newstr)
with open(r"removequotes.csv", "w") as output:
writer = csv.writer(output, lineterminator='\n', delimiter='\t')
for val in app:
writer.writerow(val)
output.close()
The output save in file look like this:
But i want "Bank" in one cell.
Simply use
for column in zip(*words):
newrows = [[word.replace('"', '')] for word in column]
app.extend(newrows)
to put all columns one after another into the first column.
newrow = [[word.replace('"', '')] for word in column] creates a new list for each column with double quotes stripped and wrapped into a list and app.extend(newrow) appends all of these lists to your result variable app.
You got your result because of your inner loop and in particular its last line:
for y in x:
...
app = app + list(res)
The for-loop takes each word in each column and list(res) converts the string with the word into a list of characters. So "Bank" becomes ['B', 'a', 'n', 'k'], etc. Then app = app + list(res) creates a new list that contains every item from app and the characters from the word and assigns that to app.
In the end you got a array containing every letter from the file instead of a array with all words in the file in the right order. The call to writer.writerow(val) then wrote each letter as it's own row.
BTW: If your input also uses tabs to delimit columns, it might be easier to use list(csv.reader(f, lineterminator='\n', delimiter='\t')) instead of your simple read with split() and stripping of quotes.

Convert CSV column format from string to datetime

Python 3.4 ... I am trying to convert a specific column in CSV from string to datetime and write it to the file. I am able to read and convert it, but need help writing the date format change to the file (the last line of code - needs to be modified). I got the following error for last line of code Error: sequence expected.
import csv
import datetime
with open ('/path/test.date.conversion.csv') as csvfile:
readablefile = csv.reader(csvfile)
writablefile = csv.writer(csvfile)
for row in readablefile:
date_format = datetime.datetime.strptime(row[13], '%m/%d/%Y')
writablefile.writerow(date_format)
So date_format is an instance of the datetime class. You need to format a string to print out to file. You can go with the standard:
date_format.isoformat()
Or customize your own format:
date_format.strftime('%Y-%m-%d %H:%M:%S')
Also a couple more things to note. You can't write to the same file you're reading from like that. If you want to overwrite that file, create a tmp file then replace it when done, but it's better to just create another file. The best solution imho is to make good use of stdin and stdout like this:
import csv
import sys
from datetime import datetime
FIELD = 13
FMT = '%m/%d/%Y'
readablefile = csv.reader(sys.stdin)
writablefile = csv.writer(sys.stdout)
for row in readablefile:
dt = datetime.strptime(row[FIELD], FMT)
row[FIELD] = dt.isoformat()
writablefile.writerow(row)
Now you can call the script like so:
cat /path/test.date.conversion.csv | python fixdate.py > another_file.csv
At this point, you can make use of the argparse module to provide all kinds of nice parameters to your script like:
Which field to format?
What input and/or output format?
etc.