Convert numbers from mathematica csv export to numpy complex array - csv

I have exported data from mathematica to a csv file. The file structure looke as follows:
"x","y","Ex","Ey"
0.,0.,0.+0.*I,-3.0434726787506006*^-12+3.4234894344189825*^-12*I
0.,0.,0.+0.*I,-5.0434726787506006*^-12+10.4234894344189825*^-13*I
...
I'm reading in the data with pandas, but I get an error
import csv
import pandas as pd
import numpy as np
df=pd.read_csv('filename.csv')
df.columns=['x', 'y', 'Ex','Ey']
df['Ey'] = df['Ey'].str.replace('*^','E')
df['Ey'] = df['Ey'].str.replace('I','1j').apply(lambda x: np.complex(x))
Edit: I'm getting the following error in the second last line of my code:
Traceback (most recent call last):
File "plot.py", line 6, in <module>
df['Ey'] = df['Ey'].str.replace('*^','E')
File "/home/.../.local/lib/python2.7/site-packages/pandas/core/strings.py", line 1579, in replace
flags=flags)
File "/home/.../.local/lib/python2.7/site-packages/pandas/core/strings.py", line 424, in str_replace
regex = re.compile(pat, flags=flags)
File "/usr/lib/python2.7/re.py", line 194, in compile
return _compile(pattern, flags)
File "/usr/lib/python2.7/re.py", line 251, in _compile
raise error, v # invalid expression
sre_constants.error: nothing to repeat
When I write instead
df['Ey'] = df['Ey'].str.replace('*','E')
or
df['Ey'] = df['Ey'].str.replace('^','E')
I'm not getting an error. It seems like one can only give one charcter which is replaced?

Why beat yourself up messing with ascii encoded floats?
here is how to exchange complex arrays between python and mathematica using raw binary files.
in mathematica:
cdat = RandomComplex[{0, 1 + I}, 5]
{0.0142816 + 0.0835513 I, 0.434109 + 0.977644 I,
0.579678 + 0.337286 I, 0.426271 + 0.166166 I, 0.363249 + 0.0867334 I}
f = OpenWrite["test", BinaryFormat -> True]
BinaryWrite[f, cdat, "Complex64"]
Close[f]
or:
Export["test", cdat, "Binary", "DataFormat" -> "Complex64"]
in python:
import numpy as np
x=np.fromfile('test',np.complex64)
print x
[ 0.01428160+0.0835513j 0.43410850+0.97764391j 0.57967812+0.3372865j
0.42627081+0.16616575j 0.36324903+0.08673338j]
going the other way:
y=np.array([[1+2j],[3+4j]],np.complex64)
y.tofile('test')
f = OpenRead["test", BinaryFormat -> True]
BinaryReadList[f, "Complex64"]
Close[f]
note this will be several orders of magnitude faster than exchanging data by csv.

Related

Error message when importing .csv files into MySQL using Python

I am a novice when it comes to Python and I am trying to import a .csv file into an already existing MySQL table. I have tried it several different ways but I cannot get anything to work. Below is my latest attempt (not the best syntax I'm sure). I originally tried using ‘%s’ instead of ‘?’, but that did not work. Then I saw an example of the question mark but that clearly isn’t working either. What am I doing wrong?
import mysql.connector
import pandas as pd
db = mysql.connector.connect(**Login Info**)
mycursor = db.cursor()
df = pd.read_csv("CSV_Test_5.csv")
insert_data = (
"INSERT INTO company_calculations.bs_import_test(ticker, date_updated, bs_section, yr_0, yr_1, yr_2, yr_3, yr_4, yr_5, yr_6, yr_7, yr_8, yr_9, yr_10, yr_11, yr_12, yr_13, yr_14, yr_15)"
"VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"
)
for row in df.itertuples():
data_inputs = (row.ticker, row.date_updated, row.bs_section, row.yr_0, row.yr_1, row.yr_2, row.yr_3, row.yr_4, row.yr_5, row.yr_6, row.yr_7, row.yr_8, row.yr_9, row.yr_10, row.yr_11, row.yr_12, row.yr_13, row.yr_14, row.yr_15)
mycursor.execute(insert_data, data_inputs)
db.commit()
Error Message:
> Traceback (most recent call last): File
> "C:\...\Python_Test\Excel_Test_v1.py",
> line 33, in <module>
> mycursor.execute(insert_data, data_inputs) File "C:\...\mysql\connector\cursor_cext.py",
> line 325, in execute
> raise ProgrammingError( mysql.connector.errors.ProgrammingError: Not all parameters were used in the SQL statement
MySQL Connector/Python supports named parameters (which includes also printf style parameters (format)).
>>> import mysql.connector
>>> mysql.connector.paramstyle
'pyformat'
According to PEP-249 (DB API level 2.0) the definition of pyformat is:
pyformat: Python extended format codes, e.g. ...WHERE name=%(name)s
Example:
>>> cursor.execute("SELECT %s", ("foo", ))
>>> cursor.fetchall()
[('foo',)]
>>> cursor.execute("SELECT %(var)s", {"var" : "foo"})
>>> cursor.fetchall()
[('foo',)]
Afaik the qmark paramstyle (using question mark as a place holder) is only supported by MariaDB Connector/Python.

json errors when appending data with Python

Good day.
I have a small password generator program and I want to save the created passwords into a json file (append each time) so I can add them to an SQLITE3 database.
Just trying to do the append functionality I receive several errors that I don't understand.
Here are the errors I receive and below that is the code itself.
I'm quite new to Python so additional details are welcomed.
Traceback (most recent call last):
File "C:\Users\whitmech\OneDrive - Six Continents Hotels, Inc\04 - Python\02_Mosh_Python_Course\Py_Projects\PWGenerator.py", line 32, in
data = json.load(file)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1264.0_x64__qbz5n2kfra8p0\lib\json_init_.py", line 293, in load
return loads(fp.read(),
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1264.0_x64__qbz5n2kfra8p0\lib\json_init_.py", line 346, in loads
return _default_decoder.decode(s)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1264.0_x64__qbz5n2kfra8p0\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.1264.0_x64__qbz5n2kfra8p0\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
import random
import string
import sqlite3
import json
from pathlib import Path
print('hello, Welcome to Password generator!')
# input the length of password
length = int(input('\nEnter the length of password: '))
# define data
lower = string.ascii_lowercase
upper = string.ascii_uppercase
num = string.digits
symbols = string.punctuation
# string.ascii_letters
# combine the data
all = lower + upper + num + symbols
# use random
temp = random.sample(all, length)
# create the password
password = "".join(temp)
filename = 'saved.json'
entry = {password}
with open(filename, "r+") as file:
data = json.load(file)
data.append(entry)
file.seek(0)
json.dump(data, file)
# print the password
print(password)
Update: I've changed the JSON code as directed and it works but when trying to do the SQLite3 code I'm knowing receiving a typeerror
Code:
with open(filename, "r+") as file:
try:
data = json.load(file)
data.append(entry)
except json.decoder.JSONDecodeError as e:
data = entry
file.seek(0)
json.dump(data, file)
# print the password
print(password)
store = input('Would you like to store the password? ')
if store == "Yes":
pwStored = json.loads(Path("saved.json").read_text())
with sqlite3.connect("db.pws") as conn:
command = "INSERT INTO Passwords VALUES (?)"
for i in pwStored:
conn.execute(command, tuple(i.values)) # Error with this code
conn.commit()
else:
exit()
Error:
AttributeError: 'str' object has no attribute 'values'
The error is because
Your json file is empty, you need to update the following block
entry = [password]
with open(filename, "r+") as file:
try:
data = json.load(file)
data.extend(entry)
except json.decoder.JSONDecodeError as e:
data = entry
file.seek(0)
json.dump(data, file)
Also you are adding password in a set ie., entry, and it will again throw you an error TypeError: Object of type set is not JSON serializable
So you need to convert that to either a list or dict
Note: Here I have used entry as a list

How to read json file and fit lstm model?

I am trying to apply LSTM on HP news dataset. The data is in JSON format (https://www.kaggle.com/rmisra/news-category-dataset). I have tried this code and got errors. Don't know what's wrong with this code?
from keras.layers import LSTM, Activation, Dense, Dropout, Input, Embedding
from keras.optimizers import RMSprop
from keras.preprocessing.text import Tokenizer
from keras.preprocessing import sequence
from keras.utils import to_categorical
from keras.callbacks import EarlyStopping
import json
from sklearn.preprocessing import LabelBinarizer
with open('News_Category_Dataset_v2.json', 'r') as f:
train = json.load(f)
Y_train = list(train.values())
lb = LabelBinarizer()
X_train = lb.fit_transform(list(train.keys()))
##
X_train,X_test,Y_train,Y_test = train_test_split(X,Y,test_size=0.15)
##
max_words = 1000
max_len = 150
tok = Tokenizer(num_words=max_words)
tok.fit_on_texts(X_train)
sequences = tok.texts_to_sequences(X_train)
sequences_matrix = sequence.pad_sequences(sequences,maxlen=max_len)
def RNN():
inputs = Input(name='inputs',shape=[max_len])
layer = Embedding(max_words,50,input_length=max_len)(inputs)
layer = LSTM(64)(layer)
layer = Dense(256,name='FC1')(layer)
layer = Activation('relu')(layer)
layer = Dropout(0.5)(layer)
layer = Dense(1,name='out_layer')(layer)
layer = Activation('softmax')(layer)
model = Model(inputs=inputs,outputs=layer)
return model
model = RNN()
model.summary()
model.compile(loss='binary_crossentropy',optimizer=RMSprop(),metrics=['accuracy'])
model.fit(sequences_matrix,Y_train,batch_size=128,epochs=10,
validation_split=0.2,callbacks=[EarlyStopping(monitor='val_loss',min_delta=0.0001)])
Got these errors
Traceback (most recent call last):
Traceback (most recent call last):
File ".\Hpnews.py", line 30, in <module>
train = json.load(f)
File "C:\Users\a\Anaconda3\lib\json\__init__.py", line 293, in load
return loads(fp.read(),
File "C:\Users\a\Anaconda3\lib\json\__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "C:\Users\a\Anaconda3\lib\json\decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 366)
this is my json file format
"root":{6 items
"category":string"CRIME"
"headline":string"There Were 2 Mass Shootings In Texas Last Week, But Only 1 On TV"
"authors":string"Melissa Jeltsen"
"link":string"huffingtonpost.com/entry/…" "short_description":string"She left her husband. He killed their children. Just another day in America."
"date":string"2018-05-26" }
The JSON is not a typical JSON but a ndJSON ("newline-delimited JSON") that won't be opened by json.load.
You should use pandas to load you data:
import pandas as pd
data = pd.read_json('News_Category_Dataset_v2.json', lines=True)

too many values to unpack (expected 2) lda

I received error : too many values to unpack (expected 2) , when running the below code. anyone can help me? I added more details.
import gensim
import gensim.corpora as corpora
dictionary = corpora.Dictionary(doc_clean)
doc_term_matrix = [dictionary.doc2bow(doc) for doc in doc_clean]
Lda = gensim.models.ldamodel.LdaModel
ldamodel = Lda(doc_term_matrix, num_topics=3, id2word = dictionary, passes=50, per_word_topics = True, eval_every = 1)
print(ldamodel.print_topics(num_topics=3, num_words=20))
for i in range (0,46):
for index, score in sorted(ldamodel[doc_term_matrix[i]], key=lambda tup: -1*tup[1]):
print("subject", i)
print("\n")
print("Score: {}\t \nTopic: {}".format(score, ldamodel.print_topic(index, 6)))
Focusing on the loop, since this is where the error is being raised. Let's take it one iteration at a time.
>>> import numpy as np # just so we can use np.shape()
>>> i = 0 # value in first loop
>>> x = sorted( ldamodel[doc_term_matrix[i]], key=lambda tup: -1*tup[1] )
>>> np.shape(x)
(3, 3, 2)
>>> for index, score in x:
... pass
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack (expected 2)
Here is where your error is coming from. You are expecting this returned matrix to have 2 elements, however it is a multislice matrix with no simple infer-able way to unpack it. I do not personally have enough experience with this subject material to be able to infer what you might mean to be doing, I can only show you where your problem is coming from. Hope this helps!

FileNotFoundError: [Errno 2] File b'Downloads/BetterLifeIndex2015.csv' does not exist: b'Downloads/BetterLifeIndex2015.csv'

Resolved
Answer: Changed the path, it was in fact inncorect path after all. Used absolute path (alt+d+copy from file explorer". Also used "r" before the path so the path is treated like a raw string.
# load the data
BetterLifeIndex = pd.read_csv(r"C:\Users\brede\OneDrive\Dokumenter\Downloads\BetterLifeIndex2015.csv", thousands = ',')
gdp_per_capita = pd.read_csv(r"C:\Users\brede\OneDrive\Dokumenter\Downloads\gdpcapita.csv", thousands= ',', delimiter ='\t',
encoding = 'latin1' , na_values="n/a")
Im new to Python and I'm running a Example from a machine learning book. I cant get python to read my csv file.
Code:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import sklearn.linear_model
def prepare_country_stats(oecd_bli, gdp_per_capita):
oecd_bli = oecd_bli[oecd_bli["INEQUALITY"]=="TOT"]
oecd_bli = oecd_bli.pivot(index="Country", columns="Indicator", values="Value")
gdp_per_capita.rename(columns={"2015": "GDP per capita"}, inplace=True)
gdp_per_capita.set_index("Country", inplace=True)
full_country_stats = pd.merge(left=oecd_bli, right=gdp_per_capita,
left_index=True, right_index=True)
full_country_stats.sort_values(by="GDP per capita", inplace=True)
remove_indices = [0, 1, 6, 8, 33, 34, 35]
keep_indices = list(set(range(36)) - set(remove_indices))
return full_country_stats[["GDP per capita", 'Life satisfaction']].iloc[keep_indices]
# load the data
oecd_bli = pd.read_csv("Downloads/BetterLifeIndex2015.csv", thousands = ',')
gdp_per_capita = pd.read_csv("C:/Users/brede/Downloads/gdpcapita.csv", thousands= ',', delimiter ='\t',
encoding = 'latin1' , na_values="n/a")
#prepare the data
country_stats = prepare_country_stats (oecd_bli, gdp_per_capita)
x = np.c_[country_stats["gdp per capita"]]
y = np.c_[country_stats["life satisfaction"]]
#visualize the data
country_stats.plot(kind= 'scatter' , x = "GDP per capita", y ='Life satisfaction')
#select a linear model
model = sklearn.linear_model.LinearRegression()
#train the model
model.fit (x, y)
#make a prediction for Cyprus
X_new = [[22587]] #Cyprus GDP per capita
print(model.predict(X_new)) #outputs[[5.96242338]]
The output is:
runfile('C:/Users/brede/Downloads/practice_gdp.py', wdir='C:/Users/brede/Downloads')
Traceback (most recent call last):
File "<ipython-input-59-2f130edd277c>", line 1, in <module>
runfile('C:/Users/brede/Downloads/practice_gdp.py', wdir='C:/Users/brede/Downloads')
File "C:\Users\brede\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "C:\Users\brede\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/brede/Downloads/practice_gdp.py", line 31, in <module>
oecd_bli = pd.read_csv("Downloads/BetterLifeIndex2015.csv", thousands = ',')
File "C:\Users\brede\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 685, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\Users\brede\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 457, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "C:\Users\brede\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 895, in __init__
self._make_engine(self.engine)
File "C:\Users\brede\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1135, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "C:\Users\brede\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1917, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas\_libs\parsers.pyx", line 382, in pandas._libs.parsers.TextReader.__cinit__
File "pandas\_libs\parsers.pyx", line 689, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'Downloads/BetterLifeIndex2015.csv' does not exist: b'Downloads/BetterLifeIndex2015.csv'
I have triplechecked the path to the file, and I can't seem to figure this out! All help is appreciated.
This is done in Spyder, also tried in Jupyter with same result. I've even copied the path etc.
help...
I think you have to include'/' in the file path.Try that 'C:/Users/brede/OneDrive....'