I have several CSV files in one folder with the same structure and would like to import it to a Database for further analysis.
The CSV columns are the following:
Company_Code , Company_Name , Year , Account_number , Account_Description, Value
My intention is to use this CSV to populate 3 tables in my DB in MySQL Workbench. This tables are responsible for organizing:
company_data: id, code and name.
Accounts: id, account number and account description.
Values: companyID, accountID, year and value.
I want to always keep the relationship between company values and account, but in a way that makes easier queries and analysis.
I have tried python but i can't figure out how to attach foreign keys. I don't know if its better to import to a temporary table and then populate each table, directly in MySQL workbench
Edit:
For my python approach i was trying to retrive data from an XML file that has the same information but it started getting to techinical because I would need to implement several valdations that with the CSV are not necessary.
import xml.etree.ElementTree as ET;
import csv
import mysql.connector;
mydb = mysql.connector.connect(
host = "localhost",
user = "root",
passwd = "########",
database = "cvmsanepar"
)
mycursor = mydb.cursor();
mycursor.execute("CREATE TABLE CVM (conta VARCHAR(255), descricao VARCHAR(255), valor1 INTEGER(10), valor2 INTEGER(10), valor3 INTEGER(10))");
dfpTree = ET.parse("InfoFinaDFin.xml");
dfpRoot = dfpTree.getroot();
conta1 = []; #codigo CVM do plano de contas (accounts)
descricao = []; #descrição do plano de conta (account description)
valor1 = []; #valor ano do documento (value year 1)
valor2 = []; #valor ano anterior (value year 2)
valor3 = []; #valor ano anterior do anterior (value year 3)
for numeroDeConta in dfpRoot.iter('NumeroConta'):
conta1.append(numeroDeConta.text);
for descricaoConta in dfpRoot.iter('DescricaoConta1'):
descricao.append(descricaoConta.text);
for valor in dfpRoot.findall('InfoFinaDFin'):
valor1.append(valor.find('ValorConta1').text);
valor2.append(valor.find('ValorConta2').text);
valor3.append(valor.find('ValorConta3').text);
def merge(conta1,descricao,valor1,valor2,valor3):
mergedList = [(conta1[i], descricao[i], valor1[i], valor2[i], valor3[i]) for i in range(0, len(conta1))]
return mergedList
sqlFormula = "INSERT INTO CVM (conta, descricao, valor1, valor2, valor3) VALUES (%s,%s,%s,%s,%s)"
mycursor.executemany(sqlFormula, merge(conta1,descricao,valor1,valor2,valor3));
mydb.commit();
Related
I´m developing a API on Python 3 using Flask and trying to insert data to Mysql, but on the response only comes the values and I can't show properly on the Json answer to app.
Python Code
app = Flask(__name__)
app.config['MYSQL_HOST'] = 'localhost'
app.config['MYSQL_USER'] = 'root'
app.config['MYSQL_PASSWORD'] = ''
app.config['MYSQL_DB'] = 'didaxis'
mysql = MySQL(app)
#app.route('/insertAlumno', methods = ['POST'])
def insertAlumno():
nombre = request.json['nombre']
apellidoPaterno = request.json['apellidoPaterno']
apellidoMaterno = request.json['apellidoMaterno']
fechaNacimiento = request.json['fechaNacimiento']
direccion = request.json['direccion']
telefono = request.json['telefono']
correo = request.json['correo']
matricula = request.json['matricula']
curso = request.json['curso']
dataUser = (nombre, apellidoPaterno, apellidoMaterno, fechaNacimiento, direccion, telefono, correo, matricula, curso)
conect = mysql.connection.cursor()
conect.callproc('insertarAlumno', dataUser)
data = conect.fetchall()
data=jsonify(data)
return data
Stored Procedure
DELIMITER //
CREATE PROCEDURE insertarAlumno(_nombre VARCHAR(50),
_apellidoPaterno VARCHAR(30),
_apellidoMaterno VARCHAR(30),
_fechaNacimiento DATE,
_direccion VARCHAR(100),
_telefono VARCHAR(20),
_correo VARCHAR(50),
_matricula VARCHAR(50),
_curso VARCHAR(10))
BEGIN
DECLARE idPersonaTemp,idUserTemp INT DEFAULT 0;
INSERT INTO persona (nombre,apellidoPaterno,apellidoMaterno,fechaNacimiento,direccion,telefono,correo)
VALUES (_nombre,_apellidoPaterno,_apellidoMaterno,_fechaNacimiento,_direccion,_telefono,_correo);
SET idPersonaTemp = last_insert_id();
INSERT INTO usuario (usuario,pass,estatus)
VALUES (_matricula,_matricula,1);
SET idUserTemp = last_insert_id();
INSERT INTO alumno VALUES (_matricula,idPersonaTemp,idUserTemp,_curso,1);
SELECT * FROM vista_alumnos WHERE id = _matricula;
END //
Response that I get from Postman
[
[
"Simon",
"Lopez",
"Lopez",
"Fri, 23 Oct 1998 00:00:00 GMT",
"Miguel Hidalgo 515",
"4761138167",
"simon.valt23#gmail.com",
"178724",
"178724",
"178724",
9,
9,
"1APre",
1
]
]
I hope that some one can explain me about this issue and how i could get one response json with column name to identify values.
I got the answer add a few lines on the Python Code
conect = mysql.connection.cursor()
conect.callproc('insertarAlumno', dataUser)
*columns=[x[0] for x in conect.description]
data = conect.fetchall()
*json_data=[]
*for result in data:
*json_data.append(dict(zip(columns,result)))
*return jsonify(json_data)
The lines that starts with a * was modified vs the post or just was added.
Actually i want to copy one tables data to another table. it has no unique id , the only relation between two are a "fra" number and a 'pra' number but both are not unique . but fra and pra (r concatenate ) is unique for each. and one table data is sex (customer table) and another is gender(new_customer table) the gender is Boolean, sex is string of m and f . how can i copy from customer table to new_customer table
I tried these way
UPDATE new_customer JOIN customer
SET registrations.name = customer.nam,
registrations.surname = customer.vornam,
registrations.ort = .ort,
registrations.phone = customer.telmbl,
registrations.surname = customer.vornam
WHERE registrations.fra = customer.fra
and registrations.pra = customer.pra;
Any body to help me?
something like following, you can try.
UPDATE new_customer AS new_c, customer AS old_c
SET
new_c.name = old_c.nam,
new_c.surname = old_c.vornam,
new_c.ort = old_c.ort,
new_c.phone = old_c.telmbl
WHERE
new_c.fra = old_c.fra
AND new_c.pra = old_c.pra;
I am writing a python script which connects to the SQL database. It creates databases based on the folder paths and tables based on the files present in those folders in corresponding databases.
My following code is doing everything fine but I want to optimize it in a way that it loads the data into tables only if the tables are empty.
The problem with the following code is that whenever I run it, it checks if the table is not present, it creates the table and if the table is already there, it moves on but when it comes to loading data into tables, every time run the script, it loads the data into tables from file 1.
I want to tweak it in a way that it loads data into tables only and only if the data is already not present in it. If the data is present, the code move on.
I tried to do something like create table if not exists but not successful.
hostname = 'hostname'
username = 'usrname'
password = '12345'
database = 'd1'
portname = '12345'
from mysql.connector.constants import ClientFlag
import pathlib
import sys
import os
import mysql.connector
import subprocess
from subprocess import *
import time
from termcolor import colored
print(colored('\nConnecting SQL database using host = '+hostname+' , username = '+username+' , port = ' +portname+ ' , database = ' +database+'.','cyan',attrs=['reverse','blink']))
print('\n')
myConnection = mysql.connector.connect(user=username, passwd=password,host=hostname,port=portname,database=database,client_flags=[ClientFlag.LOCAL_FILES])
myCursor =myConnection.cursor()
rootDir35 = '/mnt/Wdrive/pc35/SK/E13'
filenames35 = os.listdir(rootDir35)
root35 = pathlib.Path(rootDir35)
non_empty_dirs35 = {str(p35.parent) for p35 in root35.rglob('*') if p35.is_file()}
#35
try:
print(colored('**** Starting Performing SQL-Queries for pc35 **** \n','green',attrs=['reverse','blink'] ))
for f35 in non_empty_dirs35:
dB35 = f35.replace('/','_') or f35.replace('-','_')
for dirName35, subdirList35, fileList35 in os.walk(rootDir35):
if dirName35 == f35:
p1135= 'Current Working Directory is: %s' %f35+ ' '
print(colored(p1135,'cyan'))
createDB35='CREATE DATABASE IF NOT EXISTS %s' %dB35
myCursor.execute(createDB35)
p135='Database of pc35 Created : %s' %dB35
print(colored(p135,'cyan'))
useDB35='use %s' %dB35
myCursor.execute(useDB35)
myConnection.commit()
p235= 'Database in use : %s' %dB35
print(colored(p235,'cyan'))
print(' ')
for fname35 in fileList35:
completePath35 ='%s' %dirName35+'/%s'%fname35
tblname35 = os.path.basename(fname35).split('.')[0]
if '-' not in tblname35:
if '.' not in tblname35:
sql35= 'CREATE TABLE if not exists %s (Datum varchar(50), Uhrzeit varchar(13), UpsACT_V varchar(6), UpsPRE_V varchar(6), IpsACT_A varchar(6), IpsPRE_A varchar(6), PpsACT_W varchar(6), PpsPRE_W varchar(10), UelACT_V varchar(6), UelPRE_V varchar(6), IelACT_A varchar(8), IelPRE_A varchar(8), PelACT_W varchar(8), PelPRE_W varchar(8), Qlad_Ah varchar(10), Qlast_Ah varchar(10))' %(tblname35)
myCursor.execute(sql35)
myConnection.commit()
test35 = 'The Table %s ' %tblname35+ 'in database %s '%dB35+'is created'
print(colored(test35,'yellow'))
loadData35= "LOAD DATA LOCAL INFILE '%s' " %completePath35 + "INTO TABLE %s" %tblname35
myCursor.execute(loadData35)
myConnection.commit()
p335='Data loaded from file %s ' %fname35
p435=' into table %s ' %tblname35
p535 = p335 + p435
print(colored(p535,'green'))
print(' ')
print(colored('**** SQL-Queries for pc35 successfully executed **** \n','green',attrs=['reverse','blink']))
except:
print(' ')
print(colored('**** SQL queries for pc35 doesnot executed. Please refer to the report or user manual for more details ****','red',attrs=['reverse','blink']))
print(' ')
What I want is something like load data if not exists in table.
what do you think, is it possible or what should I do to achieve this?
I am getting tweet_id from a table in my database and storing them in a dataframe in r. The problem is that the tweet_id values are not being added correctly in dataframe.
snapshot of my table:
snapshot of my dataframe in rstudio:
As you can see there is no tweet_id = '882100387989291008'(3rd value in my dataframe) in my database table
my Rscript file:
#connecting with db
#myDB = dbConnect(MySQL(), user = "root", password = "F33mtHaDD", dbname = "dashboard", host= "127.0.0.1", port="8889")
myDB =dbConnect(MySQL(), user = "root", password ="F33mtHaDD", dbname = "dashboard")
options(scipen=10)
options()$scipen
#running a query and retriving data and saving it in a object
rs = dbSendQuery(myDB, "SELECT tweet_id, sentiment, text FROM dashboard.sen_tweets_twitter WHERE text <> '';")
#getting the result. The function fetch() saves the result in a dataframe
datafetd = fetch(rs, n=-1)
#removing extra whitespaces
#new = stripWhitespace(datafetd$text)
#dataafterclean =data.frame(new)
#converts into one single string
review_text = paste(datafetd$text)
review_id = paste(datafetd$tweet_id)
print(review_id)
rm(tm_tdm)
#find the number of data
tweets_num = length(review_text)
#Disconnect connections
dbdisconnect = lapply(dbListConnections( dbDriver( drv = "MySQL")), dbDisconnect)
#checking if all connection has been closed
dbListConnections(MySQL())
The values in my database are the correct ones.How do i solve this problem?
Database tables represent unordered sets of data. In your table snapshot, it appears that the records are sorted by ID in ascending order. I postulate that all the data did in fact make it into your data frame, but that data frame has a different order than what you showed when querying your table. To confirm this, you can try sorting the data frame ascending on the ID:
datafetd[with(datafetd, order(ID)), ]
import csv
import MySQLdb
conn = MySQLdb.connect('localhost','tekno','poop','media')
cursor = conn.cursor()
txt = csv.reader(file('movies.csv'))
for row in txt:
cursor.execute('insert into shows_and_tv(watched_on,title,score_rating)' 'values ("%s","%s","%s")',row)
conn.close()
when I run this I get
TypeError: not enough arguments for format string
but it matches up
the csv is formatted like
dd-mm-yyyy,string,tinyint
which watches the fields in the database
I do not have a mysql database to play with. So I did what you need but in sqlite. It should be quite easy to adapt this to your needs.
import csv
import sqlite3
from collections import namedtuple
conn = sqlite3.connect('statictest.db')
c = conn.cursor()
c.execute('''CREATE TABLE if not exists movies (ID INTEGER PRIMARY KEY AUTOINCREMENT, 'watched_on','title','score_rating')''')
record = namedtuple('record',['watched_on','title','score_rating'])
SQL ='''
INSERT INTO movies ("watched_on","title","score_rating") VALUES (?,?,?)
'''
with open('statictest.csv', 'r') as file:
read_data = csv.reader(file)
for row in read_data:
watched_on, title, score_rating = row
data = (record(watched_on, title, score_rating))
c.execute(SQL, data)
conn.commit()