Fast import whole SQL table (using cache?) - mysql

I use DBI and RMySQL package to import the whole table from the database. The code works as expected. I would like to know is there a faster way to import the same table multiple times? For example, I import the table, do some calculations, close the R session, and then import the same table again tomorrow. Is there a way to somehow cache that table and import the same table in a faster way?
The code example (working as expected):
library(RMySQL)
library(DBI)
# coonect to database
connection <- function() {
con <- DBI::dbConnect(RMySQL::MySQL(),
host = "91.234.xx.xxx",
port = 3306L,
dbname = "xxxx",
username = "xxxx",
password = "xxxx",
Trusted_Connection = "True")
}
# imoprt
db <- connection()
vix <- DBI::dbGetQuery(db, 'SELECT * FROM VIX')
invisible(dbDisconnect(db))

Related

Draw a graph from a SQL request using Dash

I have been searching but I didn't find a simple way to draw a graph from a SQL request.
For example I have this code, and I want to make a bar chart from the result of the request :
import pymysql as sql
from dash import dcc
DB = ...
HOST = ...
USER = ...
PASSWORD = ...
connection = sql.connect(host=HOST,
port=x,
user=USER,
password=PASSWORD,
database=DB,
cursorclass=sql.cursors.DictCursor)
with connection.cursor() as cursor:
# Read a single record
sql = 'SELECT COUNT(*) AS count FROM table'
cursor.execute(sql)
result = cursor.fetchone()
print(result)
Also, I would like to update the chart regularly.
Thank you

mySQL export to GCP cloud-storage

I have mySQL running on-prem and would like to migrate it with mySQL running on Cloud SQL (GCP). I first want to export tables to Cloud Storage as JSON files and then from there move them to mySQL (cloud-sql) & Big Query.
Now I wonder how I should do this - export each table as JSON or just dump the whole database to cloud storage? (we might need to change schemas for some tables that's why im thinking to do it 1 by 1).
Is there any way doing it with python pandas?
I found this --> Pandas Dataframe to Cloud Storage Bucket
but don't understand how to connect this to my GCP's cloud storage, and how to do this mycursor.execute("SELECT * FROM table") for all my tables.
EDIT 1:
so i came up with this, but this works only for the selected schema + table. how can I do this for all tables in the schema??
#!/usr/bin/env python3
import mysql.connector
import pandas as pd
from google.cloud import storage
from google.oauth2 import service_account
import os
import csv
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/home/python2/key.json"
#export GOOGLE_APPLICATION_CREDENTIALS="/home/python2/key.json"
#credentials = storage.Client.from_service_account_json('/home/python2/key.json')
#credentials = service_account.Credentials.from_service_account_file('key.json')
mydb = mysql.connector.connect(
host="localhost", user="root", passwd="pass_word", database="test")
mycursor = mydb.cursor(named_tuple=True)
mycursor.execute("SELECT * FROM test")
myresult = mycursor.fetchall()
df = pd.DataFrame(data=myresult)
storage_client = storage.Client()
bucket = storage_client.get_bucket("my-buckets-1234567")
blob = bucket.blob("file.json")
df = pd.DataFrame(data=myresult).to_json(orient='records')
#df = pd.DataFrame(data=myresult).to_csv(sep=";", index=False, quotechar='"', quoting=csv.QUOTE_ALL, encoding="UTF-8")
blob.upload_from_string(data=df)

Load large data from MySQL from R

I have a large table in MySQL (about 3million rows with 15 columns) and I'm trying to use some of the data from the table in an R shiny app.
I've been able to get the connection and write a query in R:
library(DBI)
library(dplyr)
cn <- dbConnect(drv = RMySQL::MySQL(),
username = "user",
password = "my_password",
host = "host",
port = 3306
)
query = "SELECT * FROM dbo.locations"
However, when I run dbGetQuery(cn, query) it takes really long (I ended up closing my RStudio program after it turned unresponsive.
I also tried
res <- DBI::dbSendQuery (cn, query)
repeat {
df <- DBI::dbFetch (res, n = 1000)
if (nrow (df) == 0) { break }
}
dbClearResult(dbListResults(cn)[[1]])
since this is similar to reading the data in by chunks, but my resulting df has 0 rows for some reason.
Any suggestions on how to get my table in R? Should I even try to read that table into R? From what I understand, R doesn't handle large data very well.

Cannot see new SQL tables in Django online interface on PythonAnywhere

As part of hosting my website I have an SQL server on pythonanywhere.com with some data collected from my website. I need to aggregate some of the information into a new table stored in the same database. If I use the code below I can create a new table as observed by the SHOW TABLES query. However, I cannot see that table in the Django online interface provided alongside the SQL server.
Why is that the case? How can I make the new visible on the Django interface so I can browse the content and modify it?
from __future__ import print_function
from mysql.connector import connect as sql_connect
import sshtunnel
from sshtunnel import SSHTunnelForwarder
from copy import deepcopy
sshtunnel.SSH_TIMEOUT = 5.0
sshtunnel.TUNNEL_TIMEOUT = 5.0
def try_query(query):
try:
cursor.execute(query)
connection.commit()
except Exception:
connection.rollback()
raise
if __name__ == '__main__':
remote_bind_address = ('{}.mysql.pythonanywhere-services.com'.format(SSH_USERNAME), 3306)
tunnel = SSHTunnelForwarder(('ssh.pythonanywhere.com'),
ssh_username=SSH_USERNAME, ssh_password=SSH_PASSWORD,
remote_bind_address=remote_bind_address)
tunnel.start()
connection = sql_connect(user=SSH_USERNAME, password=DATABASE_PASSWORD,
host='127.0.0.1', port=tunnel.local_bind_port,
database=DATABASE_NAME)
print("Connection successful!")
cursor = connection.cursor() # get the cursor
cursor.execute("USE {}".format(DATABASE_NAME)) # select the database
cursor.execute("SHOW TABLES")
prev_tables = deepcopy(cursor.fetchall())
try_query("CREATE TABLE IF NOT EXISTS TestTable(TestName VARCHAR(255) PRIMARY KEY, SupplInfo VARCHAR(255))")
print("Created table.")
cursor.execute("SHOW TABLES")
new_tables = deepcopy(cursor.fetchall())

MySQL and R, inserting more than 1000 rows at a time

I'm using R to insert a data.frame into a MySQL database. I have this code below that inserts 1000 rows at a time successfully. However, it's not practical if I have a data.frame with tens of thousands of rows. How would you do a bulk insert using R? is it even possible?
## R and MySQL
library(RMySQL)
### create sql connection object
mydb = dbConnect(MySQL(), dbname="db", user='xxx', password='yyy', host='localhost', unix.sock="/Applications/MAMP/mysql/mysql.sock")
# get data ready for mysql
df = data.format
# chunks
df1 <- df[1:1000,]
df2 <- df[1001:2000,]
df3 <- df[2001:nrow(df),]
## SQL insert for data.frame, limit 1000 rows
dbWriteTable(mydb, "table_name", df1, append=TRUE, row.names=FALSE)
dbWriteTable(mydb, "table_name", df2, append=TRUE, row.names=FALSE)
dbWriteTable(mydb, "table_name", df3, append=TRUE, row.names=FALSE)
For completeness, as the link suggests, write the df to a temp table and insert into the destination table as follows:
dbWriteTable(mydb, name = 'temp_table', value = df, row.names = F, append = F)
dbGetQuery(mydb, "insert into table select * from temp_table")
Fast bulk insert is now supported by the DBI-based ODBC package, see
this example posted by Jim Hester (https://github.com/r-dbi/odbc/issues/34):
library(DBI);
con <- dbConnect(odbc::odbc(), "MySQL")
dbWriteTable(con, "iris", head(iris), append = TRUE, row.names=FALSE)
dbDisconnect(con)
Since RMySQL is also DBI-based you just have to "switch" the DB connection
to use the odbc package (thanks to the standardized DBI interface of R).
Since the RMySQL package
... is being phased out in favor of the new RMariaDB package.
according to their web site (https://github.com/r-dbi/RMySQL) you could try switching the driver package to RMariaDB (perhaps they have already implemented a bulk insert feature).
For details see: https://github.com/r-dbi/RMariaDB
If all else fails, you could put it in a loop:
for(i in 0:floor(nrow(df)/1000)) {
insert_set = df[(i*1000 + 1):((i+1)*1000),]
dbWriteTable(mydb, "table_name", insert_set, append=T, row.names=F)
}