I want to insert a tab-delimted file, which is conatining both japanese and english characters with special charcters. I am using RMySQL to do is. One of a solution i tried giving below error:
dbWriteTable(con, "japan_test2", d, append = T, row.names=FALSE);
Error in mysqlExecStatement(conn, statement, ...) : RS-DBI driver: (could not run statement: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '˜¨å¤œã®ã‚³ãƒ³_ text)' at line 3)
In addition: Warning message:
In strsplit(msg, "\n") : input string 1 is invalid in this locale
[1] FALSE
Warning message:
In mysqlWriteTable(conn, name, value, ...) :
could not create table: aborting mysqlWriteTable
Current Locale: LC_COLLATE=English_United States.1252;LC_CTYPE=Japanese_Japan.932;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
Locale Tried: US, Japanese.
Encoding Tried: UTF-8,16,ASCII.
System: Windows7
RStudio Version 0.98.977
MySQL 5.4.27 CE
Probably you aren't setting properly the encoding of the connection. You can try this:
con <- dbConnect(MySQL(), user=user, password=password,dbname=dbname, host=host, port=port)
# With the next line I try to get the right encoding (it works for Spanish keyboards)
encoding <- if(grepl(pattern = 'utf8|utf-8',x = Sys.getlocale(),ignore.case = T)) 'utf8' else 'latin1'
dbGetQuery(con,paste("SET names",encoding))
dbGetQuery(con,paste0("SET SESSION character_set_server=",encoding))
dbGetQuery(con,paste0("SET SESSION character_set_database=",encoding))
dbWriteTable( con, value = dfr, name = table, append = TRUE, row.names = FALSE )
dbDisconnect(con)
Remember that you have to use your local encoding as the right encoding of the connection. I try to get my encoding in the third line of the proposed code and then set the encoding according to my local encoding. Good luck!
Related
This question expands on this question
Here, I'm using the custom function created by #Simon.S.A. shown in the answer to this question. I'm attempting to save a tbl_sql object in R to MySQL as a new table without first saving it locally. Here, the database and schema in my MySQL are named "test." The tbl_sql object in R is my_data, and I want to save this is a new table in MySQL labeled "car_data".
library(DBI)
library(tidyverse)
library(dbplyr)
#establish connection and import data from MySQL
con <- DBI::dbConnect(RMariaDB::MariaDB(),
dbname = "test",
host = "127.0.0.1",
user = "user",
password = "password")
my_data <- tbl(con, "mtcars")
my_data <- my_data %>% filter(mpg >= 22)
# write function to save tbl_sql as a new table in SQL
write_to_database <- function(input_tbl, db, schema, tbl_name){
# connection
tbl_connection <- input_tbl$src$con
# SQL query
sql_query <- glue::glue(
"SELECT *\n",
"INTO {db}.{schema}.{tbl_name}\n",
"FROM (\n",
dbplyr::sql_render(input_tbl),
"\n) AS sub_query"
)
result <- dbExecute(tbl_connection, as.character(sql_query))
}
# execute function
write_to_database(my_data, "test", "test", "car_data")
After running final line, I get the following error. I'm not sure how I can fix this.
Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '.test.car_data
FROM (
SELECT *
FROM `mtcars`
WHERE (`mpg` >= 22.0)
) AS sub_quer' at line 2 [1064]
12.
stop(structure(list(message = "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '.test.car_data\nFROM (\nSELECT *\nFROM `mtcars`\nWHERE (`mpg` >= 22.0)\n) AS sub_quer' at line 2 [1064]",
call = NULL, cppstack = NULL), class = c("Rcpp::exception",
"C++Error", "error", "condition")))
11.
result_create(conn#ptr, statement, is_statement)
10.
initialize(value, ...)
9.
initialize(value, ...)
8.
new("MariaDBResult", sql = statement, ptr = result_create(conn#ptr,
statement, is_statement), bigint = conn#bigint, conn = conn)
7.
dbSend(conn, statement, params, is_statement = TRUE)
6.
.local(conn, statement, ...)
5.
dbSendStatement(conn, statement, ...)
4.
dbSendStatement(conn, statement, ...)
3.
dbExecute(tbl_connection, as.character(sql_query))
2.
dbExecute(tbl_connection, as.character(sql_query))
1.
write_to_database(my_data, "test", "test", "car_data")
Creating a table with INTO command is an SQL Server (even MS Access) specific syntax and not supported in MySQL. Instead, consider the counterpart statement: CREATE TABLE...SELECT. Also, schema differs between RDBMS's. For MySQL, database is synonymous to schema.
Therefore, consider adjusted version of SQL build:
sql_query <- glue::glue(
"CREATE TABLE {db}.{tbl_name}\n AS \n",
"SELECT * \n",
"FROM (\n",
dbplyr::sql_render(input_tbl),
"\n) AS sub_query"
)
I am fairly new to programming and I've been trying to implement this code in Python, using PyCharm. I'm running the code via a remote server, using PyCharm on my local computer. It was written by a former colleague, and has been giving a lot of encoding issues since we updated the packages like MySQL and the Python interpreter to 3.8. The MySQL version is 8.0, but this is an update. That was not the version installed originally when the code was written.
This is the full error that I am getting:
findBestMatch
*** WARNING: FoundCity[i] = {'Country': 'Austria', 'Page': 'Contact', 'Confidence': 10, 'Mentions': 1}
WriteToDB
Problem <class 'MySQLdb._exceptions.ProgrammingError'>
Traceback (most recent call last):
File "/remotepath/TextMining/NER/FindLocationStoreSQL.py", line 399, in
WriteToDB(c.title(), cn, idProject, 10, "Contact", "v1", cursor, db, database_country)
File "/remotepath/TextMining/NER/FindLocationStoreSQL.py", line 286, in WriteToDB
cursor.execute(sql,values)
File "/home/localhost/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "/home/localhost/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 319, in _query
db.query(q)
File "/home/localhost/.local/lib/python3.8/site-packages/MySQLdb/connections.py", line 259, in query
_mysql.connection.query(self, query)
MySQLdb._exceptions.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'City text mined, Country from datasource'',Confidence='10',FoundWhere=''Contact'' at line 1")
Process finished with exit code -1
The code sample of the function it is trying to run is below:
def findBestMatch(FoundCity,FoundCountry,database_country):
pair_candidates = []
for i in range(0,5):
for j in range(0,5):
if len(FoundCity)>i and len(FoundCountry)>j:
city_i = FoundCity[i].get('City')
country_j = FoundCountry[j].get('Country')
if city_i != None and country_j != None:
#sql = "SELECT City,Country_CountryName,Longitude,Latitude FROM Semanticon.City where city like '{0}' and Country_CountryName like '{1}' and Population>0 order by Population desc".format(FoundCity[i]['City'].encode('utf-8'),FoundCountry[j]['Country'].encode('utf-8'))
sql = "SELECT City,Country_CountryName,Longitude,Latitude FROM Semanticon.City where city like '{0}' and Country_CountryName like '{1}' and Population>0 order by Population desc".format(
FoundCity[i]['City'], FoundCountry[j]['Country'])
try:
cursor.execute(sql)
except:
db = MySQLdb.connect(host, username, password, database, charset='utf8')
db.set_character_set("utf8")
cursor = db.cursor()
cursor.execute(sql)
resul = cursor.fetchall()
if len(resul)>0:
pair_candidates.append({"City":FoundCity[i]['City'],"Country":FoundCountry[j]['Country'],"Score":(FoundCity[i]["Confidence"]+FoundCountry[j]["Confidence"]+0.5*(FoundCity[i]["Mentions"]+FoundCountry[j]["Mentions"]))})
#return FoundCity[i]['City'],FoundCountry[j]['Country'],FoundCity[i]['Confidence']
else:
if city_i == None:
print("*** WARNING: FoundCity[i] = ", FoundCity[i])
else:
print("*** WARNING: FoundCountry[j] = ", FoundCountry[j])
I had to take the encoding out, hence the commented out 'sql' line. The encoding was causing problems and adding an extra 'b' to the string to be read from the database.
The 'WriteToDB' function that it's complaining about is below:
def WriteToDB(City,Country,ProjectId,Confidence,Location,Version,cursor,db,database_country):
sql = None
if database_country!="":
if database_country == Country:
if City != "":
#sql = "UPDATE ProjectLocation SET City='{0}',DataTrace='{1}',Confidence={2},FoundWhere='{3}' WHERE Projects_idProjects={4} and Country='{5}';".format(City," City text mined, Country from datasource",Confidence,Location,ProjectId,original_database_cntry)
sql = "UPDATE ProjectLocation SET City='%s',DataTrace='%s',Confidence='%s',FoundWhere='%s' WHERE Projects_idProjects='%s' and Country='%s';"
values = (City, " City text mined, Country from datasource", Confidence, Location, ProjectId,
original_database_cntry)
if database_country!=Country:
if Country.encode('utf-8') in database_country:
#sql = "UPDATE ProjectLocation SET City='{0}',DataTrace='{1}',Confidence={2},FoundWhere='{3}' WHERE Projects_idProjects={4} and Country='{5}';".format(
# City, " City text mined, Country from datasource", Confidence, Location, ProjectId,
#original_database_cntry)
sql = "UPDATE ProjectLocation SET City='%s',DataTrace='%s',Confidence='%s',FoundWhere='%s' WHERE Projects_idProjects='%s' and Country='%s';"
values = (City, " City text mined, Country from datasource", Confidence, Location, ProjectId,
original_database_cntry)
else:
print("Country conflict in project:"+str(ProjectId))
else:
#sql = "Insert into ProjectLocation (Type,City,Country,Projects_idProjects,Original_idProjects,IsLocationFromDataset,Confidence,FoundWhere,Version,DataTrace)" \
# "Values ('{0}','{1}','{2}',{3},{4},0,{5},'{6}','{7}','{8}')".format("Main",City,Country,ProjectId,ProjectId,Confidence,Location,Version,"Both minded from text v0.1")
sql = "Insert into ProjectLocation (Type,City,Country,Projects_idProjects,Original_idProjects,IsLocationFromDataset,Confidence,FoundWhere,Version,DataTrace)" \
"Values ('%s','%s','%s','%s',0,'%s','%s','%s','%s','%s')"
values = ("Main", City, Country, ProjectId,ProjectId, Confidence, Location, Version, "Both minded from text v0.1")
if sql!=None:
cursor.execute(sql,values)
db.commit()
I commented out the SQL queries as shown and tried to bind them instead, because it was giving a lot of encoding errors. I'm not sure how to get rid of this error and not end up with the encoding errors yet again.
Can someone help?
UPDATE to the question.
I reverted back all the sql queries, and used the queries with the encoding now (previously commented out) and I am getting 'b's in the output.
Any suggestions on how to properly encode these SQL queries so the binary encoding does not come out as b's in the output?
Here is a sample of the output:
ProjectID,ProjectName,FoundCity,FoundCountry,DatabaseCity,DatabaseCountry,Confidence,FoundWhere,Website
2542, Migrantour Country,,b'',b'',b'',10,Contact,link
3938,GeoSmartCity,,,b'',b'',10,Contact,link
My data contains special characters like German umlauts.
p=structure(list(ppl_code = c(992621L, 992381L, 992136L, 991989L,
991898L, 991759L, 991681L, 991593L, 991294L, 991036L, 990934L,
990751L, 990535L, 990411L, 990182L, 989507L), proj_name = c("klo",
"Dalbygda", "Oosterhorn", "Hån", "Yatir", "Montigny la Cour",
"Valle Hermoso", "Acciona Honawad - 120 MW", "Apfeltrang", "RiaBlades",
"General Acha", "Lindau-Böhlitz", "Apfeltrang", "Alcazar Round 2",
"Peckelsheim", "Linnich 3")), .Names = c("ppl_code", "proj_name"
), row.names = 15:30, class = "data.frame")
When I try to write it into MySQL database :
conn <- dbConnect(
drv = RMySQL::MySQL(),
dbname = "mydb",
host = "#####",
username = "#####",
password = "#####")
dbWriteTable(conn, value = p, name = "MyTable",row.names=FALSE)
I'm getting the Encoding error :
could not run statement: Invalid utf8 character string: 'Lindau-B'
I have checked several posts regarding this issue like here and here but they are all general explanation without a clear solution !
can anybody help me with a clear query that could solve this issue ?
You need to announce that UTF-8 is being used.
Tool -> Global Options -> Code -> Saving and put UTF-8
rs <- dbSendQuery(con, 'SET NAMES utf8')
I have been trying to export a large pandas dataframe using DataFrame.to_sql to a MySQL database, but the dataframe has unicode characters in some columns, some of which cause warnings during export and are converted to ?.
I managed to reproduce the issue with this example (database login removed):
import pandas as pd
import sqlalchemy
import pymysql
engine = sqlalchemy.create_engine('mysql+pymysql://{}:{}#{}/{}?charset=utf8'.format(*login_info), encoding='utf-8')
df_test = pd.DataFrame([[u'\u010daj',2], \
['čaj',2], \
['špenát',4], \
['květák',7], \
['kuře',1]], \
columns = ['a','b'])
df_test.to_sql('test', engine, if_exists = 'replace', index = False, dtype={'a': sqlalchemy.types.UnicodeText()})
The first two rows of the dataframe should be the same, just defined differently.
I get the following warning, and the problematic characters (č, ě, ř) are rendered as ?:
/usr/local/lib/python3.6/site-packages/pymysql/cursors.py:166: Warning: (1366, "Incorrect string value: '\\xC4\\x8Daj' for column 'a' at row 1")
result = self._query(query)
/usr/local/lib/python3.6/site-packages/pymysql/cursors.py:166: Warning: (1366, "Incorrect string value: '\\xC4\\x8Daj' for column 'a' at row 2")
result = self._query(query)
/usr/local/lib/python3.6/site-packages/pymysql/cursors.py:166: Warning: (1366, "Incorrect string value: '\\xC4\\x9Bt\\xC3\\xA1k' for column 'a' at row 4")
result = self._query(query)
/usr/local/lib/python3.6/site-packages/pymysql/cursors.py:166: Warning: (1366, "Incorrect string value: '\\xC5\\x99e' for column 'a' at row 5")
result = self._query(query)
with the resulting database table test looking like this:
a b
?aj 2
?aj 2
špenát 4
kv?ták 7
ku?e 1
Curiously, the ž, š and á characters (and others in my full dataset) are processed correctly, so it seems to only affect a subset of unicode characters. As you can see above, I also tried setting utf-8 wherever I could (engine, DataFrame.to_sql) with no effect.
pymysql:
import pymysql
con = pymysql.connect(host='127.0.0.1', port=3306,
user='root', passwd='******',
charset="utf8mb4")
sqlalchemy:
db_url = sqlalchemy.engine.url.URL(drivername='mysql', host=foo.db_host,
database=db_schema,
query={ 'read_default_file' : foo.db_config, 'charset': 'utf8mb4' })
See "Best practice" in http://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored Explanation of ?:
The bytes to be stored are not encoded as utf8/utf8mb4. Fix this.
The column in the database is CHARACTER SET utf8 (or utf8mb4). Fix this.
Also, check that the connection during reading is UTF-8.
(Note: The CHARACTER SETs utf8 and utf8mb4 are interchangeable for European languages.)
These are Czech characters?
I haved met the same problem,use pymysql drive as well.
I change mysql drive to mysql-connector,1366 Warning disappear
install mysql-connector drive
pip install mysql-connector
sqlalchemy engine setting like this
create_engine('mysql+mysqlconnector://root:tj1996#localhost:3306/new?charset=utf8mb4')
I am using RMySQL package to write (append) data in current table.
I am using R, version 3.3.2.
My code looks like this:
library(RMySQL)
df_final <- some_data
m<-dbDriver("MySQL")
mydb <- dbConnect(m, user='odvjet12_mislav',
password='my_pass',
host='91.234.46.219',
dbname='odvjet12_fina_pn')
dbWriteTable(mydb, value = df_final, name = "fina_pn", append = TRUE, row.names = FALSE)
This code works fine for some time, but in last ten days, it always return an error:
Error in .local(conn, statement, ...) :
could not run statement: The used command is not allowed with this MySQL version
I don't understand how it is possible for code to work for some time and now, it returns an error?
I kindly ask for feedback on this issue.
Best,
Mislav Šagovac
You could also use dbGetQuery from the RMySQL package and iterate over the rows, which was my solution when I reached a similar error for a dataframe I wanted to write to a MySQL DB:
mydb = dbConnect(MySQL(), user='user', password='password', dbname='databasename', host='hostname')
for(i in 1:nrow(df)){
dbGetQuery(mydb,paste0("INSERT INTO MYTABLE (COL1,COL2) VALUES(",df$col1[i],",",df$col2[i],")"))
}