Rails - Strange characters pass through validation and break query - mysql

I copy-pasted a string into a form field and a strange character broke my MySql query.
I could force the error on the console this way (the weird character is in the middle of the two words "Invalid" and "Character", you can also copy-paste it):
> dog.name = "Invalid ​Character"
> dog.save # -> false
Which returns the following error:
ActiveRecord::StatementInvalid: Mysql2::Error: Incorrect string value: '\xE2\x80\x8BCha...' for column 'name' at row 1: UPDATE `dogs` SET `name` = 'Invalid ​Character' WHERE `dogs`.`id` = 2227
It replaced the character by '\xE2\x80\x8B' as the error said.
Is there any validation that I could use to remove these kind of weird characters?
Obs: I also saw that
> "Invalid ​Character".unpack('U*')
Returns
[73, 110, 118, 97, 108, 105, 100, 32, 8203, 67, 104, 97, 114, 97, 99, 116, 101, 114]
The weird character must be the 8230 one.
Obs2: In my application.rb, I have: config.encoding = "utf-8"
EDIT
On my console, I got:
> ActiveRecord::Base.connection.charset # -> "utf8"
> ActiveRecord::Base.connection.collation # -> "utf8_unicode_ci"
I also ran (on the rails db mySql console):
> SELECT table_collation FROM INFORMATION_SCHEMA.TABLES where table_name = 'dogs';
and got "utf8_unicode_ci"
EDIT2
If I change the table's character set to utf8mb4 I don't get the error. But still, I have to filter those characters.

On the rails db MySql console, I used:
SHOW CREATE TABLE dogs;
To find out that the charset for the table was latin1.
I just added a migration with this query:
ALTER TABLE dogs CONVERT TO CHARACTER SET utf8mb4;
And it started to work fine.

Related

Pandas DataFrame to MySQL export with Unicode characters

I have been trying to export a large pandas dataframe using DataFrame.to_sql to a MySQL database, but the dataframe has unicode characters in some columns, some of which cause warnings during export and are converted to ?.
I managed to reproduce the issue with this example (database login removed):
import pandas as pd
import sqlalchemy
import pymysql
engine = sqlalchemy.create_engine('mysql+pymysql://{}:{}#{}/{}?charset=utf8'.format(*login_info), encoding='utf-8')
df_test = pd.DataFrame([[u'\u010daj',2], \
['čaj',2], \
['špenát',4], \
['květák',7], \
['kuře',1]], \
columns = ['a','b'])
df_test.to_sql('test', engine, if_exists = 'replace', index = False, dtype={'a': sqlalchemy.types.UnicodeText()})
The first two rows of the dataframe should be the same, just defined differently.
I get the following warning, and the problematic characters (č, ě, ř) are rendered as ?:
/usr/local/lib/python3.6/site-packages/pymysql/cursors.py:166: Warning: (1366, "Incorrect string value: '\\xC4\\x8Daj' for column 'a' at row 1")
result = self._query(query)
/usr/local/lib/python3.6/site-packages/pymysql/cursors.py:166: Warning: (1366, "Incorrect string value: '\\xC4\\x8Daj' for column 'a' at row 2")
result = self._query(query)
/usr/local/lib/python3.6/site-packages/pymysql/cursors.py:166: Warning: (1366, "Incorrect string value: '\\xC4\\x9Bt\\xC3\\xA1k' for column 'a' at row 4")
result = self._query(query)
/usr/local/lib/python3.6/site-packages/pymysql/cursors.py:166: Warning: (1366, "Incorrect string value: '\\xC5\\x99e' for column 'a' at row 5")
result = self._query(query)
with the resulting database table test looking like this:
a b
?aj 2
?aj 2
špenát 4
kv?ták 7
ku?e 1
Curiously, the ž, š and á characters (and others in my full dataset) are processed correctly, so it seems to only affect a subset of unicode characters. As you can see above, I also tried setting utf-8 wherever I could (engine, DataFrame.to_sql) with no effect.
pymysql:
import pymysql
con = pymysql.connect(host='127.0.0.1', port=3306,
user='root', passwd='******',
charset="utf8mb4")
sqlalchemy:
db_url = sqlalchemy.engine.url.URL(drivername='mysql', host=foo.db_host,
database=db_schema,
query={ 'read_default_file' : foo.db_config, 'charset': 'utf8mb4' })
See "Best practice" in http://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored Explanation of ?:
The bytes to be stored are not encoded as utf8/utf8mb4. Fix this.
The column in the database is CHARACTER SET utf8 (or utf8mb4). Fix this.
Also, check that the connection during reading is UTF-8.
(Note: The CHARACTER SETs utf8 and utf8mb4 are interchangeable for European languages.)
These are Czech characters?
I haved met the same problem,use pymysql drive as well.
I change mysql drive to mysql-connector,1366 Warning disappear
install mysql-connector drive
pip install mysql-connector
sqlalchemy engine setting like this
create_engine('mysql+mysqlconnector://root:tj1996#localhost:3306/new?charset=utf8mb4')

R Programming - Japanese Characters Insert into MySQL

I want to insert a tab-delimted file, which is conatining both japanese and english characters with special charcters. I am using RMySQL to do is. One of a solution i tried giving below error:
dbWriteTable(con, "japan_test2", d, append = T, row.names=FALSE);
Error in mysqlExecStatement(conn, statement, ...) : RS-DBI driver: (could not run statement: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '˜¨å¤œã®ã‚³ãƒ³_ text)' at line 3)
In addition: Warning message:
In strsplit(msg, "\n") : input string 1 is invalid in this locale
[1] FALSE
Warning message:
In mysqlWriteTable(conn, name, value, ...) :
could not create table: aborting mysqlWriteTable
Current Locale: LC_COLLATE=English_United States.1252;LC_CTYPE=Japanese_Japan.932;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
Locale Tried: US, Japanese.
Encoding Tried: UTF-8,16,ASCII.
System: Windows7
RStudio Version 0.98.977
MySQL 5.4.27 CE
Probably you aren't setting properly the encoding of the connection. You can try this:
con <- dbConnect(MySQL(), user=user, password=password,dbname=dbname, host=host, port=port)
# With the next line I try to get the right encoding (it works for Spanish keyboards)
encoding <- if(grepl(pattern = 'utf8|utf-8',x = Sys.getlocale(),ignore.case = T)) 'utf8' else 'latin1'
dbGetQuery(con,paste("SET names",encoding))
dbGetQuery(con,paste0("SET SESSION character_set_server=",encoding))
dbGetQuery(con,paste0("SET SESSION character_set_database=",encoding))
dbWriteTable( con, value = dfr, name = table, append = TRUE, row.names = FALSE )
dbDisconnect(con)
Remember that you have to use your local encoding as the right encoding of the connection. I try to get my encoding in the third line of the proposed code and then set the encoding according to my local encoding. Good luck!

Python Syntax for Update SQL query

I want to update a table with the following query. I am getting multiple errors on the following. What is the correct syntax for writing the query below
cursor.execute("""UPDATE `%s` SET `content`=%s WHERE link=%s""", (feed,cnews,news_url))
The error I get when running the above is
Traceback (most recent call last):
File "digger_1.py", line 34, in <module>
cursor.execute("""UPDATE `%s` SET `content`=%s WHERE link=%s""", (feed,cnews,news_url))
File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 174, in execute
self.errorhandler(self, exc, value)
File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
_mysql_exceptions.ProgrammingError: (1146, "Table 'newstracker.'NDTV'' doesn't exist")
The error I picked was the table newstracker.NDTV doesn't exist, which is not right as it does exist, and I believe the error is something else which is wrong with the syntax.
You cannot specify metadata such as database, table, or field names using a parametrized query. You must substitute them using normal string formatting and then use the result string as your parametrized query.
...("""UPDATE `%s` SET `content`=%%s WHERE link=%%s""" % (feed,), ...)

Change right to user with escape character

I would like to execute the following query :
DENY DELETE ON tableTest to Domain\Username
but it prints
Msg 102, Level 15, State 1, Line 1
Incorrect syntax near '\'.
i tried
SELECT #Test = 'Domain\Username'
DENY DELETE ON tableTest to #Test
but it also didn't work.
The Domain\Username is the value i get in sys.database_principals
Any idea?
Thanks
You can use [] to be able to use otherwise invalid characters for identifiers:
DENY DELETE ON tableTest to [Domain\Username]

Django south from MySQL to postgresql

I first started using MySQL in one of my apps and I am now thinking of moving from MySQL to PostgreSQL.
I have South installed for migrations.
When I set up a new DB in postgres I successfully synced my apps and got to a complete halt in one of my last migrations.
> project:0056_auto__chg_field_project_project_length
Traceback (most recent call last):
File "./manage.py", line 11, in <module>
execute_manager(settings)
File "/Users/ApPeL/.virtualenvs/fundedbyme.com/lib/python2.7/site-packages/django/core/management/__init__.py", line 438, in execute_manager
utility.execute()
File "/Users/ApPeL/.virtualenvs/fundedbyme.com/lib/python2.7/site-packages/django/core/management/__init__.py", line 379, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/Users/ApPeL/.virtualenvs/fundedbyme.com/lib/python2.7/site-packages/django/core/management/base.py", line 191, in run_from_argv
self.execute(*args, **options.__dict__)
File "/Users/ApPeL/.virtualenvs/fundedbyme.com/lib/python2.7/site-packages/django/core/management/base.py", line 220, in execute
output = self.handle(*args, **options)
File "/Library/Python/2.7/site-packages/south/management/commands/migrate.py", line 105, in handle
ignore_ghosts = ignore_ghosts,
File "/Library/Python/2.7/site-packages/south/migration/__init__.py", line 191, in migrate_app
success = migrator.migrate_many(target, workplan, database)
File "/Library/Python/2.7/site-packages/south/migration/migrators.py", line 221, in migrate_many
result = migrator.__class__.migrate_many(migrator, target, migrations, database)
File "/Library/Python/2.7/site-packages/south/migration/migrators.py", line 292, in migrate_many
result = self.migrate(migration, database)
File "/Library/Python/2.7/site-packages/south/migration/migrators.py", line 125, in migrate
result = self.run(migration)
File "/Library/Python/2.7/site-packages/south/migration/migrators.py", line 99, in run
return self.run_migration(migration)
File "/Library/Python/2.7/site-packages/south/migration/migrators.py", line 81, in run_migration
migration_function()
File "/Library/Python/2.7/site-packages/south/migration/migrators.py", line 57, in <lambda>
return (lambda: direction(orm))
File "/Users/ApPeL/Sites/Django/fundedbyme/project/migrations/0056_auto__chg_field_project_project_length.py", line 12, in forwards
db.alter_column('project_project', 'project_length', self.gf('django.db.models.fields.IntegerField')())
File "/Library/Python/2.7/site-packages/south/db/generic.py", line 382, in alter_column
flatten(values),
File "/Library/Python/2.7/site-packages/south/db/generic.py", line 150, in execute
cursor.execute(sql, params)
File "/Users/ApPeL/.virtualenvs/fundedbyme.com/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 44, in execute
return self.cursor.execute(query, args)
django.db.utils.DatabaseError: column "project_length" cannot be cast to type integer
I am wondering if there is some workaround for this?
You current migration works that way:
Alter column "project_length" to another type.
It is broken because you are making alter that is not supported by PostgreSQL.
You must fix your migration. You can change it to following migration (it will work, but probably can be done easier):
Create another column project_length_tmp with type you want to project_length have and some default value.
Make data migration from column project_length to project_lenght_tmp (see data migrations in south docs).
Remove column project_length.
Rename column project_length_tmp to project_length.
Kind complicated migration but it have two major strengths:
1. It will work on all databases.
2. It is compatible with your old migration, so you can just override old migration (change the file) and it will be fine.
Approach 2
Another approach to your problem would be just to remove all your migrations and start from scratch. If you have only single deployment of your project it will work fine for you.
You don't provide any details of the SQL being executed, but it seems unlikely that it's an ALTER TYPE failing - assuming the SQL is correct.
=> CREATE TABLE t (c_text text, c_date date, c_datearray date[]);
CREATE TABLE
=> INSERT INTO t VALUES ('abc','2011-01-02',ARRAY['2011-01-02'::date,'2011-02-03'::date]);
INSERT 0 1
=> ALTER TABLE t ALTER COLUMN c_text TYPE integer USING (length(c_text));
ALTER TABLE
=> ALTER TABLE t ALTER COLUMN c_date TYPE integer USING (c_date - '2001-01-01');
ALTER TABLE
=> ALTER TABLE t ALTER COLUMN c_datearray TYPE integer USING (array_upper(c_datearray, 1));
ALTER TABLE
=> SELECT * FROM t;
c_text | c_date | c_datearray
--------+--------+-------------
3 | 3653 | 2
(1 row)
There's not much you can't do. I'm guessing it's incorrect SQL being generated from this Django module you are using.