SET NAMES utf8mb4 - mysql

We are using Dropwizard, JDBI, MySql 5.6 and mysql connector 5.1.32 and use a Pooled data source. In order to support emojis, the only way I have found is to call the query "SET NAMES utf8mb4" on the connection whenever the connection is obtained.
But under load we are observing that this query takes a long time (around 222 ms).. Is there any alternative to this query?
Things tried so far:
1. Tried setting charSet, characterEncoding on jdbc connection url
2. The columns in the table use utf8mb4 encoding and utf8mb4_unicide_ci collation
3. MySql is on RDS, not yet changed the character_set_server etc. variables on RDS

Related

Squeryl utf8mb4 support

I'm using Squeryl to work with a MySQL database. The tables are in utf8mb4 encoding. Now I want to insert some utf8 (4 byte) strings into the db through Squeryl. How do I do that?
I tried to set ?useUnicode=yes&characterEncoding=UTF-8 to my connection url but apparently, UTF-8 here is 3 byte to MySQL so it doesn't work.
I found this StackOverflow answer, but after some digging, I don't see anyway to append my queries with SET NAMES utf8mb4; (changing database config and environment is not an option)
Example string: อลิซร้องเพลงตามเลยค่ะ😂😂😂
Error when trying to insert the string:
Exception in thread "main" org.squeryl.SquerylSQLException: Exception while executing statement : Incorrect string value
Be sure not to connect as root.
Have this in my.cnf (in the [mysqld] section)
init_connect = SET NAMES utf8mb4

utf8mb4 setting for talend - not working

I am migrating the data from sql server to mysql. I am using the tool Talend(ETL) for the same.
The problem comes when I have emojis in the source (sql server) , it does not get inserted to the table in mysql. So, I know I must use utf8mb4 on mysql side.
The client settings character encoding has to be set, for the smileys to get inserted. The database, tables and the server are all on utf8mb4
But, the client i.e., talend is not utf8mb4. So where do I set this?
I tried with 'set names utf8mb4' in additional parameters of tmysqloutput. But this does not work
I have been stuck on this for days, any help on this would be greatly appreciated
Update :
The job looks like this now. But, the smileys are still getting exported as '?'
Thanks
Rathi
First, make sur that your server is properly configured to use utf8mb4.
Following this tutorial, you need to add the following to your my.cnf (or my.ini if you're on Windows):
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
That tells MySQL server to use utf8mb4 and ignore any encoding set by client.
After that, I didn't need to do set any additional properties on the MySQL connection in Talend. I've executed this query in Talend to check the encoding set by it :
SHOW VARIABLES
WHERE Variable_name LIKE 'character\\_set\\_%' OR Variable_name LIKE 'collation%'
And it returned:
|=-----------------------+-----------------=|
|Variable_Name |Value |
|=-----------------------+-----------------=|
|character_set_client |utf8mb4 |
|character_set_connection|utf8mb4 |
|character_set_database |utf8mb4 |
|character_set_filesystem|binary |
|character_set_results | |
|character_set_server |utf8mb4 |
|character_set_system |utf8 |
|collation_connection |utf8mb4_unicode_ci|
|collation_database |utf8mb4_unicode_ci|
|collation_server |utf8mb4_unicode_ci|
'------------------------+------------------'
The following test to insert a pile of poop works:
Update
Using native MySQL components in Talend 6.3.1, you get mysql-connector-java-5.1.30-bin.jar, which is supposed to automatically detect the utf8mb4 used by the server, but for some reason (bug?) it isn't doing that.
I switched to using JDBC components, and downloaded the latest mysql connector (mysql-connector-java-5.1.45-bin.jar), I got it working by setting these additional parameters on the tJDBCConnection component :
useUnicode=true&characterEncoding=utf-8
(even if I'm specifying utf-8, the doc says it will treat it as utf8mb4)
Here's what my job looks like now :

How to set the character-set to utf8 at session level in jhipster

When deploying my jhipster based application to cloud foundry (in my case Pivotal with ClearDB service) I don't have option to change the DB character set and not to update the JDBC parameters as it shared DB.
the charset of the DB is latin1 and I need it to be utf-8 to be able support languages like Arabic and Hebrew.
So the only option I think about to support those languages is to init the DB session/connection when it's created, like running below sqls:
SET session character_set_client = charset_name;
SET session character_set_results = charset_name;
SET session character_set_connection = charset_name;
How this can be done in jhipster I don't see place where we can set DB connection/session init sqls and if you have any other recommendation?
Currently what happen is that Arabic/Hebrew input data coming from client saved in the DB as ????
BTW if I will update the DB entries using MYSQL Workbench the Arabic/Hebrew values are save correctly and also displayed correctly.
Thanks,
Rabiaa
The data was destroyed during INSERTion.
The bytes to be stored are not encoded as utf8/utf8mb4. Fix this.
The column in the database is CHARACTER SET utf8 (or utf8mb4). Fix this.
Also, check that the connection during reading is UTF-8.
See "question marks" in http://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored
Is that one question mark per character? Or two (in the case of Arabic / Hebrew)?
Workaround to solve the issue, by creating proxy service:
1 Unbind the clearDB service (but keep it, just unbind)
2 Create new user provided service which will call the clearDB service with custom uri.
See the following commands:
cf create-user-provided-service mysql-db -p '{"uri":"mysql://<uri of the clearDB service>?useUnicode=true&characterEncoding=utf8&reconnect=true"}'
cf bind-service <app name> mysql-db
cf restart <app name>

Collation issue with linked servers SQLServer and MySql and Incorrect String Value

Setup
Have an SQL Server connecting to Mysql as a linked server. Periodically a SQL Server Agent Job calls a stored procedure (SP) to check a flag in a SQL Server table, pass the contents (via a temp table) of unflagged rows to a SP in the mySQL db, flagging the original SQL Server records as they go.
Problem
Sometimes this all (appears to) works fine, other times, the mySQL db does not get the update data, even though the flag is set in the SQLServer.
Viewing the history of the server agent job I see an error like this
SQLSTATE 01000] (Error 7412) Could not execute statement on remote server 'LINKED_MYSQL'.
[SQLSTATE 42000] (Error 7215) OLE DB provider "MSDASQL" for linked server "LINKED_MYSQL" returned message "[MySQL][ODBC 5.3(a) Driver][mysqld-5.6.21-log]Incorrect string value: '\xF8</val...' for column 'INFO' at row 1".
As there isn't a row INFO (anywhere) it looks like something systemy might be the cause. I was wondering if the Collations might well be the problem.
SQL Server is: SQL_Latin1_General_CP1_CI_AS
Whereas Mysql is: utf8.
I am unable to change the Collation of the SQL Server.
Any suggestions as to whether this is the likely cause or how to debug other solutions. No other issues are to be found in the log files.
UPDATE: So the error message says there is a problem with '\xF8</val'-
there isn't any string in my data xF8 so something is being replaced here.
UPDATE 2: Yes this would tally with the content that sometimes gets transferred - there is a ø (which is char \xF8) .
Updated Question:
How can I get SQL SERVER data into the MySQL Server without changing the collation on SQL SERVER? I have already tried to change the collation for the MySQL DB and Table to "default collation: utf8mb4_general_ci / charset: utf8mb4
Fixing the collation in mySQL actually has solved the original problem - however one line of the mySQL SP
SET hostIP := (SELECT host from information_schema.processlist WHERE ID=connection_id());
Causes the problem to reoccur. Comment the above out hooray but leave it in booo.
Have tried
SET hostIP := (SELECT host COLLATE utf8_general_ci from information_schema.processlist WHERE ID=connection_id());
but no cigar

MySQL fails update a record with utf8mb4 string on Jetty

I'm running java over jetty, on an EC2 linux instance, using MySQL DB. The column is a VARCHAR, set to accept utf8mb4 encoding.
After playing around with stuff, I've found out that it works when I run this code through gradle jettyRunWar, or even when running the same code on a tomcat server.
It doesn't work when I place the exact same war that was working before in $JETTY_HOME/webapps/root.war, then run jetty with sudo service jetty start.
The error shown is -
java.sql.BatchUpdateException: Incorrect string value: '\xF0\x9F\x99\x89' for column 'name' at row 1
Current column definition -
`name` varchar(50) CHARACTER SET utf8mb4 DEFAULT NULL
Value is set in SQL through preparedStatement.setString(...) and I made sure that mysql connector JAR is the same.
Any ideas?
Problem resolved once I've set all character_set_... MySQL DB variables to utf8mb4.
That probably means that newer version of jetty (8+) treat JDBC connection different than Tomcat or than older version, because the exact connection string, beans definition & DB was used at all cases.
Until now, character_set_... params were set to utf8 with specific columns defined as utf8mb4, that used to be enough.
Hopes that saves anyone else the day I was stuck on it.