Hibernate, MySQL Encoding does not work on debian - mysql

I've made an application in Java EE that uses Hibernate to communicate with MySQL. It works perfectly on my Windows development machine, but I have problem on debian, where the application is deployed.
When I search for keyword with Polish letters(like ł, ą, ć, ó etc,) the result is ok on Windows, but on server, where I have imported the database it does not work.
Hibernate query looks like this:
#NamedQuery(name = "Keyword.findByKeyword", query = "SELECT k FROM Keyword k WHERE k.keyword = :keyword")
and is called like this:
myEntityManager.createNamedQuery("Keyword.findByKeyword").setParameter("keyword", keyword).getSingleResult();
When I use mysql on debian via SSH and type in SELECT query manually:
SELECT * FROM keywords WHERE keyword = 'ser żółty';
it also works and return single result. Encoding and collations of tables and columns are also ok. In datasource configuration I've added
?UseUnicode=true&characterEncoding=utf8
parameters, but it also did not help. I thought that maybe there is a problem with encoding in data from request send by form, but the problem appears even if the parameter i.e. "ser żółty" is hardcoded in my repository class.
I also use Hibernate Search for indexing and the FullTextEntityManager return correct results with Polish letters.
I assume that there is some problem between Hibernate and MySQL, but I have no more ideas what could I change. Any suggestions?
Server Wildfly9.0.1, MySQL 5.6

Ok the problem was in encoding on the mysql server level. It was set to latin1 by default. To fix this follow this question Change MySQL default character set to UTF-8 in my.cnf? and edit your my.cnf file.

Related

SQLAlchemy mysql cannot get the correct charset

Python 3.8.8 programm with Flask 2.0.1 and Flask-SQLAlchemy 2.5.1
MySql database, collation of the tables: utf8_general_ci.
I'm using two other sqlserver DB with SQLALCHEMY_BINDS. Everything runs on Windows 10.
Some chars from select queries on the MySql DB comes wrong: "situazione è decisamente migliorata"
should be: "situazione è decisamente migliorata"
This would solve the problem:
mystring.encode('cp1252').decode('utf8')
but I need a solution at program level. I tried:
appending to SQLALCHEMY_DATABASE_URI connection string:
"?charset=utf8" or "?charset=cp1215" and others
setting app.config['MYSQL_CHARSET'] and
app.config['MYSQL_DATABASE_CHARSET'] to 'utf8', 'utf8mb4', 'latin1', 'cp1252'
...
passing a parameter to SQLAlchemy like db = SQLAlchemy(use_native_unicode="utf8"), many variations here too
No attemp worked. Please I need suggestions.
Are you looking for a way to specify per database connection encoding ?
For all connections try to use
app.config['SQLALCHEMY_ENGINE_OPTIONS'] = {'encoding': 'cp1252'}
For specific connections to different DBs you can also use engine_options:
engine = create_engine('mysql://user:password#hostname/dbname',
encoding='cp1252')
Got the solution.
The problem was not a problem.
The person who build the original database (that is quite old) coded wrong some characters.
Some of my approaches and the one suggested by olegsv, worked, I checked that debugging deep down into into sqlalchemy data structures, the driver accepted the characters encoding, but the very chars in data were themself worong.
This was unespected.
Maybe I should delete the whole question.

MySQL Connector/J v5.x upgrade: query now returning byte[] instead of String

I just updated the JDBC driver for my application from
mysql-connector-java-3.1.12-bin.jar
to
mysql-connector-java-5.1.34-bin.jar.
With the v3.x driver, this kind of a query works:
select concat("<a href>", count(sakila.payment.payment_id), "</a>")
from sakila.payment;
But now with the new v5.x driver, the query only works with a cast().
select cast(concat("<a href>", count(sakila.payment.payment_id), "</a>")
as char(30)) from sakila.payment;
Is there any property in the MySQL database I can change?
I don't want to change hundreds of queries like that.
I suspect that you will have to bite the bullet and update your code. There is a bug report here that seems to match your circumstances and the status of that bug report is "Won't fix". The response from the developers ([4 Apr 2007 17:43] Reggie Burnett) was:
This is something that we can't really fix. Let me explain.
MySQL has several issues when it comes to reporting whether a result if binary or not. This was very bad on MySQL versions prior to 5.0 but it's still a problem even today. The SQL you reported is returned by MySQL as binary when it obviously is not. The connector can't know for sure. With 5.0.5 and 5.0.6, we tried to make a "best guess" but that code caused more problems than it solved, so with 5.0.7 we have rolled it out. Your SQL will return string properly with 5.0.7, but that doesn't mean it's fixed. In fact, it returns string because we are ignoring the binary flag so that means you could generate valid SQL that should return binary and 5.0.7 will return string.
Until the server is fixed, the connector just can't always do the right thing. I hope this has cleared it up somewhat.

How to display unicode in MySQL result?

http://www.sqlfiddle.com/#!2/82f65/1
I tried this:
create table x(y varchar(100) character set utf8);
insert into x(y) values('爱');
But the chinese character doesn't appear:
select y from x;
Output:
Y
?
I'm the author of sqlfiddle.com. The problem was that I didn't have my connection string and default database encoding for mysql setup to properly handle UTF8. I have fixed this now, but because the fiddle you posted is still using the obsolete settings, you'll have to see it working here on my slightly-modified version of your fiddle:
http://www.sqlfiddle.com/#!2/e79e8/1
Your link might start working eventually, it just needs to clear out of the running memory and be reset. After no one hits it for a while it should be harvested and then ready to be built back up cleanly. Thanks!
FYI, the changes I had to make to get it to work were found here: http://www.compoundtheory.com/?action=displayPost&ID=421
The relavent bits where adding this to my connection string from java:
useUnicode=true&characterEncoding=UTF-8
And adding this to my create database statement:
create database my_new_database default CHARACTER SET = utf8 default COLLATE = utf8_general_ci;
It is working fine in mysql on my localhost. it may be due to mysql charset or some setting please check it.
If you have to run this query via program like php then
run query before select query
"SET NAMES utf8"
It will be return result properly
thanks
The Chinese character is not displaying in fiddle but in actual mysql database it is working fine. Kindly check your mysql version

Renaming columns in a MySQL select statement with R package RJDBC

I am using the RJDBC package to connect to a MySQL (Maria DB) database in R on a Windows 7 machine and I am trying a statement like
select a as b
from table
but the column will always continue to be named "a" in the data frame.
This works normally with RODBC and RMySQL but doesn't work with RJDBC. Unfortunately, I have to use RJDBC as this is the only package that has no problem with the encoding of chinese, hebrew and so on letters (set names and so on don't seem to work with RODBC and RMySQL).
Has anybody experienced this problem?
I have run into the same frustrating issue. Sometimes the AS keyword would have its intended effect, but other times it wouldn't. I was unable to identify the conditions to make it work correctly.
Short Answer: (Thanks to Simon Urbanek (package maintainer for RJDBC), Yev, and Sebastien! See the Long Answer.) One thing that you may try is to open your JDBC connection using ?useOldAliasMetadataBehavior=true in your connection string. Example:
drv <- JDBC("com.mysql.jdbc.Driver", "C:/JDBC/mysql-connector-java-5.1.18-bin.jar", identifier.quote="`")
conn <- dbConnect(drv, "jdbc:mysql://server/schema?useOldAliasMetadataBehavior=true", "username", "password")
query <- "SELECT `a` AS `b` FROM table"
result <- dbGetQuery(conn, query)
dbDisconnect(conn)
This ended up working for me! See more details, including caveats, in the Long Answer.
Long Answer: I tried all sorts of stuff, including making views, changing queries, using JOIN statements, NOT using JOIN statements, using ORDER BY and GROUP BY statements, etc. I was never able to figure out why some of my queries were able to rename columns and others weren't.
I contacted the package maintainer (Simon Urbanek.) Here is what he said:
In the vast majority of cases this is an issue in the JBDC driver, because there is really not much RJDBC can do other than to call the driver.
He then recommended that I make sure I had the most recent JDBC driver for MySQL. I did have the most recent version. However, it got me thinking "maybe it IS a bug with the JDBC driver." So, I searched Google for: mysql jdbc driver bug alias.
The top result for this query was an entry at bugs.mysql.com. Yev, using MySQL 5.1.22, says that when he upgraded from driver version 5.0.4 to 5.1.5, his column aliases stopped working. Asked if it was a bug.
Sebastien replied, "No, it's not a bug! It's a documented change of behavior in all subsequent versions of the driver." and suggested using ?useOldAliasMetadataBehavior=true, citing documentation for the JDBC driver.
Caveat Lector: The documentation for the JDBC driver states that
useColumnNamesInFindColumn is preferred over useOldAliasMetadataBehavior unless you need the specific behavior that it provides with respect to ResultSetMetadata.
I haven't had the time to fully research what this means. In other words, I don't know what all of the ramifications are of using useOldAliasMetadataBehavior=true are. Use at your own risk. Does someone else have more information?
I don't know RJDBC, but in some cases when it is necessary to give permanent aliases to columns without renaming them, you can use VIEWs
CREATE OR REPLACE VIEW v_table AS
SELECT a AS b
FROM table
... and then ...
SELECT b FROM v_table
There is a separate function in the ResultSetMetaData interface for retrieving the column label vs the column name:
String getColumnLabel(int column) throws SQLException;
Gets the designated column's suggested title for use in printouts and
displays. The suggested title is usually specified by the SQL AS
clause. If a SQL AS is not specified, the value returned
fromgetColumnLabel will be the same as the value returned by the
getColumnName method.
Using getColumnLabel should resolve this issue (if not, check that your JDBC driver is following this spec).
e.g.
ResultSetMetaData rsmd = rs.getMetaData();
int columnCount = rsmd.getColumnCount();
while(rs.next()) {
for (int i = 1; i < columnCount + 1; i++) {
String label = rsmd.getColumnLabel(i);
System.out.println(rs.getString(label));
}
}
This is the work around we use for R and SAP HANA via RJDBC:
names(result)[1]<-"b"
It's not the nicest work around, but since Aaron's solution does work for us, we went with this "solution".

UTF8 - Hibernate/MySQL weirdness

I have a db in production where all of my tables are using utf8 / utf8_general_ci encoding. This is basically working fine except in one scenario.
What happens is that ??? are being returned for some characters (Chinese, etc); however, they are also returned correctly for the same table but via a different criteria.
I've double checked the connection parameters from Hibernate to MySQL and they have the good charset set.
I cannot understand how this can be happening. The criteria that returns the bad characters is just a simple findById:
Criteria criteria = getHibernateSession().createCriteria(CalendarEvent.class);
criteria.add(Restrictions.eq("id", id));
return (CalendarEvent) criteria.uniqueResult();
This is only happening in production on Solaris - I cannot reproduce it locally.
In your connection-string have you tried
jdbc:mysql://localhost/dbname?characterEncoding=utf8
or add JVM parameter -Dfile.encoding=utf-8 when starting your application / server
Try setting the following properties in your hibernate configuration file:
<property name="hibernate.connection.useUnicode">true</property>
<property name="hibernate.connection.characterEncoding">UTF-8</property>
<property name="hibernate.connection.charSet">UTF-8</property>