MSSQL and MySQL as Linked Server - mysql

I have a MSSQL Server 2005 and MySQL Server as linked server.
I want to save particular data from MSSQL to MySQL.
And I have a huge problem related with encoding.
MS SQL
select SERVERPROPERTY ('collation')
Result: Cyrillic_General_CI_AS
MySQL
mysql> SHOW VARIABLES LIKE 'character\_set\_%';
+--------------------------+--------+
| Variable_name | Value |
+--------------------------+--------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
+--------------------------+--------+
When I'm trying to retrive data from MySQL or to insert ones to MySQL
I have a wrong character set in text field,
something like that "???????????????"
How can I convert text data to UTF-8 encoding before inserting the data to linked server?
Or should I change some settings?
I don't want to change encoding of MySQL server on CP-1251, it's not convenient for me.

What is your Collation Compatible property for linked server? This might help.
Have you tried COLLATE?

Related

How to insert unknown charset string to MariaDB table in C

My sample code is..
insert into test set fir='aaa',sec='¿¬ºA¦ µµT °¥µ젰ވ­.alz' (in C)
This string copy&paste to MariaDB prompt, then success.
question
but not work in C code. sec column is NULL. why?
How can I modify this in C code that insert to DB? (It is okay to string is broken.)
MariaDB status...
MariaDB [oops]> status
....
Server characterset: utf8
Db characterset: utf8
Client characterset: utf8
Conn. characterset: utf8
...
MariaDB [oops]> show full columns from test
...
| fir | varchar(255) | utf8_general_ci |
| sec | varchar(255) | utf8_general_ci |
...
thanks.
Ps.
C code is parse character(¿¬ºA¦ µµT °¥µ젰ވ­.alz) in E-Mail
add
MariaDB [oops]> show variables like 'char%';
+--------------------------+-------------------------------+
| Variable_name | Value |
+--------------------------+-------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /xxxx/mysql/share/charsets/ |
+--------------------------+-------------------------------+
8 rows in set (0.01 sec)
MariaDB [oops]>
The mysql C client often defaults to a latin1 client connection while the mysql command line client adopts the character set of the user's environment.
So my guess would be that if you execute "SHOW VARIABLES LIKE 'char%'" from within your C program, you'll get a row back stating that the client connection is in fact latin1.
Thus, although the charset of your original string is "unknown" to you, it probably contains some multibye character sequences which are being wrongfully encoded as a result of inserting data from a latin1 client connection into a utf8 database/column.
Generally speaking, if you want your data to be unadulterated upon insertion then your client connection character encoding should match that of the database/column.

MySQL UTF8 Issue

Okay, I have tried to import "CSV" file into MySQL for the past 24 hours but have failed miserably.
I have set name, set char and there is nothing left that I have not set to UTF8 but it still is not working. Not just for the DB and Tables, but for the server as well, still no use.
I am importing directly into MySQL so it is not PHP issue. I will be grateful if anyone can highlight where am I going wrong.
mysql> SHOW CREATE DATABASE `dict_2`;
+----------+--------------------------------------------------------------------
---------------------+
| Database | Create Database
|
+----------+--------------------------------------------------------------------
---------------------+
| dict_2 | CREATE DATABASE `dict_2` /*!40100 DEFAULT CHARACTER SET utf8 COLLAT
E utf8_unicode_ci */ |
+----------+--------------------------------------------------------------------
---------------------+
1 row in set (0.00 sec)
mysql> show variables like "%character%"; show variables like "%collation%";
+--------------------------+--------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | utf8 |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | C:\xampp\mysql\share\charsets\ |
+--------------------------+--------------------------------+
8 rows in set (0.00 sec)
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_general_ci |
| collation_server | utf8_unicode_ci |
+----------------------+-----------------+
3 rows in set (0.00 sec)
In its current form, this question is impossible to answer.
We're left guessing...
That you're using a MySQL LOAD DATA statement.
You've verified that the characterset encoding of the .csv file is not ucs2.
You've verified that the characterset encoding of the .csv file is utf8 (i.e. matches the character_set_database system variable), of that you've specified the appropriate characterset in the CHARACTER SET clause of the LOAD DATA statement.
Beyond that, there's a whole slew of other things that might be wrong, but we're still just guessing.
Very frequently when something MySQL "fail miserably", there's some sort of indication, like an error message, or some other behavior that we can observe and describe.
In the question, the description of the failure mode is beyond vague, it's entirely non-existent.

How do I convert the OpenShift MySQL 5.1 cartridge to UTF-8

The default MySQL 5.1 cartridge apparently creates all its tables with the latin1 character set. I have an application (Review Board, a python/Django application) that has some issues unless the DB is running as UTF-8. How do I change that? I can't just edit my.cnf because it will be wiped at the next cartridge restart.
mysql> show variables like 'character_set%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
I cannot change this setting in my.cnf, because to the best of my knowledge, there exists no OpenShift environment variable to set the character encoding. How do I persistently change this (ideally in my OpenShift hooks so this will persist into future deployments) and update my existing tables to UTF-8?
I found a solution but not a perfect one :
In openshift installing phpMyAdmin,
Find and change server settings, the relevant character variables changed from latin1 to utf8.
Hope that helps

Unicode characters from database not recognized

This stumps me. I'm upgrading a fairly large app (for me) from Rails 2.3 to Rails 3.0. I'm also running this app in Ruby 1.9.2 as opposed to 1.8.7 before. On top of that I've also switched to HTML5. There are therefore many variables in play.
In several pages, the text coming from the MySQL database just does not display right anymore. This can be as simple as the euro symbol (€) or as esoteric as some Sanskrit text: सर्वम् मंगलम्
While everything looked great on the old site now I get some garbage characters such as € instead of the euro sign or the following:
सर्वम् मंगलम्
... instead of the sanskrit text.
The data in the database is unchanged. As far as I know everything is set up for utf-8 everywhere.
What gives?
Edit 1 following up Roland's help:
Here is what I get on my ubuntu server's MySQL databases:
mysql> SHOW VARIABLES LIKE 'character_set%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
but here is what I get from running the command on my local mac:
mysql> SHOW VARIABLES LIKE 'character_set%';
+--------------------------+------------------------------------------------------+
| Variable_name | Value |
+--------------------------+------------------------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/Cellar/mysql/5.5.14/share/mysql/charsets/ |
+--------------------------+------------------------------------------------------+
The second listing looks better to me (who doesn't understand encoding very much).
Should I modify my server databases' settings? Won't that mess up their existing data? If so how do I go about changing the char. set variables?
When you interpret the given string as Unicode, save it as UTF-8 to a byte stream and then convert the byte stream to MacRoman, you will get the right bytes. These are the UTF-8 encoded string.
I did this (in a UTF-8 terminal):
$ echo 'सर्वम् मंगलम्' > in
$ iconv -f UTF-8 -t MacRoman < in
सर्वम् मंगलम्
So somewhere, the opposite conversion is done to the data. The byte stream is interpreted as being in MacRoman, and it is then converted to UTF-8 again.

Retrieving latin1 encoded results with JDBC

I am trying to retrieve result sets from a MySQL database sing JDBC which is then used to generate reports in BiRT. The connection string is set up in BiRT.
The database is latin1:
SHOW VARIABLES LIKE 'c%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
| collation_connection | latin1_swedish_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
| completion_type | 0 |
| concurrent_insert | 1 |
| connect_timeout | 5 |
+--------------------------+----------------------------+
So I have been trying to correct the strange looking encoding results that are returned (German characters). I thought it would make sense to us the "characterSetResults" property to retrieve the result set as "latin1" like this:
jdbc:mysql://localhost:3306/statistics?useUnicode=true&characterEncoding=latin1&characterSetResults=latin1
This connection string fails and by deduction I have discovered that it is the property:
characterSetResults=latin1
is causing the connection to fail. The error is a long java error which means little to me. It starts with:
org.eclipse.birt.report.data.oda.jdbc.JDBCException: There is an error in get connection, Communications link failure
Last packet sent to the server was 38 ms ago..
at org.eclipse.birt.report.data.oda.jdbc.JDBCDriverManager.doConnect(JDBCDriverManager.java:262)
at org.eclipse.birt.report.data.oda.jdbc.JDBCDriverManager.getConnection(JDBCDriverManager.java:186)
at org.eclipse.birt.report.data.oda.jdbc.JDBCDriverManager.tryCreateConnection(JDBCDriverManager.java:706)
at org.eclipse.birt.report.data.oda.jdbc.JDBCDriverManager.testConnection(JDBCDriverManager.java:634)
at org.eclipse.birt.report.data.oda.jdbc.ui.util.DriverLoader.testConnection(DriverLoader.java:120)
at org.eclipse.birt.report.data.oda.jdbc.ui.util.DriverLoader.testConnection(DriverLoader.java:133)
at org.eclipse.birt.report.data.oda.jdbc.ui.profile.JDBCSelectionPageHelper.testConnection(JDBCSelectionPageHelper.java:687)
at org.eclipse.birt.report.data.oda.jdbc.ui.profile.JDBCSelectionPageHelper.access$7(JDBCSelectionPageHelper.java:655)
at org.eclipse.birt.report.data.oda.jdbc.ui.profile.JDBCSelectionPageHelper$7.widgetSelected(JDBCSelectionPageHelper.java:578)
at org.eclipse.swt.widgets.TypedListener.handleEvent(TypedListener.java:234)
If I change this to:
characterSetResults=utf8
the connection string connects without errors, but the encoding issue remains.
Does anyone know the correct way to retrieve latin1? And yes, I know UTF8 is the thing to use, but this is not my database....
Thank you for reading this,
Stephen
After some digging, have you tried characterSetResults=ISO8859_1? This is equivalent to latin1 and there is evidence MySQL handles this much better.
I do not have a DB to test this on, but it looks form what I read to be spot-on for what you need.
When specifying character encodings on the client side, use Java-style names(Mysql connector-j-reference-charsets).So it is supposed to work by using jdbc:mysql://localhost:3306/statistics?useUnicode=true&characterEncoding=utf-8&characterSetResults=Cp1252