SQL Server collation to MySQL collation+character set mapping and vice versa - mysql

I'm trying to automatically translate SQL Server collation string to MySQL collation string. But I can't find any rules, how are these collation names created.
For example, I want to translate MySQL collation 'big5_chinese_ci' into SQL Server equivalent.
There are some useful links:
http://technet.microsoft.com/en-us/library/aa258233(v=sql.80).aspx
https://dev.mysql.com/doc/refman/5.5/en/charset-charsets.html
Any ideas? (btw...please excuse my bad english)

Related

Connecting with wrong collation to a mysql server?

I have a Golang program that may connect to databases with different character sets or collation.
For example the default at the time of writing of the Golang MYSQL driver is utf8mb4_general_ci https://github.com/go-sql-driver/mysql#collation
However if I connect to a database configured like so:
CREATE DATABASE example character set utf8mb4 collate utf8mb4_unicode_ci;
Can I expect "bad things to happen"? Indexes not to work?
In most case, there are no problem. For example, column collation is used regardless connection collation when WHERE column=? is used.
See also: https://dev.mysql.com/doc/refman/5.6/en/charset-collation-coercibility.html
But I can't speak it's 100% safe. It's safe to use one collation all place.

MySQL utf8_bin collation equivalent for Azure SQL database

I am trying to migrate a MySql application to Azure.
The pricing for Azure's MySql database seems to be quite higher than the "SQL Databases" option so i decided to go for that "SQL database" option.
The last step for the resource set-up is to choose a collation.
In MySQL i use utf8_bin but that collation seems not to be valid for "SQL Database".
Is there an equivalent collation?
I need to store UTF characters, case sensitive and accent sensitive comparison and i almost never sort strings.
I did some research on the internet, but couldn't find any information about Azure's collations
Edit:
After additional researches i've come across 'Latin1_General_BIN2' that should do the job. I'm not sure that 'Latin' can handle all utf8 characters (eg. ʖ, ޖ, etc) - and i did not yet fully grasped the difference between BIN and BIN2 collations
that collation is not UTF8 capable. Up to this moment, existing collations in SQL Server and Azure SQL DB are non-Unicode, with Unicode being enabled (UTF-16) with the NCHAR and NVARCHAR (and SQLVARIANT) data types.
That being said, we are now running a private preview of UTF8 support in SQL Server and Azure SQL DB, so I'd like to further discuss with you.
Will you be at Ignite? If so please look for me in the SQL Server booth. If not, can you please send me an email to utf8team#microsoft.com?
Thank you!

Illegal mix of collations for operation 'concat'

I'm trying to execute this concat query in mysql
SELECT CONCAT(if(fName,fName,''),Name)
From Student
Error:
#1271 - Illegal mix of collations for operation 'concat'
This is due to collections difference, you can solve by converting the two strings or columns
to one collection say UTF8
CONCAT(CAST(fName AS CHAR CHARACTER SET utf8),CAST('' AS CHAR CHARACTER SET utf8))
This will solve :)
you can check more about casting in MySQL here MySQL Casting
The charsets and/or collations you use in your connection do not match the charset/collation in your table.
There are 4 solutions:
1- Change the charset in your connection:
//find out the charset used in your table.
SHOW TABLES LIKE 'student'
//set the server charset to match
SET NAMES 'charset_name' [COLLATE 'collation_name']
2- Change the charset used in your table to match the server charset:
//find out the charset used in the server
SHOW VARIABLES LIKE 'character_set%';
SHOW VARIABLES LIKE 'collation%';
//Change the charset used in the table
ALTER TABLE student ......
3- Change the default charset settings and restart MySQL
Edit My.ini and replace the character_set_* options, so they match your tables.
4- Change the charset settings for your connection
You client can override the charset and collation settings.
If it does not option 1 or 3 should fix your issue, but if the connection overrides these settings, you need to check the connection-string and edit the charset/collation settings to match your database.
Some advice:
Find a charset. I recommend UTF8 and a collation: I recommend utf8_general_ci. And use those consistantly everywhere.
A concatenation can only work if the collation of all used values matches OR you use a collation that all collations are a subset of (from a logical standpoint).
If you want to concatenate text, each text should be the same collation. Take a look at the collation the database uses, then take a look at the collation that your connection uses:
show variables like '%coll%'
The collation_connection should match the collation of the table you try to concatenate. If it doesn't, the error message will arise.
You can then change the connection collation to match the one of the table.
Look like you have a miss use on the if statement there because it will resulting an undefined data type so the concat operation will fail as it different in data type. Try change the query by use ifnull instead.
Try this query instead:
SELECT concat(ifnull(fName,''),Name) From Student
see the demo here http://www.sqlize.com/kfy85j8f1e
for another reference read also http://forums.mysql.com/read.php?10,225982,225982#msg-225982
It can also be an error with your client library being too old for the mysql server.
We had a similar problem with LIKE and the character "ő" and using PHP MySQL library version 5.1.52 but MySQL server version 5.5.22.
The problem has gone away upon upgrading the client library.

SQL Server localization

Does SQL Server take over the localization from the server it's installed on? Or can you define the locale for each instance/database?
Which setting is responsible for having comma or period when a double is saved to the database?
SQL Server has a server collation and each database can either use the server collation or can be set to a different collation.
The format of the datatype will be taken from the Database collation. Providing that a collation has not been explicitly set for the column.
SQL Server Collations
Remember that if you use different collation for columns that you are trying to compare, you will need to use COLLATE and that will cause the argument to be a "non searchable argument", that is indexes will not be used to satisfy that statement.

SSIS - Data from MySql to MsSql some characters are?

I just transferred some data from MySql to MsSql (2K5) in a text field, some of my characters, such as apostrophes, are now ? (question mark) to me this indicates some sort of collation or character set error, right?
To be honest, I don't know which one should I be using
The MySql db currect charset is utf8_general_ci and in ms sql is SQL_Latin1_General_CP1_CI_AS .
I have tried changing the charset of the mysql table to latin1_swedish_ci, however this doesnt help
Thanks for the input
Have you tried changing the target (SQL Server) column data type to NVARCHAR?
The utf8_general_ci collation on the MySQL column indicates a Unicode data type. If the source is Unicode, so should be the target - for the easiest transition.
Collations themselves play a minor role here. They just affect comparison and sorting.
You might also need to check the SSIS type of the columns in your dataflow. Remember, the data type and character set is set at the connection manager on the source (and that may involve a conversion from the original native character set). Also, any operations like derived columns or conversions will have a character set which can be altered and will persist down that column's lineage in the data flow. At the end when it gets to the destination, there could be additional character set coercion/conversion.