SQL Server 2008 collation - mysql

I am moving a table from MySQL to SQL Server 2008 which holds a mixture of characters in one e.g. english, Français, Ελλάδα
When I do this I either get the Greek character represented by ????? or I loose the French/Spanish accents?
I have set my columns up as nvarchar for unicode and played around with the collations but I cannot seem to figure this one out.

It turns out my problem was with the actuall insert script I was running. You using NVarchar field you need add an "N" to the insert. i.e. myColumn = N'Ελλάδα'.

What SQL Commands are you using??
When inserting into MS SQL you need to use the COLLATE keyword for each column that has a special collation.

You should ensure that MySQL characters are correctly converted to ucs-2 (unicode) used by SQL Server from whatever collation/encoding you have in MySQL (probably, utf-8?)
See. for ex., Insert UTF8 data into a SQL Server 2008

Related

How to display special characters in SQL server 2008?

I am using SQL server 2008 and have the column in my table set to nvarchar. Data with special characters are getting stored wrongly in this table. Eg: this is one entry
Need to check if doesn’t comes as doesn’t itself and don’t comes asdon’t itself and ensure closure of issues.
The garbage ’ should actually be an apostrophe ('). I have checked my collation string. At database level it is SQL_Latin1_General_CP850_BIN2 and at server level it is SQL_Latin1_General_CP1_CI_AS.
I know for sure the encoding set everywhere else in my application is UTF-8.
How do I store the data correctly in my table? Do I need to change my SQL queries or any settings in the database?
Please advise.
You need to make sure that you're observing two things:
Always use NVARCHAR as datatype for your columns
Always make sure to use the N'....' prefix when dealing with string literals (for example in your INSERT or UPDATE statements)
With those two things in place, SQL Server has no trouble at all storing all Unicode characters you might throw at it...

Unicode Comparing in PHP/MySQL

The name Accîdent seems to be different than AccÎdent when I do a database query to update the column. Yet Accîdent and AccÎdent point to the same place...
In MySQL Accîdent = Accîdent when inserted.
Also, AccÎdent = AccÃŽdent.
Do you know why this is?
By default, MySQL assumes the client uses the latin1 character set. If you're using UTF-8 in your PHP scripts, then this assumption is false. You need to specify to MySQL that you're using UTF-8 by issuing this SQL statement just after the database connection is opened:
SET NAMES utf8
Then the data inserted by the following SQL statements will use the correct character set. This means that you need to re-insert your data or follow the MySQL conversion procedure (see the last paragraphs).
It is recommended that your tables are configured to store data in UTF-8, too, to avoid unnecessary read/write character set conversions. That's not required, though.
More information is available in the MySQL documentation. Specifically, Connection Character Sets and Collations.
First, you seem to be storing UTF-8 data in a table of different encoding. MySQL will try and cope, but the side effect is as you see - data in the database will look "weird". When creating a table, you need to specify the character encoding - preferably UTF-8. For existing tables, you'll need to convert the data.
Second, the tables have a "collation" beside encoding. Encoding determines how the characters map to bytes, collation determines sorting and comparison. There are language-specific collations, but utf8_general_ci should be the one you're looking for (ci stands for "case insensitive") - then your two string would match.

Convert database font from MS SQL to mysql utf8?

I have old database at one windows dedicated server and now i buy a new linux dedicated server with php and mysql.
I plan to using php to pull out database from ms sql server row by row and put it to mysql database.
But problem is mysql using utf8_unicode_ic and i don't know which charset MS SQL server used.
THanks for help.
Have you tried just running your code? Odds are it'll "Just work".
Caveats below:
You may run into issues in your data (although this is highly unlikely) because the character set you're referring to is actually a collation. That is, it defines the string "ABCDEFGH" to be equal to "abcdefgh". The "_ci" part of utf8_unicode_ci means it's case insensitive.
Some quick googling found that MySQL defaults to case and accent sensitive collation, that's good, because SQL Server does the same. You should check the collation of the SQL Server database, if it's "SQL_Latin1_General_CP1_CI_AS" you should be good.
SQL Server stores character-based data in extended (i.e. depending on Windows, operating system, code-pages/encodings installed and used on server machine) ASCII for non-unicode (char, varchar, text, etc.) types and in unicode (nchar, nvarchar, ntext, etc.) types. I believe internet has plenty of material on this FAQ topic

SQL Server localization

Does SQL Server take over the localization from the server it's installed on? Or can you define the locale for each instance/database?
Which setting is responsible for having comma or period when a double is saved to the database?
SQL Server has a server collation and each database can either use the server collation or can be set to a different collation.
The format of the datatype will be taken from the Database collation. Providing that a collation has not been explicitly set for the column.
SQL Server Collations
Remember that if you use different collation for columns that you are trying to compare, you will need to use COLLATE and that will cause the argument to be a "non searchable argument", that is indexes will not be used to satisfy that statement.

SSIS - Data from MySql to MsSql some characters are?

I just transferred some data from MySql to MsSql (2K5) in a text field, some of my characters, such as apostrophes, are now ? (question mark) to me this indicates some sort of collation or character set error, right?
To be honest, I don't know which one should I be using
The MySql db currect charset is utf8_general_ci and in ms sql is SQL_Latin1_General_CP1_CI_AS .
I have tried changing the charset of the mysql table to latin1_swedish_ci, however this doesnt help
Thanks for the input
Have you tried changing the target (SQL Server) column data type to NVARCHAR?
The utf8_general_ci collation on the MySQL column indicates a Unicode data type. If the source is Unicode, so should be the target - for the easiest transition.
Collations themselves play a minor role here. They just affect comparison and sorting.
You might also need to check the SSIS type of the columns in your dataflow. Remember, the data type and character set is set at the connection manager on the source (and that may involve a conversion from the original native character set). Also, any operations like derived columns or conversions will have a character set which can be altered and will persist down that column's lineage in the data flow. At the end when it gets to the destination, there could be additional character set coercion/conversion.