Change the encoding of a MySQL database - mysql

I am having an issue with a MySQL database. I have two copies of it: development and production. Today I realised that:
The production database has the encoding format utf8_general_ci
The development database has the encoding format latin1_swedish_ci.
My question:
Is it possible to change the encoding of my development database to look like the production one? How can I perform that in a easy way?
And, for the noobs like me, what are the main differences between the two formats. Which one would you recommend? Which one is the most standard on the industry?
Thanks in advance.

question 1: Yes, it's possible
ALTER DATABASE `test` CHARACTER SET utf8 COLLATE utf8_general_ci;
but You have to be carefull if nothing will change after switch. Only english, will change easyly... special characters could be a problem.
question 2: You have to read more: http://dev.mysql.com/doc/refman/5.5/en/charset-general.html because there are issues like sorting, charset range...

Related

Should I alter my non utf-8 database (mysql) (default?) to utf-8 on a running Drupal 7 site?

I've got running Drupal 7 site on MySQL. Just recognised that I have the Database Encoding (and collation) other than UTF-8 while Drupal 7 docs say it needs UTF-8 (with utf8_general_ci collation). At the same time all my tables are the necessary utf-8 and utf8_general_ci. (I guess drupal setup did it this way).
My questions are:
should I leave the whole system as it is or should I just alter my database to required encoding or is it necessary to convert anything after I altered the database to utf-8?
Would leaving the whole system as it is cause me any trouble in the future?
Is this setting for the database just a default and it doesn't matter at all for me as all my tables are set to the proper utf-8?
Thanks
You may or may not already be in trouble.
When you connect, do you establish the connection as being utf8? (SET NAMES utf8, or whatever Drupal does to achieve that.)
Is the data in your client encoded utf8? (For English, this does not matter.)
The CHARACTER SET of every column that currently exists is already set in stone. Even if some columns are latin1 and some are utf8, the client will see them as whatever SET NAMES has established. The inconsistency does not 'hurt'. At least not until you try to store Arabic in a latin1 column.
I would not blindly try to change anything without understanding how to correctly change it. Some attempts could make things worse.

Arabic characters doesn't show in phpMyAdmin

So I am currently working on a certain project where I need to create a database in which its records will hold both English and Arabic names.
I am creating this using PhpMyAdmin where it works perfectly fine for English names, however all the Arabic names appear as "?????".
To solve this issue I tried to use "set name 'utf8' ", however it didn't work. Googling this problem I realized that PhpMyAdmin does not support either Arabic or Special characters.
I am not sure if there is any workaround for this issue. Do you have any suggestion to solve it ?
Thanks in advance
First, is your database capable of storing Unicode? SHOW CREATE TABLE table_name; will hopefully show your character set as utf8. If not this should fix it:
ALTER TABLE table_name DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
Also make sure your PHPMyAdmin settings contain this:
$cfg['DefaultCharset'] = 'utf_8';
$cfg['DefaultConnectionCollation'] = 'utf8_general_ci';
phpMyAdmin has no problem with UTF8, which as far as I can tell means Arabic has been fully supported for some time. phpMyAdmin just shows (accurately) what is stored in your database; if you're not seeing what you expect it's almost always because your application is misbehaving. Perhaps your Google search turned up quite old information; I'm not sure how long phpMyAdmin has supported these special characthers but looking at the development log file it seems that it's been at least since 2008, and almost certainly even prior to that. Anyway:
The phpMyAdmin wiki has considerable detail on the matter and is a good place to start. There's also a quite comprehensive guide here at Stackoverflow, or this link to another very similar question. You can read more about properly setting the application charset here, and I'll refer you once again to the phpMyAdmin wiki for information on recovering/repairing the situation.
To summarize: the problem is almost certainly in how your application stores data, not how phpMyAdmin displays it. Make sure your database and tables are using a utf character set. In your application code, make sure you set your connection charset properly. Recovery is rather painless and can be achieved by switching the column charset first to binary then whatever utf8-variant makes the most sense for you.
Add these 2 line at bottom of the my.ini file.
then restart the wamp server.
character-set-server=utf8
collation-server=utf8_general_ci
Fisrt of all navigate to the following link: http://localhost/phpmyadmin/index.php
enter image description here
make the Server connection collation: utf8_unicode_ci
And all the Arabic data fields will be displayed in the phpMyAdmin Databases.

mysql utf8mb4 turkish character confusion

I am inserting some datas including turkish characters into a table and when I insert a data appearance of turkish characters confused me. For example for the character 'ğ', appearance on database is 'ð'. But when I get data and print it on the browser for testing this character shows up to be correct 'ğ' character. Do I have to worry about the situation ? (Btw I am using utf8mb4 as described in title)
If you are using phpmyadmin to see database entries, it is a common misconception.
You shouldn't worry and you can configure it from php.ini and phpmyadmin's settings. But as you said it is not much of a problem. I reccomend you to use toad or navicat for mysql for better results.

Accent insensitive search on a problematic database

I have a database that contains data in different languages. Some languages use accents (like áéíóú) and I need to search in this data as the accents doesn't exist (search for 'campeon' should return 'campeón' as a valir result).
The problem is that the tables in my database (utf8_unicode_ci) are not storing utf8 characteres. If you see the data through phpmyadmin the words with accents looks like this: campeón
After some researching, I've found (in a StackOverflow question) that the problem is related to the inexistence of a SET NAMES [charset]. In fact, I've made some testings and if I set names to utf8, everything works as expected.
Well, I have the solution, what's the problem? The problem is that the database is in production, so there are thousands of strings in the database. If I change the character set the client will use, all already existing string will become invalid. The question is: is there any way to:
perform accent-insensitive searches in a database that uses a wrong charset like mine?
transform safely the data in the tables to the appropriate charset?
continue working with mixed charsets (latin1 and utf8) in the database, assuming that latin1 data will not be accent-insensitive?
If anybody has experience in any of the solutions I propose or has a new one, I'll be very thankful if share.
The problem being that the data was inserted using the wrong connection encoding, you can fix it by
Exporting the data using the wrong connection encoding, just like you have used it thus far, followed by
Importing the data using the correct utf8 connection encoding.
That will fix the encoding problem, after which search will work as expected.
What if you create a copy of the table at the beginning of your session, alter the copy's charset, perform all your queries from that, and then drop the table at the end of your session? I don't know how practical this would be - depends on how often you need to perform these queries and how big the table is.

Should I migrate a MySQL database with a latin1_swedish_ci collation to utf-8 and, if so, how?

The MySQL database used by my Rails application currently has the default collation of latin1_swedish_ci. Since the default charset of Rails applications (including mine) is UTF-8, it seems sensible to me to use the utf8_general_ci collation in the database.
Is my thinking correct?
Assuming it is, what would be the best approach to migrate the collation and all the data in the database to the new encoding?
UTF-8, as well as any other Unicode encoding scheme, can store characters in any language, so it is an excellent choice of codepage for your database.
The collation setting, on the other hand, is a completely separate issue from the encoding scheme. It involves sort orders, upper/lowercase conversions, string equality comparisons, and things like that which are language-specific. The collation setting should match the language that is used in the database.
The UTF-8 general collation is (I am assuming here—I'm not familiar with MySQL in particular) used for situations where the language is unknown and some simple default ordering is needed. It probably corresponds to the Unicode code point ordering, which is almost certainly not what you want if you're storing Swedish.
Convert to UTF-8 as the charset.
Collation settings are only used for sorting and stuff like that. Choose the collation that most of your users would expect.
Providing your existing data in the database is CORRECTLY encoded in latin1, converting the tables to utf8 (using ALTER TABLE, as described in the docs) should just work.
Then all your application needs to do is continue doing whatever it did before. If your application wants to use unicode characters, it should set its connection encoding to utf8 and use utf8, but that's its own problem.
The problem is that a large number of crap web apps have historically sent utf8 data to mysql and told it to treat it as latin1. MySQL will honour this perfectly and save junk into the tables, as instructed.
Converting the tables from latin1 to utf8 will NOT repair this mistake, as you genuinely do have total rubbish in there. Repairing them is nontrivial, particularly if during the lifetime of the app it's been talking different types of rubbish to the database.
Use below mysql query to convert your column :
ALTER TABLE users MODIFY description VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci;
To see full details about your table :
SHOW FULL COLUMNS FROM users;