MySQL database with wrong character set and LONGTEXT's with binary data - mysql

I have Percona XtraDB 5.6 server with very old database with charset set to utf8 and data encoded in different charset (probably latin1).
I tried to migrate the database to new Percona 8.0 server but after importing the SQL file, all diacritic marks become broken on the 8.0 server. I have resolved the issue by executing this query on every column in every table:
UPDATE table SET col = convert(cast(convert(col using latin1) as binary) using UTF8) WHERE 1;
BUT there is one table with binary data (specifically GZIP compressed data) saved into LONGTEXT columns. Data from this columns always becomes damaged after import on the new server.
This is what I tried so far:
changing column type to LONGBLOB before dump.
using the above query to convert the data before/after column type change.
This is the command I'm using to export DB:
mysqldump --events --routines --triggers --add-drop-database --hex-blob --opt --skip-comments --single-transaction --skip-set-charset --default-character-set=utf8 --databases "%s" > db.sql
Please note the "--hex-blob" option which still results in binary data being exported as strings, instead of hex.

It would not have been damaged by zip/unzip. But it could have been damaged in any of a number of other ways.
"--hex-blob" turns data into strings such that they will not be mangled until you reload them.
Dumping, loading, INSERTing, SELECTing all need to be told what character set to use.
The particular UPDATE you did may or may not have made things worse. Here is a list of cases that I have identified, and the minimal "fix":
http://mysql.rjweb.org/doc.php/charcoll#fixes_for_various_cases

Related

Exporting latin1 database and import as utf8 / convert to utf8

I am working on a website that uses an old database, running on aa external MySQL 4.1 server (Server A). The database uses a latin1_swedish_ci collation, as do the tables and columns. There is a new server B that runs MySQL 5 to replace server A. The encoding should be utf8_unicode_ci.
I export the DB on Server A:
mysqldump -u root -p --opt --quote-names --skip-set-charset --default-character-set=latin1 db_a -r db_a.sql
Transfer db_a.sql via scp from server A to server B
Replace latin1 with utf-8
sed -e 's/CHARSET\=latin1/CHARSET\=utf8\ COLLATE\=utf8_general_ci/g' db_a.sql > db_a2.sql
Convert file to utf-8
iconv -f latin1 -t utf8 db_a2.sql > db_a3.sql
Import db_a3.sql
In phpmyadmin everything is printed correctly. But the new client application shows artifacts in the text columns.
I tried different variations of the steps above without success. Including importing as latin1 and using the mysql convert command. Does someone know a solution to my problem?
Better would be to load it up as latin1, then fix things.
However, this is not straightforward because there are multiple scenarios to consider. See this: http://mysql.rjweb.org/doc.php/charcoll#fixes_for_various_cases
Note in particular that there are at least 2 different ways to do the ALTER. If you pick the wrong one, the data will become garbled worse.
To see what you have, use this for a sample of the data:
SELECT col, HEX(col) FROM ... WHERE ...

Selectively dumping data with mysqldump

I am trying to export my database using mysqldump and the sql file should satisfy the following conditions
The file should not contain data from table_x ( Keep the structure )
Delete/skip data that
is older than 10 days from table_y ( Keep the structure )
The conditions may increase in future for different tables.
This dump file will be used on local environment and will work as replacement for production database.
And Is there a way to write all these conditions inside a file?
mysqldump is what you want:
http://dev.mysql.com/doc/refman/5.7/en/mysqldump.html
mysqldump --single-transaction --host=localhost --user=MyUser --password=MyPassword MyDatabase MyTable --where="mydatefield > 'insert-10 day old date here'"
mysqldump --single-transaction --host=localhost --user=MyUser --password=MyPassword --ignore-table=MyDatabase.MyTable --ignore-tableMyDatabase.MySecondIgnoredTable
This answer could be improved by changing the 'insert-10 day old date here' placeholder with a backtick bash command that is curdate - 10 days.
Specifying the password on the command line can be a security risk and is discouraged, especially in scripts or shared hosting environment. Check the help file above to see the more complex process of creating a credentials file to use in the command.
If you have MyISAM tables instead of InnoDB you'll want to switch --single-transaction with --skip-add-locks
I realize this question is from a year ago, but someone might still find useful.

How can I move a database from one server to another? Completely

I'm transferring complete database from online server to localhost server.
But all records are not transferring. Is there any way to transfer complete data with same rows .
I tried via Navicat, export and import single tables, import and export .sql and gzip but all result are different
My Hosting is Shared.
Software on localhost Xamp
You can try mysqldump.
mysqldump -h hostname -u user -pPassWord --skip-triggers --single-transaction --complete-insert --extended-insert --quote-names --disable-keys dataBaseName > DUMP_dataBaseName.sql
then move you file DUMP_dataBaseName.sql to your localhost, and:
mysql -hHost -uUser -pPass -DBase < DUMP_dataBaseName.sql
Result is not missing probably your mysql tables on innodb click any table and see how many rows you are seeing .:) inno db now give the exact result
One issue I have run into countless times when moving WordPress sites are special characters (specifically single quotes and double quotes). The database will export fine, but upon import, it breaks at an "illegal" quote. My workflow now consists of exporting the database, running a find and replace on the sql file to filter out the offending characters, and then importing.
Without knowing more about your specific situation, that's just something I would look into.

Corrupted international characters because of no "SET NAMES utf8;" in TYPO3

I've got some problem in one of the TYPO3 Polish sites with character encoding. There was no setDBinit=SET NAMES utf8; parameter set in configuration.
Everything works okay (frontend & backend) but the export from the database. All international characters are corrupted when I search database via PhpMyAdmin or try to export database with data.
The official page http://wiki.typo3.org/UTF-8_support#SET_NAMES_utf8.3B says:
Without SET NAMES utf8; your TYPO3 UTF-8 setup might work, but chances are that database content entered after the conversion to UTF-8 has each international character stored as two separate, garbled latin1 chars.
If you check your database using phpMyAdmin and find umlauts in new content being shown as two garbled characters, this is the case. If this happens to you, you cannot just add the above statement any more. Your output for the new content will be broken. Instead you have to correct the newly added special chars first. This is done most easily by just deleting the content, setting the option as described above and re-entering it.
Is there any other way to repair corrupted characters? There is a lot of content to edit now...
I tried almost every combination of export encoding and converting to another encoding and so on and so far I failed.
You can try mysqldump to convert from ISO-8859-1 to utf-8:
mysqldump --user=username --password=password --default-character-set=latin1 --skip-set-charset dbname > dump.sql
chgrep latin1 utf8 dump.sql (or when you prefer sed -i "" 's/latin1/utf8/g' dump.sql)
mysql --user=username --password=password --execute="DROP DATABASE dbname; CREATE DATABASE dbname CHARACTER SET utf8 COLLATE utf8_general_ci;"
mysql --user=username --password=password --default-character-set=utf8 dbname < dump.sql

mysql dump - character encoding

I have a mysql dump what contains ascii characters like überprüft. My problem is like I can not make another dump and I have been searching on the net for a solution but every suggestions would involve another dump set it up to utf-8. Is there a way to convert a dump file.
Is the entire dump encoded like that, that is, in UTF-8? In that case you can simply set the encoding when you import the dump.
If you use the mysql command line client to import the dump, use the --default-character-set command line switch when importing, for example:
> mysql -u user --default-character-set=utf8 < dump.sql