MySQL 1300 Invalid big5 character string: ' \xC3\x97 ' - mysql

I have no clue what's going on here. One of my DB servers is giving me that error when trying to create a function (that works on all the other servers):
My function is
delimiter $$
CREATE DEFINER=`root`#`%` FUNCTION `getreadablesize`(`Width` DECIMAL(13,4),`Height` DECIMAL(13,4),`Type` VARCHAR(64)) RETURNS varchar(64) CHARSET utf8mb4 COLLATE utf8mb4_unicode_ci
BEGIN
RETURN concat(trim(trailing'.'
from trim(trailing'0'
from`Width`)),'\"',
if(`Height`>0,concat(' × ',trim(trailing'.'
from trim(trailing'0'
from`Height`)),'\"'),''),
if(`Type`>'',concat(' ',`Type`),''));
END$$
And the exact error message is
0 row(s) affected, 2 warning(s): 1300 Invalid big5 character string: '
\xC3\x97 ' 1300 Invalid big5 character string: 'C39720'
None of my DB is in Chinese, or ever uses the Big5 character set?
If I copy my schema create code, I get this:
CREATE DATABASE `sterling` /*!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci */;
EDIT: It works if I change the times symbol to something else, but that still doesn't make any sense as to why it's being treated as Big5 and not uft8
EDIT: Something is apparently quite wrong. If I run the following query, this is what I get:
SHOW VARIABLES LIKE 'character_set%';
+-----------------------------+------------------------------+
| 'character_set_client', | 'big5' |
+-----------------------------+------------------------------+
| 'character_set_connection', | 'big5' |
+-----------------------------+------------------------------+
| 'character_set_database', | 'utf8mb4' |
+-----------------------------+------------------------------+
| 'character_set_filesystem', | 'binary' |
+-----------------------------+------------------------------+
| 'character_set_results', | 'big5' |
+-----------------------------+------------------------------+
| 'character_set_server', | 'utf8mb4' |
+-----------------------------+------------------------------+
| 'character_set_system', | 'utf8' |
+-----------------------------+------------------------------+
| 'character_sets_dir', | '/usr/share/mysql/charsets/' |
+-----------------------------+------------------------------+
But my my.cnf clearly has default-character-set = utf8mb4 and all the variants of it under each applicable section... I will restart my MySQL server, because something is most definitely a foot.

Welp. I restarted the server and it never came back. I tried just about every MySQL server repair steps that existed and just nothing worked. Eventually I got it to start in safe mode with the innodb force flag set to 6, and I was then able to use HeidiSQL to pull all the data out (mysqlddump just hung and never started, but HeidiSQL implements their own export.). Even then though I still had 3 of my 165 tables that I couldn't read at all.

Related

Importing to Couch-DB changes encoding

I have a database I am using that has support for different languages, the issue I am running into is, in the source SQL data, the format is correct.
MariaDB [stmtransit]> SELECT * FROM routes WHERE route_id = 181;
+----------+-----------+------------------+------------------+------------+------------+------------------------------------------+-------------+------------------+
| route_id | agency_id | route_short_name | route_long_name | route_desc | route_type | route_url | route_color | route_text_color |
+----------+-----------+------------------+------------------+------------+------------+------------------------------------------+-------------+------------------+
| 181 | 1 | 369 | Côte-des-Neiges | NULL | 3 | http://www.stm.info/fr/infos/reseaux/bus | 009EE0 | NULL |
+----------+-----------+------------------+------------------+------------+------------+------------------------------------------+-------------+------------------+
1 row in set (0.00 sec)
When I move do the query and move it into CouchDB, it changes accents and anything other than plain characters to
Côte-des-Neiges
My request is
function queryRouteTable(db, route_id) {
return db.query({
sql: "SELECT * FROM routes WHERE route_id = ?;",
values: [route_id],
})
.take(1);
}
Then my upload to couch uses rx and rx-couch with the code, and no matter where I view the document.route_long_name after the initial grab, its always formatted wrong.
What am I missing, why does it change after initial grab.
To display the current character encoding set for a particular database, type the following command at the mysql> prompt. Replace DBNAME with the database name:
SELECT default_character_set_name FROM information_schema.SCHEMATA S WHERE schema_name = "DBNAME";
If you have your encoding set per table use the following command. Replace DBNAME with the database name, and TABLENAME with the name of the table:
SELECT CCSA.character_set_name FROM information_schema.`TABLES` T,information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA WHERE CCSA.collation_name = T.table_collation AND T.table_schema = "DBNAME" AND T.table_name = "TABLENAME";
IMPORTANT: BACKUP YOUR DATABASE
If you have a working backup of your database you can convert it from your current encoding to UTF-8 by issuing the following commands:
mysql --database=DBNAME -B -N -e "SHOW TABLES" | awk '{print "SET foreign_key_checks = 0; ALTER TABLE", $1, "CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci; SET foreign_key_checks = 1; "}' | mysql --database=DBNAME
And in prompt:
ALTER DATABASE DBNAME CHARACTER SET utf8 COLLATE utf8_general_ci;
Now you should be able to export using UTF-8 and import into couch using UTF-8 encoding...
Hope that helps...
Turns out MariaDB has a bug which turns the database formating to latin1 intead of utf8
To correct for this you must go to /etc/my.cnf
remove all instances of
default-character-set=utf8
find title "mysqld" in my.cnf and put under it
init_connect='SET collation_connection = utf8_unicode_ci'
init_connect='SET NAMES utf8'
character-set-server=utf8
collation-server=utf8_unicode_ci
skip-character-set-client-handshake
and save.
Then restart mariadb.

mysql won't import table as unicode even tho all variables are set to unicode

I have just updated my cnf properties to add the following:
init_connect = 'SET collation_connection = utf8_unicode_ci; SET NAMES utf8;'
character-set-client = utf8
character-set-server = utf8
collation-server = utf8_unicode_ci
skip-character-set-client-handshake
My system variables after restarting mysql:
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_unicode_ci |
| collation_database | utf8_unicode_ci |
| collation_server | utf8_unicode_ci |
+----------------------+-----------------+
So then I ran the following query to find a table that I knew had been built in utf_general_ci:
select t.table_name, c.column_name,round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB',count(c.column_name), c.character_set_name,c.collation_name
from columns c
inner join tables t on t.table_schema=c.table_schema and t.table_name=c.table_name
where t.table_schema='db' and
(c.collation_name like '%general%' or c.character_set_name like '%general%') and
(c.column_type like 'varchar%' or c.column_type like 'text') and
t.table_collation not like '%latin%' and t.table_name in ('table_name') group by t.table_name, c.column_name;
So I took a dump of the table and reimported it into my database, but it stays in utf8_general_ci!!?!?!??
Why is this? I know if I run an alter it will change it, but why didn't the dump and load resolve the problem?
Additionally, when I run an alter to convert to utf8_unicode_ci, all the columns in the table have "COLLATE utf8_unicode_ci" listed in them.

MySQL utf8mb4, Errors when saving Emojis

I try to save names from users from a service in my MySQL database. Those names can contain emojis like 🙈😂😱🍰 (just for examples)
After searching a little bit I found this stackoverflow linking to this tutorial. I followed the steps and it looks like everything is configured properly.
I have a Database (charset and collation set to utf8mb4 (_unicode_ci)), a Table called TestTable, also configured this way, as well as a "Text" column, configured this way (VARCHAR(191) utf8mb4_unicode_ci).
When I try to save emojis I get an error:
Example of error for shortcake (🍰):
Warning: #1300 Invalid utf8 character string: 'F09F8D'
Warning: #1366 Incorrect string value: '\xF0\x9F\x8D\xB0' for column 'Text' at row 1
The only Emoji that I was able to save properly was the sun ☀️
Though I didn't try all of them to be honest.
Is there something I'm missing in the configuration?
Please note: All tests of saving didn't involve a client side. I use phpmyadmin to manually change the values and save the data. So the proper configuration of the client side is something that I will take care of after the server properly saves emojis.
Another Sidenote: Currently, when saving emojis I either get the error like above, or get no error and the data of Username 🍰 will be stored as Username ????. Error or no error depends on the way I save. When creating/saving via SQL Statement I save with question marks, when editing inline I save with question marks, when editing using the edit button I get the error.
thank you
EDIT 1:
Alright so I think I found out the problem, but not the solution.
It looks like the Database specific variables didn't change properly.
When I'm logged in as root on my server and read out the variables (global):
Query used: SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
10 rows in set (0.00 sec)
For my Database (in phpmyadmin, the same query) it looks like the following:
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
How can I adjust these settings on the specific database?
Also even though I have the first shown settings as default, when creating a new database I get the second one as settings.
Edit 2:
Here is my my.cnf file:
[client]
port=3306
socket=/var/run/mysqld/mysqld.sock
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld_safe]
socket=/var/run/mysqld/mysqld.sock
[mysqld]
user=mysql
pid-file=/var/run/mysqld/mysqld.pid
socket=/var/run/mysqld/mysqld.sock
port=3306
basedir=/usr
datadir=/var/lib/mysql
tmpdir=/tmp
lc-messages-dir=/usr/share/mysql
log_error=/var/log/mysql/error.log
max_connections=200
max_user_connections=30
wait_timeout=30
interactive_timeout=50
long_query_time=5
innodb_file_per_table
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
!includedir /etc/mysql/conf.d/
character_set_client, _connection, and _results must all be utf8mb4 for that shortcake to be eatable.
Something, somewhere, is setting a subset of those individually. Rummage through my.cnf and phpmyadmin's settings -- something is not setting all three.
If SET NAMES utf8mb4 is executed, all three set correctly.
The sun shone because it is only 3-bytes - E2 98 80; utf8 is sufficient for 3-byte utf8 encodings of Unicode characters.
For me, it turned out that the problem lied in mysql client.
mysql client updates my.cnf's char setting on a server, and resulted in unintended character setting.
So, What I needed to do is just to add character-set-client-handshake = FALSE.
It disables client setting from disturbing my char setting.
my.cnf would be like this.
[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
...
Hope it helps.
It is likely that your service/application is connecting with "utf8" instead of "utf8mb4" for the client character set. That's up to the client application.
For a PHP application see http://php.net/manual/en/function.mysql-set-charset.php or http://php.net/manual/en/mysqli.set-charset.php
For a Python application see https://github.com/PyMySQL/PyMySQL#example or http://docs.sqlalchemy.org/en/latest/dialects/mysql.html#mysql-unicode
Also, check that your columns really are utf8mb4. One direct way is like this:
mysql> SELECT character_set_name FROM information_schema.`COLUMNS` WHERE table_name = "user" AND column_name = "displayname";
+--------------------+
| character_set_name |
+--------------------+
| utf8mb4 |
+--------------------+
1 row in set (0.00 sec)
Symfony 5 answer
Although this is not what was asked, people can land up here after searching the web for the same problem in Symfony.
1. Configure MySQL properly
☝️ See (and upvote if helpful) top answers here.
2. Change your Doctrine configuration
/config/packages/doctrine.yaml
doctrine:
dbal:
...
charset: utf8mb4
I'm not proud of this answer, because it uses brute-force to clean the input. It's brutal, but it works
function cleanWord($string, $debug = false) {
$new_string = "";
for ($i=0;$i<strlen($string);$i++) {
$letter = substr($string, $i, 1);
if ($debug) {
echo "Letter: " . $letter . "<BR>";
echo "Code: " . ord($letter) . "<BR><BR>";
}
$blnSkip = false;
if (ord($letter)=="146") {
$letter = "´";
$blnSkip = true;
}
if (ord($letter)=="233") {
$letter = "é";
$blnSkip = true;
}
if (ord($letter)=="147" || ord($letter)=="148") {
$letter = """;
$blnSkip = true;
}
if (ord($letter)=="151") {
$letter = "–";
$blnSkip = true;
}
if ($blnSkip) {
$new_string .= $letter;
break;
}
if (ord($letter) > 127) {
$letter = "&#0" . ord($letter) . ";";
}
$new_string .= $letter;
}
if ($new_string!="") {
$string = $new_string;
}
//optional
$string = str_replace("\r\n", "<BR>", $string);
return $string;
}
//clean up the input
$message = cleanWord($message);
//now you can insert it as part of SQL statement
$sql = "INSERT INTO tbl_message (`message`)
VALUES ('" . addslashes($message) . "')";
ALTER TABLE table_name CHANGE column_name column_name
VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NULL
DEFAULT NULL;
example query :
ALTER TABLE `reactions` CHANGE `emoji` `emoji` VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NULL DEFAULT NULL;
after that , successful able to store emoji in table :
Consider adding
init_connect = 'SET NAMES utf8mb4'
to all of your your db-servers' my.cnf-s.
(still, clients can (so will) overrule it)
I was importing data via command:
LOAD DATA LOCAL INFILE 'abc.csv' INTO TABLE abc
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(col1, col2, col3, col4, col5...);
This didnt work for me:
SET NAMES utf8mb4;
I had to add the CHARACTER SET to make it working:
LOAD DATA LOCAL INFILE
'E:\\wamp\\tmp\\customer.csv' INTO TABLE `customer`
CHARACTER SET 'utf8mb4'
FIELDS TERMINATED BY ',' ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES;
Note, the target column must be also utf8mb4 not utf8, or the import will save (without errors thought) the question marks like "?????".
For codeigniter user, ensure your character set and collate setting in database.php is set properly, which is worked for me.
$db['default']['char_set'] = 'utf8mb4';
$db['default']['dbcollat'] = 'utf8mb4_unicode_ci';

Mysql2::Error: Incorrect string value

I have a rails application running on production mode, but all of the sudden this error came up today when a user tried to save a record.
Mysql2::Error: Incorrect string value
More details (from production log):
Parameters: {"utf8"=>"â<9c><93>" ...
Mysql2::Error: Incorrect string value: '\xC5\x99\xC3\xA1k
Mysql2::Error: Incorrect string value: '\xC5\x99\xC3\xA1k
Now I saw some solutions that required dropping the databases and recreating it, but I cannot do that.
Now mysql shows this:
mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.04 sec)
What is wrong and how can I change it so I do not have any problems with any characters?
Also: Is this problem solvable with javascript? Convert it before sending it ?
Thanks
the problem is caused by charset of your mysql server side. You can config manually like:
ALTER TABLE your_database_name.your_table CONVERT TO CHARACTER SET utf8
or drop the table and recreate it like:
rake db:drop
rake db:create
rake db:migrate
references:
https://stackoverflow.com/a/18498210/2034097
https://stackoverflow.com/a/16934647/2034097
UPDATE
the first command only affect specified table, if you want to change all the tables in a database, you can do like
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_general_ci;
reference:
https://stackoverflow.com/a/6115705/2034097
I managed to store emojis (which take up 4 bytes) by following this blog post:
Rails 4, MySQL, and Emoji (Mysql2::Error: Incorrect string value error.)
You might think that you’re safe inserting most utf8 data in
to mysql when you’ve specified that the charset is utf-8. Sadly,
however, you’d be wrong. The problem is that the utf8 character set
takes up 3 bytes when stored in a VARCHAR column. Emoji characters, on
the other hand, take up 4 bytes.
The solution is in 2 parts:
Change the encoding of your table and fields:
ALTER TABLE `[table]`
CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin,
MODIFY [column] VARCHAR(250)
CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
Tell the mysql2 adapter about it:
development:
adapter: mysql2
database: db
username:
password:
encoding: utf8mb4
collation: utf8mb4_unicode_ci
Hope this helps someone!
Then I had to restart my app and it worked.
Please note that some emojis will work without this fix, while some won't:
➡️ Did work
🔵 Did not work until I applied the fix described above.
You can use a migration like this to convert your tables to utf8:
class ConvertTablesToUtf8 < ActiveRecord::Migration
def change_encoding(encoding,collation)
connection = ActiveRecord::Base.connection
tables = connection.tables
dbname =connection.current_database
execute <<-SQL
ALTER DATABASE #{dbname} CHARACTER SET #{encoding} COLLATE #{collation};
SQL
tables.each do |tablename|
execute <<-SQL
ALTER TABLE #{dbname}.#{tablename} CONVERT TO CHARACTER SET #{encoding} COLLATE #{collation};
SQL
end
end
def change
reversible do |dir|
dir.up do
change_encoding('utf8','utf8_general_ci')
end
dir.down do
change_encoding('latin1','latin1_swedish_ci')
end
end
end
end
If you want to the store emoji, you need to do the following:
Create a migration (thanks #mfazekas)
class ConvertTablesToUtf8 < ActiveRecord::Migration
def change_encoding(encoding,collation)
connection = ActiveRecord::Base.connection
tables = connection.tables
dbname =connection.current_database
execute <<-SQL
ALTER DATABASE #{dbname} CHARACTER SET #{encoding} COLLATE #{collation};
SQL
tables.each do |tablename|
execute <<-SQL
ALTER TABLE #{dbname}.#{tablename} CONVERT TO CHARACTER SET #{encoding} COLLATE #{collation};
SQL
end
end
def change
reversible do |dir|
dir.up do
change_encoding('utf8mb4','utf8mb4_bin')
end
dir.down do
change_encoding('latin1','latin1_swedish_ci')
end
end
end
end
Change rails charset to utf8mb4 (thanks #selvamani-p)
production:
encoding: utf8mb4
References:
https://stackoverflow.com/a/39465494/1058096
https://stackoverflow.com/a/26273185/1058096
Need to change CHARACTER SET and COLLATE for already created database:
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Or it was necessary to create a database with pre-set parameters:
CREATE DATABASE databasename CHARACTER SET utf8 COLLATE utf8_general_ci;
It seems like an encoding problem while getting data from database. Try adding the below to your database.yml file
encoding: utf8
Hope this solves your issue
Also, if you don't want to do changes in your database structure, you could opt by serializing the field in question.
class MyModel < ActiveRecord::Base
serialize :content
attr_accessible :content, :title
end

Unable to display polish characters in mysql

This may look like a duplicate but I've been searching for hours and none of the suggested fixes for similar problems are working:
I have text in xls file that was converted to CSV. It contains polish characters. I've confirmed I did save as UTF8 encoded. I don't have access to PHPMyAdmin on this server, so I uploaded this UTF8 encoded CSV file to the server.
I then use a UTF8 encoded PHP file to load the database up:
mb_language('uni');
mb_internal_encoding('UTF-8');
setlocale(LC_ALL, "pl_PL.UTF-8");
require_once('config.php');
mysql_set_charset('utf8');
$f=fopen('questions-final2.csv','r');
$questions=array();
while (($data = fgetcsv($f, 1000, ",")) !== FALSE) {
//$num = count($data);
//echo "<p> $num fields in line $row: <br /></p>\n";
print_r($data);
$questions[]=$data;
//mysql_query('INSERT INTO questions(question_id,text,answer_time,difficulty,mode) VALUES '.implode(',',$inserts));
//echo $data;
}
//exit();
// import of questions
$prev_index=0;
foreach($questions as $index=>$question){
if($index>0)
if($question[0]==$questions[$prev_index][0])
unset($questions[$index]);
else
$prev_index=$index;
}
mysql_query('SET CHARACTER SET utf8');
mysql_query('SET NAME utf8');
$res=mysql_query('SELECT * FROM questions');
$inserts=array();
foreach($questions as $question)
$inserts[]='("'.$question[5].'","'.addslashes($question[1]).'","'.$question[7].'","'.$question[0].'","'.$question[4].'")';
mysql_query('INSERT IGNORE INTO questions(question_id,text,answer_time,difficulty,mode) VALUES '.implode(',',$inserts));
var_dump(mysql_error());
fclose($f);
Now, here is what the database says:
mysql> show variables like 'character%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
I can't get that latin1 part to go away. My my.conf looks like this:
[client]
default-character-set=utf8
[mysql]
default-character-set=utf8
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
collation-server = utf8_general_ci
init-connect='SET NAMES utf8'
character-set-server = utf8
default-character-set = utf8
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
I'm using putty and have confirmed I have it set to utf8 encoding as well, this is the output:
mysql> select text from questions limit 1;
+-------------------------------------------+
| text |
+-------------------------------------------+
| ?wi?to Unii Europejskiej obchodzone jest: |
+-------------------------------------------+
1 row in set (0.00 sec)
This is the original text as it should appear:
Święto Unii Europejskiej obchodzone jest:
Also I have tried :
alter table questions modify column text TEXT character set utf8 collate utf8_unicode_ci;
and
alter table questions convert to character set utf8 collate utf8_unicode_ci;
Both before and after importing data, to no avail. What am I missing here?
mysql_query('SET NAME utf8');
This query should trigger an error:
SQL Error (1193): Unknown system variable 'NAME'
... but you don't see it because you don't test whether mysql_query() succeeds. The correct variable is NAMES.