UTF-8 output with CakePHP - mysql

I'm trying to move some Excel-Data to MySQL, but having troubles with encoding.
What I did:
Data export from OpenOffice 3.1 as csv (utf-8 encoded)
Import to phpMyAdmin via file upload (Table encoding: 'utf8_unicode_ci')
In phpMyAdmin's view mode, the data is displayed correctly (it is using utf-8 as charset):
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
When I try to display the Data on my webpage, I get a hash with a question mark in it.
System-Info
The language I try to get on my page:
German
MySQL client version: 5.0.32
My OS: MAC OS X 10.5.7
Server-Script: CakePHP v1.2.3.8166
Regards,
Benedikt

I had a similar symptom, my solution was to add
'encoding' => 'UTF8'
to config/database.php

I've had a similar problem to this. Most things are set correctly "out of the box", but I'd just like to point out the following things I found useful... I hit this problem while moving from dev to live:
You need to have matching Database encoding (2 items to check) as well as view encoding.
Your DB encoding is set up in the Schema.
Cake PHP can be manually forced to expect a type of encoding from the DB.
Cake PHP then needs to call the correct encoding for the view as well.
The DB settings are, of course, set up in your database. In my case, using MySQL workbench, this was simply a case of right clicking the schema and selecting "Alter Schema...". From there I could select the encoding/collation I wanted.
In Cake PHP the database encoding is set in app\Config\database.php. You should ensure that the DATABASE_CONFIG array (approx line 60), has the correct encoding enabled/selected. For example:
'encoding' => 'utf8'
Finally, the view needs to select the correct version of HTML to display. This is written into your templates from the file app\Config\core.php (approx line 82).
Configure::write('App.encoding', 'UTF-8');
Once all three parts are changed, you should have a consistent charset and hence display.
Hope that helps any people still searching for this.

Are you setting the connection to UTF-8 with mysql?
SET NAMES 'UTF-8'

$conn1 = mysqli_connect($this->db_host, $this->db_username, $this->db_password, $this->db_name);
Add this in connection
mysqli_set_charset($conn1,"utf8mb4");

Related

How to fix garbled characters in PHPMyAdmin

My MySQL database contains some Chinese symbols and such (non-ASCII symbols). When I view them in PHPMyAdmin, they look garbled. However, if I display them on my website with PHP using the regular mysqli API, it looks fine so I assume the data is uploaded/stored properly in the database, so maybe the server connection collation is incorrect.
My PHP code for opening the database connection is:
function openConnection(): mysqli
{
$databaseHost = "localhost";
$databaseUser = "root";
$databasePassword = '';
$databaseName = "my-database-name";
$connection = new mysqli($databaseHost, $databaseUser,
$databasePassword, $databaseName);
if ($connection->connect_error) {
die("Connection failed: " . $connection->connect_error);
}
return $connection;
}
My PHPMyAdmin server connection collation is the default utf8mb4_unicode_ci which seems to be reasonable as well. My tables are also created with the default utf8mb4_general_ci. Shouldn't that work fine for any input users might make?
Calling $connection->get_charset() in PHP also returns the correct charset:
If I export the database data in MyPHPAdmin, the export is also garbled in Notepad++, I made sure to view it with UTF-8 encoding. If I import the garbled export again, the database will show the data as garbled once more and on the website the data now also shows as garbled. In this case, an actually corrupted export happened.
How can I solve this encoding problem? Clearly PHP can handle UTF-8 properly, my Apache web server is also serving UTF-8 and my database is configured seemingly correctly as well but there is an issue with PHPMyAdmin or the database/database table collation.
It looks like the issue was entirely elsewhere since I'm supplying data to PHP with C++ code. The C++ code uses the nlohmann JSON libary to build the data submitted to the PHP script. The issue was my inability to specifically encode std::strings to UTF-8 like described here when putting data into a C++ JSON object. With that said, everything is now working as expected.
⚈ If using mysqli, do $mysqli_obj->set_charset('utf8mb4');
⚈ If using PDO do somethin like $db = new PDO('dblib:host=host;dbname=db;charset=utf8mb4', $user, $pwd);
⚈ Alternatively, execute SET NAMES utf8mb4
Any of these will say that the bytes in the client are UTF-8 encoded. Conversion, if necessary, will occur between the client and the database if the column definition is something other than utf8mb4.
More notes on PHP: http://mysql.rjweb.org/doc.php/charcoll#php
If you have specific garbling, see Trouble with UTF-8 characters; what I see is not what I stored
If you suspect the data being fed from PHP to Notepad, dump a few Chinese characters in hex and shown to us. I would expect every 4th character to be hex F0 or every 3rd to be between E3 and EA. (These are the first byte for 4-char and 3-char UTF-8 encoding of Chinese characters.)
Does Notepad properly handle UTF-8, or does it need a setting?
If you are in the "cmd" in Windows, you may need chcp 65001; see http://mysql.rjweb.org/doc.php/charcoll#entering_accents_in_cmd That way, more non-English characters will display correctly.

phpMyAdmin won't display or insert Unicode characters properly into database

I'm using phpMyAdmin version 4.4.4 with MySQL 5.6 (charset is set to UTF-8 Unicode). The table in question has the collation set to utf8-general-ci and all fields are also set to utf8-general-ci collation as well. My php.ini file has default_charset = "UTF-8".
Despite all the UTF-8 settings for all three applications, unicode characters appear garbled when viewing a table within phpMyAdmin. So, instead of seeing ...
Søren
... in phpMyAdmin I see ...
Søren
Even though it displays garbled in phpMyAdmin, it displays correctly on the website. The only problem is with phpMyAdmin.
If I attempt to Insert a new record using phpMyAdmin and enter Søren in a text field, it displays like this within phpMyAdmin...
Søren
Which looks correct there, but, on the web page, it displays like this...
S�ren
The ø character is replaced with a question mark inside a black diamond instead of displaying the proper unicode character on the website.
What the heck is going on? How do I make phpMyAdmin display and insert the unicode characters properly into the table without mangling them? Thanks!
My php.ini file has default_charset = "UTF-8".
That only affects the charset used for some PHP built-in functions like htmlentities.
MySQL uses its own charset to decode stuff you send it. This can be set using $mysqli->set_charset('utf8') for mysqli, or mysql_set_charset('utf8') for the deprecated mysql module, or using charset=utf8 in the connection string in PDO.

Why does PhpMyAdmin seem to break my Character Encoding?

I wrote a short script which simply inserts Unicode characters into a MySQL database. It looks like this:
mysql_connect('localhost', 'root', '*') or die(mysql_error());
mysql_select_db('test') or die(mysql_error());
mysql_query("INSERT INTO thetable (thefield) VALUES ('äöüß')") or die(mysql_error());
I generated the script using Notepad++ and it's UTF-8 encoded without BOM.
The database and the table have a utf8_general_ci collation. When I look at the data using PhpMyAdmin then the charset seems to be broken. The characters are not displayed correctly:
äöüß
When I receive the data back in my script then the charset seems to be okay. I dumped it with the right header (header('Content-Type: text/html; charset=utf-8')) and everything looks right.
When I insert data into the table using PhpMyAdmin again, then it is displayed correctly inside PhpMyAdmin, but as soon as I dump it from my Demo script, then the charset is broken again.
I have no idea what the reason could be. The database's charset, the HTTP header and the encoding of the script are consistent and I don't doubt that PhpMyAdmin is working correctly. So where else could I look for the problem?
Your getting 2 characters for every one in the original, so its reading it as standard ASCII instead of unicode. you probably need to specify the character set the MySQL Connection is using when you connect.
I'm on my cell, but if u post your DB connect code I can show you how when I get to a computer
Edit - see PHP PDO: charset, set names?

UTF-8: showing correctly in database, however not in HTML despite utf-8 charset

I use MySQL 5.1 and loaded from a UTF-8 decoded txt-file about 2.7 mil lines into a table which itself is declared as utf8_unicode_ci and as well all char-fields are declared as utf8_unicode_ci, using LOAD DATA INFILE...
In the database itself the characters all seem to be correct, everything looks nice. However, when I print them using php, the characters show up as ???, although I use utf-8 declaration in the HTML head:
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
...
In another table (using utf-8), where I inserted text from a submitted form, the characters appear strangely in the database, but are shown correctly again, when I print them using SELECT....
So, I was wondering: what is wrong? Are UTF-8 chars shown correctly in the database or strangely but when you SELECT them again they are OK? Or where is the problem (when loading the file into the db, in the HTML or somewhere in between)??
Thank you very much for any hint or suggestion! :)
Note: MySQL's utf8 charset is limited, it only supports Unicode characters in the BMP that take up no more than three bytes. You should be using utf8mb4 instead.
Make sure you send the SET NAMES utf8 SET NAMES utf8mb4 command to MySQL after connecting, before running any MySQL queries.
Make sure your page is actually rendered as utf-8 (if there's an HTTP header Content-Type: text/html;charset=iso-8859-1, browsers disagree about which should win).
Read this article: Handling Unicode Front To Back In A Web App (but remember to replace utf8 with utf8mb4 where MySQL is concerned).
If phpMyAdmin displays your entered data as correct Unicode text, then my bet is that you are not doing SET NAMES utf8 after connecting.
Try use such code after connecting to DataBase, but befor you recieve data
$db->query('set character_set_client=utf8');
$db->query('set character_set_connection=utf8');
$db->query('set character_set_results=utf8');
$db->query('set character_set_server=utf8');

Data in db is in wrong encoding (using CKeditor) and greek

I am using ckeditor 3.4 to insert data (text) to database and then display it on a page.
Problem: when I write (greek )in the ckeditor everything is fine. When I press the HTML button of the ckeditor again everything is fine (e.g. i see the actuall text typed not html entities). However when I save the data (and hence store them to the db) the stored data in the db are like this
"<p style="text-align: center;">
... σÏντομα πεÏισσότεÏες πληÏοφοÏίες...</p>
<p>
</p>"
Note: when I recall the data the are correctly displayed on the web page.
Actions taken so far:
1- the connection file to the db has the following: $conn->query("SET NAMES 'utf8'");
2- In the config.js of the ckeditor I have added the following lines
config.entities = false;
config.entities_greek = false;
config.entities_latin = false;
config.entities_processNumerical = false;
// Define changes to default configuration here. For example:
config.language = 'el';
// config.uiColor = '#AADC6E';
};
3- my webpages are set to: content="text/html;charset=utf-8"
4- db colation: utf8_unicode_ci / type MyIsam
I've been searching around but no luck.
I'd appreciate any help
Thank you all for your answers.
Solution was much simpler.
The right writing is SET NAMES UTF8 instead of SET NAMES 'utf8'
If you are using PHP or any other language that doesn't do this automatically, you need to invoke
SET NAMES 'UTF8'
on the connection before calling any statements, in order to use UTF-8 in your database.
Also make sure you are serving all pages as UTF-8 so that posted data is in UTF-8.
There are also some configuration parameters that controls how the data is sent and processed by the server, but I have never managed to get it to work without this statement.
se more here: http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html
EDIT: Ah, sorry, didn't see that you actually did this. If it is displayed correctly when you output it and your charset is set to UTF-8 on the page, then I'm assuming that you only view it in the DB with a tool that doesn't support UTF-8, or isn't configured for it? So what exactly is the problem right now?