ZF2 Doctrine2 MySql charset error - mysql

I've setup a MySQL DB with utf8_unicode_ci collation, and all the tables and columns on it have de same collation.
My Doctrine config have SET NAMES utf8 as a connection option and my html files use utf8 charset.
The text saved on those tables contain accented characters (á,è,etc).
The problem is that when I save the content to the DB, it saves with strange characters, like when I try to save ISO in UTF8 table. (e.g.: Notícias)
The only workaround that i've found is to, utf8_decode before save, and utf8_encode before printing.
That means that, for some reason, something in between is messing up utf8 with iso.
What might be?
Thanks.
EDIT:
I've setup to encode before saving and decode before printing, and it prints correctly but in DB my chars change to:
XPTÓ -> XPTÓ
This makes searching in DB for "XPTÓ" impossible...

I would print bin2hex($string); at each step of the original workflow (i.e. without encode/decode steps).
Go through each of:
the raw $_POST data
the values you get after form validation
the values that get put in your bound Entity
the values you'd get from the db if you query it directly using PDO (get this from
$em->getConnection())
the values that get populated into your Entity on reload (can do this via $em->detach($entity); $entity = $em->find('Entity', $id);)
You'd be looking at the point at which the output changes, and focus your search there.
I would also double check:
On the db: SHOW CREATE TABLE 'table' shows CHARSET=utf8 for the whole table (and nothing different for the individual columns)
That the tool you use to see your database values (Navicat, phpMyAdmin) has got the correct encoding set.

Related

Issue with retrieving special characters of sql inserted entities from MySQL using in Spring Boot application

I created a MySQL database from a script, along with inserting rows of data that include special characters such as č, š and ž.
For that reason I set default character set on schema in the script:
CREATE SCHEMA IF NOT EXISTS schemaname DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci ;
Now if I go through MySQL console and try retrieving data from a table with these special characters:
SELECT name FROM tablename;
I'll get a response with these special characters:
+-------------------------------------------------+
| name |
+-------------------------------------------------+
| Test of š, and č, and ž |
+-------------------------------------------------+
I've also created a Spring Boot application (hibernate with Spring Boot Starter Data JPA) in which I implemented rest API for entities that I have in my database.
The problem arises when I try to retrieve (previously inserted) entities using JPARepository (EntityRepository.findAll()). Instead of correctly written special characters which are normal in MySQL console, I get:
Å¡, ž and such characters (should be š, ž)
However, if I store a new entity using the Rest API with special characters and retrieve that one, the special characters are fine (š, ž).
Any idea as to why there is a difference between these and how I might solve the issue?
Without having to insert every entity through Rest API instead of MySQL inserts.
My application.properties file contains (I don't think the issue is here, since entities created through rest API are fine):
spring.datasource.url=jdbc:mysql://localhost:3306/schemaname?useSSL=false&allowPublicKeyRetrieval=true&useUnicode=true&characterEncoding=UTF-8
Could be an issue: The .sql files were created on Windows, but I am running sql server (in docker) and my application on Linux (all .sql files have UTF-8 character encoding).
The solution I found was actually really simple, can't believe I missed this.
I just had to put N in front of every string with special character in the .sql file. Example:
INSERT INTO X(FIELD1) VALUE (N'ščž');
Simply means I'm passing a NCHAR/NVARCHAR/NTEXT value instead of normal CHAR/VARCHAR/TEXT.

How to save UTF-8 data in MySQL database in Livecode app

I'm trying to save some data gathered from fields in MySQL db. Text contains some Polish characters, but Livecode sends all Polish chars as '?'. Here's part of my code:
Declare variable
put the unicodeText of field "Title" into tTitle
put uniEncode(tTitle, "UTF8") into tTitle
Send this to db:
put "UPDATE magazyn SET NAZWA='" & tTitle & "'" into tSQLStatement
revExecuteSQL gConnectionID,tSQLStatement, "SET NAMES 'utf8'"
For example, word "łąka" is saved as "??ka". I've tried uniEncode, uniDecode, everything is going wrong.
Don't use any encoders/decoders. They will only add to the confusion.
When trying to use utf8/utf8mb4, if you see Question Marks (regular ones, not black diamonds),
The bytes to be stored are not encoded as utf8. Fix this. (Getting rid of the encoders may fix it.)
The column in the database is CHARACTER SET utf8 (or utf8mb4). Fix this.
Also, check that the connection during reading is utf8. I don't know the details of "Livecode"; look in its documentation. If you can't find anything, execute this SQL after connecting: SET NAMES utf8.
Problem solved! Here's code:
get the unicodeText of field "Title"
put unidecode(it,"polish") into tTitle
it will save polish characters in a strange version, but for downloading i'm using this code:
set the unicodetext of fld "List" to uniencode(tList,"polish")
tList variable contains all data gathered from MySQL
Ensure the column in the database is set to utf8 encoding.
Starting with LiveCode 7 all text in fields is unicode, specifically UTF-16. Before you send the text out to any external file or datastore, you need to encode it as UTF-8 (or whatever format you want to store it in. Use the LiveCode textEncode() function for this:
put textEncode(field "Title","utf-8") into tTitle
put "UPDATE magazine SET nazwa = :1" into tSQLStatement
revExecuteSQL gConnectionID, tSQLStatement, "tTitle"
Note: It's also a good idea to use the :N variable substitution method to reduce the risk of SQL code injection attacks.
When you read the data from the database use textDecode to convert back to UTF-16:
put textDecode(tRawDataFromDB,"UTF-16") into old tTitle

MySQL Exporting Arabic/Persian Characters

I'm new to MySQL and i'm working on it through phpMyAdmin.
My problem is that i have imported some tables with (.sql) extension into a database with: UTF8_general_ci format and it contains some Arabic or Persian characters. However, when i export these data into an Excel file, they appear as the following:
The original value: أحمد الكمالي
The exported value: أحمد  الكمالي
I have searched and looked for this issue and tried to solve it by making the output and the server connection with the same format UTF8_general_ci. But, for some reason which i don't know, the phpMyAdmin doesn't allow me to change to the same format, it forces me to chose this: UTF8mb4_general_ci
Anyway, when i export the data, i'm making sure that the format is in UTF8 but it still appears like that.
How can i solve it or fix it?
Note: Here are some screenshots if you want to check organized by numbers.
http://www.megafileupload.com/rbt5/Screenshots.rar
I found easier way that you can rebuild excel file with correct characters.
Export your data from MySQL normally in CSV format.
Open new Excel and go to Data tab.
Select "From Text".if you not find this it is under "Get External Data".
Select your file.
Change file origin to Unicode(UTF-8) and select next.("Delimited" checked by default)
Select Comma delimiter and press finish.
you will see your language characters correctly.See more
Mojibake. Probably...
The bytes you have in the client are correctly encoded in utf8mb4 (good).
You connected with SET NAMES latin1 (or set_charset('latin1') or ...), probably by default. (It should have been utf8mb4.)
The column in the tables may or may not have been CHARACTER SET utf8mb4, but it should have been that.
(utf8 and utf8mb4 work equally well for Arabic/Persian.)
Please provide more details if this explanation does not suffice.

phpMyAdmin won't display or insert Unicode characters properly into database

I'm using phpMyAdmin version 4.4.4 with MySQL 5.6 (charset is set to UTF-8 Unicode). The table in question has the collation set to utf8-general-ci and all fields are also set to utf8-general-ci collation as well. My php.ini file has default_charset = "UTF-8".
Despite all the UTF-8 settings for all three applications, unicode characters appear garbled when viewing a table within phpMyAdmin. So, instead of seeing ...
Søren
... in phpMyAdmin I see ...
Søren
Even though it displays garbled in phpMyAdmin, it displays correctly on the website. The only problem is with phpMyAdmin.
If I attempt to Insert a new record using phpMyAdmin and enter Søren in a text field, it displays like this within phpMyAdmin...
Søren
Which looks correct there, but, on the web page, it displays like this...
S�ren
The ø character is replaced with a question mark inside a black diamond instead of displaying the proper unicode character on the website.
What the heck is going on? How do I make phpMyAdmin display and insert the unicode characters properly into the table without mangling them? Thanks!
My php.ini file has default_charset = "UTF-8".
That only affects the charset used for some PHP built-in functions like htmlentities.
MySQL uses its own charset to decode stuff you send it. This can be set using $mysqli->set_charset('utf8') for mysqli, or mysql_set_charset('utf8') for the deprecated mysql module, or using charset=utf8 in the connection string in PDO.

Data in db is in wrong encoding (using CKeditor) and greek

I am using ckeditor 3.4 to insert data (text) to database and then display it on a page.
Problem: when I write (greek )in the ckeditor everything is fine. When I press the HTML button of the ckeditor again everything is fine (e.g. i see the actuall text typed not html entities). However when I save the data (and hence store them to the db) the stored data in the db are like this
"<p style="text-align: center;">
... σÏντομα πεÏισσότεÏες πληÏοφοÏίες...</p>
<p>
</p>"
Note: when I recall the data the are correctly displayed on the web page.
Actions taken so far:
1- the connection file to the db has the following: $conn->query("SET NAMES 'utf8'");
2- In the config.js of the ckeditor I have added the following lines
config.entities = false;
config.entities_greek = false;
config.entities_latin = false;
config.entities_processNumerical = false;
// Define changes to default configuration here. For example:
config.language = 'el';
// config.uiColor = '#AADC6E';
};
3- my webpages are set to: content="text/html;charset=utf-8"
4- db colation: utf8_unicode_ci / type MyIsam
I've been searching around but no luck.
I'd appreciate any help
Thank you all for your answers.
Solution was much simpler.
The right writing is SET NAMES UTF8 instead of SET NAMES 'utf8'
If you are using PHP or any other language that doesn't do this automatically, you need to invoke
SET NAMES 'UTF8'
on the connection before calling any statements, in order to use UTF-8 in your database.
Also make sure you are serving all pages as UTF-8 so that posted data is in UTF-8.
There are also some configuration parameters that controls how the data is sent and processed by the server, but I have never managed to get it to work without this statement.
se more here: http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html
EDIT: Ah, sorry, didn't see that you actually did this. If it is displayed correctly when you output it and your charset is set to UTF-8 on the page, then I'm assuming that you only view it in the DB with a tool that doesn't support UTF-8, or isn't configured for it? So what exactly is the problem right now?