Encoding problem (Hebrew UTF8) in WordPress - mysql

I have a blog (of a friend) I am failing to fix:
http://www.nivcalderon.com/
The language of the website is Hebrew, but the encoding scrambles the output, and I can't find how to fix it.
I tried changing the DB colliation to be utf8_general_ci.
I added this:
define('DB_COLLATE', 'utf8_general_ci');
To the wp-config
(and also this: define('DB_CHARSET', 'utf8');
But removed it later, since it didn't seem to fix the problem)
Any ideas of what else to do ?
Thanks

The issue is causing because of a bad import, which make the database to contain double-encoded utf-8 strings. It can be fixed by exporting the tables as latin1 and than importing it as UTF-8. This is not a WordPress fault.

Related

MySQL Database Show Lik 'רולר חו×'

I have imported a database which includes Hebrew. The Hebrew in the database looks like this: קירורית ל 4 פחיות מבית ×מגזית
I tried to change the encoding to UTF-8 but it still looks the same.
How can I fix this?
If any more information needed tell me and I'll provide.
Thanks.
That's "Mojibake" for something close to קירורית ל 4 פחיות מבית �מגזי, correct?
See Trouble with UTF-8 characters; what I see is not what I stored for the cause, to wit:
The bytes to be stored need to be UTF-8-encoded. Fix this.
The connection when INSERTing and SELECTing text needs to specify utf8 or utf8mb4. Fix this.
The column needs to be declared CHARACTER SET utf8 (or utf8mb4). Fix this.
HTML should start with .
The appropriate fix for the data is here: http://mysql.rjweb.org/doc.php/charcoll#fixes_for_various_cases

jsp insert into database no utf-8

I've got web-app (jsp) which is using database (mysql). I've put some data into database to test is it work to show in jsp what's in database There was issue with utf-8 characters (polish letters) but i fixed it by adding <parameter-encoding default-charset="UTF-8" /> into glasfish-web.xml. But i still got problem with putting data into database from form. In database instead of polish character i got "?????". I've tried many thinks and nothing Dont reallu now where to look to fixit
ok problem solved I'll put answer for other people having same problem
In jsp where i start my database connection for url="jdbc:mysql://localhost/databasename i changed it into
url="jdbc:mysql://localhost/databasename?useUnicode=true&characterEncoding=UTF-8"
and now everything in database looks like it shoudl
Problem you have faced is that java has escaped the UTF-8 character sequence.
you can use StringEscapeUtils provided by java in order to escape any characters which become ??? or anything else.
try this :
str = org.apache.commons.lang.StringEscapeUtils.unescapeJava(str);
From java

mysql - How to save ñ

Whenever I try to save ñ it becomes ? in the mysql database. After some few readings it is suggested that I have to change my jsp charset to UTF-8. For some reasons I have to stick to ISO-8859-1. My database table encoding is latin1. How can I fix this? Please help.
Go to your database administration with MySQL WorkBench for example, put the Engine to InnoDB and the collation to utf8-utf8_general_ci.
You state in your question that you require a ISO-8859-1 backend (latin1), and a Unicode (UTF-8) frontend. This setup is crazy, because the set on the frontend is much larger than that allowed in the database. The sanest thing would be using the same encoding through the software stack, but also using Unicode only for storage would make sense.
As you should know, a String is a human concept for a sequence of characters. In computer programs, a String is not that: it can be viewed as a sequence of characters, but it's really a pair data structure: a stream of bytes and an encoding.
Once you understand that passing a String is really passing bytes and a scheme, let's see who sends what:
Browser to HTTP server (usually same encoding as the form page, so UTF-8. The scheme is specified via Content-Type. If missing, the server will pick one based on its own strategy, for example default to ISO-8859-1 or a configuration parameter)
HTTP Server to Java program (it's Java to Java, so the encoding doesn't matter since we pass String objects)
Java client to MySQL server (the Connector/J documentation is quite convoluted - it uses the character_set_server system variable, possibly overridden by the characterEncoding connection parameter)
To understand where the problem lies, first assure that the column is really stored as latin1:
SELECT character_set_name, collation_name
FROM information_schema.columns
WHERE table_schema = :DATABASE
AND table_name = :TABLE
AND column_name = :COLUMN;
Then write the Java string you get from the request to a log file:
logger.info(request.getParameter("word"));
And finally see what actually is in the column:
SELECT HEX(:column) FROM :table
At this point you'll have enough information to understand the problem. If it's really a question mark (and not a replacement character) likely it's MySQL trying to transcode a character from a larger set (let's say Unicode) to a narrower one which doesn't contain it. The strange thing here is that ñ belongs to both ISO-8859-1 (0xF1, decimal 241) and Unicode (U+00F1), so it'd seem like there's a third charset (maybe a codepage?) involved in the round trip.
More information may help (operating system, HTTP server, MySQL version)
Change your db table content encoding to UTF-8
Here's the command for whole DB conversion
ALTER DATABASE db_name DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
And this is for single tables conversion
ALTER TABLE db_table CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
change your table collate to utf8_spanish_ci
where ñ is not equal to n but if you want both characters to be equal use
utf8_general_ci instead
I try several combinations, but this works for me:
VARCHAR(255) BINARY CHARACTER SET utf8 COLLATE utf8_bin
When data retrieve in dbforge express, shows like:
NIÑA
but in the application shows like:
NIÑA
I had the same problem. Found out that is not an issue about encoding UTF-8 or whatever charset. I imported my data from windows ANSI and all my Ñ and ñ where put in the database perfectly as it should be. Example last names showed on database last_name = "MUÑOZ". I was able to select normally from the database with query Select * from database where last_name LIKE "%muñoz%" and phpmyadmin show me results fine. It selected all "MUÑOZ" and "MUNOZ" without a problem. So phpmyadmin does show all my Ñ and ñ without any problems.
The problem was the program itself. All my characters mention, showed as you describe with the funky "MU�OZ" question mark. I had follow all advice everywhere. Set my headers correctly and tried all my charsets available. Even used google fonts and whatsoever font available to display correctly those last names, but no success.
Then I remembered an old program that was able to do the trick back and forth transparently and peeked into the code to figure it out: The database itself, showing all my special characters was the problem. Remember, I uploaded using windows ANSI encoding. Phpmyadmin did as expected, uploaded all as instructed.
The old program fixed this problem translating the Ñ to its UNICODE HTML Entity: Ñ (see chart here https://www.compart.com/en/unicode/U+00D1 ) a process done back and forth from MySQL to the app.
So you just need to change your database strings containing the letter Ñ and ñ to their corresponding UNICODE to reflect correctly on your browser with UTF charset.
In my case, I solved my issues replacing all my Ñ and ñ for their corresponding UNICODE in all the last names in my database.
UPDATE database_name
SET
last_name = REPLACE(last_name,
'MUÑOZ',
'MUÑOZ');
Now, Im able to display, browse, even search all my correct last names and accents/tildes, proper to spanish language. I hope this helps. It was a pain to figure it out, but an old program solved the problem. Best regards and happy coding !

MySQL UTF8 Problem

I have a strange UTF8 encoding problem, which I don't understand.
If a friend of mine fills out a form on my webpage, then all german "umlauts" (ä,ü,ö) are displayed in strange chars in my database. When I do the same, they are displayed normally, how it should be. Everything is set to utf8_general_ci, so it should work. But it doesn't, when my friend fills out the form.
Has anyone a suggestion for me?
Thanks!
Even though all tables are UTF-8, the database connection might be using latin-1. What output do you get with SHOW VARIABLES LIKE '%character%'; in MySQL? Any signs of latin-1 there? If so, adjust your charset settings in the MySQL configuration file.
You haven't specified the language you write your app in, and it seems to be connection-based problem. You must manually set connection encoding, f.g. in JDBC, by appending on the end of connection string "?characterEncoding=utf8"
Run SET NAMES utf8 on mysql right after connection

Charset Problem while migrating database

I have a custom made CMS that I must migrate to work on Wordpress. Everything worked fine except the charset module.
Since this is about a Rumanian blog content, there are some special chars used (this will be ă, î, ș, â, Ț). When i insert this content on wordpress wp_posts, Wordpress displays them as "?".
I've tried all kind of stuff, like changing the charset from utf8 to latin1, latin2, and so on, but no result.
Even more, when i try to replace that special characters with normal ones (eg: ă to a, î to i) nothing happens, the content remains the same (there are actually some chars that are changed but not all)
What i do wrong and what i must do to do it right?
Thanks!
Character sets are a complete nightmare. What I'd do is to use mysqldump to dump your database to a sql file. Check to see if the special chars still look right.
Then, using find and replace in a text editor, replace all special characters with the correct html entity. e.g. Ă becomes Ă.
http://meta.wikimedia.org/wiki/Help:Romanian_characters
Then delete your database, set all conceivable settings to utf-8, and import your dump.
Wordpress also has an extensive article about character encodings.
Good luck!