I have an old site where visitors can add their comments. Until now it always worked well, it doesn't have many visitors (it's for a niche audience). It was built in classic ASP and it's using MySQL (now 5.6). It's running on IIS 8.5 and it connects to the DB without DSN.
Whenever someone adds emoji characters to their posts it'll make the IIS service go into some kind of loop using more than 60% of the CPU and never stops.
I do not want to filter these characters out, I think they fit in well with the site's premise, however I did not foresee this issue. When I first set up MySQL I used UTF-16 to make sure my users could write in any language, and I never had issues until now. There are messages in what looks like Japanese and Korean, and I only figured out it was an issue with Emojis when a user told me what he was doing when the site crashed on him.
All the site's pages/files are saved in Unicode and for all of them the charset is set as "utf-8".
The database's collation is utf16_unicode_ci and so are the tables'.
I can insert Emojis into the tables directly from command line or via HeidiSQL, however....
The server is sending the Emojis as question marks (?).
Here's my connection string:
Driver={MySQL ODBC 5.3 Unicode
Driver};Server=...;User=******;Password=******;Option=3;charset=utf16;
Use CHARACTER SET utf8mb4 end-to-end in MySQL.
Related
I feel I'm a bit in over my head on this one. I have developed an ASP.Net MVC website for a friend that allows them to paste in Hebrew words and it does some conversion/translation. I am using MySQL as a data backend with ASP.Net MVC 5.
The website is fairly simple. The database consists of two tables which store letters, and translations. I am using MySQL EF6 for data access layer. There are basically three screens on the website, one for managing each table, and one for doing the translations.
When I run it in my development environment (VS 2017/Windows 10), everything works as expected. I can edit data using the Hebrew Unicode characters and they save properly to the database. Here is an example:
When I click Save, I expect those values to be saved to the database, and they work fine. However, I have recently converted the website to run on a Mono/Ubuntu environment for hosting. I got the environment setup using mod_mono and Apache2. Everything is working perfectly, except when I save a page like this, the Hebrew character א gets converted into a question mark (?):
Here's what I've determined so far.
I know Apache/MySQL is setup properly to handle these values, because the data displays fine. It only gets messed up when I save it.
I am also running PhpMyAdmin on the same server, and when I modify that same row through the table editor, it does not mess up the encoding.
I've tried adding the Default Encoding utf-8 to the Apache configuration with no luck.
I've tried adding globalization with default encodings of utf-8 to web.config and it didn't help.
How do I troubleshoot where the value is getting messed up? Is there a simple solution I need to apply to fix this?
Thanks!
The bytes to be stored are not encoded as utf8/utf8mb4. Fix this.
The column in the database is CHARACTER SET utf8 (or utf8mb4). Fix this.
Also, check that the connection during reading is UTF-8.
HTML forms should start like <form accept-charset="UTF-8">.
For more discussion, see Trouble with utf8 characters; what I see is not what I stored
If that is not enough to solve your problem, find the HEX, as discussed in "Test the data" in that link; then ask for more help.
I switched an self developed Web-Application based on Perl/CGI and MySQL from one server to another and since then, special characters from database content like german umlauts are shown as black diamond with question mark. Everything else, even text with umlauts which is coming from the script is ok. After lot of research I have still no clue what might cause the issue.
The original server is openSuSE 13.1 with Perl 5.18.1 and MySQL 5.6.25 while the new server is Debian 8 with Perl 5.20.2 and MySQL 5.5.44. I transfered all files with zipped tar files and configured MySQL and Apache identically, i.e. setting utf-8 and so on. After dumping and restoring sql data I was able to verify that data in new MySQL is fine.
I tried many things so far, playing around with use Encode and use utf8 in Perl, setting DBI with utf8 but nothing helps. Feel quite lost now, so any hint is appreciated. Let me know if more info is necessary.
Kind regards,
Uwe
The post suggested by bytepusher helped me. When I use use utf8; AND use open ':std', ':encoding(UTF-8)'; everything works fine. However, I am still wondering why it did work without these statements on the old machine...
We are hosting a contest on our site that is open to the international community. A small percentage of our users are Japanese and have asked to be able to post comments on our site using Hiragana.
Currently, their comments show up as strings of ?????? question marks.
We are using a Win 2008 server running IIS 7 and Coldfusion 10. The DB where the comments are stored (and also appear as ?????? question marks) is SQL Server 2012.
The site is currently using the UTF-8 charset:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
Not sure where I need to make changes. DB? CF? Windows? IIS? Website code? Any ideas?
I've found other similar questions, but they usually have to do specifically with WordPress, Joomla, or site's that are entirely Japanese.
Thanks!
You might claim you are using UTF-8, but are you really? If your database, strings (the programming language might need to be told to handle strings as UTF-8), and actual output encoding aren't UTF-8, then you won't get proper results.
Then here is the font issue; many characters are not included in every font and thus don't work on a lot of computers.
Also try setting headers like this as actual headers.
collation has nothing to do with this neither do fonts (your unicode data is getting garbaged going into the db). so...
you should be talking to your db via one of the JDBC drivers not ODBC.
your db should be unicode capable & you must use unicode capable datatypes to hold the data (eg, for sql server use the "N" datatypes like Nvarchar, etc).
i assume you're using cfqueryparam (its a user facing form after all), so you need to enable the "Enable High ASCII characters"... option for that datasource in cfadmin (under the advanced menu).
I'm having issues with the encoding of the the '£' character. (Issue: When POSTing '£' from a form field and doing an insert, nothing is inserted in the MySQL table). I've checked everything wrt to UTF-8 support on my PHP code headers, server, collation/char set on MySQL etc.
I'm using MAMP as my dev environment (PHP 5.3.5).
Everything works fine on my production server (commercial host) (PHP 5.2.6) so I've ruled out any issues with my code
However, I think I have tracked down the culprit: When comparing both environments, this line is missing from my dev server:
_ENV["HTTP_ACCEPT_CHARSET"] ISO-8859-1,utf-8;q=0.7,*;q=0.3
However, there is nothing in php.ini I can see to change it. Any ideas, or am I barking up the wrong tree?
Cheers
Roland
I'd write a simple test to check out where things are going wrong.
echo() out the value from $_POST in PHP and verify whether the browser is sending the data correctly and that it's being parsed into PHP correctly. WHen you do this test make sure that the browser has correctly detected the character set.
If that works then you're likely to have mis-configured something with the database. If you have both the table collation and the connection encoding as "UTF-8" then you should have no problems saving the data into MySQL (if both SET NAMES and the table collation are the same no translation will go on so it'll be stored correctly in the table).
You didn't mention the MySQL connection anywhere in your question (just "etc"). Just in case there's anything you've missed have a good look over this article How to Avoid Character Encoding Problems in PHP
We had redmine working with mysql (and mysql works fine with utf8). Now we needed to migrate the database to SQL Server (latin1 is the default for us). The data acentuation is ok in SQL Server after the migration, but in the browser, data coming from the database is showing ? in the acentuation place. What could be the solution to show characters correctly in the browser?
HTML encode it. Browsers support HTML only, and HTML says all non-english characters are to be specially coded.