dropdown doesn't display UTF-8 correctly - html

I have a <select> element with some options on a dropdown. on that dropdown i have product some of these product have names that come up with special characters like é. But on the front-end instead of showing the é it shows the ä characters.
for solution I tried to use special characters like É for é inside a textfield. But when I replace the é with É inside a textfield, on the front-end it shows the É My magento store charset is utf8.
i want to use é, $, ä etc... of my Magento store. is there any way to solve this problem tihs doesn't affect the rest of the website

You will have to save the file in UTF-8 as well. Both the file presenting the text, as well as the file that outputs the data that populates the selectbox.
A common misstake, at least for myself, is that when working with UTF-8, you have to ensure that everything is saved using it. Scripts, codebehind, html - Everything.

David Johansson is correct.
I had the same problem with a box with a list of names.
I populated it via a function that looked up the people and created the lines for each person found. However people with names containing accents didn't display correctly.
I resolved it by running my result through iconv before returning the value.
return iconv('ISO-8859-1','UTF-8', $retval);

Related

How to truncate (ö, é etc.) efficiently from R to CSV to HTML?

I create some data in R (using R studio) that I export as a csv. This cdv will then be uploaded to HTML.
However, I always get bugs with symbols like é and ö and ä.
Is there a way I can "code" accordingly in my R file so in the HTML will look right, i.e. readable like é/ä/ö/ü....
Thank you!
You need to encode special characters, eg instead of ä, you need to put ä.
You can find the full list here: https://dev.w3.org/html5/html-author/charref

TinyMCE adds à characters by replacing symbols and consecutive two spaces

I got a text area attached to tineyMCE tool bar (ver 4.6.6). But when a symbol from tinyMCE tool bar or two consecutive spaces are added to the text area, and after the page is refreshed. The symbols or the double spaces are getting replaced by Â.
I've tried the suggestions provided in the tinyMCE forums and tried setting the entity_encoding: named, raw, numeric. But still none of this options are working. Can some one please help?
You need to set the encoding in tinymce. The encoding should happen when you are saving the text as well as rendering text inside tinymce.
In your case, the text you saved and re-rendered has different encodings.
Update your code with following. It worked for me.
tinymce.init({
....
encoding: 'xml',
entity_encoding: 'named+numeric+raw',
entities: '160,nbsp'
});

How can I show special characters like "e" with accent acute over it in HTML page?

I need to put the name of some universities on my web page. I have typed them as they were but in some browser or maybe some computers they appear differently. For example, "Universite de Moncton" should have the 2nd "e" in Universite with an accent acute over it. Could you please help about it.
If you’re using a character set that contains that character, you can use an appropriate character encoding and use it literally:
Universit‌é de Moncton
Don’t forget to specify the character set/encoding properly.
If not, you can use an HTML character reference, either a numeric character reference that denotes the code point of the character in the Universal Character Set (UCS):
Universit‌é de Moncton
Universit‌é de Moncton
Or using an entity reference:
Universit‌é de Moncton
But this entity is just a named representation of the numeric character reference (see the list of entity references that are defined in HTML 4):
<!ENTITY eacute CDATA "é" -- latin small letter e with acute,
U+00E9 ISOlat1 -->
You can use UTF-8 HTML Entities:
è è
é é
ê ê
ë ë
Here's a handy search page for the UTF-8 Character Map
I think from the mention that 'in some computers or browsers they appear differently' that the problem you have is with the page or server encoding. You must
encode the file correctly (how to do this depends on your text editor)
assign the correct encoding in your webpage, done with a meta tag
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
force the server encoding with, for example, PHP's header() function:
header('Content-Type: text/plain; charset=ISO-8859-1');
Or, yes, as everyone has pointed out, use the html entities for those characters, which is always safe, but might make a mess when you try to find-replace in code.
There are two methods. One is by using "HTML entities." You need to enter them as, for example, é. Here is a comprehensive reference of named entities; you can also reference the Unicode code point of a given character, using its decimal form as Ӓ or its hex form as Ӓ.
Perhaps more common now (ten years after this answer was originally entered) is simply using Unicode characters directly. Rất dễ dàng, phải không? This is more acceptable and universal because most pages now use UTF-8 as their character encoding.
运气!
By typing it in to your HTML code. é <--You can copy and paste this one if you want.
Microsoft windows has a character map for accessing characters not on your keyboard, it's called Character map.
http://www.starr.net/is/type/htmlcodes.html
This site shows you the HTML markup for all of those characters that you will need :)

IE munging pound (£) symbol

I have a html form which goes of to do all sorts of strange back end things. This works fine in firefox. and in most cases it works fine in IE
However the (pound sterling) £ sign causes problems, and seems to get munged in the submit.
The forms is something like this
<form action="*MyFormAction*" accept-charset="UTF-8" method="post">
I think I have seen this problem before but can't remember the solution.
edit, the euro symbol € works fine
edit 2,
In fact if I put the € symbol with a £ symbol it also works fine. Looking at the problem if I use characters which are not in the extended part of iso8859-1 it works ok. If I use extended charicters from iso8859-1 they get munged. So how do I make IE use the character set that the accept-charset says it should?
accept-charset="UTF-8"
Does not do what you think it does (or the standard says it does) in IE. Instead, IE uses the value (‘UTF-8’) as an alternative list of encodings for if a field can't be encoded using the usual default encoding (which is the same as the page's own encoding).
So if you add this attribute and your page isn't already in UTF-8, you can be getting characters submitted as either the page encoding or UTF-8, and there is no way for your form-submission-reading script to know!
For this reason you should never use accept-charset; instead you should always ensure that the page containing the form is correctly served as “Content-Type: text/html;charset=utf-8” (by HTTP header and/or <meta>).
In fact if I put the € symbol with a £ symbol it also works fine.
Yes, that's because ‘€’ cannot be encoded in the page's default encoding (presumably ISO-8859-1). So IE resorts to sending the field encoded as UTF-8, which is what you wanted all along.
I think bobince has the ideal answer which is “serve the page in UTF-8", however as I can't do this I am posting my work around for prosperity.
Adding a hidden field unmunge with a non ISO-8859-1 (what our pages are served in) extended character forces the submission into UTF8
so
<input type="hidden" name="unmunge" value="€" />
fixes the encoding (the entity is the euro symbol).
How is the £ submitted? If it's in an input box for a price don't submit it, only allow numbers to be submitted and add the £ when you display the price again. Or add the currency symbol in the backend script.
I am no sure if this will help (read the entire article at http://fyneworks.blogspot.com/2008/06/british-pound-sign-encoding-revisited.html)
Excerpt:
THE PROBLEM If you look at the
UTF-8/Latin-1 (AKA ISO-8859-1)
Character Table you will find that the
decimal code for the British pound
sterling sign is 163 - and the
hexadecimal code is A3.
£ = %A3
However, this is not the case in (all)
encoding/decoding functions in
Javascript...
encodeURI/encodeURIComponent
Encodes a Uniform Resource Identifier (URI) component by
replacing each instance of certain
characters by one, two, or three
escape sequences representing the
UTF-8 encoding of the character
Which means, in order to encode our
beloved pound sign, Javascript uses 2
characters. This is where the annoying
"Â" comes in...
£ = %C2%A3
Hope it helps.

File upload mojibake

How do you do a file upload in an HTML form without running into mojibake?
I have a form that has three fields:
a file field
a required text field
a text field which accepts Japanese characters
I've set up my HTML form with the attribute enctype='multipart/form-data'. But when the form submission fails due to the missing required field, I get redirected to the same page but my 2nd text field (the one that accepts the Jap. chars) is already mojibaked.
However, if I remove the enctype or change it to anything else, and when the form submission fails, I see the Japanese chars as they are (no mojibake). The problem is, if this succeeds, I am unable to read the uploaded files.
Any ideas how to fix this??
Mojibake (mangled display of Japanese characters) can have two causes:
The data on the page is in the right character encoding, but the browser does not recognize it.
Some characters on the page use the wrong encoding (the server wrote them in an incorrect encoding).
If the other characters on the page (outside of your form) show correctly, you produced broken output on your server.
If everything is clobbered, and you can fix it by manually setting a different encoding from the browser's menu, then the page encoding is not properly specified.
What kind of content-type headers and HTML meta tags do you use?
I've figured it out (by reverse-engineering appfuse (appfuse.org) which does not seem to be affected by mojibake with its file upload form ).
It solved it by setting the charset encoding to UTF-8 in the server side (with spring's org.springframework.web.filter.CharacterEncodingFilter ). Thus, I guess multipart-/form-data really does screw up the character encoding ( or at least for java ).