Displaying HTML codes - ¼ and ” - html

I am trying to display the real value of ¼ and ”, but the browser is displaying it as ? (question mark) at a area where AJAX is enabled. While at a different browser locations it displays ¼”.
I want to display 1" but it getting displayed as 1â€.
Please advise.

I would guess you're doing something like fetching HTML from a second page and writing it to innerHTML.
Make sure that all your HTML is saved in the UTF-8 encoding. XMLHttpRequest.responseText will decode content from UTF-8, unless the response contains a Content-Type: ...;charset=something-else header. (A <meta> element is not good enough as XMLHttpRequest doesn't parse HTML to get the meta out; it has to be a proper HTTP header.)

Related

How does the browser practically detect the HTML page encoding?

Let's suppose that an chracter encoding format called X exists (for example UTF-8), if I insert in the HTML file the tag <meta charset="X"> and then I save the file, obviously, with the same encoding, how can the broswer read the file later?
I mean, how can the broswer know the encoding of an HTML page if, to get encoding, it must read the file? It seems a sort of loop.
According to https://www.w3.org/TR/html4/charset.html#h-5.2.2, a browser gets the correct encoding from the Content-Type header field of the HTTP response. If this field is not present, the browser reads the HTML page until the META tag, assuming all bytes were ASCII characters. So this only works if ASCII is a subset of the actual encoding.

How to Prevent Browser Control from URLDecoding an Embedded URL Link

Please forgive my lack of knowledge surrounding HTML
I am trying to generate a static HTML page which is rendered in an embedded HTML Browser component in a 3rd party application.
In the HTML Body I have a URL Link embedded within the page.
Access Application
Note that the above URL is "URLEncoded". Specifically, the query string after "encrypt=" is encrypted, and then URLEncoded.
Problem
The HTML browser component embedded in the 3rd party application renders the HTML and all appears fine, EXCEPT that it Decodes the URL String.
This results in a hyperlink with the following URI;
https://mydomain.com.au/Web/Default.ashx?encrypt=x+NWTAVMqprD+ZyFtf1tfBVfIfhqKJ3JCjMmiXiSJSUl6n4FzCuW8mwQfpNskdQEvqU7QiWMdR+bu9y6+iO8eh41XwGJX9l5iCYZunTamhGdkkiR9CqVCrkStu+zAlhqcJYG6M0zztcActpm6iSn99gXDlw8z+Hs8Q88N9fZyXdYpxspgl+AoGZe7hR3zOulJb1YhabyBbf+kfI0dq1YQpHn3SWig8HuWvBANXPrPHDqAOsnT1DtJQ==
Note the now existance of characters such as "+" and "=" which causes the failure to load the application which is the target of the URL.
Is there any way to prevent a browser (browser control?) from decoding this URL string and maintaining its integrity?
I am thinking off the top of my head, and I don't really understand the purpose of this suggestion but will defining a "type" attribute on the Link such as
<a type="application/x-www-form-urlencoded" href="xxx.com/ddddd" class="Action">Access Application</a>
have any effect?
How else can I prevent the browser control from decoding this URL?
Meta Tags in the < head > ???
Thanks in Advance!
Kind Regards
Aaron
We had to escape the % symbol.
For example: changing %2B in the link to %252B (%25 being the escape code for a % symbol). Likewise changing %3D to %253D had the same effect and prevent the client application rendering %3D to a = sign.
We couldn't stop the client application from 'decoding' the URL entirely but at least now it decoded to the correct URL value

Unwanted characters being added to url in HTML

I'm trying to include a simple hyperlink in a website:
...Engineers (IEEE) projects:
So that it ends up looking like "...Engineers (IEEE) projects:" with "IEEE" being the hyperlink.
When I click on copy link address and paste the address, instead of getting
http://www.ieee.ucla.edu/
I get
http://www.ieee.ucla.edu/%C3%A2%E2%82%AC%C5%BD
and when I click on the link, it takes me to a 404 page.
Check the link. These special character are added automatically by browser (URL Encoding).
Url Encoding
Use this code and it will work::
IEEE
The proper format to add hyperlink to a html is as follow
(texts to be hyperlink)
and for better understanding go through this link http://www.w3schools.com/html/html_links.asp
%C3%A2%E2%82%AC%C5%BD represents „ which is when you get when a unicode „ is being parsed as Windows-1252 data.
Use straight quotes to delimit attribute values in your real code. You are doing this in the code you have included in the question, but that won't have the effect you are seeing. Presumably your codes are being transformed at some point in your real code.
Add appropriate HTTP headers and <meta> data to tell the browser what encoding your file is really using

Preventing PHP from auto parsing XML

I'm making an api call to a site using the code as shown below :
$xmlData = file_get_contents("http://isbndb.com/api/books.xml?access_key=XXXXXX&index1=isbn&value1=0596002068");
echo $xmlData;
However xmlData when displayed on the browser is auto parsed to HTML. For e.g. The element <title> of the returned XML which is actually a book title is converted to HTML essentially becoming the page title, and the other XML elements are displayed as plain text without the tags. I want the client side XMLHttpRequest Object to get raw XML data from the server side.
Why does this happen and how do I ensure that XML is not auto parsed?
PHP just sees it as text. For instance, do echo "<b>Bold</b>"; and it will "automatically" be in bold. It is the browser that processes the HTML and renders it.
This is what htmlspecialchars is for.
This got nothing to do with php. you spit out elements which browser interprets as HTML (that's why it sets title). Build your html page right, use <pre> tags around your content, or. when needed, send your content with correct content-type header (like text/plain to display your xml for viewing or text/xml for other purposes) so it will not parse your data as html.

Displaying unicode symbols in HTML

I want to simply display the tick (✔) and cross (✘) symbols in a HTML page but it shows up as either a box or goop ✔ - obviously something to do with the encoding.
I have set the meta tag to show utf-8 but obviously I'm missing something.
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Edit/Solution: From comments made, using FireBug I found the headers being passed by my page were in fact "Content-Type: text/html" and not UTF-8. Looking at the file format using Notepad++ showed my file was formatted as "UTF-8 without BOM". Changing this to just UTF-8 the symbols now show correctly... but firebug still seems to indicate the same content-type.
You should ensure the HTTP server headers are correct.
In particular, the header:
Content-Type: text/html; charset=utf-8
should be present.
The meta tag is ignored by browsers if the HTTP header is present.
Also ensure that your file is actually encoded as UTF-8 before serving it, check/try the following:
Ensure your editor save it as UTF-8.
Ensure your FTP or any file transfer program does not mess with the file.
Try with HTML encoded entities, like &#uuu;.
To be really sure, hexdump the file and look as the character, for the ✔, it should be E2 9C 94 .
Note: If you use an unicode character for which your system can't find a glyph (no font with that character), your browser should display a question mark or some block like symbol. But if you see multiple roman characters like you do, this denotes an encoding problem.
I know an answer has already been accepted, but wanted to point a few things out.
Setting the content-type and charset is obviously a good practice, doing it on the server is much better, because it ensures consistency across your application.
However, I would use UTF-8 only when the language of my application uses a lot of characters that are available only in the UTF-8 charset. If you want to show a unicode character or symbol in one of cases, you can do so without changing the charset of your page.
HTML renderers have always been able to display symbols which are not part of the encoding character set of the page, as long as you mention the symbol in its numeric character reference (NCR). Sounds weird but its true.
So, even if your html has a header that states it has an encoding of ansi or any of the iso charsets, you can display a check mark by using its html character reference, in decimal - ✓ or in hex - ✓
So its a little difficult to understand why you are facing this issue on your pages. Can you check if the NCR value is correct, this is a good reference http://www.fileformat.info/info/unicode/char/2713/index.htm
Make sure that you actually save the file as UTF-8, alternatively use HTML entities (&#nnn;) for the special characters.
Unlike proposed by Nicolas, the meta tag isn’t actually ignored by the browsers. However, the Content-Type HTTP header always has precedence over the presence of a meta tag in the document.
So make sure that you either send the correct encoding via the HTTP header, or don’t send this HTTP header at all (not recommended). The meta tag is mainly a fallback option for local documents which aren’t sent via HTTP traffic.
Using HTML entities should also be considered a workaround – that’s tiptoeing around the real problem. Configuring the web server properly prevents a lot of nuisance.
I think this is a file problem, you simple saved your file in 1-byte encoding like latin-1. Google up your editor and how to set files to utf-8.
I wonder why there are editors that don't default to utf-8.