I have a file with HTML with UTF-16 LE encoding, the issue is CSS file is not linking to HTML
I copy all content in the HTML file to a new HTML file, it works fine.
After several attempts to understand the issue, I came to know the file is encoded with UTF-16 LE, but how encoding change the functionality and at the same time the file looks fine and it also rendering in chrome with all good characters.
My question is how encoding change the functionality?
Related
as you know txt files and HTML files are text-based documents and I want to know why we use .HTML extension?
I mean both of them (text and HTML files) are text-based but when I open txt file, my browser can't render it.
help me, please.
The .txt extension indicates a plain text file. HTML is text based, but it isn't plain text, it is formatted with HTML.
Software uses file extensions (or Content-Type headers if we're using HTTP) to determine how to treat a file.
I'm using Microsoft Windows 10 Home Single Language 64-bit Operating System on my laptop. I'm learning HTML on this laptop from W3Schools HTML Tutorial(The Best in Class Tutorial available on internet).
I wrote following HTML code into simple Notepad editor:
<!DOCTYPE html>
<html>
<body>
<h1>My First Heading</h1>
<p>My first paragraph.</p>
</body>
</html>
Then I named this file as index.html and was trying to save on my hard disk. While doing so I changed the option from Encoding drop-down box from ANSI to UTF-8 and then saved the file on my hard disk.
So my question without using the proper syntax for character encoding i.e.
<meta charset="UTF-8">
into <head> tag of HTML page will the character encoding get apply to the file index.html which I saved to my hard disk.
If yes how without adding a code for it? If no why even after setting the encoding type before saving the file?
The character encoding of the file itself only has to do with how the content of the file is saved. Special characters etc. How the file is viewed is a completely different matter. Depending on the server, the file will either base its encoding on the server-side setup (httpd.conf, for instance), or if it's served via PHP, what settings are set up in php.ini. If the server, for some reason, does not use any of those settings, the <meta>-tag in the HTML might be used, and then it should match the method it's saved with, since if not, characters might be shown either broken, wrong, or just like gibberish.
I am using Windows Application to develop my project. There is one case, in which i need to convert RTF file to HTML and printing it. In the RTF file containing images also. In simple i can convert RTF to HTML. But images cant convert. it not there in the HTML file.
So can anyone give some idea regarding same that how can i do convert RTF file data including Images to HTML file?
Thanks in Advance.
In HTML you can embed an image as Base64 encoded. See here, with a data: URL. How the binary bytes can easily be converted to Base64 ASCII I do not know.
Okay, something just went crazy. Unless China is taking over starting with my test style.css file on my iepage - well I guess they are starting off on the right foot hating on IE, but anyways. It loads with no stylesheet - sad :( I go into the Web inspector and see that all my linked files are filled with [possibly] Chinese characters (瑨汭笠ऊ楷瑤...) I have tried deleting the files on the server and re-uploading them. The local files look fine and when loading the files directly they look fine. I didn't do anything that should of changed the rendering or anything either.
So I think I figured it out. This is weird. But anyway.
I copied and pasted your HTML to a local file to experiment with. And it loaded just fine. It was saved as UTF-8. Then I changed it to UTF-16, and I got exactly what you're seeing! As far as can tell, the browser (Firefox for Linux for me) is assuming the linked files are all in the same encoding as the HTML...
So - I assume the file on the server is in UTF-16, and if you change it to UTF-8 you should be good. Hope that fixes it!
PS: According to Firebug, your HTML is compressed by your server, even if you never explicitly told it to. But that doesn't seem to be causing any problems, thankfully.
I encountered this same problem with XML files exported from PowerShell that were embedded in iFrames.
There was no issue in IE10/11 or Edge, but Firefox and Chrome wouldn't load the stylesheet.
The original page loading the iFrames was UTF8 encoded, same with the stylesheet. However, the XML file was exported to UTF16LE ("Unicode" in PowerShell). When the XML file was loaded from the iFrame, it loaded the stylesheet as Chinese characters.
I converted the encoding in PowerShell...
Get-Content C:\foldername\file.html -Encoding Unicode | Set-Content -Encoding UTF8 C:\foldername\file.html
...and it worked! My guess is that IE must treat the encoding of all files the same as the parent, which meant that the UTF16LE encoded file was rendered as UTF8. Chrome and Firefox apparently don't do that.
Thanks Xavier Holt for setting me on the right path!
Another quick solution is to change the file encoding using Notepad.
Open the file in Notepad and Save As with the UTF-8 option selected from the drop down
it may be the .html file itself. I solved my similar problem by copying the contents of the original .html file and pasting it into a new file with the same name in the same directory (change the original's file name at first and delete the remainder of course)
I created a static website in which each page has the following structure:
Common stuff like header, menu, etc.
Page specific stuff in main content div
Footer
In this website, all the common content is duplicated in each page. In order to improve the maintainability I refactored the pages to use server-side includes (SSI) so that the common content is not duplicated. The structure of each page is now
SSI for Common stuff like header, menu, etc.
Page specific stuff in main content div
SSI for footer
In the refactored site, for some reason the French characters no longer display properly in the page-specific content area, though they display fine in the content included via SSIs.
The included header specifies the character set as:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
If I open one of the main content pages in a browser it tells me that the character encoding is ISO-8859-1. I've tried adding a .htaccess file to the folder with the lines
AddDefaultCharset UTF-8
AddCharset UTF-8 .shtml
AddCharset UTF-8 .html
But still those pesky French accents aren't displaying properly on the version of the site that uses SSIs.
You are serving your pages as UTF-8, which is good, but at least some of the page is being dragged in from files which are not actually saved as UTF-8. SSI just throws the raw bytes in, it doesn't attempt to recode the includes so that their charsets match the file they're being included into.
You need to go through all your html and include files in a text editor and make sure each one is saved as UTF-8.
As John mentioned, you can avoid encoding issues by using character references for all non-ASCII characters, but it's a tremendous pain.
Your HTML document is using UTF-8 encoding, try these character codes for your accented letters: http://www.tony-franks.co.uk/UTF-8.htm
I had the same problem as you and finally found a solution that fixed it.
UTF8 makes an extra line on my site
Save all your files as UTF-8 without BOM (http://en.wikipedia.org/wiki/Byte_order_mark).