System settings on MAC gives error for HTML - html

I just started to learn HTML and I am using a MAC and using Sublime as my text editor. I have written 6 lines of HTML code but unfortunately it gives this strange symbol output on my browser-what could be the problem? I think it has to do with either my system or browser settings on my computer.
My output on my Chrome/Safari Browser
My basic HTML code
Any advice would be appreciated, thanks!

It's an encoding problem. You have to set the correct encoding in the HTML head via meta tag:
<meta http-equiv="Content-Type" content="text/html"; charset="UTF-8">

You need to make sure your default encoding is set to UTF-8.
Below is the snippet from the default settings.
You need to add this to your user settings.
Go to your User Settings: Preferences > Settings - User and paste the snippet.
// Encoding used when saving new files, and files opened with an undefined
// encoding (e.g., plain ascii files). If a file is opened with a specific
// encoding (either detected or given explicitly), this setting will be
// ignored, and the file will be saved with the encoding it was opened
// with.
"default_encoding": "UTF-8",

Try deleting the whitespace first and see if there are any invisible characters being added. If that doesn't work, check the encoding of your file. Its possible you have a charset issue. Another thing you could do is try using another editor and/or another browser to see if the errors persist there as well. That will help you find out the source of the problem easier if charset or invisible characters aren't the issue.
Check here.
http://www.w3schools.com/html/html_charset.asp

Related

html file displaying wierd characters when copied from Windows to Mac

Ok, I think the title pretty much sums the question up nicely. Basically, I've written an help file on my windows machine in HTML, so it includes characters like the following:
®, ', ", ...
Obviously it displays fine on Windows, but when I copy the file to my Mac and try to view it the characters above turn jibberish and look foriegn. I could type them on my Mac and save it, but I'm just worried that I need to do something to prevent the same thing from happening on other computers/environments.
If anybody knows how I can stop this from happening, as easily as possible, I'd be greatful to know. Thanks in advance...
Make sure your HTML file is saved as UTF8 and use the UTF8 meta tag:
To save a file as UTF-8, open it in using NotePad and choose "save as", then make sure encoding is set as UTF-8.
To add the UTF-8 meta tag to your HTML file, just add the following line in the "head" section: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
UTF8 is designed for backward compatibility with ASCII and to avoid the complications of endianness and byte order marks in UTF-16 and UTF-32. See: Wikipedia
My assumption is either due to file encoding (maybe one uses UTF-8 and the other iso-8859-1) or due to differences between editors. Try on the Windows machine pasting the code into Notepad or Wordpad, then sending that code to the Mac.
You can save it as unicode and add the meta like John Riche said or replace it by its HTML entities:
® = ®
http://www.w3schools.com/tags/ref_entities.asp

Characters not displaying correctly in different browsers

I used certain characters in website such as • — “ ” ‘ ’ º ©.
I found that when testing to see what my website looked like under different browsers (BrowserLab)
the afore-mentioned characters are replaced with �.
I then changed the charset in the webpage header from:
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
to
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Suddenly all the pages have the above mentioned characters replaced with a ?.
Even more puzzling is this is not always consistent across and even within the same page, as some sections display the character • and © correctly.
In particular, I need to replace the character • with one that will display across browsers, can anyone help me with the answer? Thanks.
You should save your HTML source as UTF8.
Alternatively, you can use HTML entities instead.
The source code needs to be saved in the same encoding as you're instructing the browser to parse it in. If you're saving your files in UTF-8, instruct the browser to parse it as UTF-8 by setting an appropriate HTTP header or HTML meta tag (headers preferable, your web server may be setting one without you knowing). Use a decent editor that clearly tells you what encoding you're saving the file as. If it doesn't display correctly, there's a discrepancy between what you're telling your browser the file is encoded in and what it's really encoded in.
Check to see if Apache is setup to send the charset. Look for the directive "AddDefaultCharset" and set it to Off in .htaccess or your config file.
Most/all browsers will take what is sent in the HTTP headers over what is in the document.
If you're using Notepad++, I suggest You to use Edit Plus editor to copy the text (which has the special characters) and paste it in your file. This should work.
Yes I had this problem too in notepad++ copy and pasting wasn't working with some symbols
I think SLaks is right
HTML entities for copyright symbol &#169

Why do symbols like apostrophes and hyphens get replaced with black diamonds on my website?

A website I've made has a few problems... On one of the pages, wherever there's an apostrophe (') or a dash (-), the symbol gets replaced with a weird black diamond with a question mark in the center of it
Here's what I mean
It seems this is happening all over the site wherever these symbols appear. I've never seen this before, can anyone explain it to me?
Suggestions on how to fix it would also be greatly appreciated.
See http://test.rfinvestments.co.za/index.php?c=team for a clear look at the problem.
It's an encoding problem. You have to set the correct encoding in the HTML head via meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
Replace "ISO-8859-1" with whatever your encoding is (e.g. 'UTF-8'). You must find out what encoding your HTML files are. If you're on an Unix system, just type file file.html and it should show you the encoding. If this is not possible, you should be able to find out somewhere what encoding your editor produces.
You need to change your text to 'Plain text' before pasting into the HTML document. This looks like an error I've had before by pasting straight from MS word.
MS word and other rich text editors often place hidden or invalid chars into your code. Try using — for your dashes, or ’ for apostrophes (etc), to eliminate the need for relying on your char encoding.
I have the same issue in my asp.net web application. I solved by this link
I just replace ' with ’ text like below and my site in browser show apostrophe without rectangle around as in question ask.
Original text in html page
Click the Edit button to change a field's label, width and type-ahead options
Replace text in html page
Click the Edit button to change a field’s label, width and type-ahead options
Look at your actual html code and check that the weird symbols are not originating there. This issue came up when I started coding in Notepad++ halfway after coding in Notepad. It seems to me that the older version of Notepad I was using may have used different encoding to Notepad's++ UTF-8 encoding. After I transferred my code from Notepad to Notepad++, the apostrophes got replaced with weird symbols, so I simply had to remove the symbols from my Notepad++ code.
If you are editing HTML in Notepad you should use "Save As" and alter the default "Encoding:" selection at the botom of the dialog to UTF-8.
you should also include-
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
This un-ambiguously sets the correct character set and informs the browser.
I experienced the same problem when I copied a text that has an apostrophe from a Word document to my HTML code.
To resolve the issue, all I did was deleted the particular word in my HTML and typed it directly, including the apostrophe. This action nullified the original copy and paste acton and displayed the newly typed apostrophe correctly
What I really don't understand with this kind of problem is that the html page I ran as a local file displayed perfectly in Chromium browser, but as soon as I uploaded it to my website, it produced this error.
Even stranger, it displayed perfectly in the Vivaldi browser whether displayed from the local or remote file.
Is this something to do with the way Chromium reads the character set? But why only with a remote file?
I fixed the problem by retyping the text in a simple text editor and making sure the single quote mark was the one I used.

Eclipse HTML editor for HTML template files

I'm trying to edit phpbb HTML template file with Eclipse Ganymedes version 3.4.1 containing Web Developer Tools.
These template files contain HTML markup with template variable marks in form {variable_name}. Now, when trying to open such file, Eclipse trys to validate also these template variable marks.
For example template contains
<meta http-equiv="content-type" content="text/html; charset={S_CONTENT_ENCODING}" />
After opening Eclipse shows on editor body:
Unsupported Character Body
Character encoding "{S_CONTENT_ENCODING}" is not supported by this platform.
<button>Set encoding...</button>
How to solve this using WTP or is there any better editor for template editing purpose ?
Eclipse is trying to determine the text encoding from your meta tags and fails.
To override this behavior open the file in eclipse so you can see the error. Open the File menu and choose Properties (Alt-Enter) and eclipse will show you the properties dialog for the file where you can change the text file encoding.
I don't know if this can be disabled for all the files.
I've never used Eclipse on Linux, but it looks like the problem isn't really about Eclipse supporting variables -- it's about it trying to render what a character set that it thinks is called "{S_CONTENT_ENCODING}"
You can probably get around the problem by changing {S_CONTENT_ENCODING} to utf-8 (or latin-1 or whatever) in all of your templates. (This assumes that you aren't changing encoding from one template to the next, but I really doubt you are.)
Copy-paste utf-8 where you see {S_CONTENT_ENCODING} in one of the templates, and Eclipse should handle it the other {foo} instances from there.

Question mark characters display within text. Why is this?

I have a backup server that automatically backs up my live site, both files and database.
On the live site, the text looks fine, but when you view the mirrored version of it, it displays '?' within some of the text. This text is stored within the news database table.
Here is a screenshot of it being on the live server and of it on the mirrored server.
What could happen within the process of backing it up to the mirrored server?
The live server is Solaris, and the mirrored server is Linux Red Hat Linux 5.
The following articles will be useful:
10.3 Specifying Character Sets and Collations
10.4 Connection Character Sets and Collations
After you connect to the database, issue the following command:
SET NAMES 'utf8';
Ensure that your web page also uses the UTF-8 encoding:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
PHP also offers several functions that will be useful for conversions:
iconv
mb_convert_encoding
Edit your Apache configuration file on the "mirror" server (the server with the problem), and comment-out the following line:
AddDefaultCharset UTF-8
Then restart Apache:
service httpd restart
The problem is that the "AddDefaultCharset UTF-8" line overrides the Content-Type specified in the .html files; e.g.:
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
The most common symptom is that character codes above 127 display as black diamonds with question marks on them (in Chrome, Safari or Firefox), or as little boxes (in Internet Explorer and Opera).
HTML files generated by Microsoft Word usually have many such characters, the most common one being character code 160 = 0xA0, which is equivalent to " " in the Windows-1252 encoding, and is often found between span tags, like this:
<span style="mso-spacerun: yes">ááá </span>
I got here looking for a solution for JavaScript displayed in the browser and although not directly related with a database...
In my case I copied and pasted some text I found on the Internet into a JavaScript file and saved it with Windows Notepad.
When the page that uses that JavaScript file output the strings, there were question marks (like the ones shown in the question) instead of the special characters like accented letters, etc.
I opened the file using Notepad++. Right after opening the file I saw that the character encoding was set as ANSI as you can see (mouse cursor on footer) in the following screenshot:
To solve the issue, click the Encoding menu in Notepad++ and select Encode in UTF-8. You should be good to go. :)
This is going to be something to do with character encodings.
Are you sure the mirrored site has the same properties with regards to character encodings as your main server?
Depending on what sort of server you have, this may be a property of the server process itself, or it could be an environment variable.
For example, if this is a UNIX environment, perhaps try comparing LANG or LC_ALL?
See also here
Unicode or other character set characters falling through?
I have seen similar "strange" characters show up on sites I have worked on often when the text is copied from an email or some other document format (e.g. word) into a text editor. The editor can display the non ASCII characters but the browser can't. For the website, I would suggest looking up the HTML entity code for the character and inserting that instead ... or switch to more standard ones.
Your browser hasn't interpreted the encoding of the page correctly (either because you've forced it to a particular setting, or the page is set incorrectly), and thus cannot display some of the characters.
Check the character set being emitted by your mirrored server. There appears to be a difference from that to the main server -- the live site appears to be outputting Unicode, where the mirror is not. Also, it's usually a good idea to scrub Unicode characters in your incoming content and replace them with their appropriate HTML entities.
Your specific issue regards "smart quotes," "em dashes" and "en dashes." I know you can replace em dashes with — and n-dashes with – (which should be done on the input side of your database); I don't know what the correct replacement for the smart quotes would be. (I usually just replace all curly single quotes with ' and all curly double quotes with " ... Typography geeks may feel free to shoot me on sight.)
I should note that some browsers are more forgiving than others with this issue -- Internet Explorer on Windows tends to auto-magically detect and "fix" this; Firefox and most other browsers display the question marks.
I had this issue so I just took all my content, copy/pasted it into Notepad, made a new PHP file, pasted back in, re-saved and overwrote, and.. that worked!
It really was some relic of Microsoft Word editing...
I usually curse MS Word and then run the following Windows Script Host script.
// Replace with path to a file that needs cleaning
PATH = "test.html"
var go = WScript.CreateObject("Scripting.FileSystemObject");
var content = go.GetFile(PATH).OpenAsTextStream().ReadAll();
var out = go.CreateTextFile("clean-"+PATH, true);
// Symbols
content = content.replace(/“/g, '"');
content = content.replace(/”/g, '"');
content = content.replace(/’/g, "'");
content = content.replace(/–/g, "-");
content = content.replace(/©/g, "©");
content = content.replace(/®/g, "®");
content = content.replace(/°/g, "°");
content = content.replace(/¶/g, "<p>");
content = content.replace(/¿/g, "¿");
content = content.replace(/¡/g, '¡');
content = content.replace(/¢/g, '¢');
content = content.replace(/£/g, '£');
content = content.replace(/¥/g, '¥');
out.Write(content);