AEM/sightly : How to remove unicode in a page source code - html

when checking at the source code of my page, i can see that some special characters such as " ' " or " & " have been replaced by their unicode value. This cause some problem SEO wise and i would like to make sure that unicode symbols get appropriately rendered. Where do i start from there ?
The page is rendered via AEM using sightly as a templating engine

You could use a different display context for your title, for example <title>${page.title # context="html"}</title>, if that works for your application/site.

Related

When are HTML entities left unparsed?

I've run into something that I find a bit strange... There is a website with a <table> of audio file links, author, etc.
The link looks like:
This & That
I found it strange that the & character is not written as & in the source file. If I use the Chrome Dev Tools (or Firefox) and change the content to This & That, the string & is visible, instead of just the & character. Why is that?
The table is using TableSorter, which I thought could be doing this under the hood... but even if I open it with Javascript disabled, the raw & character is still displayed, so that doesn't seem to the be culprit.
When are HTML entities left unparsed?
They aren't. The HTML is parsed as it's being read. This is completely independent of JavaScript.
I found it strange that the & character is not written as & in the source file.
Indeed, it should be, but it works the way it is because browsers handle invalid markup gracefully, and in fact that particular aspect is now encoded in the specification: An ampersand that isn't ambiguous is taken as a literal ampersand.
If I use the Chrome Dev Tools (or Firefox) and change the content to This & That, the string & is visible, instead of just the & character. Why is that?
When you view the DOM through Chrome's devtools and similar, you're seeing the live DOM represented in valid form, not the actual source file, so naturally the browser shows you that & as & (the entity it should have been in the source file).
This is independent to Javascript.
& and & has the same HTML number as : &, which is also the ASCII Dec provision of those symbols. You can take a look at here : http://www.ascii.cl/htmlcodes.htm
So I guess there is a problem with your code, they should be seen as the same symbol (&) in your screen.

Telling the HTML editor the "&" mark is not part of the code

So I have set up rules in my signature application which is using an HTML editor at the backside, now I am trying to write a normale phrase(not coded) with a AND mark(&). But the HTML editor keeps reading this as an HTML and keeps converting it to & a m p ;
Now how do I tell the HTML that the "&" mark is not part of the HTML code?
THanks, if u need more explanation I might be able to post some print screens.
Now how do I tell the HTML that the "&" mark is not part of the HTML code?
You don't. & is a specific symbol with a specific meaning in HTML. It denotes the start of an entity in the code. In particular, it's one of four which according to the spec must always be specified as entities because the symbols carry specific meaning in HTML code.
To represent an & as text in HTML, you have to use the entity definition:
&
For example:
This & That
renders as:
This & That
If you want an HTML editor to conform to a custom HTML spec that you define, you're going to have to build your own HTML editor.

Character Encoding - Incorrect Display?

So I have a pretty fresh WordPress installation and am working on a custom theme. I have noticed that special characters aren't displaying properly.
At the moment if I try and use a ' or a "" it is displaying as an í.
I have the character encoding set to UTF-8 in my header and also the browser. I have tried in two browsers and it is the same. If I view the page source it is displaying the correct code for an apostrophe.
If I use the FireFox developer tools and manually edit the HTML and add a " or ' in to the text then it displays this correctly.
I have checked the database and the ' or " is showing correctly in there and the database is set as utf-general.
The page in question so you can see what I mean:
http://dev.evaske.com/Liverpool/about-us-2/
Any ideas what else I can try?
EDIT:
So I've managed to get it down to the fact that cause it's font-face, it doesn't like the fact that WP is outputting the ’. Is there a way to make it so that this isn't happening and it just pulls the "" instead of ’?

Some Apostrophes showing as â?T in HTML Emails

I'm using ASP Classic/VBScript to send emails using CDO.Message object. It appears that the single quote or apostrophe character ’ (as opposed to the standard character ') shows up in the recipients email as: â?T
Where is the problem and what is the best way to resolve this? I actually tried running a replace to change all ’ to ' but it appears that didn't work.
I guess I'm really not even sure what the difference is between these two different characters, and why some sites, like Microsoft for example, use ’.
http://www.hanselman.com/blog/WhyTheAskObamaTweetWasGarbledOnScreenKnowYourUTF8UnicodeASCIIAndANSIDecodingMrPresident.aspx
all the info you could need on character encoding.
You need to set the correct character encoding on .BodyPart.Charset of your CDO.Message object.
Most likely you need to set it to "utf-8" as the default appears to be "us-ascii".
This indeed was a problem of character encoding. The solution was to put two lines of code in the web page that contains my form. I actually opted to add these lines of code to the top of my Global include file which I named inc_globals.asp. This file appears at the top of every page. Here's the code that fixed the problem:
Response.CodePage = 65001
Response.CharSet = "utf-8"
As a matter of documentation, here's a post that was helpful in solving this case:
http://groups.google.com/group/microsoft.public.inetserver.asp.general/browse_thread/thread/b79e6b95e24ef0fe/a25c643aaf12770d
Mails are written in HTML format. Have you tried using HTML Entities? For your apostrophe, it should be '.
In VB :
Replace mailBody, "'", "'"

How to stop Dreamweaver from converting " double quotes to "?

I need to use double quotes inside a tag
How to stop Dreamweaver from converting " double quotes to " ?
I need the original " and not " but as soon as I add the " quote via Design view, it shows " in design view, but in code view its "
I need the " double quote to remain the same in both Design and Code view.
The reason is that i need the double quote "" in a tag such as {mytag category="news"}
I need the exact tag as {mytag category="news"} but dreamweaver is changing the double quotes in the Code view to " so this is what i am getting in the Code view
{mytag category="news"}
ISSUE :: SOLVED
There is no way to achieve this.
In HTML
foo='"'
and
foo='"'
are equivalent. If you need one of those two syntaxes over the other, then you are not dealing with HTML and shouldn't be using an HTML editor to produce your content.
foo="""
… on the other hand, is an error and you should have even less reason to want that.
I was able to disable this quote-pairing as follows :
Quit Dreamweaver.
Edit the text file: ~/Library/Application Support/Adobe/Dreamweaver CC 2017/en_US/Configuration/Brackets/brackets.json
Add the following line within the body: "closeBrackets": false,
Save the text file and relaunch Dreamweaver.
It should no longer auto-complete quote pairs.
Some people have reported they needed to make this change to the same-named file within the application folder as shown below, but I had no such file on my UK installation:
Applications\Adobe\Dreamweaver CC 2017\en_US\Configuration\Brackets\brackets.json
Found the answer
I just changed the DOCTYPE from XHTML to HTML5
And it worked fine, it did not convert the double quotes to its entity.
This solves the issue for me, for now.