Outputting JSON with angle brackets in browser - json

Currently I'm working on a small webservice which outputs JSON to the client. For testing purposes I let the JSON output on my browser (Firefox 20). Within the JSON I use tags for declaring text in different languages but it seems that this causes some trouble as my browser filters the start tag.
I guess the browsers (I also tried it on Chrome and Opera) think that the tags are HTML and try to handle it. So I put the JSON code in CODE-Tags and PRE-Tags as well but the result is always the same.
In other words, what I get:
"description":"Bild 1<\/de>Image 1<\/en>\u5199\u771f\u7b2c\u4e00<\/jp>"
What I want:
"description":"<de>Bild 1<\/de><en>Image 1<\/en><jp>\u5199\u771f\u7b2c\u4e00<\/jp>"
Important: The output is what it have to be (says my debugger), it's just how the browser shows it. Is there a possibility to let the browsers ignore the tags or do I have to use "& lt;" and "& gt;"? Thank you!

Escaping XML might seem like the immediate solution, but it is not. It will break your original webservice.
Please recheck if you are sending the below header in the response:
Content-Type: application/json
The above header would make the browsers interpret the response as a JSON (and not HTML)

Yes, try using escape characters XML Escape characters

Related

Multilingual URLs showing as unicode in breadcrumb menu

I have a Norwegian URL path which looks like this /om-os/bæredygtighed/socialt-ansvar
In my breadcrumb menu, I expect to see something like this:
Om os > Bæredygtighed > Socialt-ansvar
However, the æ is appearing as %c3%a6. So my breadcrumb looks like this:
Om os > B%c3%a6redygtighed > Socialt-ansvar
I have <meta charset="utf-8"> in the head, so I'm unsure why these characters are still appearing?
I don't know how you are building the URLs, but, except for the domains, that have a different encoding, all non-ASCII parts of a URL must be URL-encoded, AKA percent-encoded. The browser does it for you if you don't do it yourself. OTOH, the browser will in most cases show you the unencoded version of your characters. You might not be aware that what is sent over the wire is URL-encoded.
E.g., your path is sent over the wire as /om-os/b%c3%a6redygtighed/socialt-ansvar, even if you see /om-os/bæredygtighed/socialt-ansvar in the address bar. Check it with the developer tools. If you use Firefox, you will have to look at the Headers tab of the HTTP call's details in the Network tab. Chrome, instead, will also show you the HTTP call's summary row URL-encoded. That %c3%a6 in the path is the hex value of the two bytes, C3 and A6, that make up the UTF-8 encoding of the character æ.
You can even set your window.location.pathname programmatically to /om-os/bæredygtighed/socialt-ansvar, but when you read window.location.pathname afterwards, you will get it URL-encoded:
window.location.pathname = '/om-os/bæredygtighed/socialt-ansvar'
[...]
console.log(window.location.pathname)
/om-os/b%C3%A6redygtighed/socialt-ansvar
I don't know how your path flows into your breadcrumbs, but you clearly can reverse the URL-encoding before using your strings.
In JavaScript you normally do that with decodeURIComponent():
console.log(decodeURIComponent('b%c3%a6redygtighed'))
bæredygtighed
console.log(decodeURIComponent('/om-os/b%c3%a6redygtighed/socialt-ansvar'))
/om-os/bæredygtighed/socialt-ansvar
In PHP you normally do that with urldecode:
$decoded = urldecode('b%c3%a6redygtighed'); // will contain 'bæredygtighed'
But it would be better if you could make your data flow in a way that avoids the encoding and decoding steps before reaching your breadcrumbs.
If you have not yet figured out the fix -
just to add on top of whatever walter-tross has already mentioned in above answer -
For the given input - (/om-os/bæredygtighed/socialt-ansvar)
the encodeURI js-method output is as follows -
/om-os/b%C3%A6redygtighed/socialt-ansvar
and the the encodeURIComponent js-method output is as follows -
%2Fom-os%2Fb%C3%A6redygtighed%2Fsocialt-ansvar.
Given the above, it appears that you are fetching the bread-crumb input from the URL. And the behaviour is equivalent to encodeURI method, thus enabling you to split on the '/' character.
The fix, as already noted, would be to perform url-decode using decodeURI or decodeURIComponent on the individual components prior to using it as content.

HTML Hrefs in Tomcat

I'm building a HTML string in tomcat and I notice that in my JSON object, my clickable href link is something like:
http://localhost/%22/https://myLinkHere.com/%22
This is a 2 part question. First, should it contain the http://localhost in front? And secondly, why is the %22 there?
Here is what my JSON href looks like in text:
linkDisplayName
This looks right to me, but I can't tell why the last %22 is there.
I think you won't need the localhost as long as you are supplying the relative path
The ascii code for %22 is " which is correctly referenced in your link.
HTML parsers are very lenient, which often leads to confusing behavior. Without the exact JSON it's hard to say for sure but there are a couple of obvious issues. Ultimately the issue is your HTML is malformed and/or mis-escaped.
%22 is " URL-encoded, which means that the quotes you've \-escaped are being included in the URL rather than surrounding them. That likely means that in the JSON they're double-escaped. That might mean it's \\" or something similar; try just a single backslash (\") or no backslash at all (").
Notice that the protocol (https:/) in your URL is also wrong; a URL starts with a protocol (like https) followed by a :, and generally followed by two slashes (//). Your URL follows the protocol with just a single slash, which makes it look like a relative URL rather than an absolute one. Browsers will prefix relative URLs with whatever they infer the current host to be, which in your context appears to be localhost.
The HTML should look like this:
linkDisplayName
So in summary no, the URL should probably not contain http://localhost, and it should not contain those %22s either. They're showing up because your JSON is malformed.

How do I displayed Json into a text view with correct identation in cocos2d/cocos2dx?

When I try to show the JSON string, the indentation is gone and very ugly.
Is there a way to make it look better?
As you stated in a comment : the JSON string looks good in a browser. That is propably (I'm 99% sure) because most browsers will recognize the content as JSON and display it nicely, in an easier to read form. But in its core, the JSON string is just a set of characters, without any special whitespace characters like /n or /t to indicate newlines or tabs.
So to answear your question : in order to display your JSON string similarily to what you see in your browser, you'd have to parse and format it yourself.
If you are using a CCLabel to display it, you could subclass it creating a JSONLabel which would format a string given in its constructor the way that you like it.

Label text ignoring html tags

<label for="abc" id="xyz">http://abc.com/player.js</xref>?xyz="foo" </label>
is ignoring
</xref> tag
value in the browser. So, the displayed output is
http://abc.com/player.js?xyz="foo"
but i want the browser to display
http://abc.com/player.js</xref>?xyz="foo"
Please help me how to achieve this.
It isn't being ignored. It is being treated as an end tag (for a non-HTML element that has no start tag). Use < if you want a < character to appear as data instead of as "start of tag".
That said, this is a URL and raw <, > and " characters shouldn't appear in URIs anyway. So encode it as http://abc.com/player.js%3C/xref%3E?xyz=%22foo%22
You should do it like this
"http://abc.com/player.js%3C/xref%3E?xyz=foo"
Url should be encoded properly to work as valid URL
Use encodeURI for encoding URLs for a valid one
var ValidURL = encodeURI("http://abc.com/player.js</xref>?xyz=foo");
See this answer on encodeURI for better knowledge.
I misunderstood the question, I thought the URI was to be used elsewhere within JavaScript. But the question pretty clearly states that the URI is to just be rendered as text.
If the text being displayed is being passed in from a server, then your best bet is to encode it before printing it on the page (or if you're using a template engine, then you can most likely just encode it on the template). Pretty much any web framework/templating engine should have this functionality.
However, if it is just static HTML, just manually encode the the characters. If you don't know the codes off the top of your head, you can just use some online converter to help, such as something like:
HTML Encode/Decode:
http://htmlentities.net/
Old Answer:
Try encoding the URI using the JavaScript function encodeURI before using it:
encodeURI('http://abc.com/player.js</xref>?xyz="foo"');
You can also decode it using decodeURI if need be:
decodeURI(yourEncodedURI);
So ultimately I don't think you'll be able to get the browser to display the </xref> tag as is, but you will be able to preserve it (using encodeURI/decodeURI) and use it in your code, if this is what you need.
Fiddle:
http://jsfiddle.net/rk8nR/3/
More info:
When are you supposed to use escape instead of encodeURI / encodeURIComponent?

What other characters beside ampersand (&) should be encoded in HTML href/src attributes?

Is the ampersand the only character that should be encoded in an HTML attribute?
It's well known that this won't pass validation:
Because the ampersand should be &. Here's a direct link to the validation fail.
This guy lists a bunch of characters that should be encoded, but he's wrong. If you encode the first "/" in http:// the href won't work.
In ASP.NET, is there a helper method already built to handle this? Stuff like Server.UrlEncode and HtmlEncode obviously don't work - those are for different purposes.
I can build my own simple extension method (like .ToAttributeView()) which does a simple string replace.
Other than standard URI encoding of the values, & is the only character related to HTML entities that you have to worry about simply because this is the character that begins every HTML entity. Take for example the following URL:
http://query.com/?q=foo&lt=bar&gt=baz
Even though there aren't trailing semi-colons, since < is the entity for < and > is the entity for >, some old browsers would translate this URL to:
http://query.com/?q=foo<=bar>=baz
So you need to specify & as & to prevent this from occurring for links within an HTML parsed document.
The purpose of escaping characters is so that they won't be processed as arguments. So you actually don't want to encode the entire url, just the values you are passing via the querystring. For example:
http://example.com/?parameter1=<ENCODED VALUE>&parameter2=<ENCODED VALUE>
The url you showed is actually a perfectly valid url that will pass validation. However, the browser will interpret the & symbols as a break between parameters in the querystring. So your querystring:
?q=whatever&lang=en
Will actually be translated by the recipient as two parameters:
q = "whatever"
lang = "en"
For your url to work you just need to ensure that your values are being encoded:
?q=<ENCODED VALUE>&lang=<ENCODED VALUE>
Edit: The common problems page from the W3C you linked to is talking about edge cases when urls are rendered in html and the & is followed by text that could be interpreted as an entity reference (&copy for example). Here is a test in jsfiddle showing the url:
http://jsfiddle.net/YjPHA/1/
In Chrome and FireFox the links works correctly, but IE renders &copy as ©, breaking the link. I have to admit I've never had a problem with this in the wild (it would only affect those entity references which don't require a semicolon, which is a pretty small subset).
To ensure you're safe from this bug you can HTML encode any of your URLS you render to the page and you should be fine. If you're using ASP.NET the HttpUtility.HtmlEncode method should work just fine.
You do not need HTML escapement here:
According to the HTML5 spec:
http://www.w3.org/TR/html5/tokenization.html#character-reference-in-attribute-value-state
&lang= should be parsed as non-recognized character reference and value of the attribute should be used as it is: http://domain.com/search?q=whatever&lang=en
For the reference: added question to HTML5 WG: http://lists.w3.org/Archives/Public/public-html/2011Sep/0163.html
In HTML attribute values, if you want ", '&' and a non-breaking space as a result, you should (as an author who is clear about intent) have ", & and in the markup.
For " though, you don't have to use " if you use single quotes to encase your attribute values.
For HTML text nodes, in addition to the above, if you want < and > as a result, you should use < and >. (I'd even use these in attribute values too.)
For hfnames and hfvalues (and directory names in the path) for URIs, I'd used Javascript's encodeURIComponent() (on a utf-8 page when encoding for use on a utf-8 page).
If I understand the question correctly, I believe this is what you want.