I'm using Chrome (and IE's) network tools in the debugger to view
what form data I'm sending by ajax calls.
This is the parsed data:
This is the source data:
The lines marked in yellow are what my question is about.
The first picture shows the correct string that I'm sending: description +'---'.
The second picture shows: description%2B'+---', where %2B is code for a plus sign.
I'm wondering, how can there be 2 plus signs in the second picture (the actual plus and the %2B)? Furthermore, what is this second plus doing inside the quotes?
That's not the data that I'm sending. On the server side it receives correctly, but I'm just wondering, is it a bug in IE and chrome Debugger or am I missing something?
Thanks
You are missing something, but it's very subtle: in application/x-www-form-urlencoded encoding, the space character is changed to a +. So the second plus is not a plus, but rather an encoded space.
For more information, see the answer to this question.
Related
I have a Norwegian URL path which looks like this /om-os/bæredygtighed/socialt-ansvar
In my breadcrumb menu, I expect to see something like this:
Om os > Bæredygtighed > Socialt-ansvar
However, the æ is appearing as %c3%a6. So my breadcrumb looks like this:
Om os > B%c3%a6redygtighed > Socialt-ansvar
I have <meta charset="utf-8"> in the head, so I'm unsure why these characters are still appearing?
I don't know how you are building the URLs, but, except for the domains, that have a different encoding, all non-ASCII parts of a URL must be URL-encoded, AKA percent-encoded. The browser does it for you if you don't do it yourself. OTOH, the browser will in most cases show you the unencoded version of your characters. You might not be aware that what is sent over the wire is URL-encoded.
E.g., your path is sent over the wire as /om-os/b%c3%a6redygtighed/socialt-ansvar, even if you see /om-os/bæredygtighed/socialt-ansvar in the address bar. Check it with the developer tools. If you use Firefox, you will have to look at the Headers tab of the HTTP call's details in the Network tab. Chrome, instead, will also show you the HTTP call's summary row URL-encoded. That %c3%a6 in the path is the hex value of the two bytes, C3 and A6, that make up the UTF-8 encoding of the character æ.
You can even set your window.location.pathname programmatically to /om-os/bæredygtighed/socialt-ansvar, but when you read window.location.pathname afterwards, you will get it URL-encoded:
window.location.pathname = '/om-os/bæredygtighed/socialt-ansvar'
[...]
console.log(window.location.pathname)
/om-os/b%C3%A6redygtighed/socialt-ansvar
I don't know how your path flows into your breadcrumbs, but you clearly can reverse the URL-encoding before using your strings.
In JavaScript you normally do that with decodeURIComponent():
console.log(decodeURIComponent('b%c3%a6redygtighed'))
bæredygtighed
console.log(decodeURIComponent('/om-os/b%c3%a6redygtighed/socialt-ansvar'))
/om-os/bæredygtighed/socialt-ansvar
In PHP you normally do that with urldecode:
$decoded = urldecode('b%c3%a6redygtighed'); // will contain 'bæredygtighed'
But it would be better if you could make your data flow in a way that avoids the encoding and decoding steps before reaching your breadcrumbs.
If you have not yet figured out the fix -
just to add on top of whatever walter-tross has already mentioned in above answer -
For the given input - (/om-os/bæredygtighed/socialt-ansvar)
the encodeURI js-method output is as follows -
/om-os/b%C3%A6redygtighed/socialt-ansvar
and the the encodeURIComponent js-method output is as follows -
%2Fom-os%2Fb%C3%A6redygtighed%2Fsocialt-ansvar.
Given the above, it appears that you are fetching the bread-crumb input from the URL. And the behaviour is equivalent to encodeURI method, thus enabling you to split on the '/' character.
The fix, as already noted, would be to perform url-decode using decodeURI or decodeURIComponent on the individual components prior to using it as content.
I work with a legacy ASP.NET web application that has URLs that use query string values to pass information between pages. I ran into an issue with a couple of URLs that contain spaces, numbers, and dashes that I'm trying to understand.
Here's an example of the URL:
http://myserver.com/SelectReport.aspx?Name=My Report&ReportFile=my_financial_report&ReportTitle=My Financial Planning Across A 1-Year Or 2-Year Outlook
The problematic part of the URL is the ReportTitle query string value.
When I click the link in Internet Explorer 11 or Microsoft Edge, I get a Cant' reach this page. It took too long to connect to this website. Error Code: INET_E_CONNECTION_TIMEOUT error. It should be noted that the link works fine if I turn ON compatibility view settings in Internet Explorer 11.
When I click the link in Google Chrome, I get a `This site can't be reached. The connection was reset. ERR_CONNECTION_RESET" error.
If I delete the 2 in 2-Year, the link works. However, if I delete the 1 in 1-Year and leave 2-Year alone, the link does not work. I'd like to know why removing the 2 in 2-Year allows the link to work, but removing the 1 in 1-Year does not. This is true whether I replace spaces with %20 or not. Does anyone know the answer?
I know that I can replace the spaces in the ReportTitle query string value with plus signs (+) and it will work. This is likely the route I will take to fix the issue, but I was hoping to understand the issue better.
Thanks!
This is the continuation of my comments in the original post. I am writing this answer to share the demo example. It may not be a full answer.
There is absolutely no difference when you have spaces, or spaces are replaced by %20 or spaces are replaced by +. Also, I mentioned earlier your URL has valid characters including -.
See the three links below. I suspect it is your application that is dealing with URL encoding, decoding and having issues. It is not a general problem.
With Spaces
<br>
With %20
<br>
With +
I have a test site and test DB both set to windows-1252. When I type Alt+234 into Chrome it puts this symbol in the field: Ω. And when I submit the form it posts and stores it as Ω I'm assuming this is the browser saying "hey, this isn't in the specified charset but I do know of an html equivalent, so I'll post that instead". Fine. The symbol appears properly after saving, I can save, save, save, and it always appears fine. But if I try the same thing with Alt+230 the browser does not submit it's html entity value of µ. Instead I see "(unable to decode value)" when viewing the POST in the Chrome DevTool window. And it ends up being stored in the database as a question mark.
Why does it treat Alt+234 (Ω) differently than Alt+230 (µ)?
I know I should switch to UTF8 but I still would like to know why it is functioning this way. Thanks!
Using encodeURIComponent to wrap the value fixed the problem.
Broken:
`?value=${myValue}`
Working:
`?value=${encodeURIComponent(myValue)}`
U+03A9 Ω Greek capital letter omega is not part of Windows code page 1252.
U+00B5 µ Micro sign (which is not the exact same character as Greek mu) is part of 1252 (byte 181).
The Alt+keypad shortcut numbers don't align with code page 1252, or the current ANSI code page in general, so being able to type a character from that shortcut doesn't imply membership of those code pages. Instead they are from DOS code page 437.
And when I submit the form it posts and stores it as Ω I'm assuming this is the browser saying "hey, this isn't in the specified charset but I do know of an html equivalent, so I'll post that instead"
Yes, this is a long-standing weird unrecoverable mangling that HTML5 finally standardised, for when a character is not encodable in the encoding the page has requested.
Instead I see "(unable to decode value)" when viewing the POST in the Chrome DevTool window. And it ends up being stored in the database as a question mark.
The browser will be sending that character as code page 1252 byte 181. The devtools and whatever your application is aren't expecting to be dealing with code page 1252 bytes... probably they are expecting UTF-8. Because byte 181 on its own is not a valid UTF-8 sequence they can't keep it.
I'm trying to include a simple hyperlink in a website:
...Engineers (IEEE) projects:
So that it ends up looking like "...Engineers (IEEE) projects:" with "IEEE" being the hyperlink.
When I click on copy link address and paste the address, instead of getting
http://www.ieee.ucla.edu/
I get
http://www.ieee.ucla.edu/%C3%A2%E2%82%AC%C5%BD
and when I click on the link, it takes me to a 404 page.
Check the link. These special character are added automatically by browser (URL Encoding).
Url Encoding
Use this code and it will work::
IEEE
The proper format to add hyperlink to a html is as follow
(texts to be hyperlink)
and for better understanding go through this link http://www.w3schools.com/html/html_links.asp
%C3%A2%E2%82%AC%C5%BD represents „ which is when you get when a unicode „ is being parsed as Windows-1252 data.
Use straight quotes to delimit attribute values in your real code. You are doing this in the code you have included in the question, but that won't have the effect you are seeing. Presumably your codes are being transformed at some point in your real code.
Add appropriate HTTP headers and <meta> data to tell the browser what encoding your file is really using
I've been trying to get a particular design to work for an email signature without any luck, I have a feeling it's not possible, but perhaps someone has a solution.
The issue is that there is a small image that needs to be above the first line one types on. i.e.
[image]
[type here]
[signature details - e.g. phone no.]
If I make it normally, outlook always inserts a line break before the image and places the cursor there, I can't get the cursor to start after the image without clicking there (e.g. by pressing tab after typing subject).
I've tried making the image a background image of a div/span/table, I've tried using css to set the margin-top to a negative number, but the problem seems to stem from the fact that outlook inserts the signature after the div it creates for typing in.
Does anyone have a suggestion or is my task futile?
Which version of Outlook?
Try adding this to the image:
style="display:block;"
P.S - Backgrounds don't work in some versions of Outlook, unless you do some sort of conditional statements, you need to test how your e-mail renders in Microsoft Word (MSO rendering engine).
http://www.campaignmonitor.com/css/
You need to set on body a background-image and padding-top equal to the height of the image.
I did something similar a while ago to fix the "cursor problem" you mentioned. I don't remember which version of Outlook it was for, but nobody ever reported any problems with it.
As Kyle R mentioned, you should definitely test how your email renders with different clients.
An email can be a multipart message. This means the body can have multiple encodings. Each encoding comes with its own header:
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=frontier
This is a message with multiple parts in MIME format.
--frontier
Content-Type: text/plain
This is the body of the message.
--frontier
Content-Type: application/octet-stream
Content-Transfer-Encoding: base64
PGh0bWw+CiAgPGhlYWQ+CiAgPC9oZWFkPgogIDxib2R5PgogICAgPHA+VGhpcyBpcyB0aGUg
Ym9keSBvZiB0aGUgbWVzc2FnZS48L3A+CiAgPC9ib2R5Pgo8L2h0bWw+Cg==
--frontier--
(example from here)
In most clients the default encoding is Content-Type: text/plain. Adding an image however switches the encoding to eg. base64.
Each new encodings starts on a new line. I assume this is causing you the trouble, since this automatically puts your text/cursor below the image.
One way to get around this would be to encode the entire message as html - the image as well as the text, using an img tag, inline.
Clicking next to the image lets you do this unknowingly, I assume.