userinfo in HTML href - html

I have noticed that when I use the userinfo part (user:password) in an HTML attribute such as href or src, the Chrome dev tools seem to go crazy.
For example, if I navigate to http://localhost:8080 and the displayed document's HTML contains this:
yo mama
then the dev tools, upon hovering or copying the attribute value, yield http://localhost:8080/http://kaczka:dziwaczka#google.pl.
The link itself, however, works correctly when simply clicked.
Does anybody know wheter this is an issue with Chrome dev tools or some intentional feature? If this is intentional, why does this happen?

No, this is not by design. Rather, this type of URL has never been regarded to be completed correctly. Generally speaking, this type of credentials specification has been deprecated - use the Authorization HTTP header instead (see serverfault.com and https://code.google.com/p/chromium/issues/detail?id=123150).

Related

What is the difference between these URL syntax?

I was sent a hyperlink to a Tableau Public link by a client. When I tried opening it, I got a 404 exception. I wrote back to the client but was told by the same that the link was working fine. I visited his profile page and was able to open the presentation there, but the URL that ended up working was slightly different than the one behind the original, non-functioning link.
Here's the anonymized URL behind the original link
https://public.tableau.com/profile/[client_name]%23!/vizhome/Project-AirportDelay/FlightPerformancesinUSA?publish=yes
And here's the URL via the profile page:
https://public.tableau.com/profile/[client_name]#!/vizhome/Project-AirportDelay/FlightPerformancesinUSA
The only differences I see are ?publish=yes and %23!. I tried appending the former, ?publish=yes, to the working URL, and it was still functional. So I suspect that it has to do with the other difference %23! vs. #!. Could the first work because he is opening it from his computer where he is likely logged onto Tableau Public? What's the difference between these syntax? Any ideas about why the original hyperlink might not be functional?
For obvious privacy reasons, I can't provide the whole URL.
It looks like the basic URL pattern for passing filters ?publish=yes
and
%23 is the URL encoded representation of #
The first # after the authority component starts the fragment component. If the # should be part of the path component or the query component, it has to be percent-encoded as %23.
As # is a reserved character, these URIs aren’t equivalent:
http://example.com/foo#bar
http://example.com/foo%23bar
There are countless ways how a URI reference could become erroneous. The culprit is often a software, like a word processor, where someone pastes the correct URI, and the software incorrectly percent-encodes it (maybe assuming that the user didn’t paste the real/correct URI).
Copy-pasting the URI from the browser address bar into a plain text document should always work correctly.

What happens to an img html tag with a json src?

What happens to an img html tag (<img src="address-to-server-action">) if the response from the server is not a valid img, say JSON. Would the browser throw a parse error? Does all browsers have the same behaviour? Does it matter if the img is hidden?
I'm actually just trying to bring a cookie from a cross domain server that serves only JSON.
What happens if you try something like <img src="http://example.com/some.json"> and want JSON back from it is that browsers won’t let you. If the src value isn’t a valid image the browser supports, it’s just not going to show it and not let you do anything else with the response.
That’s how browsers work and that behavior is documented in the HTML spec, which references the request algorithm in the Fetch spec, one property of which is the request mode.
Unless a fetch is performed with a specific mode set, then the default mode no-cors is used. And the HTML spec algorithm for image requests doesn’t set any other mode, so they’re no-cors.
And responses to no-cors requests are handled by browsers as opaque filtered responses, which means that from any client-side JavaScript code you might try in your web app, you won’t be able to access any information/properties from the response.
The browser will display alternative text. The browser may inform the user that the image is not properly loaded.
You can hide the image (display: none;) and it will not show up.
You can set cross domain cookies as described in http://www.ainixon.me/set-cookie-on-cross-domains/.

Why is IE10 removing URL hash marks on external redirect links

I have a basic link:
Free Pie Here
but when I click on it, I'm redirected to https://pieworld.com/apple
Everything after the hash mark, as well as the hash, are not included. This is only happening in IE10. I've tested without the target="_blank" as well, but the link still breaks at the hash.
Can't seem to find any documentation on this. The closest I've come to is this SO question, but it doesn't help.
Some background info that might help:
This is a .Net site
I'm redirecting from a http: to a https: site.
According to the RFC3986 https://www.rfc-editor.org/rfc/rfc3986 it is not OK to use this format. You should remove the trailing slash. If you have a trailing slash, it points to a directory within the server. Without it, you point to a document and with the hashmark you are allowed to point to a segment of the document. See example here.
A hash character is used for bookmarks in an URL. To use a hash character as part of the URL itself, you need to URL encode it using %23:
Free Pie Here
Why do you have a trailing slash after the hash?
Try https://pieworld.com/apple/#1
That would be more standard. I've never heard of anyone putting trailing slashes after hash links.
I Think, as the other folks suggested, that the website that you are trying to navigate to may interpret the /#1 as a folder/page inside the parent-page/document. Try removing the forward-slash before the #1 or look inside the html for the header's id/name tag so you can link it directly.
May also be a bug in IE10.
-Phantom
Any URL that contains a # character is a fragment URL. The portion of the URL to the left of the # identifies a resource that can be downloaded by a browser and the portion on the right, known as the fragment identifier, specifies a location within the resource.
http://www.httpwatch.com/features.htm#print
In HTML documents, the browser looks for an element with id attribute matching the fragment. For example, in the URL shown above the browser finds a matching tag in the Printing Support heading:
<h3 id="print">Printing Support</h3>
and scrolls the page to display that section.
I am not sure if the slash after the hash is supported. If you didn't mean to use it as a fragmented url, you should remove the hash or replace it.
The syntax of the Location header field has been changed to allow all URI references, including relative references and fragments,
along with some clarifications as to when use of fragments would not be appropriate. (Section 7.1.2)
for more information check this thorough post.
Hash removed from URL when back button clicked IE9 , IE10 IE11
In IE10 browser, first time on clicking the HREF link, then it comes to the correct below url: http://www.example.com/yy/zz/ff/paul.html#20007_14
If back button is clicked from the IE10 browser and again clicked the HREF link , then it comes to the below url: http://www.example.com/yy/zz/ff/paul.html
Solution :
Please change your url with https
It works for mine

Spoofing HTTP-request Referrer from HTML?

Is there some secret and mystical way to change the value of my HTTP-request's referer, or at the very least, keep it from showing? Also, using a MitM page from another domain would not solve my issue, as you are now just submitting that other page's value.
This is not browser specific, I would need to do this on the HTML level.
The problem I am facing is a silent-login page where it sends an HTTP-Redirect to the http-Referrer, unless it is the same domain, or empty.
You can not control this on an html level. Your only option is to modify the login code to not issue the redirect or to direct it to the desired page.
It's an old question, but I know how you can do this. The first way is not guaranteed across all browsers, but you can use rel=noreferrer. AFAIK GC is the only UA to currently support this but it is in the standard. FX may also, IDK.
The second way is far more reliable, and it involves a cool little hack someone shared with me on IRC:
Basically, construct an iframe from a base64-encoded data: URI. The framed document is to have a script that listens for a window.postMessage() and when it gets fed the command with a URL to visit, it executes window.top.location = msg.data.URI or however it is that one reads the message. Sorry I can't recall, I haven't slept for a few days.
Enjoy if you still care.. :)

Best Way to View Generated Source of Webpage?

I'm looking for a tool that will give me the proper generated source including DOM changes made by AJAX requests for input into W3's validator. I've tried the following methods:
Web Developer Toolbar - Generates invalid source according to the doc-type (e.g. it removes the self closing portion of tags). Loses the doctype portion of the page.
Firebug - Fixes potential flaws in the source (e.g. unclosed tags). Also loses doctype portion of tags and injects the console which itself is invalid HTML.
IE Developer Toolbar - Generates invalid source according to the doc-type (e.g. it makes all tags uppercase, against XHTML spec).
Highlight + View Selection Source - Frequently difficult to get the entire page, also excludes doc-type.
Is there any program or add-on out there that will give me the exact current version of the source, without fixing or changing it in some way? So far, Firebug seems the best, but I worry it may fix some of my mistakes.
Solution
It turns out there is no exact solution to what I wanted as Justin explained. The best solution seems to be to validate the source inside of Firebug's console, even though it will contain some errors caused by Firebug. I'd also like to thank Forgotten Semicolon for explaining why "View Generated Source" doesn't match the actual source. If I could mark 2 best answers, I would.
Justin is dead on. The key point here is that HTML is just a language for describing a document. Once the browser reads it, it's gone. Open tags, close tags, and formatting are all taken care of by the parser and then go away. Any tool that shows you HTML is generating it based on the contents of the document, so it will always be valid.
I had to explain this to another web developer once, and it took a little while for him to accept it.
You can try it for yourself in any JavaScript console:
el = document.createElement('div');
el.innerHTML = "<p>Some text<P>More text";
el.innerHTML; // <p>Some text</p><p>More text</p>
The un-closed tags and uppercase tag names are gone, because that HTML was parsed and discarded after the second line.
The right way to modify the document from JavaScript is with document methods (createElement, appendChild, setAttribute, etc.) and you'll observe that there's no reference to tags or HTML syntax in any of those functions. If you're using document.write, innerHTML, or other HTML-speaking calls to modify your pages, the only way to validate it is to catch what you're putting into them and validate that HTML separately.
That said, the simplest way to get at the HTML representation of the document is this:
document.documentElement.innerHTML
[updating in response to more details in the edited question]
The problem you're running into is that, once a page is modified by ajax requests, the current HTML exists only inside the browser's DOM-- there's no longer any independent source HTML that you can validate other than what you can pull out of the DOM.
As you've observed, IE's DOM stores tags in upper case, fixes up unclosed tags, and makes lots of other alterations to the HTML it got originally. This is because browsers are generally very good at taking HTML with problems (e.g. unclosed tags) and fixing up those problems to display something useful to the user. Once the HTML has been canonicalized by IE, the original source HTML is essentially lost from the DOM's perspective, as far as I know.
Firefox most likley makes fewer of these changes, so Firebug is probably your better bet.
A final (and more labor-intensive) option may work for pages with simple ajax alterations, e.g. fetching some HTML from the server and importing this into the page inside a particular element. In that case, you can use fiddler or similar tool to manually stitch together the original HTML with the Ajax HTML. This is probably more trouble than it's worth, and is error prone, but it's one more possibility.
[Original response here to the original question]
Fiddler (http://www.fiddlertool.com/) is a free, browser-independent tool which works very well to fetch the exact HTML received by a browser. It shows you exact bytes on the wire as well as decoded/unzipped/etc content which you can feed into any HTML analysis tool. It also shows headers, timings, HTTP status, and lots of other good stuff.
You can also use fiddler to copy and rebuild requests if you want to test how a server responds to slightly different headers.
Fiddler works as a proxy server, sitting in between your browser and the website, and logs traffic going both ways.
I know this is an old post, but I just found this piece of gold. This is old (2006), but still works with IE9. I personnally added a bookmark with this.
Just copy paste this in your browser's address bar:
javascript:void(window.open("javascript:document.open(\"text/plain\");document.write(opener.document.body.parentNode.outerHTML)"))
As for firefox, web developper tool bar does the job. I usually use this, but sometimes, some dirty 3rd party asp.net controls generates differents markups based on the user agent...
EDIT
As Bryan pointed in the comment, some browser remove the javascript: part when copy/pasting in url bar. I just tested and that's the case with IE10.
If you load the document in Chrome, the Developer|Elements view will show you the HTML as fiddled by your JS code. It's not directly HTML text and you have to open (unfold) any elements of interest, but you effectively get to inspect the generated HTML.
In the Web Developer Toolbar, have you tried the Tools -> Validate HTML or Tools -> Validate Local HTML options?
The Validate HTML option sends the url to the validator, which works well with publicly facing sites. The Validate Local HTML option sends the current page's HTML to the validator, which works well with pages behind a login, or those that aren't publicly accessible.
You may also want to try View Source Chart (also as FireFox add-on). An interesting note there:
Q. Why does View Source Chart change my XHTML tags to HTML tags?
A. It doesn't. The browser is making these changes, VSC merely displays what the browser has done with your code. Most common: self closing tags lose their closing slash (/). See this article on Rendered Source for more information (archive.org).
Using the Firefox Web Developer Toolbar (https://addons.mozilla.org/en-US/firefox/addon/60)
Just go to View Source -> View Generated Source
I use it all the time for the exact same thing.
I had the same problem, and I've found here a solution:
http://ubuntuincident.wordpress.com/2011/04/15/scraping-ajax-web-pages/
So, to use Crowbar, the tool from here:
http://simile.mit.edu/wiki/Crowbar (now (2015-12) 404s)
wayback machine link:
http://web.archive.org/web/20140421160451/http://simile.mit.edu/wiki/Crowbar
It gave me the faulty, invalid HTML.
This is an old question, and here's an old answer that has once worked flawlessly for me for many years, but doesn't any more, at least not as of January 2016:
The "Generated Source" bookmarklet from SquareFree does exactly what you want -- and, unlike the otherwise fine "old gold" from #Johnny5, displays as source code (rather than being rendered normally by the browser, at least in the case of Google Chrome on Mac):
https://www.squarefree.com/bookmarklets/webdevel.html#generated_source
Unfortunately, it behaves just like the "old gold" from #Johnny5: it does not show up as source code any more. Sorry.
In Firefox, just ctrl-a (select everything on the screen) then right click "View Selection Source". This captures any changes made by JavaScript to the DOM.
alert(document.documentElement.outerHTML);
Check out "View Rendered Source" chrome extension:
https://chrome.google.com/webstore/detail/view-rendered-source/ejgngohbdedoabanmclafpkoogegdpob/
Why not type this is the urlbar?
javascript:alert(document.body.innerHTML)
In the elements tab, right click the html node > copy > copy element - then paste into an editor.
As has been mentioned above, once the source has been converted into a DOM tree, the original source no longer exists in the browser. Any changes you make will be to the DOM, not the source.
However, you can parse the modified DOM back into HTML, letting you see the "generated source".
In Chrome, open the developer tools and click the elements tab.
Right click the HTML element.
Choose copy > copy element.
Paste into an editor.
You can now see the current DOM as an HTML page.
This is not the full DOM
Note that the DOM cannot be fully represented by an HTML document. This is because the DOM has many more properties than the HTML has attributes. However this will do a reasonable job.
I think IE dev tools (F12) has; View > Source > DOM (Page)
You would need to copy and paste the DOM and save it to send to the validator.
Only thing i found is the BetterSource extension for Safari this will show you the manipulated source of the document only downside is nothing remotely like it for Firefox
The below javascript code snippet will get you the complete ajax rendered HTML generated source. Browser independent one. Enjoy :)
function outerHTML(node){
// if IE, Chrome take the internal method otherwise build one as lower versions of firefox
//does not support element.outerHTML property
return node.outerHTML || (
function(n){
var div = document.createElement('div'), h;
div.appendChild( n.cloneNode(true) );
h = div.innerHTML;
div = null;
return h;
})(node);
}
var outerhtml = outerHTML(document.getElementsByTagName('html')[0]);
var node = document.doctype;
var doctypestring="";
if(node)
{
// IE8 and below does not have document.doctype and you will get null if you access it.
doctypestring = "<!DOCTYPE "
+ node.name
+ (node.publicId ? ' PUBLIC "' + node.publicId + '"' : '')
+ (!node.publicId && node.systemId ? ' SYSTEM' : '')
+ (node.systemId ? ' "' + node.systemId + '"' : '')
+ '>';
}
else
{
// for IE8 and below you can access doctype like this
doctypestring = document.all[0].text;
}
doctypestring +outerhtml ;
I was able to solve a similar issue by logging the results of the ajax call to the console. This was the html returned and I could easily see any issues that it had.
in my .done() function of my ajax call I added console.log(results) so I could see the html in the debugger console.
function GetReversals() {
$("#getReversalsLoadingButton").removeClass("d-none");
$("#getReversalsButton").addClass("d-none");
$.ajax({
url: '/Home/LookupReversals',
data: $("#LookupReversals").serialize(),
type: 'Post',
cache: false
}).done(function (result) {
$('#reversalResults').html(result);
console.log(result);
}).fail(function (jqXHR, textStatus, errorThrown) {
//alert("There was a problem getting results. Please try again. " + jqXHR.responseText + " | " + jqXHR.statusText);
$("#reversalResults").html("<div class='text-danger'>" + jqXHR.responseText + "</div>");
}).always(function () {
$("#getReversalsLoadingButton").addClass("d-none");
$("#getReversalsButton").removeClass("d-none");
});
}