Best Way to View Generated Source of Webpage? - html

I'm looking for a tool that will give me the proper generated source including DOM changes made by AJAX requests for input into W3's validator. I've tried the following methods:
Web Developer Toolbar - Generates invalid source according to the doc-type (e.g. it removes the self closing portion of tags). Loses the doctype portion of the page.
Firebug - Fixes potential flaws in the source (e.g. unclosed tags). Also loses doctype portion of tags and injects the console which itself is invalid HTML.
IE Developer Toolbar - Generates invalid source according to the doc-type (e.g. it makes all tags uppercase, against XHTML spec).
Highlight + View Selection Source - Frequently difficult to get the entire page, also excludes doc-type.
Is there any program or add-on out there that will give me the exact current version of the source, without fixing or changing it in some way? So far, Firebug seems the best, but I worry it may fix some of my mistakes.
Solution
It turns out there is no exact solution to what I wanted as Justin explained. The best solution seems to be to validate the source inside of Firebug's console, even though it will contain some errors caused by Firebug. I'd also like to thank Forgotten Semicolon for explaining why "View Generated Source" doesn't match the actual source. If I could mark 2 best answers, I would.

Justin is dead on. The key point here is that HTML is just a language for describing a document. Once the browser reads it, it's gone. Open tags, close tags, and formatting are all taken care of by the parser and then go away. Any tool that shows you HTML is generating it based on the contents of the document, so it will always be valid.
I had to explain this to another web developer once, and it took a little while for him to accept it.
You can try it for yourself in any JavaScript console:
el = document.createElement('div');
el.innerHTML = "<p>Some text<P>More text";
el.innerHTML; // <p>Some text</p><p>More text</p>
The un-closed tags and uppercase tag names are gone, because that HTML was parsed and discarded after the second line.
The right way to modify the document from JavaScript is with document methods (createElement, appendChild, setAttribute, etc.) and you'll observe that there's no reference to tags or HTML syntax in any of those functions. If you're using document.write, innerHTML, or other HTML-speaking calls to modify your pages, the only way to validate it is to catch what you're putting into them and validate that HTML separately.
That said, the simplest way to get at the HTML representation of the document is this:
document.documentElement.innerHTML

[updating in response to more details in the edited question]
The problem you're running into is that, once a page is modified by ajax requests, the current HTML exists only inside the browser's DOM-- there's no longer any independent source HTML that you can validate other than what you can pull out of the DOM.
As you've observed, IE's DOM stores tags in upper case, fixes up unclosed tags, and makes lots of other alterations to the HTML it got originally. This is because browsers are generally very good at taking HTML with problems (e.g. unclosed tags) and fixing up those problems to display something useful to the user. Once the HTML has been canonicalized by IE, the original source HTML is essentially lost from the DOM's perspective, as far as I know.
Firefox most likley makes fewer of these changes, so Firebug is probably your better bet.
A final (and more labor-intensive) option may work for pages with simple ajax alterations, e.g. fetching some HTML from the server and importing this into the page inside a particular element. In that case, you can use fiddler or similar tool to manually stitch together the original HTML with the Ajax HTML. This is probably more trouble than it's worth, and is error prone, but it's one more possibility.
[Original response here to the original question]
Fiddler (http://www.fiddlertool.com/) is a free, browser-independent tool which works very well to fetch the exact HTML received by a browser. It shows you exact bytes on the wire as well as decoded/unzipped/etc content which you can feed into any HTML analysis tool. It also shows headers, timings, HTTP status, and lots of other good stuff.
You can also use fiddler to copy and rebuild requests if you want to test how a server responds to slightly different headers.
Fiddler works as a proxy server, sitting in between your browser and the website, and logs traffic going both ways.

I know this is an old post, but I just found this piece of gold. This is old (2006), but still works with IE9. I personnally added a bookmark with this.
Just copy paste this in your browser's address bar:
javascript:void(window.open("javascript:document.open(\"text/plain\");document.write(opener.document.body.parentNode.outerHTML)"))
As for firefox, web developper tool bar does the job. I usually use this, but sometimes, some dirty 3rd party asp.net controls generates differents markups based on the user agent...
EDIT
As Bryan pointed in the comment, some browser remove the javascript: part when copy/pasting in url bar. I just tested and that's the case with IE10.

If you load the document in Chrome, the Developer|Elements view will show you the HTML as fiddled by your JS code. It's not directly HTML text and you have to open (unfold) any elements of interest, but you effectively get to inspect the generated HTML.

In the Web Developer Toolbar, have you tried the Tools -> Validate HTML or Tools -> Validate Local HTML options?
The Validate HTML option sends the url to the validator, which works well with publicly facing sites. The Validate Local HTML option sends the current page's HTML to the validator, which works well with pages behind a login, or those that aren't publicly accessible.
You may also want to try View Source Chart (also as FireFox add-on). An interesting note there:
Q. Why does View Source Chart change my XHTML tags to HTML tags?
A. It doesn't. The browser is making these changes, VSC merely displays what the browser has done with your code. Most common: self closing tags lose their closing slash (/). See this article on Rendered Source for more information (archive.org).

Using the Firefox Web Developer Toolbar (https://addons.mozilla.org/en-US/firefox/addon/60)
Just go to View Source -> View Generated Source
I use it all the time for the exact same thing.

I had the same problem, and I've found here a solution:
http://ubuntuincident.wordpress.com/2011/04/15/scraping-ajax-web-pages/
So, to use Crowbar, the tool from here:
http://simile.mit.edu/wiki/Crowbar (now (2015-12) 404s)
wayback machine link:
http://web.archive.org/web/20140421160451/http://simile.mit.edu/wiki/Crowbar
It gave me the faulty, invalid HTML.

This is an old question, and here's an old answer that has once worked flawlessly for me for many years, but doesn't any more, at least not as of January 2016:
The "Generated Source" bookmarklet from SquareFree does exactly what you want -- and, unlike the otherwise fine "old gold" from #Johnny5, displays as source code (rather than being rendered normally by the browser, at least in the case of Google Chrome on Mac):
https://www.squarefree.com/bookmarklets/webdevel.html#generated_source
Unfortunately, it behaves just like the "old gold" from #Johnny5: it does not show up as source code any more. Sorry.

In Firefox, just ctrl-a (select everything on the screen) then right click "View Selection Source". This captures any changes made by JavaScript to the DOM.

alert(document.documentElement.outerHTML);

Check out "View Rendered Source" chrome extension:
https://chrome.google.com/webstore/detail/view-rendered-source/ejgngohbdedoabanmclafpkoogegdpob/

Why not type this is the urlbar?
javascript:alert(document.body.innerHTML)

In the elements tab, right click the html node > copy > copy element - then paste into an editor.
As has been mentioned above, once the source has been converted into a DOM tree, the original source no longer exists in the browser. Any changes you make will be to the DOM, not the source.
However, you can parse the modified DOM back into HTML, letting you see the "generated source".
In Chrome, open the developer tools and click the elements tab.
Right click the HTML element.
Choose copy > copy element.
Paste into an editor.
You can now see the current DOM as an HTML page.
This is not the full DOM
Note that the DOM cannot be fully represented by an HTML document. This is because the DOM has many more properties than the HTML has attributes. However this will do a reasonable job.

I think IE dev tools (F12) has; View > Source > DOM (Page)
You would need to copy and paste the DOM and save it to send to the validator.

Only thing i found is the BetterSource extension for Safari this will show you the manipulated source of the document only downside is nothing remotely like it for Firefox

The below javascript code snippet will get you the complete ajax rendered HTML generated source. Browser independent one. Enjoy :)
function outerHTML(node){
// if IE, Chrome take the internal method otherwise build one as lower versions of firefox
//does not support element.outerHTML property
return node.outerHTML || (
function(n){
var div = document.createElement('div'), h;
div.appendChild( n.cloneNode(true) );
h = div.innerHTML;
div = null;
return h;
})(node);
}
var outerhtml = outerHTML(document.getElementsByTagName('html')[0]);
var node = document.doctype;
var doctypestring="";
if(node)
{
// IE8 and below does not have document.doctype and you will get null if you access it.
doctypestring = "<!DOCTYPE "
+ node.name
+ (node.publicId ? ' PUBLIC "' + node.publicId + '"' : '')
+ (!node.publicId && node.systemId ? ' SYSTEM' : '')
+ (node.systemId ? ' "' + node.systemId + '"' : '')
+ '>';
}
else
{
// for IE8 and below you can access doctype like this
doctypestring = document.all[0].text;
}
doctypestring +outerhtml ;

I was able to solve a similar issue by logging the results of the ajax call to the console. This was the html returned and I could easily see any issues that it had.
in my .done() function of my ajax call I added console.log(results) so I could see the html in the debugger console.
function GetReversals() {
$("#getReversalsLoadingButton").removeClass("d-none");
$("#getReversalsButton").addClass("d-none");
$.ajax({
url: '/Home/LookupReversals',
data: $("#LookupReversals").serialize(),
type: 'Post',
cache: false
}).done(function (result) {
$('#reversalResults').html(result);
console.log(result);
}).fail(function (jqXHR, textStatus, errorThrown) {
//alert("There was a problem getting results. Please try again. " + jqXHR.responseText + " | " + jqXHR.statusText);
$("#reversalResults").html("<div class='text-danger'>" + jqXHR.responseText + "</div>");
}).always(function () {
$("#getReversalsLoadingButton").addClass("d-none");
$("#getReversalsButton").removeClass("d-none");
});
}

Related

Raising errors for typos in HTML

I just spent an hour trying to figure out why a CSS script wasn't loading. It turns out it was because I typed:
<link rel="styelsheet" href="path/to/script">
instead of
<link rel="stylesheet" href="path/to/script">
It would have been nice had Firefox said something to me, like styelsheet is unknown, or whatever. I would have realized the typo immediately. Any way to get that in the console?
Nothing prevents you from extending the link element with your own custom rel attribute. Due to this, the browser probably chooses to remain silent, assuming you know what you're doing :)
I would encourage you to get an editor that helps here, and leverage the auto-complete/suggestion features of said editor. For example, Visual Studio Code (and other editors) will tab-complete the entire element for you: see below.
If you are using HTML5:
Any HTML5 validator (†) should be able to report this, because styelsheet is not a valid link type.
You may only use link types that are
defined in the HTML5 specification, or
registered in the Microformats wiki existing-rel-values page.
† There are various validators available for Firefox, and I can’t recommend one (and it would be off-topic here anyway), but our sister site Software Recommendations might help.
If you want to find misspellings within links specifically, this will catch most:
var links= document.querySelectorAll('link');
for(var i = 0 ; i < links.length ; i++) {
if(!links[i].rel || links[i].rel !== 'stylesheet' || !links[i].href) {
console.log('ERROR: '+links[i].outerHTML);
}
}
Fiddle
Firefox web developer tools will show that your stylesheet was not loaded, sometimes as an error if it can detect that. Right click on your page and select "Inspect Element" or ctrl-Shift-Q. You can look in the network panel to see what was downloaded. Perhaps other areas there will also highlight your link as a problem.
So, yes, Firefox, and other browsers, will flag that line as a problem in some way.

Chrome parses the url incorrectly

This is how I am adding image tag under div
<img src='/files/images/remove.gif' border='0' " +
" onClick='function(\"" + url + "\");' />";
When I open my page in Firefox, everything works fine and here the HTML gets generated in FF
<div class="myclass"><img border="0" onclick="myfunction("http://127.0.0.1:8080/abc/attachments/1d28bc6b-f637-426f-8bca-e27f1c6f2ed9/debug.txt");" src="/files/images/remove.gif">
</div>
But in Chrome HTML gets generated like this
<div class="myclass"><img src="/files/images/remove.gif" border="0" onclick="myfunction(" http:="" 127.0.0.1:8080="" abc="" attachments="" 1d28bc6b-f637-426f-8bca-e27f1c6f2ed9="" debug.txt");"="">
</div>
See the url it add extra quotes and = sign.
Can anyone please help me to understand what's wrong here, any workaround/solution for this problem?
Thanks.
Well, at first, it looks like a bug in your script that emits the contents of the onclick. But then, it'd show up also in other browsers..
However, let's try from this view and assume it is a bug in your code, not Chrome.
Is the onclick generated by JS code, or on the server? If on JS, then please show us the code that generated and set that onclick handler, just to be sure it is correct.
OTOH, if it is generated on the server (ie. PHP, ASP, Ruby, ..), then have you tried peeking what exactly is sent over the wire? Open Chrome's DeveloperTools, go to "network" and refresh the page. Then watch the 'responses' and check if the HTML code embedded in them is correct. If it is incorrect, then look for a bug in the serverside scripts. If HTML seen in responses is correct, then indeed it is the browser.
If this is the latter case - the fact that HTMLInspector shows you the tag/link as damaged, it does not mean it has to malfunction. The Inspector is only a software too, so it may have some bugs in parsing/displaying - but the HTML tag may be correct and work properly. I've seen similar mis-displays a couple of times with various URLs before, but the links always worked properly. Try opening the ChromeConsole and $jquerying the URL out of that onclick. It might happen that you'll get the correct result - in that case, that's just presentation bug and the click/link should work properly - so ignore it or report to Google as a bug. Were the obtained URL incorrect, I'd double- or triplecheck the HTML in Network/Response to ensure that it really really as properly formatted, and then I'd call it a bug and try to work it around - i.e. by setting it via JS in onload, etc..
You are using some server infrastructure in between that is not pure HTML or JS, so we can't help properly. You have onClick (looks like .net), +" " + in the middle, which is not pure HTML.
I created a replication here: http://jsbin.com/ubacak/latest Note, I changed your code and just had it as pure js + html and all is fine.

Nice looking xhtml/html when I "View Source"?

I'm just curious if anyone has any tricks on how to keep source code looking good when you "View Source." I'm militant about keeping my code well formatted and spaced while I'm developing and I tend to "View Source" a lot to double check the output (when firebug is overkill). When I start using RenderPartials and RenderActions and anything in the tag it gets pretty messy.
I don't want to send too many extra characters to the browser to keep file size efficient but is there a way to force the xhtml/html to do a newline or tab? I tried a couple of things that didn't work. Thanks!
Get over it.
Don't worry about how it looks in 'view source'; worry about how it looks in csharp :) If you get worried about the efficiency of the HTML you can gzip it, and other such things.
I use firefox's ViewSourceWith extension to view the source in a code editor (in my case SciTe) in which I have a macro programmed so that when I press Ctrl-1 it reformats the HTML using a script I've written.
If validation is the goal then consider using a HTML validator rather than your eyeballs. Total Validator looks good.
Just send a \n and it should come out as a newline in the "view-source" section of the browser.
Example:
public static String Etc(...)
{
TagBuilder myTag = new TagBuilder("span");
myTag.SetInnerText("I'm mr. tag-content!");
return myTag.ToString(TagRenderMode.Normal) + Environment.NewLine;
}

HTML link to a certain point in a webpage - slight twist

Here's the use case: A user clicks on a link that opens a window displaying the contents of a text log. Easy enough. But is there a way of using POST, to open that text log to a certain location (i.e., search for a particular timestamp given in the post, and show the file at that specific location)?
(Assume I can't put html tags inside the text log -- it's a raw file).
Template of log:
+++ 2009/06/19 10:47:12.264 ACTION +++
+++ 2009/06/19 10:49:12.111 ACTION +++
So I want the page to load a specific timestamp.
Thanks,
Michael
Why can't you just have a php or perl or simlar script that processes the log file on the spot, and sticks in html anchors and calls it a day?
Doing on the spot processing would also allow you display a trimmed down copy of the log thats only relevant to the timespan around the event in question.
Since you can't modify the file, the only way would be to wrap it in a <frame> or an <iframe> and drive the searching and scrolling from JavaScript in the neighbouring/containing page.
Here's an example, which you can try out online at http://entrian.com/so-container.html
<html><head><script>
function go() {
// "line" is the <input> for which line to jump to; see the HTML.
var line = document.getElementById('line').value;
if (document.body.createTextRange) { // This is IE
var range = frames['log'].document.body.createTextRange();
if (range.findText(line)) {
range.select(); // Scroll the match into view and highlight it.
}
} else { // Non-IE. Tested in Firefox; YMMV on other browsers.
frames['log'].find(line); // Scroll the match into view and highlight it.
}
}
</script></head><body>
<input type='text' size='5' name='line' id='line' value='10'>
<button onclick='go()'>Go</button><br>
<iframe name='log' width='100' height='50' src='so-data.txt'>
<!-- so-data.txt contains the numbers 01-20 on separate lines -->
</body></html>
I've tested that in IE7 and FF3; I'd be surprised if it worked elsewhere without edits, but you never know your luck!
Obviously in your case you'd be driving the scrolling programmatically rather than via an <input> box, but you can see how it would work for you.
If you could put some tags around the file's text, then you could probably insert some javascript that would scroll the window after loading it.
Yes, but passing your parameters via a querystring would be a whole lot simpler.
To scroll to a certain position in the text file you will need to user javascript (overly complicated in my opinion) or add an html anchor tag.
If you were planning to post the raw text log in a window, you will also run into some difficulty as HTML will not recognize the newlines and run the log together into one blob.
have you tried
window.open ('log.txt');
window.scrollTo (0, window.scrollMaxY);
? From mozilla reference : https://developer.mozilla.org/en/DOM/window.scrollMaxY
Keep a 'living copy' of the log file that has been translated to HTML - every time the original file is modified (or simply every X seconds), check for and append the newest lines with HTML anchors applied to the HTML version.
A new feature was added to Chromium waaaaay back in 2020 that allows you to link to ANY location on any webpage.
At the time of this writing, it works for sure in Chrome and Opera but not yet in Firefox, Safari or Brave browser.
The trick is to add:
/#:~:text=
and follow the equal sign with the desired search text, replacing any spaces with %20. Example:
There is no ID near this location on the page
<div>IMPORTANT: Use Opera or Chrome to open above link</div>
For more information:
Linking to a specific part of a web page

How do you validate a page with AJAX content with a W3C service or similar

I have a web page that is the parent to a bunch of pages that are loaded with in using the following code.
function loadContent(elementSelector, sourceURL) {
$(""+elementSelector+"").load("http://url.com/"+sourceURL+"");
}
To call this I would have a href like
href="javascript:loadContent('#content','page.php')"
How can you validate this using a service like the W3C markup validation service? Or for that matter grab the conent of a page in your browser. When I view source all I get is the parent regardless of what information is on the screen.
Thx
This firefox plugin uses the same algorithms to validate and has a "Validate now (HTML body after JS execution)" option:
http://users.skynet.be/mgueury/mozilla/
You can use W3C's Markup Validator Web Service API.
It seems like you can do this in Chrome Developer Tools in the Elements tab by right-clicking on the <html> tag and selecting Edit as HTML, which gives you copy-pasteable text of the current page state.
Keep in mind this doesn't include the DOCTYPE declaration, so that might need to be copied separately.