HTML link to a certain point in a webpage - slight twist - html

Here's the use case: A user clicks on a link that opens a window displaying the contents of a text log. Easy enough. But is there a way of using POST, to open that text log to a certain location (i.e., search for a particular timestamp given in the post, and show the file at that specific location)?
(Assume I can't put html tags inside the text log -- it's a raw file).
Template of log:
+++ 2009/06/19 10:47:12.264 ACTION +++
+++ 2009/06/19 10:49:12.111 ACTION +++
So I want the page to load a specific timestamp.
Thanks,
Michael

Why can't you just have a php or perl or simlar script that processes the log file on the spot, and sticks in html anchors and calls it a day?
Doing on the spot processing would also allow you display a trimmed down copy of the log thats only relevant to the timespan around the event in question.

Since you can't modify the file, the only way would be to wrap it in a <frame> or an <iframe> and drive the searching and scrolling from JavaScript in the neighbouring/containing page.
Here's an example, which you can try out online at http://entrian.com/so-container.html
<html><head><script>
function go() {
// "line" is the <input> for which line to jump to; see the HTML.
var line = document.getElementById('line').value;
if (document.body.createTextRange) { // This is IE
var range = frames['log'].document.body.createTextRange();
if (range.findText(line)) {
range.select(); // Scroll the match into view and highlight it.
}
} else { // Non-IE. Tested in Firefox; YMMV on other browsers.
frames['log'].find(line); // Scroll the match into view and highlight it.
}
}
</script></head><body>
<input type='text' size='5' name='line' id='line' value='10'>
<button onclick='go()'>Go</button><br>
<iframe name='log' width='100' height='50' src='so-data.txt'>
<!-- so-data.txt contains the numbers 01-20 on separate lines -->
</body></html>
I've tested that in IE7 and FF3; I'd be surprised if it worked elsewhere without edits, but you never know your luck!
Obviously in your case you'd be driving the scrolling programmatically rather than via an <input> box, but you can see how it would work for you.

If you could put some tags around the file's text, then you could probably insert some javascript that would scroll the window after loading it.

Yes, but passing your parameters via a querystring would be a whole lot simpler.
To scroll to a certain position in the text file you will need to user javascript (overly complicated in my opinion) or add an html anchor tag.
If you were planning to post the raw text log in a window, you will also run into some difficulty as HTML will not recognize the newlines and run the log together into one blob.

have you tried
window.open ('log.txt');
window.scrollTo (0, window.scrollMaxY);
? From mozilla reference : https://developer.mozilla.org/en/DOM/window.scrollMaxY

Keep a 'living copy' of the log file that has been translated to HTML - every time the original file is modified (or simply every X seconds), check for and append the newest lines with HTML anchors applied to the HTML version.

A new feature was added to Chromium waaaaay back in 2020 that allows you to link to ANY location on any webpage.
At the time of this writing, it works for sure in Chrome and Opera but not yet in Firefox, Safari or Brave browser.
The trick is to add:
/#:~:text=
and follow the equal sign with the desired search text, replacing any spaces with %20. Example:
There is no ID near this location on the page
<div>IMPORTANT: Use Opera or Chrome to open above link</div>
For more information:
Linking to a specific part of a web page

Related

Link to a specific spot on a page I can't edit [duplicate]

How do I create a link to a part of long webpage on another website that I don't control?
I thought you could use a variant of the #partofpage at the end of my link. Any suggestions?
Just append a # followed by the ID of the <a> tag (or other HTML tag, like a <section>) that you're trying to get to. For example, if you are trying to link to the header in this HTML:
<p>This is some content.</p>
<h2><a id="target">Some Header</a></h2>
<p>This is some more content.</p>
You could use the link Link.
Create a "jump link" using the following format:
http://www.example.com/somepage#anchor
Where anchor is the id of the element you wish to link to on that page. Use browser development tools / view source to find the id of the element you wish to link to.
If the element doesn't have an id and you don't control that site then you can't do it.
That is only possible if that site has declared anchors in the page.
It is done by giving a tag a name or id attribute, so look for any of those close to where you want to link to.
And then the syntax would be
text
In case the target page is on the same domain (i.e. shares the same origin with your page) and you don't mind creation of new tabs (1), you can (ab)use some JavaScript:
see tenth paragraph on another page
Trivia:
var w = window.open('some URL of the same origin');
w.onload = function(){
// do whatever you want with `this.document`, like
this.document.querySelecotor('footer').scrollIntoView()
}
Working example of such 'exploit' you can try right now could be:
javascript:(function(url,sel,w,el){w=window.open(url);w.addEventListener('load',function(){w.setTimeout(function(){el=w.document.querySelector(sel);el.scrollIntoView();el.style.backgroundColor='red'},1000)})})('https://stackoverflow.com/questions/45014240/link-to-a-specific-spot-on-a-page-i-cant-edit','footer')
If you enter this into location bar (mind that Chrome removes javascript: prefix when pasted from clipboard) or make it a href value of any link on this page (using Developer Tools) and click it, you will get another (duplicate) SO question page scrolled to the footer and footer painted red. (Delay added as a workaround for ajax-loaded content pushing footer down after load.)
Notes
Tested in current Chrome and Firefox, generally should work since it is based on defined standard behaviour.
Cannot be illustrated in interactive snippet here at SO, because they are isolated from the page origin-wise.
MDN: Window.open()
(1) window.open(url,'_self') seems to be breaking the load event; basically makes the window.open behave like a normal a href="" click navigation; haven't researched more yet.
The upcoming Chrome "Scroll to text" feature is exactly what you are looking for....
https://github.com/bokand/ScrollToTextFragment
You basically add #targetText= at the end of the URL and the browser will scroll to the target text and highlight it after the page is loaded.
It is in the version of Chrome that is running on my desk, but currently it must be manually enabled. Presumably it will soon be enabled by default in the production Chrome builds and other browsers will follow, so OK to start adding to your links now and it will start working then.
Edit: It's been implemented in Chrome. See https://chromestatus.com/feature/4733392803332096
You can NOW...
As of Chrome release 81 (Feb 2020), there is a new feature called Text Fragments. It allows you to provide a link that opens at the precise text specified (with that text highlighted).
At the moment, it works in Edge, Chrome and Opera but not in Firefox, Safari or Brave. (See note 6 at bottom for more)
For security reasons, the feature requires links to be opened in a noopener context. Therefore, make sure to include rel="noopener" in your anchor markup or add noopener to your Window.open() list of window functionality features.
You create the link to your desired text by appending this string to the end of the URL:
/#:~:text=
and providing the percent-encoded search string thus:
/#:~:text=String%20to%20focus%20on
Here is a working example:
https://newz.icu/#:~:text=Google%20surveillance%20increases
Notes:
Test the above link in Chrome or Opera only
In the above example, note that the text string is in a div that is normally hidden on page load - so in this example it is being displayed despite what would normally happen. Useful.
Recent versions of Chrome also include a new option when you Right-Click on selected text: Copy link to highlight. This will auto-create the direct-to-text link for you (i.e. it automatically appends the /#:~:text= to the text you highlighted) and place it in the clipboard - just paste it where desired.
Suppose you want to highlight an entire block of text? The Text Fragments feature allows specifying a starting%20phrase and an ending%20phrase (separated by a comma), and it will highlight all text in between:
https://newz.icu/#:~:text=Dr.%20Mullis,before%20now
Note the comma between Mullis and before
web.dev article about Text Fragments
CanIUse status of Text Fragments
PS - Please forgive choice of example website. It simply had the desired
elements required for the demonstration. Hoping we can focus on function
rather than content.
First off target refers to the BlockID found in either HTML code or chromes developer tools that you are trying to link to. Each code is different and you will need to do some digging to find the ID you are trying to reference. It should look something like div class="page-container drawer-page-content" id"PageContainer"Note that this is the format for the whole referenced section, not an individual text or image. To do that you would need to find the same piece of code but relating to your target block. For example dv id="your-block-id" Anyways I was just reading over this thread and an idea came to my mind, if you are a Shopify user and want to do this it is pretty much the same thing as stated.
But instead of
> http://url.to.site.example/index.html#target
You would put
> http://example.com/target
For example, I am setting up a disclaimer page with links leading to a newsletter signup and shopping blocks on my home page so I insert https://mystore-classifier.com/#shopify-section-1528945200235 for my hyperlink.
Please note that the -classifier is for my internal use and doesn't apply to you. This is just so I can keep track of my stores.
If you want to link to something other than your homepage you would put
> http://mystore-classifier.example/pagename/#BlockID
I hope someone found this useful, if there is something wrong with my explanation please let me know as I am not an HTML programmer my language is C#!
It's now possible to create an "anchor" link that goes to a specific part of any webpage in most browsers in a few different ways.
All of them will create a link with an #anchor at the end, where "anchor" is the thing that you want to navigate to. The browser will interpret the part of the URL after the # to scroll to a specific part of the page.
Here are 3 ways to create a url like this:
Using an existing anchor. Perhaps there will be one in the URL as you scroll down the page. If not, look around the page for a header that has a little link icon to the left of it and click it to update the browsers navigation url.
Using any html element's id property or the name or id on an ("anchor") element. The other answers explain this quite well. You will have to open the developer console and inspect the part of the page to find an id (and you may not find one). It's a little different on each browser, but here's how to inspect an element in Chrome.
Using a text snippet to highlight part of the page.
Basically, html tag can have id="abc" as shown below:
<div id="abc">test</div>
<p id="abc">test</p>
<span id="abc">test</span>
<a id="abc">test</a>
And, "<a>" tag can also have name="abc" as shown below:
<a name="abc">test</a>
Then, you can use the id and name values "abc" with "#" in urls as shown below to go to the specific part of a page:
https://www.example.com/#abc
https://www.example.com/index.html#abc
Then, you can put the urls above in "<a>" tag to create the links to id="abc" and name="abc" as shown below:
test
test
And, if you want to go to the specific part of the same page, you can only put the id and name values "abc" with "#" in "<a>" tag to create the links to id="abc" and name="abc" as shown below:
<!-- Go to the specific part of the same page -->
test
<div id="abc">test</div>
<!-- Go to the specific part of the same page -->
test
<a name="abc">test</a>

Does clicking a link always cause a full-page re-rendering?

Here's a scenario: Page1.html and page2.html are identical, except a change of a single word.
page1.html
<html><head></head><body>
Lorem ipsum *one*
...Stuff here...
click me
...Stuff here...
</body></html>
page2.html
<html><head></head><body>
Lorem ipsum *two*
...Stuff here...
click me
...Stuff here...
</body></html>
As you can see, everything is identical in both pages except that in page2.html, *one* is swapped with *two*
Now, assume a visitor visits page1.html, and clicks on the a href link, will the browser re-render the entire DOM or simply detect the changed word and modify it?
This is obviously implementation specific, my interest is an answer for the major browsers (Chrome, Firefox, Opera, Safari, IE...)
Yes it will, unless you change your link click to a javascript that loads an inner HMTL dinamically. This is called Ajax.
I will surely load the entire page...A href requests basically reload the complete page..you can still load part of the page using Asynchronous javascript and xml... AJAX :)
It will attempt to reload the page.
If you want to stop the default action of reloading the page you could use event.preventDefault();.
More info here: http://api.jquery.com/event.preventdefault/]1
#Rodrigogq sums it pretty much up.
I would recommend using jQuery for replacing text.
$('body').text(function () {
return $(this).text().replace("one", "two");
});​​​​​
This wouldn't use ajax but simply replace.
Can be modified to a button press.
If the URL is changed the browser loads the data from new address and needs to render it again – regardless of implementation of the browser. The difference between the HTML documents would be visible not earlier than when the new document would be loaded ant parsed into DOM which means that it gets rendered rather than finding the differences (which would be non-trivial task).
If you want to change only the content of particular element you need to use Javascript and set either static value or load it via AJAX.
To answer:
"As long as the bytes in the html page being downloaded match the bytes in the html page the user is currently seeing, do not re-render that part"
» Here is what it looks like in most browsers:
Step 1: Click on a link
Step 2: Get head response from the destination server (good if there is content-length included)
Step 3: If response is render-able (usually 200), Start unloading the already loaded DOM.
Step 4: Load the new content and render.
This prohibits any byte level comparison with the current and new request to identify the difference.

Linking to a specific part of a web page

How do I create a link to a part of long webpage on another website that I don't control?
I thought you could use a variant of the #partofpage at the end of my link. Any suggestions?
Just append a # followed by the ID of the <a> tag (or other HTML tag, like a <section>) that you're trying to get to. For example, if you are trying to link to the header in this HTML:
<p>This is some content.</p>
<h2><a id="target">Some Header</a></h2>
<p>This is some more content.</p>
You could use the link Link.
Create a "jump link" using the following format:
http://www.example.com/somepage#anchor
Where anchor is the id of the element you wish to link to on that page. Use browser development tools / view source to find the id of the element you wish to link to.
If the element doesn't have an id and you don't control that site then you can't do it.
That is only possible if that site has declared anchors in the page.
It is done by giving a tag a name or id attribute, so look for any of those close to where you want to link to.
And then the syntax would be
text
In case the target page is on the same domain (i.e. shares the same origin with your page) and you don't mind creation of new tabs (1), you can (ab)use some JavaScript:
see tenth paragraph on another page
Trivia:
var w = window.open('some URL of the same origin');
w.onload = function(){
// do whatever you want with `this.document`, like
this.document.querySelecotor('footer').scrollIntoView()
}
Working example of such 'exploit' you can try right now could be:
javascript:(function(url,sel,w,el){w=window.open(url);w.addEventListener('load',function(){w.setTimeout(function(){el=w.document.querySelector(sel);el.scrollIntoView();el.style.backgroundColor='red'},1000)})})('https://stackoverflow.com/questions/45014240/link-to-a-specific-spot-on-a-page-i-cant-edit','footer')
If you enter this into location bar (mind that Chrome removes javascript: prefix when pasted from clipboard) or make it a href value of any link on this page (using Developer Tools) and click it, you will get another (duplicate) SO question page scrolled to the footer and footer painted red. (Delay added as a workaround for ajax-loaded content pushing footer down after load.)
Notes
Tested in current Chrome and Firefox, generally should work since it is based on defined standard behaviour.
Cannot be illustrated in interactive snippet here at SO, because they are isolated from the page origin-wise.
MDN: Window.open()
(1) window.open(url,'_self') seems to be breaking the load event; basically makes the window.open behave like a normal a href="" click navigation; haven't researched more yet.
The upcoming Chrome "Scroll to text" feature is exactly what you are looking for....
https://github.com/bokand/ScrollToTextFragment
You basically add #targetText= at the end of the URL and the browser will scroll to the target text and highlight it after the page is loaded.
It is in the version of Chrome that is running on my desk, but currently it must be manually enabled. Presumably it will soon be enabled by default in the production Chrome builds and other browsers will follow, so OK to start adding to your links now and it will start working then.
Edit: It's been implemented in Chrome. See https://chromestatus.com/feature/4733392803332096
You can NOW...
As of Chrome release 81 (Feb 2020), there is a new feature called Text Fragments. It allows you to provide a link that opens at the precise text specified (with that text highlighted).
At the moment, it works in Edge, Chrome and Opera but not in Firefox, Safari or Brave. (See note 6 at bottom for more)
For security reasons, the feature requires links to be opened in a noopener context. Therefore, make sure to include rel="noopener" in your anchor markup or add noopener to your Window.open() list of window functionality features.
You create the link to your desired text by appending this string to the end of the URL:
/#:~:text=
and providing the percent-encoded search string thus:
/#:~:text=String%20to%20focus%20on
Here is a working example:
https://newz.icu/#:~:text=Google%20surveillance%20increases
Notes:
Test the above link in Chrome or Opera only
In the above example, note that the text string is in a div that is normally hidden on page load - so in this example it is being displayed despite what would normally happen. Useful.
Recent versions of Chrome also include a new option when you Right-Click on selected text: Copy link to highlight. This will auto-create the direct-to-text link for you (i.e. it automatically appends the /#:~:text= to the text you highlighted) and place it in the clipboard - just paste it where desired.
Suppose you want to highlight an entire block of text? The Text Fragments feature allows specifying a starting%20phrase and an ending%20phrase (separated by a comma), and it will highlight all text in between:
https://newz.icu/#:~:text=Dr.%20Mullis,before%20now
Note the comma between Mullis and before
web.dev article about Text Fragments
CanIUse status of Text Fragments
PS - Please forgive choice of example website. It simply had the desired
elements required for the demonstration. Hoping we can focus on function
rather than content.
First off target refers to the BlockID found in either HTML code or chromes developer tools that you are trying to link to. Each code is different and you will need to do some digging to find the ID you are trying to reference. It should look something like div class="page-container drawer-page-content" id"PageContainer"Note that this is the format for the whole referenced section, not an individual text or image. To do that you would need to find the same piece of code but relating to your target block. For example dv id="your-block-id" Anyways I was just reading over this thread and an idea came to my mind, if you are a Shopify user and want to do this it is pretty much the same thing as stated.
But instead of
> http://url.to.site.example/index.html#target
You would put
> http://example.com/target
For example, I am setting up a disclaimer page with links leading to a newsletter signup and shopping blocks on my home page so I insert https://mystore-classifier.com/#shopify-section-1528945200235 for my hyperlink.
Please note that the -classifier is for my internal use and doesn't apply to you. This is just so I can keep track of my stores.
If you want to link to something other than your homepage you would put
> http://mystore-classifier.example/pagename/#BlockID
I hope someone found this useful, if there is something wrong with my explanation please let me know as I am not an HTML programmer my language is C#!
It's now possible to create an "anchor" link that goes to a specific part of any webpage in most browsers in a few different ways.
All of them will create a link with an #anchor at the end, where "anchor" is the thing that you want to navigate to. The browser will interpret the part of the URL after the # to scroll to a specific part of the page.
Here are 3 ways to create a url like this:
Using an existing anchor. Perhaps there will be one in the URL as you scroll down the page. If not, look around the page for a header that has a little link icon to the left of it and click it to update the browsers navigation url.
Using any html element's id property or the name or id on an ("anchor") element. The other answers explain this quite well. You will have to open the developer console and inspect the part of the page to find an id (and you may not find one). It's a little different on each browser, but here's how to inspect an element in Chrome.
Using a text snippet to highlight part of the page.
Basically, html tag can have id="abc" as shown below:
<div id="abc">test</div>
<p id="abc">test</p>
<span id="abc">test</span>
<a id="abc">test</a>
And, "<a>" tag can also have name="abc" as shown below:
<a name="abc">test</a>
Then, you can use the id and name values "abc" with "#" in urls as shown below to go to the specific part of a page:
https://www.example.com/#abc
https://www.example.com/index.html#abc
Then, you can put the urls above in "<a>" tag to create the links to id="abc" and name="abc" as shown below:
test
test
And, if you want to go to the specific part of the same page, you can only put the id and name values "abc" with "#" in "<a>" tag to create the links to id="abc" and name="abc" as shown below:
<!-- Go to the specific part of the same page -->
test
<div id="abc">test</div>
<!-- Go to the specific part of the same page -->
test
<a name="abc">test</a>

Is there a way to bookmark or link to a section of a page without an anchor?

Is there a way to bookmark or link to an HTML page (which I am not author of) without having an anchor in the HTML code?
I want the page to get scrolled down to a particular section when accessed from a bookmark or hyperlink even if there is no anchor tag in the destination page.
Note: the destination page has an anchor tag as "foo" then bookmarking like http:/...hello.html#foo will not only take the user to hello.html, but also automatically scroll down to the section of the page so that the anchor tag "foo" is at the top of the screen.
It's the year 2020, there is a draft by WICG for Text Fragments, and now you can link to text on a page as if you were searching for it by adding the following to the hash
#:~:text=<Text To Link to>
Working example on Chrome Version 81.0.4044.138:
Click on this link Should take you to another answer page and highlight the link there
You only need to have the appropriate id attribute on an element to use it like a bookmark...
Test
...
<p id="test">Hello world</p>
See the W3C specification: Anchors with the id attribute
Older specifications also allowed navigation based on the name attribute, but this attribute has been removed from the latest HTML specifications (but if there is a name attribute it may be used in the same way as an id attribute).
If there is no id or name attribute where you wish to navigate to, there is no way of navigating to the specific point within the page, only to the page itself. In this case you may want to quote the pertinent information and supply a citation with a link or perhaps ask the author if they would add an id.
This is a copy of #AbderrahmaneTAHRIJOUTI's answer but updated with some extra info.
It's the year 2020, and now there is a draft by WICG for Text Fragments, and now you can link to text on a page as if you were searching for it by adding the percent-encoded quote to the URL like this:
#:~:text=<percent-encoded-text-quoted-from-site>
For example, this link highlights the syntax from the spec.
One can also highlight multiple sections as well by chaining query parameters with ampersand (&):
#:~:text=<quote-1>&text=<quote-2>
For example, see these highlights to the spec.
Even ranges can be set in case of a longer quote (at least in Chrome):
#:~:text=<begin-text>,<end-text>
For example, highlighting an entire paragraph in the spec.
For some reason, in Chrome 89.0.4389.90 the above links may only work if one (1) clicks on them, (2) goes to the address bar by clicking in it or by F6, and (3) hits Enter. Not sure why this is when Google search constantly offers links like this in the results which work out of the box (e.g., a link to Azure Vault)
Support
It's still spotty, but most major browsers support it (except for Firefox...). To check the current status of adoption, check out https://caniuse.com/?search=%3A~%3A
There is a relatively recent W3C Working Group Note on Selectors and States which would allow linking to selected text.
Here is a Firefox webextension partially implementing the link format (allowing you to "create" a link, based on the selection, as well as, obviously, open such a link, highlighting the correct selection):
https://addons.mozilla.org/en-US/firefox/addon/precise-links/
As of 2019 it seems to work fine.
Its source code is available here.
The Firefox extension "Web Marker" does exactly what you want.
https://addons.mozilla.org/en-US/firefox/addon/web-marker/
You can find its source code and documentation here:
http://liveurls.mozdev.org/tech.html
If the page supports being embedded as an iframe, you can link to a document that embedds it and then autoscrolls the document. The issue is that we can't get the height of the page, so instead we just hijack the scrolling event to make the page taller once we approach the bottom:
data:text/html,<html><body style="margin:0; padding:0;"><iframe id='i' src='http://forecast.weather.gov/MapClick.php?CityName=Las+Vegas&state=NV&site=VEF&textField1=36.175&textField2=-115.136&e=0' width=100% frameborder=0 margin=0 scrolling=no style="height: calc(100vh + 170px + 200px);"></iframe></body><script>window.scrollTo(0, 170);window.onscroll = function(e) {if((window.innerHeight + window.scrollY) >= document.body.offsetHeight - 200) {document.getElementById('i').style.height = window.innerHeight + window.scrollY + 200;}};</script></html>
Modern browsers will try to scroll to an element with an ID that matches the hash part of the URL (i.e. if you have <h1 id="foo">, then #foo would get you there).
If everything else fails, you can use jQuery. Get the hash part from the document URL with window.location.hash. You can then interpret that in JavaScript to determine an element in the page.
Use scrollTop to move there (see Scroll to an element with jQuery).
See also: https://api.jquery.com/scrolltop/
I must be not getting something, but sadly your solution is not working for me... The attached document's jargon confuses me a bit as the dilettant I am. :-(
Though, it gave a nice clue... Hence, I found this link with a simpler way to do this (in my case, link to a specific part of a text in some other author's blog post without ID tags):
Share or link to quotes and text in Chrome
https://support.google.com/chrome/answer/10256233?hl=en-GB&co=GENIE.Platform%3DDesktop
To create a link that opens directly to highlighted text:
On your computer, open Chrome.
Go to a page with text that you want to share.
To highlight the text that you want to share, click and hold, then drag your mouse.
To open the context menu, right-click on the highlighted text.
Select Copy link to highlight.
If you can’t select this option, this feature may not work for the selected content.
Paste the link anywhere; for example, into an email or message thread.
Tip: To remove the highlight from the text in the linked content, right-click the highlighted text and select Remove highlight.
If you want to link to a specific part of a PDF file online, this solution also worked for me:
https://helpx.adobe.com/acrobat/kb/link-html-pdf-page-acrobat.html#:~:text=Open%20a%20PDF%20file%20to,end%20of%20the%20link's%20URL.
Just posting this in case someone is still lost as I was.
Cheers!
The AnchorMe addon from Firefox just solved this for me. Ctrl + double click on your desired destination and voilà.

Best Way to View Generated Source of Webpage?

I'm looking for a tool that will give me the proper generated source including DOM changes made by AJAX requests for input into W3's validator. I've tried the following methods:
Web Developer Toolbar - Generates invalid source according to the doc-type (e.g. it removes the self closing portion of tags). Loses the doctype portion of the page.
Firebug - Fixes potential flaws in the source (e.g. unclosed tags). Also loses doctype portion of tags and injects the console which itself is invalid HTML.
IE Developer Toolbar - Generates invalid source according to the doc-type (e.g. it makes all tags uppercase, against XHTML spec).
Highlight + View Selection Source - Frequently difficult to get the entire page, also excludes doc-type.
Is there any program or add-on out there that will give me the exact current version of the source, without fixing or changing it in some way? So far, Firebug seems the best, but I worry it may fix some of my mistakes.
Solution
It turns out there is no exact solution to what I wanted as Justin explained. The best solution seems to be to validate the source inside of Firebug's console, even though it will contain some errors caused by Firebug. I'd also like to thank Forgotten Semicolon for explaining why "View Generated Source" doesn't match the actual source. If I could mark 2 best answers, I would.
Justin is dead on. The key point here is that HTML is just a language for describing a document. Once the browser reads it, it's gone. Open tags, close tags, and formatting are all taken care of by the parser and then go away. Any tool that shows you HTML is generating it based on the contents of the document, so it will always be valid.
I had to explain this to another web developer once, and it took a little while for him to accept it.
You can try it for yourself in any JavaScript console:
el = document.createElement('div');
el.innerHTML = "<p>Some text<P>More text";
el.innerHTML; // <p>Some text</p><p>More text</p>
The un-closed tags and uppercase tag names are gone, because that HTML was parsed and discarded after the second line.
The right way to modify the document from JavaScript is with document methods (createElement, appendChild, setAttribute, etc.) and you'll observe that there's no reference to tags or HTML syntax in any of those functions. If you're using document.write, innerHTML, or other HTML-speaking calls to modify your pages, the only way to validate it is to catch what you're putting into them and validate that HTML separately.
That said, the simplest way to get at the HTML representation of the document is this:
document.documentElement.innerHTML
[updating in response to more details in the edited question]
The problem you're running into is that, once a page is modified by ajax requests, the current HTML exists only inside the browser's DOM-- there's no longer any independent source HTML that you can validate other than what you can pull out of the DOM.
As you've observed, IE's DOM stores tags in upper case, fixes up unclosed tags, and makes lots of other alterations to the HTML it got originally. This is because browsers are generally very good at taking HTML with problems (e.g. unclosed tags) and fixing up those problems to display something useful to the user. Once the HTML has been canonicalized by IE, the original source HTML is essentially lost from the DOM's perspective, as far as I know.
Firefox most likley makes fewer of these changes, so Firebug is probably your better bet.
A final (and more labor-intensive) option may work for pages with simple ajax alterations, e.g. fetching some HTML from the server and importing this into the page inside a particular element. In that case, you can use fiddler or similar tool to manually stitch together the original HTML with the Ajax HTML. This is probably more trouble than it's worth, and is error prone, but it's one more possibility.
[Original response here to the original question]
Fiddler (http://www.fiddlertool.com/) is a free, browser-independent tool which works very well to fetch the exact HTML received by a browser. It shows you exact bytes on the wire as well as decoded/unzipped/etc content which you can feed into any HTML analysis tool. It also shows headers, timings, HTTP status, and lots of other good stuff.
You can also use fiddler to copy and rebuild requests if you want to test how a server responds to slightly different headers.
Fiddler works as a proxy server, sitting in between your browser and the website, and logs traffic going both ways.
I know this is an old post, but I just found this piece of gold. This is old (2006), but still works with IE9. I personnally added a bookmark with this.
Just copy paste this in your browser's address bar:
javascript:void(window.open("javascript:document.open(\"text/plain\");document.write(opener.document.body.parentNode.outerHTML)"))
As for firefox, web developper tool bar does the job. I usually use this, but sometimes, some dirty 3rd party asp.net controls generates differents markups based on the user agent...
EDIT
As Bryan pointed in the comment, some browser remove the javascript: part when copy/pasting in url bar. I just tested and that's the case with IE10.
If you load the document in Chrome, the Developer|Elements view will show you the HTML as fiddled by your JS code. It's not directly HTML text and you have to open (unfold) any elements of interest, but you effectively get to inspect the generated HTML.
In the Web Developer Toolbar, have you tried the Tools -> Validate HTML or Tools -> Validate Local HTML options?
The Validate HTML option sends the url to the validator, which works well with publicly facing sites. The Validate Local HTML option sends the current page's HTML to the validator, which works well with pages behind a login, or those that aren't publicly accessible.
You may also want to try View Source Chart (also as FireFox add-on). An interesting note there:
Q. Why does View Source Chart change my XHTML tags to HTML tags?
A. It doesn't. The browser is making these changes, VSC merely displays what the browser has done with your code. Most common: self closing tags lose their closing slash (/). See this article on Rendered Source for more information (archive.org).
Using the Firefox Web Developer Toolbar (https://addons.mozilla.org/en-US/firefox/addon/60)
Just go to View Source -> View Generated Source
I use it all the time for the exact same thing.
I had the same problem, and I've found here a solution:
http://ubuntuincident.wordpress.com/2011/04/15/scraping-ajax-web-pages/
So, to use Crowbar, the tool from here:
http://simile.mit.edu/wiki/Crowbar (now (2015-12) 404s)
wayback machine link:
http://web.archive.org/web/20140421160451/http://simile.mit.edu/wiki/Crowbar
It gave me the faulty, invalid HTML.
This is an old question, and here's an old answer that has once worked flawlessly for me for many years, but doesn't any more, at least not as of January 2016:
The "Generated Source" bookmarklet from SquareFree does exactly what you want -- and, unlike the otherwise fine "old gold" from #Johnny5, displays as source code (rather than being rendered normally by the browser, at least in the case of Google Chrome on Mac):
https://www.squarefree.com/bookmarklets/webdevel.html#generated_source
Unfortunately, it behaves just like the "old gold" from #Johnny5: it does not show up as source code any more. Sorry.
In Firefox, just ctrl-a (select everything on the screen) then right click "View Selection Source". This captures any changes made by JavaScript to the DOM.
alert(document.documentElement.outerHTML);
Check out "View Rendered Source" chrome extension:
https://chrome.google.com/webstore/detail/view-rendered-source/ejgngohbdedoabanmclafpkoogegdpob/
Why not type this is the urlbar?
javascript:alert(document.body.innerHTML)
In the elements tab, right click the html node > copy > copy element - then paste into an editor.
As has been mentioned above, once the source has been converted into a DOM tree, the original source no longer exists in the browser. Any changes you make will be to the DOM, not the source.
However, you can parse the modified DOM back into HTML, letting you see the "generated source".
In Chrome, open the developer tools and click the elements tab.
Right click the HTML element.
Choose copy > copy element.
Paste into an editor.
You can now see the current DOM as an HTML page.
This is not the full DOM
Note that the DOM cannot be fully represented by an HTML document. This is because the DOM has many more properties than the HTML has attributes. However this will do a reasonable job.
I think IE dev tools (F12) has; View > Source > DOM (Page)
You would need to copy and paste the DOM and save it to send to the validator.
Only thing i found is the BetterSource extension for Safari this will show you the manipulated source of the document only downside is nothing remotely like it for Firefox
The below javascript code snippet will get you the complete ajax rendered HTML generated source. Browser independent one. Enjoy :)
function outerHTML(node){
// if IE, Chrome take the internal method otherwise build one as lower versions of firefox
//does not support element.outerHTML property
return node.outerHTML || (
function(n){
var div = document.createElement('div'), h;
div.appendChild( n.cloneNode(true) );
h = div.innerHTML;
div = null;
return h;
})(node);
}
var outerhtml = outerHTML(document.getElementsByTagName('html')[0]);
var node = document.doctype;
var doctypestring="";
if(node)
{
// IE8 and below does not have document.doctype and you will get null if you access it.
doctypestring = "<!DOCTYPE "
+ node.name
+ (node.publicId ? ' PUBLIC "' + node.publicId + '"' : '')
+ (!node.publicId && node.systemId ? ' SYSTEM' : '')
+ (node.systemId ? ' "' + node.systemId + '"' : '')
+ '>';
}
else
{
// for IE8 and below you can access doctype like this
doctypestring = document.all[0].text;
}
doctypestring +outerhtml ;
I was able to solve a similar issue by logging the results of the ajax call to the console. This was the html returned and I could easily see any issues that it had.
in my .done() function of my ajax call I added console.log(results) so I could see the html in the debugger console.
function GetReversals() {
$("#getReversalsLoadingButton").removeClass("d-none");
$("#getReversalsButton").addClass("d-none");
$.ajax({
url: '/Home/LookupReversals',
data: $("#LookupReversals").serialize(),
type: 'Post',
cache: false
}).done(function (result) {
$('#reversalResults').html(result);
console.log(result);
}).fail(function (jqXHR, textStatus, errorThrown) {
//alert("There was a problem getting results. Please try again. " + jqXHR.responseText + " | " + jqXHR.statusText);
$("#reversalResults").html("<div class='text-danger'>" + jqXHR.responseText + "</div>");
}).always(function () {
$("#getReversalsLoadingButton").addClass("d-none");
$("#getReversalsButton").removeClass("d-none");
});
}