Dump HTML of page including iframes - html

I'd like to dump the HTML contents of a web page, including the HTML of iframes included inside the <iframe> elements. The Chrome Developer Tools "Elements" tab is capable of showing iframe embedded in this way.
When I say "dump the HTML contents" I'm interested in browser automation tools like Selenium or PhantomJS. Do any of these tools have this capacity built in?
For example, the HTML dump I'd like of this page should include the HTML source of this embedded page.

You can use phantomjs to achieve this
Here is a code snippet from the phantom js server code.
var system = require('system');
var url = system.args[1] || '';
if(url.length > 0) {
var page = require('webpage').create();
page.open(url, function (status) {
if (status == 'success') {
var delay, checker = (function() {
var html = page.evaluate(function () {
var body = document.getElementsByTagName('body')[0];
if(body.getAttribute('data-status') == 'ready') {
return document.getElementsByTagName('html')[0].outerHTML;
}
});
if(html) {
clearTimeout(delay);
console.log(html);
phantom.exit();
}
});
delay = setInterval(checker, 100);
}
});
}
on the html you use the "data-status" attribute to let phantomjs know when the page is ready if the html belongs to you . The other option would be to use a nice timeout if the html page does not belong to you.

Related

Insert a Separate txt file into a <p> tag without js ONLY html [duplicate]

I have an html paragraph (inside a div) in which I want to display a simple fixed text. The text is a bit long so I'd rather the text will be in a seperate txt file.
something like
<div><p txt=file.txt></p></div>
Can I do something like that?
You can do something like that in pure html using an <object> tag:
<div><object data="file.txt"></object></div>
This method has some limitations though, like, it won't fit size of the block to the content - you have to specify width and height manually. And styles won't be applied to the text.
You can use a simple HTML element <embed src="file.txt"> it loads the external resource and displays it on the screen no js needed
It can be done with HTML <embed> or <object> tags, Javascript, or PHP/ASP/other back-end languages.
PHP (as example of server-side language) is the the way I've always done it:
<div><p><?php include('myFile.txt'); ?></p></div>
To use this (if you're unfamiliar with PHP), you can:
1) check if you have php on your server
2) change the file extension of your .html file to .php
3) paste the code from my PHP example somewhere in the body of your newly-renamed PHP file
Javascript will do the trick here.
function load() {
var file = new XMLHttpRequest();
file.open("GET", "http://remote.tld/random.txt", true);
file.onreadystatechange = function() {
if (file.readyState === 4) { // Makes sure the document is ready to parse
if (file.status === 200) { // Makes sure it's found the file
text = file.responseText;
document.getElementById("div1").innerHTML = text;
}
}
}
}
window.onLoad = load();
I would use javascript for this.
var txtFile = new XMLHttpRequest();
txtFile.open("GET", "http://my.remote.url/myremotefile.txt", true);
txtFile.onreadystatechange = function() {
if (txtFile.readyState === 4 && txtFile.status == 200) {
allText = txtFile.responseText;
}
document.getElementById('your div id').innerHTML = allText;
This is just a code sample, would need tweaking for all browsers, etc.
Here is a javascript code I have tested successfully :
var txtFile = new XMLHttpRequest();
var allText = "file not found";
txtFile.onreadystatechange = function () {
if (txtFile.readyState === XMLHttpRequest.DONE && txtFile.status == 200) {
allText = txtFile.responseText;
allText = allText.split("\n").join("<br>");
}
document.getElementById('txt').innerHTML = allText;
}
txtFile.open("GET", '/result/client.txt', true);
txtFile.send(null);

how to add html file to html file without jquery

When a div is clicked on my site, I want the contents of another html file to be added to the existing html. I've tried many methods and cannot find a solution. I don't want to use iframe or object or jquery or php.
function loadhtmlfile(filename, filetype, location){
var fileref=document.createElement('link');
fileref.setAttribute("rel", "html");
fileref.setAttribute("type","text/html");
fileref.setAttribute("href", filename);
document.getElementById("parentDiv").appendChild(fileref);
}
loadhtmlfile("my.html", "html", "parentDiv");
This adds a link for the html file. It doesn't add the actual content of the html file.
Also from what I've read, it sounds like it may be best to do this using a server application. I'm using node.js. If it's best doing this server side, how do I do this using node.js?
Also I will be using websockets so I suspect this will change answers.
You just could use XMLHttpRequest with javascript to load HTML content :
function loadFile(file) {
var xhr = new XMLHttpRequest();
xhr.open('GET', file);
xhr.addEventListener('readystatechange', function() { // load the page asynchronously
if (xhr.readyState === XMLHttpRequest.DONE && xhr.status === 200) { // if the file is correctly loaded
document.getElementById('yourelement').innerHTML = xhr.responseText;
}
});
xhr.send(null);
}

Repeatedly Grab DOM in Chrome Extension

I'm trying to teach myself how to write Chrome extensions and ran into a snag when I realized that my jQuery was breaking because it was getting information from the extension page itself and not the tab's current page like I had expected.
Quick summary, my sample extension will refresh the page every x seconds, look at the contents/DOM, and then do some stuff with it. The first and last parts are fine, but getting the DOM from the page that I'm on has proven very difficult, and the documentation hasn't been terribly helpful for me.
You can see the code that I have so far at these links:
Current manifest
Current js script
Current popup.html
If I want to have the ability to grab the DOM on each cycle of my setInterval call, what more needs to be done? I know that, for example, I'll need to have a content script. But do I also need to specify a background page in my manifest? Where do I need to call the content script within my extension? What's the easiest/best way to have it communicate with my current js file on each reload? Will my content script also be expecting me to use jQuery?
I know that these questions are basic and will seem trivial to me in retrospect, but they've really been a headache trying to explore completely on my own. Thanks in advance.
In order to access the web-pages DOM you'll need to programmatically inject some code into it (using chrome.tabs.executeScript()).
That said, although it is possible to grab the DOM as a string, pass it back to your popup, load it into a new element and look for what ever you want, this is a really bad approach (for various reasons).
The best option (in terms of efficiency and accuracy) is to do the processing in web-page itself and then pass just the results back to the popup. Note that in order to be able to inject code into a web-page, you have to include the corresponding host match pattern in your permissions property in manifest.
What I describe above can be achieved like this:
editorMarket.js
var refresherID = 0;
var currentID = 0;
$(document).ready(function(){
$('.start-button').click(function(){
oldGroupedHTML = null;
oldIndividualHTML = null;
chrome.tabs.query({ active: true }, function(tabs) {
if (tabs.length === 0) {
return;
}
currentID = tabs[0].id;
refresherID = setInterval(function() {
chrome.tabs.reload(currentID, { bypassCache: true }, function() {
chrome.tabs.executeScript(currentID, {
file: 'content.js',
runAt: 'document_idle',
allFrames: false
}, function(results) {
if (chrome.runtime.lastError) {
alert('ERROR:\n' + chrome.runtime.lastError.message);
return;
} else if (results.length === 0) {
alert('ERROR: No results !');
return;
}
var nIndyJobs = results[0].nIndyJobs;
var nGroupJobs = results[0].nGroupJobs;
$('.lt').text('Indy: ' + nIndyJobs + '; '
+ 'Grouped: ' + nGroupJobs);
});
});
}, 5000);
});
});
$('.stop-button').click(function(){
clearInterval(refresherID);
});
});
content.js:
(function() {
function getNumberOfIndividualJobs() {...}
function getNumberOfGroupedJobs() {...}
function comparator(grouped, individual) {
var IndyJobs = getNumberOfIndividualJobs();
var GroupJobs = getNumberOfGroupedJobs();
nIndyJobs = IndyJobs[1];
nGroupJobs = GroupJobs[1];
console.log(GroupJobs);
return {
nIndyJobs: nIndyJobs,
nGroupJobs: nGroupJobs
};
}
var currentGroupedHTML = $(".grouped_jobs").html();
var currentIndividualHTML = $(".individual_jobs").html();
var result = comparator(currentGroupedHTML, currentIndividualHTML);
return result;
})();

making a paragraph in html contain a text from a file

I have an html paragraph (inside a div) in which I want to display a simple fixed text. The text is a bit long so I'd rather the text will be in a seperate txt file.
something like
<div><p txt=file.txt></p></div>
Can I do something like that?
You can do something like that in pure html using an <object> tag:
<div><object data="file.txt"></object></div>
This method has some limitations though, like, it won't fit size of the block to the content - you have to specify width and height manually. And styles won't be applied to the text.
You can use a simple HTML element <embed src="file.txt"> it loads the external resource and displays it on the screen no js needed
It can be done with HTML <embed> or <object> tags, Javascript, or PHP/ASP/other back-end languages.
PHP (as example of server-side language) is the the way I've always done it:
<div><p><?php include('myFile.txt'); ?></p></div>
To use this (if you're unfamiliar with PHP), you can:
1) check if you have php on your server
2) change the file extension of your .html file to .php
3) paste the code from my PHP example somewhere in the body of your newly-renamed PHP file
Javascript will do the trick here.
function load() {
var file = new XMLHttpRequest();
file.open("GET", "http://remote.tld/random.txt", true);
file.onreadystatechange = function() {
if (file.readyState === 4) { // Makes sure the document is ready to parse
if (file.status === 200) { // Makes sure it's found the file
text = file.responseText;
document.getElementById("div1").innerHTML = text;
}
}
}
}
window.onLoad = load();
I would use javascript for this.
var txtFile = new XMLHttpRequest();
txtFile.open("GET", "http://my.remote.url/myremotefile.txt", true);
txtFile.onreadystatechange = function() {
if (txtFile.readyState === 4 && txtFile.status == 200) {
allText = txtFile.responseText;
}
document.getElementById('your div id').innerHTML = allText;
This is just a code sample, would need tweaking for all browsers, etc.
Here is a javascript code I have tested successfully :
var txtFile = new XMLHttpRequest();
var allText = "file not found";
txtFile.onreadystatechange = function () {
if (txtFile.readyState === XMLHttpRequest.DONE && txtFile.status == 200) {
allText = txtFile.responseText;
allText = allText.split("\n").join("<br>");
}
document.getElementById('txt').innerHTML = allText;
}
txtFile.open("GET", '/result/client.txt', true);
txtFile.send(null);

How can I get an HTML page to read the contents of a text document?

How can I get an HTML page (.html) to read the contents of a text document that can be found in the same folder as the .html file? The server is IIS.
Thanks
Google for server-side includes.
It seems like you can use #include directives in IIS.
http://msdn.microsoft.com/en-us/library/ms525185.aspx
But to be honest I strongly suggest using a scripting language, either PHP or something in the ASP family.
one hesitates to suggest iframes, but out of completeness...
(You probably need server side includes, but you probably have bigger issues in general)
By adding the following JavaScript code to the element of the web page:
<script>
function clientSideInclude(id, url)
{
var req = false;
// For Safari, Firefox, and other non-MS browsers
if (window.XMLHttpRequest)
{
try {
req = new XMLHttpRequest();
} catch (e) {
req = false;
}
} else if (window.ActiveXObject) {
// For Internet Explorer on Windows
try {
req = new ActiveXObject("Msxml2.XMLHTTP");
} catch (e) {
try {
req = new ActiveXObject("Microsoft.XMLHTTP");
} catch (e) {
req = false;
}
}
}
var element = document.getElementById(id);
if (!element) {
alert("Bad id " + id +
"passed to clientSideInclude." +
"You need a div or span element " +
"with this id in your page.");
return;
}
if (req) {
// Synchronous request, wait till we have it all
req.open('GET', url, false);
req.send(null);
element.innerHTML = req.responseText;
} else {
element.innerHTML =
"Sorry, your browser does not support " +
"XMLHTTPRequest objects. This page requires " +
"Internet Explorer 5 or better for Windows, " +
"or Firefox for any system, or Safari. Other " +
"compatible browsers may also exist.";
}
}
</script>
IIS can do server-side includes. BUt, if you can't do this, and want to include the text file in the HTML, you could grab the file with an XMLHTTPRequest object and insert it into the DOM with Javascript.
Naturally, a JS library will make this easier. For example in prototype:
new Ajax.Updater($('id_of_div'), 'http://yourdomain/path/to/file.txt');
that would grab the file and drop the contents into <div id="id_of_div"></div>
Or in jQuery:
$("#id_of_div").load("http://yourdomain/path/to/file.txt");
You can put an url to the text file directly, the user will download it by clicking. If this is not what you want, then you can probably use Server Side Includes (see here and here). If this does not work, you must write a script (ASP?) to read the file.