How to Modify AJAX Based site Content OnDocumentComplete Event - html

I have an IE tool bar application under the DocumentComplete Event of the Page , i was disabling the HTML links using Concept of BHO(Browser Helper Object) that is working well for some of the sites but when i browse some AJAX/JQUERY based sites ,where the data will be popuplated dynamically and more over source code is also not vsible for dynamic data,this link disabling feature is not working.....
How To Disable or Modify the Content Of Dynamic Data When We load into browser??
HTMLDocument document = (HTMLDocument)webBrowser.Document;
IHTMLElementCollection hh = ((IHTMLElementCollection)document.getElementsByTagName("a"));
foreach (IHTMLElement ht in hh)
{
((HTMLAnchorElement)ht).removeAttribute("href", 1);
((HTMLAnchorElement)ht).style.color = "#b9b0b0";
}
Any Help ?

It depends on your IE version. Till IE 8 there is no DOM Mutation event generated but there is an alternative approach as mentioned here. Although IE 9 supports some basic dom mutation events as mentioned in this IE blog. Also, see this MSDN entry for the supported interface in IE9.
A very crude approach is to use a timer event for monitoring any changes in the DOM content. But you have to take care of threading issues and have to keep per window text information (by window I mean tabs and also IFrame content).
Another approach is to use Asynchronous Pluggable Protocols for monitoring any Ajax request and then accordingly notifying the appropriate tab/window. This is harder then the previous crude approach.

I got answer for the above mentioned problem. The following are the resolution steps what I have followed for the issue.
I just read the whole page content into "MSHTML" object as mentioned in the question
HTMLDocument document = (HTMLDocument)webBrowser.Document;
and again I just loaded it into a Web Browser control.
From there I can read the data easily and I just removed "href" property.

Related

How do I generate SEO-friendly markup for a single-page web app? [duplicate]

There are a lot of cool tools for making powerful "single-page" JavaScript websites nowadays. In my opinion, this is done right by letting the server act as an API (and nothing more) and letting the client handle all of the HTML generation stuff. The problem with this "pattern" is the lack of search engine support. I can think of two solutions:
When the user enters the website, let the server render the page exactly as the client would upon navigation. So if I go to http://example.com/my_path directly the server would render the same thing as the client would if I go to /my_path through pushState.
Let the server provide a special website only for the search engine bots. If a normal user visits http://example.com/my_path the server should give him a JavaScript heavy version of the website. But if the Google bot visits, the server should give it some minimal HTML with the content I want Google to index.
The first solution is discussed further here. I have been working on a website doing this and it's not a very nice experience. It's not DRY and in my case I had to use two different template engines for the client and the server.
I think I have seen the second solution for some good ol' Flash websites. I like this approach much more than the first one and with the right tool on the server it could be done quite painlessly.
So what I'm really wondering is the following:
Can you think of any better solution?
What are the disadvantages with the second solution? If Google in some way finds out that I'm not serving the exact same content for the Google bot as a regular user, would I then be punished in the search results?
While #2 might be "easier" for you as a developer, it only provides search engine crawling. And yes, if Google finds out your serving different content, you might be penalized (I'm not an expert on that, but I have heard of it happening).
Both SEO and accessibility (not just for disabled person, but accessibility via mobile devices, touch screen devices, and other non-standard computing / internet enabled platforms) both have a similar underlying philosophy: semantically rich markup that is "accessible" (i.e. can be accessed, viewed, read, processed, or otherwise used) to all these different browsers. A screen reader, a search engine crawler or a user with JavaScript enabled, should all be able to use/index/understand your site's core functionality without issue.
pushState does not add to this burden, in my experience. It only brings what used to be an afterthought and "if we have time" to the forefront of web development.
What your describe in option #1 is usually the best way to go - but, like other accessibility and SEO issues, doing this with pushState in a JavaScript-heavy app requires up-front planning or it will become a significant burden. It should be baked in to the page and application architecture from the start - retrofitting is painful and will cause more duplication than is necessary.
I've been working with pushState and SEO recently for a couple of different application, and I found what I think is a good approach. It basically follows your item #1, but accounts for not duplicating html / templates.
Most of the info can be found in these two blog posts:
http://lostechies.com/derickbailey/2011/09/06/test-driving-backbone-views-with-jquery-templates-the-jasmine-gem-and-jasmine-jquery/
and
http://lostechies.com/derickbailey/2011/06/22/rendering-a-rails-partial-as-a-jquery-template/
The gist of it is that I use ERB or HAML templates (running Ruby on Rails, Sinatra, etc) for my server side render and to create the client side templates that Backbone can use, as well as for my Jasmine JavaScript specs. This cuts out the duplication of markup between the server side and the client side.
From there, you need to take a few additional steps to have your JavaScript work with the HTML that is rendered by the server - true progressive enhancement; taking the semantic markup that got delivered and enhancing it with JavaScript.
For example, i'm building an image gallery application with pushState. If you request /images/1 from the server, it will render the entire image gallery on the server and send all of the HTML, CSS and JavaScript down to your browser. If you have JavaScript disabled, it will work perfectly fine. Every action you take will request a different URL from the server and the server will render all of the markup for your browser. If you have JavaScript enabled, though, the JavaScript will pick up the already rendered HTML along with a few variables generated by the server and take over from there.
Here's an example:
<form id="foo">
Name: <input id="name"><button id="say">Say My Name!</button>
</form>
After the server renders this, the JavaScript would pick it up (using a Backbone.js view in this example)
FooView = Backbone.View.extend({
events: {
"change #name": "setName",
"click #say": "sayName"
},
setName: function(e){
var name = $(e.currentTarget).val();
this.model.set({name: name});
},
sayName: function(e){
e.preventDefault();
var name = this.model.get("name");
alert("Hello " + name);
},
render: function(){
// do some rendering here, for when this is just running JavaScript
}
});
$(function(){
var model = new MyModel();
var view = new FooView({
model: model,
el: $("#foo")
});
});
This is a very simple example, but I think it gets the point across.
When I instante the view after the page loads, I'm providing the existing content of the form that was rendered by the server, to the view instance as the el for the view. I am not calling render or having the view generate an el for me, when the first view is loaded. I have a render method available for after the view is up and running and the page is all JavaScript. This lets me re-render the view later if I need to.
Clicking the "Say My Name" button with JavaScript enabled will cause an alert box. Without JavaScript, it would post back to the server and the server could render the name to an html element somewhere.
Edit
Consider a more complex example, where you have a list that needs to be attached (from the comments below this)
Say you have a list of users in a <ul> tag. This list was rendered by the server when the browser made a request, and the result looks something like:
<ul id="user-list">
<li data-id="1">Bob
<li data-id="2">Mary
<li data-id="3">Frank
<li data-id="4">Jane
</ul>
Now you need to loop through this list and attach a Backbone view and model to each of the <li> items. With the use of the data-id attribute, you can find the model that each tag comes from easily. You'll then need a collection view and item view that is smart enough to attach itself to this html.
UserListView = Backbone.View.extend({
attach: function(){
this.el = $("#user-list");
this.$("li").each(function(index){
var userEl = $(this);
var id = userEl.attr("data-id");
var user = this.collection.get(id);
new UserView({
model: user,
el: userEl
});
});
}
});
UserView = Backbone.View.extend({
initialize: function(){
this.model.bind("change:name", this.updateName, this);
},
updateName: function(model, val){
this.el.text(val);
}
});
var userData = {...};
var userList = new UserCollection(userData);
var userListView = new UserListView({collection: userList});
userListView.attach();
In this example, the UserListView will loop through all of the <li> tags and attach a view object with the correct model for each one. it sets up an event handler for the model's name change event and updates the displayed text of the element when a change occurs.
This kind of process, to take the html that the server rendered and have my JavaScript take over and run it, is a great way to get things rolling for SEO, Accessibility, and pushState support.
Hope that helps.
I think you need this: http://code.google.com/web/ajaxcrawling/
You can also install a special backend that "renders" your page by running javascript on the server, and then serves that to google.
Combine both things and you have a solution without programming things twice. (As long as your app is fully controllable via anchor fragments.)
So, it seem that the main concern is being DRY
If you're using pushState have your server send the same exact code for all urls (that don't contain a file extension to serve images, etc.) "/mydir/myfile", "/myotherdir/myotherfile" or root "/" -- all requests receive the same exact code. You need to have some kind url rewrite engine. You can also serve a tiny bit of html and the rest can come from your CDN (using require.js to manage dependencies -- see https://stackoverflow.com/a/13813102/1595913).
(test the link's validity by converting the link to your url scheme and testing against existence of content by querying a static or a dynamic source. if it's not valid send a 404 response.)
When the request is not from a google bot, you just process normally.
If the request is from a google bot, you use phantom.js -- headless webkit browser ("A headless browser is simply a full-featured web browser with no visual interface.") to render html and javascript on the server and send the google bot the resulting html. As the bot parses the html it can hit your other "pushState" links /somepage on the server mylink, the server rewrites url to your application file, loads it in phantom.js and the resulting html is sent to the bot, and so on...
For your html I'm assuming you're using normal links with some kind of hijacking (e.g. using with backbone.js https://stackoverflow.com/a/9331734/1595913)
To avoid confusion with any links separate your api code that serves json into a separate subdomain, e.g. api.mysite.com
To improve performance you can pre-process your site pages for search engines ahead of time during off hours by creating static versions of the pages using the same mechanism with phantom.js and consequently serve the static pages to google bots. Preprocessing can be done with some simple app that can parse <a> tags. In this case handling 404 is easier since you can simply check for the existence of the static file with a name that contains url path.
If you use #! hash bang syntax for your site links a similar scenario applies, except that the rewrite url server engine would look out for _escaped_fragment_ in the url and would format the url to your url scheme.
There are a couple of integrations of node.js with phantom.js on github and you can use node.js as the web server to produce html output.
Here are a couple of examples using phantom.js for seo:
http://backbonetutorials.com/seo-for-single-page-apps/
http://thedigitalself.com/blog/seo-and-javascript-with-phantomjs-server-side-rendering
If you're using Rails, try poirot. It's a gem that makes it dead simple to reuse mustache or handlebars templates client and server side.
Create a file in your views like _some_thingy.html.mustache.
Render server side:
<%= render :partial => 'some_thingy', object: my_model %>
Put the template your head for client side use:
<%= template_include_tag 'some_thingy' %>
Rendre client side:
html = poirot.someThingy(my_model)
To take a slightly different angle, your second solution would be the correct one in terms of accessibility...you would be providing alternative content to users who cannot use javascript (those with screen readers, etc.).
This would automatically add the benefits of SEO and, in my opinion, would not be seen as a 'naughty' technique by Google.
Interesting. I have been searching around for viable solutions but it seems to be quite problematic.
I was actually leaning more towards your 2nd approach:
Let the server provide a special website only for the search engine
bots. If a normal user visits http://example.com/my_path the server
should give him a JavaScript heavy version of the website. But if the
Google bot visits, the server should give it some minimal HTML with
the content I want Google to index.
Here's my take on solving the problem. Although it is not confirmed to work, it might provide some insight or idea's for other developers.
Assume you're using a JS framework that supports "push state" functionality, and your backend framework is Ruby on Rails. You have a simple blog site and you would like search engines to index all your article index and show pages.
Let's say you have your routes set up like this:
resources :articles
match "*path", "main#index"
Ensure that every server-side controller renders the same template that your client-side framework requires to run (html/css/javascript/etc). If none of the controllers are matched in the request (in this example we only have a RESTful set of actions for the ArticlesController), then just match anything else and just render the template and let the client-side framework handle the routing. The only difference between hitting a controller and hitting the wildcard matcher would be the ability to render content based on the URL that was requested to JavaScript-disabled devices.
From what I understand it is a bad idea to render content that isn't visible to browsers. So when Google indexes it, people go through Google to visit a given page and there isn't any content, then you're probably going to be penalised. What comes to mind is that you render content in a div node that you display: none in CSS.
However, I'm pretty sure it doesn't matter if you simply do this:
<div id="no-js">
<h1><%= #article.title %></h1>
<p><%= #article.description %></p>
<p><%= #article.content %></p>
</div>
And then using JavaScript, which doesn't get run when a JavaScript-disabled device opens the page:
$("#no-js").remove() # jQuery
This way, for Google, and for anyone with JavaScript-disabled devices, they would see the raw/static content. So the content is physically there and is visible to anyone with JavaScript-disabled devices.
But, when a user visits the same page and actually has JavaScript enabled, the #no-js node will be removed so it doesn't clutter up your application. Then your client-side framework will handle the request through it's router and display what a user should see when JavaScript is enabled.
I think this might be a valid and fairly easy technique to use. Although that might depend on the complexity of your website/application.
Though, please correct me if it isn't. Just thought I'd share my thoughts.
Use NodeJS on the serverside, browserify your clientside code and route each http-request's(except for static http resources) uri through a serverside client to provide the first 'bootsnap'(a snapshot of the page it's state). Use something like jsdom to handle jquery dom-ops on the server. After the bootsnap returned, setup the websocket connection. Probably best to differentiate between a websocket client and a serverside client by making some kind of a wrapper connection on the clientside(serverside client can directly communicate with the server). I've been working on something like this: https://github.com/jvanveen/rnet/
Use Google Closure Template to render pages. It compiles to javascript or java, so it is easy to render the page either on the client or server side. On the first encounter with every client, render the html and add javascript as link in header. Crawler will read the html only but the browser will execute your script. All subsequent requests from the browser could be done in against the api to minimize the traffic.
This might help you : https://github.com/sharjeel619/SPA-SEO
Logic
A browser requests your single page application from the server,
which is going to be loaded from a single index.html file.
You program some intermediary server code which intercepts the client
request and differentiates whether the request came from a browser or
some social crawler bot.
If the request came from some crawler bot, make an API call to
your back-end server, gather the data you need, fill in that data to
html meta tags and return those tags in string format back to the
client.
If the request didn't come from some crawler bot, then simply
return the index.html file from the build or dist folder of your single page
application.

Get innerHTML via Jsoup

Im trying to scrape data from this website: http://www.bundesliga.de/de/liga/tabelle/
In the source code i can see the tables but there's no content, just things like:
<td>[no content]</td>
<td>[no content]</td>
<td>[no content]</td>
<td>[no content]</td>
....
With firebug (F12 in Firefox) i wont see any content too but i can select the table and then copy the innerHTML via firebug option. In that case i get all the informations about the teams, but i dont know how to get the table with the content in Jsoup.
To get the value of an attribute, use the Node.attr(String key) method
For the text on an element (and its combined children), use Element.text()
For HTML, use Element.html(), or Node.outerHtml() as appropriate
For example:
String html = "<p>An <a href='http://example.com/'><b>example</b></a> link.</p>";
Document doc = Jsoup.parse(html);
Element link = doc.select("a").first();
String text = doc.body().text(); // "An example link"
String linkHref = link.attr("href"); // "http://example.com/"
String linkText = link.text(); // "example""
String linkOuterH = link.outerHtml();
// "<b>example</b>"
String linkInnerH = link.html(); // "<b>example</b>"
reference:
http://jsoup.org/cookbook/extracting-data/attributes-text-html
The table is not rendered on the server directly, but build by the client side JavaScript of the page and constructed with data that is getting to the client via AJAX. So what you get with the naive Jsoup approach is expected.
I see two possible solutions:
You analyze the network traffic and identify the ajax calls that the site is making. Then you try to reconstruct the format and fire the same requests as the JavaScript would. Then you can reconstruct the table.
you don't use Jsoup but a real browser, that loads the page and runs the JavaScript including all AJAX calls. You could use Selenium webdriver for that. There is a headless browser called phantomjs which has a relatively small footprint that you can use in combination with selenium webdriver.
both options have their (dis)advantages:
This takes more time, since you need to understand the network traffic pretty good. The reward will be a very fast and memory efficient scraper.
The programming of selenium is very easy and you should not have any difficulties achieving your goal. You don't need to understand the inner workings of the site you want to scrape. However, the price is a further dependency in your project. Memory consumption is high. Another process runs. The scraping will be slow.
Maybe you find another source with the soccer table that is holding the infos you want? That might be the easiest. For example http://www.fussballdaten.de/bundesliga/

How to handle HTML content in Windows 8 Metro App

I'm designing a Windows 8 Reader App, and I have to use a control to show the HTML content, which is fetched from some website feeds. Cause those HTML content may contains images or some other formatted text, now I'm using a richtextblock to show the HTML content, but it costs a lot of time to parse the HTML content.
So I'm wondering if there is any controls that can handle the HTML content except the WebView.
Thanks.
Updated:
The reason I can't use WebView is that I need to implement pagination, like the image belowed:
As JP Alioto mentioned you should use the WebView control.
You can use the NavigateToString method to load the HTML. Or use Navigate to request a URI.
There are issues however with using the WebView control, specifically it is rendered differently and is not a standard control, this means things like your app bar or settings pane will not render on top of the WebView, there is a workaround by using the WebViewBrush to "paint" the WebView to standard control such as a rectangle when needed.
Also you can make a screenshot of the webpage you want to display. But to make a screenshot of webpage it's also not easy to do, but I offer you to make it with some special sites wich are created to take screenshot of other websites. Then you can download an image this sites return and open and display it in your windows 8 app. I show You some example how to I did that:
StorageFolder screens = await Windows.ApplicationModel.Package.Current.InstalledLocation.CreateFolderAsync(#"Screens\" + folderName, CreationCollisionOption.GenerateUniqueName);
var downloader = new BackgroundDownloader();
IStorageFile file = await screens.CreateFileAsync(fname, CreationCollisionOption.GenerateUniqueName);
string my_uri = "http://api.snapito.com/web/e3c351d5994134eb1aea855ce78e296c3292d48a/lc/" + url + "?type=jpeg";
DownloadOperation download = downloader.CreateDownload(new System.Uri(my_uri), file);
await download.StartAsync();
I think there are only two options but none of them are really good:
Use WebView and transform your HTML with CSS and other techniques to look native. Use the ScriptNotify and NavigationStarting and other events to navigate to another page. In W8.1 the WebView is much better (eg. treated as regular control not floating over all other controls,...)
Parse your HTML and generate native elements. I started such an implementation and created a XAML control to display HTML with native controls (see https://mytoolkit.codeplex.com/wikipage?title=HtmlTextBlock). However if you have complex HTML (eg iframes, etc.) this may not work and you have no other choice than to use the WebView control.

Alternative to iFrames with HTML5

I would like to know if there is an alternative to iFrames with HTML5.
I mean by that, be able to inject cross-domains HTML inside of a webpage without using an iFrame.
Basically there are 4 ways to embed HTML into a web page:
<iframe> An iframe's content lives entirely in a separate context than your page. While that's mostly a great feature and it's the most compatible among browser versions, it creates additional challenges (shrink wrapping the size of the frame to its content is tough, insanely frustrating to script into/out of, nearly impossible to style).
AJAX. As the solutions shown here prove, you can use the XMLHttpRequest object to retrieve data and inject it to your page. It is not ideal because it depends on scripting techniques, thus making the execution slower and more complex, among other drawbacks.
Hacks. Few mentioned in this question and not very reliable.
HTML5 Web Components. HTML Imports, part of the Web Components, allows to bundle HTML documents in other HTML documents. That includes HTML, CSS, JavaScript or anything else an .html file can contain. This makes it a great solution with many interesting use cases: split an app into bundled components that you can distribute as building blocks, better manage dependencies to avoid redundancy, code organization, etc. Here is a trivial example:
<!-- Resources on other origins must be CORS-enabled. -->
<link rel="import" href="http://example.com/elements.html">
Native compatibility is still an issue, but you can use a polyfill to make it work in evergreen browsers Today.
You can learn more here and here.
You can use object and embed, like so:
<object data="http://www.web-source.net" width="600" height="400">
<embed src="http://www.web-source.net" width="600" height="400"> </embed>
Error: Embedded data could not be displayed.
</object>
Which isn't new, but still works. I'm not sure if it has the same functionality though.
object is an easy alternative in HTML5:
<object data="https://github.com/AbrarJahin/Asp.NetCore_3.1-PostGRE_Role-Claim_Management/"
width="400"
height="300"
type="text/html">
Alternative Content
</object>
You can also try embed:
<embed src="https://github.com/AbrarJahin/Asp.NetCore_3.1-PostGRE_Role-Claim_Management/"
width=200
height=200
onerror="alert('URL invalid !!');" />
Re-
As currently, StackOverflow has turned off support for showing external URL contents, run code snippet is not showing anything. But for your site, it will work perfectly.
No, there isn't an equivalent. The <iframe> element is still valid in HTML5. Depending on what exact interaction you need there might be different APIs. For example there's the postMessage method which allows you to achieve cross domain javascript interaction. But if you want to display cross domain HTML contents (styled with CSS and made interactive with javascript) iframe stays as a good way to do.
If you want to do this and control the server from which the base page or content is being served, you can use Cross Origin Resource Sharing (http://www.w3.org/TR/access-control/) to allow client-side JavaScript to load data into a <div> via XMLHttpRequest():
// I safely ignore IE 6 and 5 (!) users
// because I do not wish to proliferate
// broken software that will hurt other
// users of the internet, which is what
// you're doing when you write anything
// for old version of IE (5/6)
xhr = new XMLHttpRequest();
xhr.onreadystatechange = function() {
if(xhr.readyState == 4 && xhr.status == 200) {
document.getElementById('displayDiv').innerHTML = xhr.responseText;
}
};
xhr.open('GET', 'http://api.google.com/thing?request=data', true);
xhr.send();
Now for the lynchpin of this whole operation, you need to write code for your server that will give clients the Access-Control-Allow-Origin header, specifying which domains you want the client-side code to be able to access via XMLHttpRequest(). The following is an example of PHP code you can include at the top of your page in order to send these headers to clients:
<?php
header('Access-Control-Allow-Origin: http://api.google.com');
header('Access-Control-Allow-Origin: http://some.example.com');
?>
This also does seem to work, although W3C specifies it is not intended "for an external (typically non-HTML) application or interactive content"
<embed src="http://www.somesite.com" width=200 height=200 />
More info:
http://www.w3.org/wiki/HTML/Elements/embed
http://www.w3schools.com/tags/tag_embed.asp
An iframe is still the best way to download cross-domain visual content. With AJAX you can certainly download the HTML from a web page and stick it in a div (as others have mentioned) however the bigger problem is security. With iframes you'll be able to load the cross domain content but won't be able to manipulate it since the content doesn't actually belong to you. On the other hand with AJAX you can certainly manipulate any content you are able to download but the other domain's server needs to be setup in such a way that will allow you to download it to begin with. A lot of times you won't have access to the other domain's configuration and even if you do, unless you do that kind of configuration all the time, it can be a headache. In which case the iframe can be the MUCH easier alternative.
As others have mentioned you can also use the embed tag and the object tag but that's not necessarily more advanced or newer than the iframe.
HTML5 has gone more in the direction of adopting web APIs to get information from cross domains. Usually web APIs just return data though and not HTML.
I created a node module to solve this problem node-iframe-replacement. You provide the source URL of the parent site and CSS selector to inject your content into and it merges the two together.
Changes to the parent site are picked up every 5 minutes.
var iframeReplacement = require('node-iframe-replacement');
// add iframe replacement to express as middleware (adds res.merge method)
app.use(iframeReplacement);
// create a regular express route
app.get('/', function(req, res){
// respond to this request with our fake-news content embedded within the BBC News home page
res.merge('fake-news', {
// external url to fetch
sourceUrl: 'http://www.bbc.co.uk/news',
// css selector to inject our content into
sourcePlaceholder: 'div[data-entityid="container-top-stories#1"]',
// pass a function here to intercept the source html prior to merging
transform: null
});
});
The source contains a working example of injecting content into the BBC News home page.
You can use an XMLHttpRequest to load a page into a div (or any other element of your page really). An exemple function would be:
function loadPage(){
if (window.XMLHttpRequest){
// code for IE7+, Firefox, Chrome, Opera, Safari
xmlhttp=new XMLHttpRequest();
}else{
// code for IE6, IE5
xmlhttp=new ActiveXObject("Microsoft.XMLHTTP");
}
xmlhttp.onreadystatechange=function(){
if (xmlhttp.readyState==4 && xmlhttp.status==200){
document.getElementById("ID OF ELEMENT YOU WANT TO LOAD PAGE IN").innerHTML=xmlhttp.responseText;
}
}
xmlhttp.open("POST","WEBPAGE YOU WANT TO LOAD",true);
xmlhttp.send();
}
If your sever is capable, you could also use PHP to do this, but since you're asking for an HTML5 method, this should be all you need.
The key issue to load another site's page in your own site's page is security. There is cross-site security policy defined, you would have no chance to load it directly in your iframe if another site has it set to strict same origin policy. Hence to find an alternative to iframe, they must be able to bypass this security policy restriction, even in the future, if any new component is introduced by WSC, it would have similar security restrictions.
For now, the best way to bypass this is to simulate a normal page access in your logic, this means AJAX + server side access maybe a good option. You can set up some proxy at your server side and fetch the page content and load it into the iframe. This works but not perfect as if there is any link or image in the content and they may not be accessible.
Normally if you try to load a page in your own iframe, you would need to check whether the page can be loaded in iframe or not. This post gives some guidelines on how to do the check.
You should have a look into JSON-P - that was a perfect solution for me when I had that problem:
https://en.wikipedia.org/wiki/JSONP
You basically define a javascript file that loads all your data and another javascript file that processes and displays it. That gets rid of the ugly scrollbar of iframes.

Winforms web browser control not firing document complete with AJAX web site

The VB.Net desktop app uses the IE browser control to navigate the web. When a normal page loads the document_complete event fires and I can read the resulting page and go from there. The issue I am having is that the page I am driving is written with AJAX, so the document complete event never fires. Furthermore, when you view the source of the page after it loaded a new portion via AJAX, it hasn't change. How are people handling this? What are my options?
This solution might solve your problem.
prerequists:
AxwebBrowser control,
reference to mshtml.dll
Dim axmshtml As mshtml.HTMLDocument = YourAxWebBrowserControl.Document
Dim HTMLSource As String = axmshtml.body.innerHTML 'html source, including DOM changes
If you know what you are looking for you can put the above code in a timer/loop
and simply monitor the page source for changes.
If wb is your webbrowser control, then instead of getting the HTML by using:
wb.DocumentText
use:
wb.Document.Body.InnerHtml
This will give you the updated html, reflecting the AJAX update.
As to detecting when the AJAX completes, for me it seems to be triggering a DocumentCompleted event. Not sure why it's different for you.
You need to interact with the Javascript code in the website using the methods on HtmlDocument.
I have seen this kind of behavior with C# when some AJAX scripts created a race condition. Adding the defer attribute to the script tag helped in that case. YMMV.
Not sure if this will work.
When the Ajax call completes, add a random anchor hash to the URL like so: foo.html#23234
then add your code to the NavigateComplete2 event.
http://msdn.microsoft.com/en-us/library/aa768334%28VS.85%29.aspx
I'm guessing that the page your load in your windows app does an AJAX call, which appears to refresh the page. In that case, the document_complete event isn't fired, because the webpage itself isn't refreshed, but a portion of the page.
I found a similar question about this problem, with an accepted answer in VB.Net.
You can use the ProgressChanged event, it seems to fire during ajax calls