How can I get FULL html code via Webkit.net - html

The website use ajax to load some data.When DocumentCompleted,I only get the html code without ajax data.
How to get the ajax data through webkit.net?
Thanks.

I've just recently fought with this myself and have what should be a working solution. I've not tried it with ajax, but I have used it after creating and appending DOM elements from C# and it produces the full code where DocumentText only produces the original unmodified HTML.
var fullHTML = webKitBrowser1.StringByEvaluatingJavaScriptFromString("document.getElementsByTagName('html')[0].outerHTML")
The only limitation to this method that I've seen is that it does not include the doctype tag if there is one, but everything else is there.

Related

Using node and node-phantom to scrape AngularJS Application

I have a node script set up to scrape pages from an AngularJS application and then generate code needed for testing purposes. It works great except for one thing. ng-if. Since ng-if removes elements from the dom the script never sees these blocks of code. I can't remove the ng-if's. So I'm wondering if there is some way to intercept the html between when node-phantom requests the page and when it actually loads everything in to phantoms dom. What I'm hoping to do is simply set all the ng-if's to true so that all content is available. Does anyone have any ideas for this?
EDIT I'm using phantomjs-node not node-phantom.
My Final solution was to scrape the page for all of the comment tags. Then filter through to find the ones that contained ng-ifs and parse out variable names from those tags. Then I tapped into Angular's $scope and set all of the variables to true. Forcing everything that is hidden on the page to be visible.

Get HTML element using JSP

I am new to Java EE. In my application, I have a HTML page containing a text area filled with some information.
I also have a form for the user. Once he submits the form, I use a servlet to process the information.
Finally I need to forward the Result from the servlet to the same HTML page and update it in the existing textarea.
I have done creation of HTML page, submitted the form to servlet.
I am now stuck on how to access the textarea in the HTML page to update my result.
I googled a lot but everywhere people forward the result of servlet to JSP page and create a fresh HTML page using "out" object.
I need to use the same textarea of my HTML page to update the result. Please help me on achieving this.
Thanks in advance.
There are two ways to do this.
Server-side rendering. You can re-draw the whole page, with a normal form submit. In this case you would put something in request scope in the servlet and then access that in the JSP when you draw the textarea. Note that you are not accessing the HTML textarea in the JSP - the JSP runs server-side code to generate HTML markup, but you are not accessing the browser DOM directly.
This might look like:
servlet: request.setAttribute("textareaContent", varWithTextareaContent)
JSP: <textarea>${textareaContent}</textarea>
Client-side rendering Instead, you could make an AJAX post request with jQuery (there are some plugins to help with this). In this case, your servlet would not forward to a JSP for HTML rendering - it would directly return a JSON object like {"textareaContent": ...} , and you would handle that client-side in your AJAX callback. In this case you would be accessing the textarea in the browser DOM directly, in Javascript (not JSP).

Jquery or json load data without refreshing page and add into html tags

I get some data on html/php page from database. And I edit it. But I want that data to change automatically without refreshing when I edit and click submit button. I have read that I must use json. But I can't add json values into html tags.
How can I do it ?
If you could not understand me, see this video.
You can use javascript to fetch the json file and manipulate the DOM accordingly. The example you provided uses jquery. Jquery provides a couple of ways to retrieve json data with ajax calls. This is all documented very well. See https://api.jquery.com/jquery.get/

parse html in adobe air

I am trying to load and parse html in adobe air. The main purpose being to extract title, meta tags and links. I have been trying the HTMLLoader but I get all sort of errors, mainly javascript uncaught exceptions.
I also tried to load the html content directly (using URLLoader) and push the text into HTMLLoader (using loadString(...)) but got the same error. Last resort was to try and load the text into xml and then use E4X queries or xpath, no luck there cause the html is not well formed.
My questions are:
Is there simple and reliable (air/action script) DOM component there (I do not need to display the page and headless mode will do)?
Is there any library to convert (crappy) html into well formed xml so I can use xpath/E4X
Any other suggestions on how to do this?
thx
ActionScript is supposed to be a superset of JavaScript, and thankfully, there's...
Pure JavaScript/ActionScript HTML Parser
created by Javascript guru and jQuery creator John Resig :-)
One approach is to run the HTML through HTMLtoXML() then use E4X as you please :)
Afaik:
No :-(
No :-(
I think the easiest way to grab title and meta tags is writing some regular expressions. You can load the page's HTML code into a string and then read out whatever you need like this:
var str:String = ""; // put HTML code in here
var pattern:RegExp = /<title>(.+)<\/title>/i;
trace(pattern.exec(str));

How can I post data (form) to html page and hijacking the data in the middle?

the site addres: http://www.ynet.co.il/YediothPortal/Ext/TalkBack/CdaTalkBack/1,2497,L-3650194-0-68-544-0--,00.html
fill the form with rubbish.
Hit 'Send'
the form post the data to another HTML without any parsing of the data i've just added
How do they do it?
A likely option is that they are using a content management system where "html" on the URL doesn't actually mean it's a static html file.
This may be out of left field, but I've certainly used the occasional JS function to grab everything in the header and either parse it or pass it to another script using AJAX.
I'll sometimes use this method in a 404.html page to grab the headers of the previous page, parse them out to see where someone was trying to go and redirect them.
That is, as annakata said, one of the numerous options available.
Edit based on clarified question:
Numerous frameworks can be configured to intercept an html request - for instance asp.net can be set to handle any given extension and an HTTPModule could do anything with that. It's really up to web server configuration what it decides to do with any request.
also: you don't really want to be saying "hijack"