Jsoup - Trying to extract Comments number from web page - extract

I'm trying to extract the overall comments number from a web page using Jsoup.
For example, here is a page (CNN): http://edition.cnn.com/2011/POLITICS/07/31/debt.talks/index.html?hpt=T1
I see that the class ID is cnn_strycmtsndff, but can't get to find the right command to extract it.
Can someone help?
Thanks

Unfortunately, I don't think Jsoup is going to cut it. If you use the Chrome developer tools you can clearly pick out the HTML used for presenting the "(##### Comments)" section, but if you just view the source, none of that information is there. It seems like they are using some Javascript to dynamically embed the information in the page.
This is what you see in "View Source":
<div id="disqus_thread"></div><script type="text/javascript" src="http://cnn.disqus.com/embed.js"></script>
So Jsoup will never be able to see the elements with the comment information.

Related

How to Make Text link to Hyperlinks in Facebook Comment? Comment plugin

Now this is very tricky thing. I have recently seen that many blogs do have facebook comment plugins where anyone can comment and place relevant links.
But the problem is those links are in text and are not hyperlink or you can say not clickable.
I found here in this article post right down you can see a facebook comment plugin, there you can see a text link is a hyperlink.
http://www.huffingtonpost.com/2014/11/25/black-friday-apple-deals-2014_n_6211754.html
Now my question how did that user do that? i mean from text to hyperlink. Becz normally it won't happen.
I have searched lot of stuff in google but i am not able to get the correct method.
The user doesn't make something that looks like a hyper-link display as a hyper-link, the webpage does!
What happens is that the scripting behind the page is using a reg is looking for a regular expression to pattern match URLs in the comments. When a match is found it'll be displayed as a hyper-link, if it doesn't match the regex it just displays as flat text instead of a hyper-link This either built into the Facebook comment plug-in or the website itself.
When text gets passed to HTML it has no way of telling what is and is not a link However if you process it through a script to identify links as being links you can tell it to display them as hyper-links rather than just plain text.
A great example/explanation of this is over at http://regexr.com/39i0i
Tl;Dr
Users don't make it happen, the plug-in/webpage make it happen.
With the exception of plug ins that require you use link tags, in the case of the page you linked though that is all that version of the Facebook plugin.

How can I format an html page so that when a link is added to G , it looks nice?

Im building a new website, and I was testing the ability to add a "link" to google plus and have the article and it's content (summary) get pulled into google+.
When I do this today, all I see is an image - and no text. I'm sure that there is some formatting change I can make to the page to make the text come in... But I don't know what to do.
If there is a guide somewhere I would like to read it, but I cannot find one. Thanks for the help
An example page:
http://alexedison.com/index.php?indexNum=8
The easiest way is to create snippet data for your page. Google provides a tool for doing this as well as additional information at:
https://developers.google.com/+/web/snippet
You can validate the data on your page using the Google structured data testing tool:
https://www.google.com/webmasters/tools/richsnippets
There is no problem in your webpage.
Google Plus has change the look of each post.
now it shows only Image, Title along with given URL.
if you visit Google Plus , you can see posts without description meta information.

Source code viewer through Html page

Hi im demonstrating the html tags that are new in CSS3 and I'm making a documentation for the easy viewing and interpretation on comparing both the source code and the output.And its its really hard for me going to the source code and then selecting the file and browsing it on the browser
It would be great if I could view both the source code and the output
on the same html page.
For example(I m talking from the page I ve attached below) if I select Source code the source code must be displayed on the screen or from any of the text editors.
I don't know whether it is possible to do so,If possible it would be great
if anyone of you could guide me.
To get the source of just one element, do this:
HTML: <div id="one"><span id="two"></span></div>
JS:
document.getElementById('one').innerHTML // returns <span id="two"></span>
document.getElementById('one').outerHTML // returns <div id="one"><span id="two"></span></div>
To get the source of the entire page, do this:
document.doctype + document.documentElement.outerHTML
document.doctype returns the doctype, and document.documentElement.outerHTML returns the code for the <html> tag and everything inside it.
Use the developer tools you have in all modern browsers (the most advanced ones being Chrome and Firefox).
You typically open such a toolset using the Ctrl-Uppercase-i shortcut.
Then you have a lot of useful tools, as described here for Chrome or here for Firefox.
One of them seems to be exactly what you need. For example in Chrome, the first tab, called "Elements", shows you the source code with a lot of goodies (interpreted css with reasons, element displayed when you hover the mouse, search, etc.).
I'd suggest you take the time to read the linked documentation, as this is an essential tool of any web developer. And you won't be able to stop using those tools as soon as you go deeper in javascript or css.

How to locate/edit the original source html of an aspx page on a sharepoint site

I am working with a MOSS 2007 site, the page I am currently trying to make additions/changes to specifically is a popup.
The page is an .aspx file and I know where it is located, but simply put I am trying to find and ultimately modify the original source html of this page(the html which shows up when you hit "view source" when brought up in my browser).
When i try to edit the page in sharepoint designer It only brings up a formatted view of the html code, I wish to see a much more complete version where i can edit the attributes of the tags and such.
Is there anyway to locate this original source html?
Try using CTRL + F to do quick find and then search your HTML tag id or any descriptive text you can find from viewing source or expecting the element. And then select Search Entire Solution or Current Project.
If the project is set up with a MVC structure, try finding
MainFolder/ProjectName_Web/Views
Alternately look in
MainFolder/ProjectName_Web/UserControls
If you are using Visual Studio, check out Kris Hollenbeck's solution.

Embed a webclip

I am wondering if there's a way I can embed a webclip into a webpage, as in, I can have a portion of a webpage embedded as a widget into another page. I was thinking it might be possible someway though Mac OS X's Dashboard widgets, one can take a webclip and make a dashboard widget, as I hear that they are HTML based, and thus one could reverse-engineer one into simple HTML code. Kind of the reverse of what google does for gadgets. Any ideas? I'm open to any solutions.
Thanks.
The easy, html-based way is with an iframe. What this does is put an entire webpage within a box on your page. You don't have much flexibility with it.
You can also do it with javascript. JQuery makes it easy with their .load() method. Going this route, you can load a webpage with javascript, load specific tags within that page, or even modify the incoming code before displaying it.
Most basically:
$("#xxxx").load("url.html");
Where xxxx is the id of the html tag where you want the content to be loaded on your page (e.g. if you have <div id="xxxx">content will go here</div> in your HTML). See more details at: http://docs.jquery.com/Ajax/load.
If these don't suffice, the next step would be PHP (I doubt you'd need it, but if you'd like to, you car search for file_get_contents on php.net).