Scraping a website by scrolling and copy-pasting content - google-chrome

I'm scraping a website for which I can only copy and paste its content into Excel. To my knowledge, Python, R, or other languages do not work.
My method right now is to copy and paste the content of its website. I copy as I scroll down and the website loads up the content. It works, but only for small quantities. However, if I keep scrolling for several minutes and attempt to copy and paste the content into Excel, then not all content appears to pop up in Excel.
I'm using Google Chrome as my browser.
Does anyone have experience with this?

If you trying to scrap a website contents without any javascript requirement.
You may want JSoup library using Java - here's a quick link for your reference.

Related

Embedding an Excel workbook in a web page

I'm using Microsoft's One Drive to share an Excel workbook by embedding it into a web page using the iframe tag.
I've got the code that One Drive provides and it displays fine on the page. However, it's possible for a user to click the icon in the black bar at the bottom and view the Workbook full screen.
I don't have a problem with that, but it then gives the option to download, copy and share the entire file and that is a problem.
I've found parameters that can be used with the workbook link such as wdHideGridlines, but is there anything that will get rid of that black bar? Or anything that will stop someone downloading the file?
It seems that you can embed a file with OneDrive and it's open for all, or you can use the 'share' option and get a view-only link, but I can't seem to embed that link - it displays an Excel icon for the workbook rather than a view of the data.
I hope this makes sense, if anyone can help I'd be grateful.
At the moment, the only way to prevent the download is to hide the download command on the web page with CSS.
Cast your vote in this user voice idea. https://excel.uservoice.com/forums/274580-excel-online/suggestions/19274656-remove-the-download-option
If it gets sufficient votes, Microsoft will consider implementing it.

jQuery mobile single page template loading content via ajax

I'm creating mobile app with phonegap, google maps and jquery mobile. I would like to have more then one single html templates files (navigation for pages) and load just the content of the page via ajax to javascript overlay.
Also i want to keep the history and hashes to be able to go back to previous page etc. I tried using $.mobile.loadPage and changePage but this is changing the whole pages and i would like to change just the content.
Is there any solution?
You will find this framework the exact one you are looking for.

Html Page to Print, PDF and Copy to Clipboard in .net MVC 3.0

I am developing a .Net MVC 3.0 Web application. There is a part I have to display invoice details like this.
On the top right corner I need to give Print, PDF & Copy to Clipboard function.
How to add those three button with above functions. Please Help ME......
Thanks In Advance.....!!!
Note: (I added this part later). There is a whole page. But this preview invoice is just a part of page wrapped by div tag. I don't want to print the whole page. I just want give three buttons only to that invoice div tag. Thanks :)
pdf = simply link to the URL of the PDF (Or the URL that generates the PDF)
print = what are you printing? The PDF? If so, this is redundant, as a person just needs to click the PDF and can print from there. Is it to print the web page? Again, redundant, as the web browser already has a print feature.
copy to clipboard = what are you copying to clipboard? For now, this is can only be accomplished via Flash, and, likely in the future not something you can do at all. This question's answers list some options: How do I copy to the clipboard in JavaScript?
Bottom line is that the image you provided appears to be a wireframe created by an IA or UX person that doesn't understand how web pages work. Not an uncommon thing, but really this is an UX issue, not a development issue. The problem needs to be addressed by the UX team first.
UPDATE based on question's UPDATE:
Regarding only wanting to deal with the contents of a particular div, the PDF and COPY TO CLIPBOARD comments above still stand.
What changes is the PRINT option. This can get messy. You could try swapping out the CSS via JS to only display the part you want to print, then triggering a print dialogue via JS, but that's going to be somewhat messy. I'd maybe consider instead making the PRINT icon a 'PRINT FRIENDLY' icon. Upon clicking that update the CSS to only display the content you want to print. Then let users use the PRINT feature in the browser, or you could try triggering the print dialogue with another bit of JS (though I'm not sure where browsers are at in supporting you printing from JS directly).

live content from html to html

I'm using UIWebView to display data from my organization data (publicize and legal), however, for instance, I would only want to pull specific data from the html file rather than pulling the whole URL. e.g. I want to pull the "News" section of the html and I want the user to only stay in that page, not enabling them to go into other parts of the website (e.g. home page, contact us) and allowing them to view the PDF article on the HTML file.
I've asked around and read up on DOM and screen scraping, but it seem that the data pulled are stored in a database instead.
Is there any way that I can pull just the HTML "News" section with the PDF URL into my customized HTML file and that it will be updated live (maybe every 30second it will refresh and pull information from the website so that the content and list of PDF are up to date)(e.g. added in 3new article into the main website, my customize HTML file will also refresh and pull information from website and update my article list)
If anyone can point to me a specific method that allow HTML to HTML data passing (live), that will be great and I can go do more research on it. Currently very lost and confuse as it is my first time doing this. Any help/feedback will be very much appreciated :)
EDIT: For example, google map or google search. I don't want to use the whole google webpage, just taking the important thing that i want like the search result or map display.
This will involve quite a lot of learning on your part - you'll have to learn HTML / the DOM / JavaScript and iOS/UIWebVIew.
Lets leave the live refresh part for now, I'll post another answer or edit to that later on.
That's not going to easy either (check out my earlier posting today on background execution issues that will affect you, unless the update is only to take place in the foreground
iOS Run Code Once a Day)
You will have to do something like this. And note that I've never tried this, nor seen posting of people who have on here, but in theory it should work, but there will be a lot of learning as I've said, and lots of trial and error. Its a big task when you're not familiar with these things.
1) Download the html page and load it in a UIWebView, but that UIWebView is hidden so the user's can't see it.
2) When the page has loaded its dom will be accessable.
3) You can use Javascript to access the DOM and look for the parts you want.
How you inject and run the Javascript in UIWebView can be answered in a separate question (this answer will get too long if all the exact details are included).
4) Remove the parts of the dom you are not interested in. Or use use events to make only those parts you are interested in appear, jQuery can probably help here.
5) Display the UIWebView
Alternatively the HTML could be saved to a file and string parsing could be used to search for the bits you are looking for and create a new text html file from it. I think this would get very messy, better to take advantage of the fact that UIWebView will parse the HTML page and create the dom for you.

how to write app that will make a NEW html page and fill it by copying the information from other website?

i need for my website www.livegol.co.il some app that will make the pages automatically and fill some information inside the page by copying it from other website.
for EX:
the app will be make automatically the homepage on 10:00 AM everyday, by copying the name of the teams that are playing from here: www.livescore.com
thanks
you'll probably want some php scraper type tool which is triggered by a cron job at the specified time.
there are plenty of website scraper scripts out there. i recently used one for some IMDB content.
hope this helps