Best way to read content of website after login

Best way to read content of website after login - html

So there is a contest going on where you can win tickets for a local festival. They display the number of tickets available, but you can only access that page after a login. I was making a java program to read the HTML content but can´t find a way to login first a then read the content.
So heres where I need help, what is the best way to have a script/program request a refresh on said page every once in a while, in order for me to read the content and do some kind of statistics on the way the tickets are sorted.
Thanks

Yes, there is a way to achieve the solution. Actually it is the related testing part of Java. You can look into Selenium web driver. It will take a while to setup. But once it done, you just need to change parameters like website url and username and password, and you will be able to login in any website and render the
HTML content. I learned from here

Found out the best/simple way, if you want to store just simple data is creating a script with greasemonkey and store the data using cookies

Related

Using OneNote API without registering an application?

The question is pretty clear I think, but I will elaborate on why I'm asking it.
I created a little blog engine based on OneNote. Basically, the blog configuration asks for an access to OneNote. Then the user chooses a section under which the blog posts are stored.
There is a cron script that will use all these informations to automatically get new pages, fetch the medias and cache every, and finally display the posts.
I chose OneNote because I own three Windows 8 computers and a Windows Phone, so OneNote was an easy choice, as I didn't want to get an other application to manage my blog.
There is still a lot to do (as always with softwares...), but I want to make this more or less an open source project, so that other people can install it on their websites and link it directely to OneNote.
The only "big" obstacle for this now is that authentication in the OneNote API needs to register the application on the Live Connect, and specify a redirect domain. So every user wishing to use this blog engine on their server will have to register their own application... That will look complicated just for a blog, especially if you're not tech-savvy.
Is there a way to "skip" or work around this requirement, even if it requires the user to make the section public (as it is for a blog, this doesn't seem too much to ask) ?
Thank you in advance,
Cheers

Sounds like an awesome project! When you get it released be sure to let us know at #OneNoteDev.
Unfortunately, at this time there's no way to circumvent the requirement for Live Connect OAuth configuration. You could offer a hosted variant so only you need to worry about the LiveID configuration.

In order to bypass a website's login screen, can you load a link with a username and password?

I am relatively new to web development, and I was hoping I could get some pointers about the feasibility of a feature I would like to implement. Is it possible to have a url link that you can click on, that can contain login credentials for the website it is linking to, so as to bypass that websites login screen?
In other words, can I make a link from my website to facebook, that would allow me to login right in to my facebook, from any computer? Meaning, if I don't have cookies to store my login info in, is it possible to login still?
This is just a conceptual question, so any help would be appreciated! Thanks!

One reason why this is generally avoided, is because web servers often store the query string parameters in the access logs. And normally, you wouldn't want files on your server with a long list of usernames and passwords in clear text.
In addition, a query string containing a username and password could be used with a dictionary attack to guess valid login credentials.
Apart from those issues, as long as the request is made via HTTPS, it would have been safe during the transit.

It is possible to pass parameters in the URL through a GET request on the server, but one has to understand that the request would likely be made in clear text and thus isn't likely to be secure. There was a time where I did have to program a "silent" log-in using tokens, so it can be done in enterprise applications.

You used to be able to do this, but most browsers don't allow it anymore. You would never be able to do this using facebook only something that uses browser auth (the browser pops up a username/pass dialog)
it was like this:
http://username:pass#myprotectedresource.com
What you might be able to do is whip up some javascript in a link that posts your username and password to the login page of facebook. Not sure if it will work because you might need to scrape the cookie/hidden fields from the login page itself.

It is possible for the site to block you on account of no cookies, or invalid nonce or wrong HTTP referrer, but it may work if their security is low.

While it is possible, it is up to the site (in this case Facebook) to accept these values in the query string. There are some security issues to consider certainly, and isn't done generally.
Though, there are different options out there for single sign on. This web site uses OpenID for that.

Using cookies to synch online and offline applications

I am working on an application that will have online and offline components and would like to get some opinions on how I am planning to do this. Feel free to give me some tough love if this is a ridiculous idea as I would like to learn as much as possible with this :o)
Here are is an outline for what I am trying to accomplish...
Client portion does basic CRUD, but forms may change depending on what is in the online database.
Client portion can be used online or offline.
Client portion should be fairly easy to move from one machine to the next (ie I'd rather not have to set up a database on each client).
Server portion does not need to be synched in real time.
I'm considering making the client portion a series of HTML forms that read from and write to cookies. The forms will be generated using JS based on what is in the cookies. For example, a cookie may store things like a list of values that will be used in a select box on the form. When the forms are submitted they will write to cookies. The forms could be used to submit data that will likely change how the form is generated next time it is loaded... for example, I may have a form that will allow me to enter options that will be included in another form's select box.
The server portion will read these cookies, update the database and then update the cookies so that the forms are appropriately generated next time.
Does this sound nuts? Would I be better off looking in to something like Google Gears? Any tips, advice or ideas would be greatly appreciated!
Thanks in advance :o)

Unless the online/offline thing is what will distinguish your application, I'd let Gears handle this. The general advice is to focus your effort on the parts of the functionality that distinguishes you, and let libraries handle the rest—assuming they handle it in a way that is acceptable for your app.

How can I update a web page remotely? Using a web service or email, no direct server access

How can I update a web page remotely? Is there a web service or can I do it via email, I have no direct access to the server.
We simply need to add an alert facility in an emergency. For example simple text message across the top of the home page saying "We are shut today due to bad weather".
Thanks

I can't tell that I catch what you mean, but I will answer in general manner
1- if you are building the whole site from scratch: You can create your site by any CMS like dotNetNuke or joomla which will allow you to login and edit what you want
2- if you are building just this page from scratch : You can build your page with online-editing in mind, in this case I recommend to build two pages one for for viewing content and the other for online-editing you can use any HTML-Editor control like FCKEditor
3- if you are dealing with already built page : it will be easier to build administration page which you can upload the new version of the content page to it, and the administration page take care of replacing the content page
hope this can help you, if not, please feel free to clear your needs so we can help more

Contact the host server company you have your DNS/Host service/name resolution with and ask them to redirect the DNS calls to another server of your choice with the notice you wish to have people see when they try to access your page.

On a general basis, yes it's possible, that's what most blog engines and CMS are for. It's also fairly easy to develop an ad-hoc program if all you need is to be able to put an offline page.
If what you mean however is you need to do today witout any access to the server, contacting the person hosting your site or your DNS are indeed your best chances.

I'd suggest getting someone to put a twitter widget on the page, then you can sms/email or use a web browser to send your updates and they will automatically appear on the site.
Is it at all possible for you get someone to do that for you? Twittercard can be used to generate the code to drop in.

It looks like this thread is a bit dated, but for anyone still looking for a way to update your site using email, you might want to check out https://www.sitemailcms.com/. It's a service I've developed to do just that.

how to browse web site with script to get informations

I need to write a script that go to a web site, logs in, navigates to a page and downloads (and after that parse) the html of that page.
What I want is a standalone script, not a script that controls Firefox. I don't need any javascript support in that just simple html navigation.
If nothing easy to do this exists.. well then something that acts though a web browser (firefox or safari, I'm on mac).
thanks

I've no knowledge of pre-built general purpose scrapers, but you may be able to find one via Google.
Writing a web scraper is definitely doable. In my very limited experience (I've written only a couple), I did not need to deal with login/security issues, but in Googling around I saw some examples that dealt with them - afraid I don't remember URL's for those pages. I did need to know some specifics about the pages I was scraping; having that made it easier to write the scraper, but, of course, the scrapers were limited to use on those pages. However, if you're just grabbing the entire page, you may only need the URL(s) of the page(s) in question.
Without knowing what language(s) would be acceptable to you, it is difficult to help much more. FWIW, I've done scrapers in PHP and Python. As Ben G. said, PHP has cURL to help with this; maybe there are more, but I don't know PHP very well. Python has several modules you might choose from, including lxml, BeautifulSoup, and HTMLParser.
Edit: If you're on Unix/Linux (or, I presume, CygWin) You may be able to achieve what you want with wget.

If you wanted to use PHP, you could use the cURL functions to build your own simple web page scraper.
For an idea of how to get started, see: http://us2.php.net/manual/en/curl.examples-basic.php

This is PROBABLY a dumb question, since I have no knowledge of mac but what language are we talking about here, and also is this a website that you have control over, or something like a spider bot that google might use when checking page content? I know that in C# you can load in objects on other sites using an HttpWebRequest and a stream reader... In java script (this would only really work if you know what is SUPPOSED to be there) you could open the web page as the source of an iframe, and using java script traverse the contents of all the elements on the page... or better yet, use jquery.

I need to write a script that go to a web site, logs in, navigates to a page and downloads (and after that parse) the html of that page.
To me this just sounds like a POST or GET request to the URL of the login page could do the job.With the proper parameters username and password (depending on the form input names used on the page) set in the request, the result will be the html of the page that you can then parse as you please.
This can be done with virtually any language. What language do you want to use?

I recently did exactly what you’re asking for in a C# project. If login is required your first request is likely to be a post and include credentials. The response will usually include cookies which persist the identity across subsequent requests. Use Fiddler to look at what form data (field names and values) is being posted to the server when you logon normally with your browser. Once you have this you can construct an HttpWebRequest with the form data and store the cookies from the response in a CookieContainer.
The next step is to make the request for the content you actually want. This will be another HttpWebRequest with the CookieContainer attached. The response can be read by a StreamReader which you can than read and convert to a string.
Each time I’ve done this it has usually been a pretty laborious process to identify all the relevant form data and recreate the requests manually. Use Fiddler extensively and compare the requests your browser is making when using the site normally with the requests coming from your script. You may also need to manipulate the request headers; again, use Fiddler to construct these by hand, get them submitting correctly and the response as you expect then code it. Good luck!

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008