I am currently working on a small php library which would allow users to access data from the Google Play Developer Console, and insert it into a database, for future use.
To achieve this, I authenticate into the corresponding Google service with GET and POST requests (this part is still ok) and then do various POST requests to get all the data I need.
Everything was working fine, I got the script itself functioning (for fetching JSON responses) since 2 or 3 weeks, and I am launching it on a daily basis since then : no problems.
Today, I tried to launch it again, and as a JSON response for any POST request, I'm getting this :
I swear this is no fake, yet it's quite scary. Is Google trolling me right now?
Plus, the web version of the console still works, it just seems that my requests from outside don't.
I'm working on a localhost, so the hacking possibility is near (if not) 0, and I'm really worried that I made all this work for nothing.
if Google is actually trying to silence me from fetching data, plus making me rage, they are doing it right.
I've been googling this with all sort of keywords, still no luck.
Actually, I have solved this problem. As Google Webkit seems to regularly change the format of its JSON responses, my parsing and regular expressions were not working anymore to get the correct identifiers for authentication. I still think Google is trolling users or developers of non-official APIs, but now I have found a solution : manually check my parsing functions, adapting them to get every right value, and it's working again.
It's a pain, but if someone ever gets this problem and ends here, you know what to do!
(I can't be more precise, since these changes are totally random, but those are guidelines.)
Related
I have no previous experience with programming Google Chrome plugins which is why I am starting here to see if what I want to accomplish is possible/reasonable. I do however have a pretty broad experience in programming in general.
What I want:
I want some kind of "trigger" to go off when a new Chrome Notification (you know these little pop ups above the system tray) is popping up. I want to execute some script/code depending on what information the notification contains so that I for example could have an alarm go off if I receive an email from a certain user with a certain key word in the subject and get a pop up from my Gmail Notifier extension.
This is however just an example and I have a bunch of ideas for different notifications from different extensions and websites so don't get caught up on that particular example.
When I look at the Chrome Notification API I see that there is a getAll method that supposedly is getting all the "notifications in the system" but I do not find any Event for new notifications.
I suppose a possibility would be to poll with getAll a couple of times per second (it needs to be really fast for some implementations I have in mind) but it feels very tacky.
Is there any way to easily access new Notifications programmatically in Chrome?
(I'm open to all solutions, programming languages and such...)
Well, I searched long and hard and got involved with the Chromium dev group and asked around there. As far as I could figure out there was no reasonable way of accessing all Notifications programatically.
So what I ended up doing was just download the source-code of Chromium and build my own custom version of chromium adding a very crude API. Worked like a charm and not as complicated as one might think.
Cheers!
I am trying to retrieve data from a website for which the parameters that you need to define does not show up in the url ie. http://www.vmm.be/webrap/ibmcognos/cgi-bin/cognosisapi.dll?b_action=cognosViewer&ui.action=run&ui.object=%2fcontent%2ffolder[%40name%3d%27Water%27]%2ffolder[%40name%3d%27Afvalwater%27]%2freport[%40name%3d%27Individuele%20analyseresultaten%20per%20RWZI%27]&ui.name=Individuele%20analyseresultaten%20per%20RWZI&run.outputFormat=HTML&run.prompt=false&ui.backURL=%2fwebrap%2fibmcognos%2fcgi-bin%2fcognosisapi.dll%3fb_action%3dxts.run%26m%3dportal%2fcc.xts%26m_folder%3di5DDA04E5A00C4B6AB6DF44BB4FAD7CEC&p_RwziNr=51&run.prompt=false
how can I extract the data for different years and parameters in a programmatic way?
I am using matlab's urlread but since the data I want to import does not show up in the html-code (I have checked this with the Web Developer Toolbar in FireFox), nothing is being read in. I have no experience with websites, only matlab and c-programming, so I have no idea how the data can be shown in the browser if it is not showing up in the html-source code so could someone point me to the right direction on how to get this job done? Is it at all possible? I hope so because I will have to repeat this for around 500 measuring stations, each 10 years so I am not planning on copying the required data manually as I did before when I just needed one station.
it turns out it is not possible to do what I want in Matlab. I however did manage to get the required data programmatically by using a combination of Selenium with C# and chrome driver. It's slow, but it's working and I can do other stuff in the meanwhile so I can recommend it to everyone who is downloading data from servers in a tedious way.
A project I am working on makes multiple calls to the YouTube gdata API. Usually when I am using the site heavily and going through several pages that use the API, I just stop getting any returns from the API. Those parts of the site load fine, but things usually gathered by the API suddenly cease to load for a while. Is this because the API can't handle successive calls from me like that or because my code is bad?
I know this probably isn't the most constructive answer for Stack Overflow, so please let me know if I need to remove it.
I think you hit some kind of request throttling. As said in the FAQ https://developers.google.com/youtube/faq#operation_limits Youtube limits various of options to prevent to much data/requests.
This all goes back to some of my original questions of trying to "index" a webpage. I was originally trying to do it specifically in java but now I'm opening it up to any language.
Before I tried using HTML unit and other methods in java to get the information I needed but wasn't successful.
The information I need to get from a webpage I can very easily find with firebug and I was wondering if there was anyway to duplicate what firebug was doing specifically for my needs. When I open up firebug I go to the NET tab, then to the XHR tab and it shows a constantly updating page with the information the server is updating. Then when I click on the request and look at the response it has the information I need, and this is all without ever refreshing the webpage which is what I am trying to do(not to mention the variables it is outputting do not show up in the html of the webpage)
So can anyone point me in the right direction of how they would go about this?
(I will be putting this information into a mysql database which is why i added it as a tag, still dont know what language would be best to use though)
Edit: These requests on the server are somewhat random and although it shows the url that they come from when I try to visit the url in firefox it comes up trying to open something called application/jos
Jon, I am fairly certain that you are confusing several technologies here, and the simple answer is that it doesn't work like that. Firebug works specifically because it runs as part of the browser, and (as far as I am aware) runs under a more permissive set of instructions than a JavaScript script embedded in a page.
JavaScript is, for the record, different from Java.
If you are trying to log AJAX calls, your best bet is for the serverside application to log the invoking IP, useragent, cookies, and complete URI to your database on receipt. It will be far better than any clientside solution.
On a note more related to your question, it is not good practice to assume that everyone has read other questions you have posted. Generally speaking, "we" have not. "We" is in quotes because, well, you know. :) It also wouldn't hurt for you to go back and accept a few answers to questions you've asked.
So, the problem is?:
With someone else's web-page, hosted on someone else's server, you want to extract select information?
Using cURL, Python, Java, etc. is too painful because the data is continually updating via AJAX (requires a JS interpreter)?
Plain jQuery or iFrame intercepts will not work because of XSS security.
Ditto, a bookmarklet -- which has the added disadvantage of needing to be manually triggered every time.
If that's all correct, then there are 3 other approaches:
Develop a browser plugin... More difficult, but has the power to do everything in one package.
Develop a userscript. This is much easier to do and technologies such as Greasemonkey deal with the XSS problem.
Use a browser macro technology such as Chickenfoot. These all have plusses and minuses -- which I won't get into.
Using Greasemonkey:
Depending on the site, this can be quite easy. The big drawback, if you want to record data, is that you need your own web-server and web-application. But this server can be locally hosted on an XAMPP stack, or whatever web-application technology you're comfortable with.
Sample code that intercepts a page's AJAX data is at: Using Greasemonkey and jQuery to intercept JSON/AJAX data from a page, and process it.
Note that if the target page does NOT use jQuery, the library in use (if any) usually has similar intercept capabilities. Or, listening for DOMSubtreeModified always works, too.
If you're using a library such as jQuery, you may have an option such as the jQuery ajaxSend and ajaxComplete callbacks. These could post requests to your server to log these events (being careful not to end up in an infinite loop).
I need to write a script that go to a web site, logs in, navigates to a page and downloads (and after that parse) the html of that page.
What I want is a standalone script, not a script that controls Firefox. I don't need any javascript support in that just simple html navigation.
If nothing easy to do this exists.. well then something that acts though a web browser (firefox or safari, I'm on mac).
thanks
I've no knowledge of pre-built general purpose scrapers, but you may be able to find one via Google.
Writing a web scraper is definitely doable. In my very limited experience (I've written only a couple), I did not need to deal with login/security issues, but in Googling around I saw some examples that dealt with them - afraid I don't remember URL's for those pages. I did need to know some specifics about the pages I was scraping; having that made it easier to write the scraper, but, of course, the scrapers were limited to use on those pages. However, if you're just grabbing the entire page, you may only need the URL(s) of the page(s) in question.
Without knowing what language(s) would be acceptable to you, it is difficult to help much more. FWIW, I've done scrapers in PHP and Python. As Ben G. said, PHP has cURL to help with this; maybe there are more, but I don't know PHP very well. Python has several modules you might choose from, including lxml, BeautifulSoup, and HTMLParser.
Edit: If you're on Unix/Linux (or, I presume, CygWin) You may be able to achieve what you want with wget.
If you wanted to use PHP, you could use the cURL functions to build your own simple web page scraper.
For an idea of how to get started, see: http://us2.php.net/manual/en/curl.examples-basic.php
This is PROBABLY a dumb question, since I have no knowledge of mac but what language are we talking about here, and also is this a website that you have control over, or something like a spider bot that google might use when checking page content? I know that in C# you can load in objects on other sites using an HttpWebRequest and a stream reader... In java script (this would only really work if you know what is SUPPOSED to be there) you could open the web page as the source of an iframe, and using java script traverse the contents of all the elements on the page... or better yet, use jquery.
I need to write a script that go to a web site, logs in, navigates to a page and downloads (and after that parse) the html of that page.
To me this just sounds like a POST or GET request to the URL of the login page could do the job.With the proper parameters username and password (depending on the form input names used on the page) set in the request, the result will be the html of the page that you can then parse as you please.
This can be done with virtually any language. What language do you want to use?
I recently did exactly what you’re asking for in a C# project. If login is required your first request is likely to be a post and include credentials. The response will usually include cookies which persist the identity across subsequent requests. Use Fiddler to look at what form data (field names and values) is being posted to the server when you logon normally with your browser. Once you have this you can construct an HttpWebRequest with the form data and store the cookies from the response in a CookieContainer.
The next step is to make the request for the content you actually want. This will be another HttpWebRequest with the CookieContainer attached. The response can be read by a StreamReader which you can than read and convert to a string.
Each time I’ve done this it has usually been a pretty laborious process to identify all the relevant form data and recreate the requests manually. Use Fiddler extensively and compare the requests your browser is making when using the site normally with the requests coming from your script. You may also need to manipulate the request headers; again, use Fiddler to construct these by hand, get them submitting correctly and the response as you expect then code it. Good luck!