A project I am working on makes multiple calls to the YouTube gdata API. Usually when I am using the site heavily and going through several pages that use the API, I just stop getting any returns from the API. Those parts of the site load fine, but things usually gathered by the API suddenly cease to load for a while. Is this because the API can't handle successive calls from me like that or because my code is bad?
I know this probably isn't the most constructive answer for Stack Overflow, so please let me know if I need to remove it.
I think you hit some kind of request throttling. As said in the FAQ https://developers.google.com/youtube/faq#operation_limits Youtube limits various of options to prevent to much data/requests.
Related
I have found that it is easy to find the API when a website uses client-side rendering. Unfortunately, the website I am looking at uses server-side rendering and I am trying to find the API. Articles that talk about how to find the API always only explain how to do it for client-side rendered websites and then dismiss server-side, "because it is a lot more complicated".
My question is if it is even possible to find the API from a server-side rendered website and if anybody would have any starting point for me since I am unable to find anything.
You can't.
You don't have access to the server-side code so you have no way of telling where it is getting the data from.
Maybe it is a public web service. Maybe it is a private web service. Maybe it is direct access to a database. Maybe it is reading data from static files on the server's local file system. You have no way of telling.
I am currently working on a small php library which would allow users to access data from the Google Play Developer Console, and insert it into a database, for future use.
To achieve this, I authenticate into the corresponding Google service with GET and POST requests (this part is still ok) and then do various POST requests to get all the data I need.
Everything was working fine, I got the script itself functioning (for fetching JSON responses) since 2 or 3 weeks, and I am launching it on a daily basis since then : no problems.
Today, I tried to launch it again, and as a JSON response for any POST request, I'm getting this :
I swear this is no fake, yet it's quite scary. Is Google trolling me right now?
Plus, the web version of the console still works, it just seems that my requests from outside don't.
I'm working on a localhost, so the hacking possibility is near (if not) 0, and I'm really worried that I made all this work for nothing.
if Google is actually trying to silence me from fetching data, plus making me rage, they are doing it right.
I've been googling this with all sort of keywords, still no luck.
Actually, I have solved this problem. As Google Webkit seems to regularly change the format of its JSON responses, my parsing and regular expressions were not working anymore to get the correct identifiers for authentication. I still think Google is trolling users or developers of non-official APIs, but now I have found a solution : manually check my parsing functions, adapting them to get every right value, and it's working again.
It's a pain, but if someone ever gets this problem and ends here, you know what to do!
(I can't be more precise, since these changes are totally random, but those are guidelines.)
I've built a simple RESTful API for game developers to do things like authenticate against our user store, post scores, and get leaderboards. It's not perfect, but it's very well documented and is working pretty well for 50% of game developers.
The other 50% seem to be Flash developers who just don't get the idea of a RESTful API. I really don't want to build a wrapper for these developers—I'm not a Flash developer, have no interest in becoming one, and really want to keep everything about our API technology-agnostic.
Can anyone recommend a good tutorial on how to consume a RESTful API for Flash developers?
Given my experience with this problem, I suspect this may have less to do with the fact that the developers don't understand the API and more to do with the limitations of the flash player: it simply doesn't support the full range of HTTP methods (at least, not without a proxy) you've probably used in your RESTful API implementation (such as PUT and DELETE).
Flash only supports GET and POST.
If this is indeed the issue, you'll need to offer an alternative to the missing HTTP verbs. I've worked around this in the past by using POST and adding a 'method=http_method' parameter to the query string.
Another gotcha could be your lack of a crossdomain.xml file.
This all goes back to some of my original questions of trying to "index" a webpage. I was originally trying to do it specifically in java but now I'm opening it up to any language.
Before I tried using HTML unit and other methods in java to get the information I needed but wasn't successful.
The information I need to get from a webpage I can very easily find with firebug and I was wondering if there was anyway to duplicate what firebug was doing specifically for my needs. When I open up firebug I go to the NET tab, then to the XHR tab and it shows a constantly updating page with the information the server is updating. Then when I click on the request and look at the response it has the information I need, and this is all without ever refreshing the webpage which is what I am trying to do(not to mention the variables it is outputting do not show up in the html of the webpage)
So can anyone point me in the right direction of how they would go about this?
(I will be putting this information into a mysql database which is why i added it as a tag, still dont know what language would be best to use though)
Edit: These requests on the server are somewhat random and although it shows the url that they come from when I try to visit the url in firefox it comes up trying to open something called application/jos
Jon, I am fairly certain that you are confusing several technologies here, and the simple answer is that it doesn't work like that. Firebug works specifically because it runs as part of the browser, and (as far as I am aware) runs under a more permissive set of instructions than a JavaScript script embedded in a page.
JavaScript is, for the record, different from Java.
If you are trying to log AJAX calls, your best bet is for the serverside application to log the invoking IP, useragent, cookies, and complete URI to your database on receipt. It will be far better than any clientside solution.
On a note more related to your question, it is not good practice to assume that everyone has read other questions you have posted. Generally speaking, "we" have not. "We" is in quotes because, well, you know. :) It also wouldn't hurt for you to go back and accept a few answers to questions you've asked.
So, the problem is?:
With someone else's web-page, hosted on someone else's server, you want to extract select information?
Using cURL, Python, Java, etc. is too painful because the data is continually updating via AJAX (requires a JS interpreter)?
Plain jQuery or iFrame intercepts will not work because of XSS security.
Ditto, a bookmarklet -- which has the added disadvantage of needing to be manually triggered every time.
If that's all correct, then there are 3 other approaches:
Develop a browser plugin... More difficult, but has the power to do everything in one package.
Develop a userscript. This is much easier to do and technologies such as Greasemonkey deal with the XSS problem.
Use a browser macro technology such as Chickenfoot. These all have plusses and minuses -- which I won't get into.
Using Greasemonkey:
Depending on the site, this can be quite easy. Â The big drawback, if you want to record data, is that you need your own web-server and web-application. But this server can be locally hosted on an XAMPP stack, or whatever web-application technology you're comfortable with.
Sample code that intercepts a page's AJAX data is at: Using Greasemonkey and jQuery to intercept JSON/AJAX data from a page, and process it.
Note that if the target page does NOT use jQuery, the library in use (if any) usually has similar intercept capabilities. Or, listening for DOMSubtreeModified always works, too.
If you're using a library such as jQuery, you may have an option such as the jQuery ajaxSend and ajaxComplete callbacks. These could post requests to your server to log these events (being careful not to end up in an infinite loop).
I have access to a web interface for a large amount of data. This data is usually accessed by people who only want a handful of items. The company that I work for wants me to download the whole set. Unfortunately, the interface only allows you to see fifty elements (of tens of thousands) at a time, and segregates the data into different folders.
Unfortunately, all of the data has the same url, which dynamically updates itself through ajax calls to an aspx interface. Writing a simple curl script to grab the data is difficult due to this and due to the authentication required.
How can I write a script that navigates around a page, triggers ajax requests, waits for the page to update, and then scrapes the data? Has this problem been solved before? Can anyone point me towards a toolkit?
Any language is fine, I have a good working knowledge of most web and scripting languages.
Thanks!
I usually just use a program like Fiddler or Live HTTP Headers and just watch what's happening behind the scenes. 99.9% of the time you'll see that there's a querystring or REST call with a very simple pattern that you can emulate.
If you need to directly control a browser
Have you thought of using tools like WatiN which are actually used for UI testing purposes but I suppose you could use it to programmaticly make requests anywhere and act upon responses.
If you just need to get the data
But since you can do whatever you please you can just make usual web requests from a desktop application and parse results. You could customize it to your own needs. And simulate AJax requests at will by setting certain request headers.
Maybe this ?
Website scraping using jquery and ajax
http://www.kelvinluck.com/2009/02/data-scraping-with-yql-and-jquery/