How to extend AFNetworking 2.0 to perform request combining - afnetworking-2

I have a UI where the same image URL could be requested by several UIImageViews at varying times. Obviously if a request from one of them has finished then returning the cached version works as expected. However, especially with slower networks, I'd like to be able to piggy-back requests for an image URL onto any currently running/waiting HTTP request for the same URL.
On an HTTP server this called request combining and I'd love to do the same in the client - to combine the different requests for the same URL into a single request and then callback separately to each of the callers). The requests for that URL dont happen to start at the same time.
What's the best way to accomplish this?
I think re-writing UIImageView+AFNetworking might be the easiest way:
check the af_sharedImageRequestOperationQueue to see if it has an operation with the same request
if I do already have an operation in the queue or running then add myself to some list of callbacks/blocks to be called on success/failure
if I don't have the operation, then create it as normal
in the setCompletionBlockWithSuccess to call each of the blocks in turn.
Any simpler alternatives?

I encountered a similar problem and decided that your way was the most straightforward. One added bit of complexity is that these downloads require special credentials and so must go through their own operation queue. Here's the code from my UIImageView category to check whether a particular URL is inflight:
NSUInteger foundOperation = [[ConnectionManager sharedConnectionManager].operationQueue.operations indexOfObjectPassingTest:^BOOL(AFHTTPRequestOperation *obj, NSUInteger idx, BOOL *stop) {
BOOL URLAlreadyInFlight = [obj.request.URL.absoluteString isEqualToString:URL.absoluteString];
if (URLAlreadyInFlight) {
NSBlockOperation *updateUIOperation = [NSBlockOperation blockOperationWithBlock:^{
[[NSOperationQueue mainQueue] addOperationWithBlock:^{
self.image = [[ImageCache sharedImageCache] cachedImageForURL:URL];
}];
}];
//Makes updating the UI dependent on the completion of the matching operation.
[updateUIOperation addDependency:obj];
}
return URLAlreadyInFlight;
}];
Were you able to come up with a better solution?
EDIT: Well, it looks like my method of updating the UI just can't work, as the operation's completion blocks are run asynchronously, so the operation finishes before the blocks are run. However, I was able to modify the image cache to be able to add callbacks for when certain URLs are cached, which seems to work correctly. So this method will properly detect when certain URLs are in flight and be able to take action with that knowledge.

Related

Data Studio connector making multiple calls to API when it should only be making 1

I'm finalizing a Data Studio connector and noticing some odd behavior with the number of API calls.
Where I'm expecting to see a single API call, I'm seeing multiple calls.
In my apps script I'm keeping a simple tally which increments by 1 every url fetch and that is giving me the correct number I expect to see with getData().
However, in my API monitoring logs (using Runscope) I'm seeing multiple API requests for the same endpoint, and varying numbers for different endpoints in a single getData() call (they should all be the same). E.g.
I can't post the code here (client project) but it's substantially the same framework as the Data Connector code on Google's docs. I have caching and backoff implemented.
Looking for any ideas or if anyone has experienced something similar?
Thanks
Per the this reference, GDS will also perform semantic type detection if you aren't explicitly defining this property for your fields. If the query is semantic type detection, the request will feature sampleExtraction: true
When Data Studio executes the getData function of a community connector for the purpose of semantic detection, the incoming request will contain a sampleExtraction property which will be set to true.
If the GDS report includes multiple widgets with different dimensions/metrics configuration then GDS might fire multiple getData calls for each of them.
Kind of a late answer but this might help others who are facing the same problem.
The widgets / search filters attached to a graph issue getData calls of their own. If your custom adapter is built to retrieve data via API calls from third party services, data which is agnostic to the request.fields property sent forward by GDS => then these API calls are multiplied by N+1 (where N = the amout of widgets / search filters your report is implementing).
I could not find an official solution for this either, so I invented a workaround using cache.
The graph's request for getData (typically requesting more fields than the Search Filters) will be the only one allowed to query the API Endpoint. Before starting to do so it will store a key in the cache "cache_{hashOfReportParameters}_building" => true.
if (enableCache) {
cache.putString("cache_{hashOfReportParameters}_building", 'true');
Logger.log("Cache is being built...");
}
It will retrieve API responses, paginating in a look, and buffer the results.
Once it finished it will delete the cache key "cache_{hashOfReportParameters}building", and will cache the final merged results it buffered so far inside "cache{hashOfReportParameters}_final".
When it comes to filters, they also invoke: getData but typically with only up to 3 requested fields. First thing we want to do is make sure they cannot start executing prior to the primary getData call... so we add a little bit of a delay for things that might be the search filters / widgets that are after the same data set:
if (enableCache) {
var countRequestedFields = requestedFields.asArray().length;
Logger.log("Total Requested fields: " + countRequestedFields);
if (countRequestedFields <= 3) {
Logger.log('This seams to be a search filters.');
Utilities.sleep(1000);
}
}
After that we compute a hash on all of the moving parts of the report (date range, plus all of the other parameters you have set up that could influence the data retrieved form your API endpoints):
Now the best part, as long as the main graph is still building the cache, we make these getData calls wait:
while (cache.getString('cache_{hashOfReportParameters}_building') === 'true') {
Logger.log('A similar request is already executing, please wait...');
Utilities.sleep(2000);
}
After this loop we attempt to retrieve the contents of "cache_{hashOfReportParameters}_final" -- and in case we fail, its always a good idea to have a backup plan - which would be to allow it to traverse the API again. We have encountered ~ 2% error rate retrieving data we cached...
With the cached result (or buffered API responses), you just transform your response as per the schema GDS needs (which differs between graphs and filters).
As you start implementing this, you`ll notice yet another problem... Google Cache is limited to max 100KB per key. There is however no limit on the amount of keys you can cache... and fortunately others have encountered similar needs in the past and have come up with a smart solution of splitting up one big chunk you need cached into multiple cache keys, and gluing them back together into one object when retrieving is necessary.
See: https://github.com/lwbuck01/GASs/blob/b5885e34335d531e00f8d45be4205980d91d976a/EnhancedCacheService/EnhancedCache.gs
I cannot share the final solution we have implemented with you as it is too specific to a client - but I hope that this will at least give you a good idea on how to approach the problem.
Caching the full API result is a good idea in general to avoid round trips and server load for no good reason if near-realtime is good enough for your needs.

How can I configure Polymer's platinum-sw-* to NOT cache one URL path?

How can I configure Polymer's platinum-sw-cache or platinum-sw-fetch to cache all URL paths except for /_api, which is the URL for Hoodie's API? I've configured a platinum-sw-fetch element to handle the /_api path, then platinum-sw-cache to handle the rest of the paths, as follows:
<platinum-sw-register auto-register
clients-claim
skip-waiting
on-service-worker-installed="displayInstalledToast">
<platinum-sw-import-script href="custom-fetch-handler.js"></platinum-sw-import-script>
<platinum-sw-fetch handler="HoodieAPIFetchHandler"
path="/_api(.*)"></platinum-sw-fetch>
<platinum-sw-cache default-cache-strategy="networkFirst"
precache-file="precache.json"/>
</platinum-sw-cache>
</platinum-sw-register>
custom-fetch-handler.js contains the following. Its intent is simply to return the results of the request the way the browser would if the service worker was not handling the request.
var HoodieAPIFetchHandler = function(request, values, options){
return fetch(request);
}
What doesn't seem to be working correctly is that after user 1 has signed in, then signed out, then user 2 signs in, then in Chrome Dev Tools' Network tab I can see that Hoodie regularly continues to make requests to BOTH users' API endpoints like the following:
http://localhost:3000/_api/?hoodieId=uw9rl3p
http://localhost:3000/_api/?hoodieId=noaothq
Instead, it should be making requests to only ONE of these API endpoints. In the Network tab, each of these URLs appears twice in a row, and in the "Size" column the first request says "(from ServiceWorker)," and the second request states the response size in bytes, in case that's relevant.
The other problem which seems related is that when I sign in as user 2 and submit a form, the app writes to user 1's database on the server side. This makes me think the problem is due to the app not being able to bypass the cache for the /_api route.
Should I not have used both platinum-sw-cache and platinum-sw-fetch within one platinum-sw-register element, since the docs state they are alternatives to each other?
In general, what you're doing should work, and it's a legitimate approach to take.
If there's an HTTP request made that matches a path defined in <platinum-sw-fetch>, then that custom handler will be used, and the default handler (in this case, the networkFirst implementation) won't run. The HTTP request can only be responded to once, so there's no chance of multiple handlers taking effect.
I ran some local samples and confirmed that my <platinum-sw-fetch> handler was properly intercepting requests. When debugging this locally, it's useful to either add in a console.log() within your custom handler and check for those logs via the chrome://serviceworker-internals Inspect interface, or to use the same interface to set some breakpoints within your handler.
What you're seeing in the Network tab of the controlled page is expected—the service worker's network interactions are logged there, whether they come from your custom HoodieAPIFetchHandler or the default networkFirst handler. The network interactions from the perspective of the controlled page are also logged—they don't always correspond one-to-one with the service worker's activity, so logging both does come in handy at times.
So I would recommend looking deeper into the reason why your application is making multiple requests. It's always tricky thinking about caching personalized resources, and there are several ways that you can get into trouble if you end up caching resources that are personalized for a different user. Take a look at the line of code that's firing off the second /_api/ request and see if it's coming from an cached resource that needs to be cleared when your users log out. <platinum-sw> uses the sw-toolbox library under the hood, and you can make use of its uncache() method directly within your custom handler scripts to perform cache maintenance.

Drive API files.list returning nextPageToken with empty item results

In the last week or so we got a report of a user missing files in the file list in our app. We we're a bit confused at first because they said they only had a couple files that matched our query string, but with a bit of work we were able to reproduce their issue by adding a large number of files to our Google Drive. Previously we had been assuming people would have less than 100 files and hadn't been doing paging to avoid multiple files.list requests.
After switching to use paging, we noticed that on one of our test accounts was sending hundreds and hundreds of files.list requests and most of the responses did not contain any files but did contain a nextPageToken. I'll update as soon as I can get a screenshot - but the client was sending enough requests to heat the computer up and drain battery fairly quickly.
We also found that based on what the query is even though it matches the same files it can have a drastic effect of the number of requests needed to retrieve our full file list. For example, switching '=' to 'contains' in the query param significantly reduces the number of requests made, but we don't see any guarantee that this is a reasonable and generalizeable solution.
Is this the intended behavior? Is there anything we can do to reduce the number of requests that we are sending?
We're using the following code to retrieve files created by our app that is causing the issue.
runLoad: function(pageToken)
{
gapi.client.drive.files.list(
{
'maxResults': 999,
'pageToken': pageToken,
'q': "trashed=false and mimeType='" + mime + "'"
}).execute(function (results)
{
this.filePageRequests++;
if (results.error || !results.nextPageToken || this.filePageRequests >= MAX_FILE_PAGE_REQUESTS)
{
this.isLoading(false);
}
else
{
this.runLoad(results.nextPageToken);
}
}.bind(this));
}
It is, but probably shouldn't be, the correct behaviour.
It generally occurs when using the drive.file scope. What (I think) is happening is that the API layer is fetching all files, and then removing those that are outside of the current scope/query, and returning the remainder to your client app. In theory, a particular page of files could have no files in-scope, and so the returned array is empty.
As you've seen, it's a horribly inefficient way of doing it, but that seems to be the way it is. You simply have to keep following the next page link until it's null.
As to "Is there anything we can do to reduce the number of requests that we are sending?"
You're already setting max results to 999 which is the obvious step. Just be aware that I have seen this value trigger internal errors (timeouts?) which manifest themselves as 500 errors. You might want to sacrifice efficiency for reliability and stick to the default of 100 which seems to be better tested.
I don't know if the code you posted is your actual code, or just a simplified illustration, but you need to make sure you are dealing with 401 errors (auth expiry) and 500 errors (sometimes recoverable with a retry)

dojo jsonrest store with periods of no internet connectivity

At the moment I don't know if this is a Dojo issue, a browser issue, or both.
I have a dojo.store.JsonRest data store of items:
//Create stores
var json = new JsonRest(options);
//Memory store
var memory = Observable(new Memory({}));
//Observable cache
var cache = new Cache(json, memory);
The store of items may be shared with different users simultaneously, so the store is periodically updated by issuing something like:
store.query({..})
When I want to add a new item to it, I use
dojo.xhr('POST',{
url:...,
postData:...,
handleAs:'json',
headers:{...},
failOk:true,
timeout:15*1000
});
This works fine. However, I would like to gracefully handle the case when the post occurs during the loss of an internet connection. In particular, I do not want the store to automatically try to post again when a connection is established again; I want the user to retry manually.
In Chrome, it appears that the POST is aborted, and regardless of an internet connection being subsequently established again, the deferred object from the POST appears to be discarded, and the new item is never added to the datastore.
In Firefox, it appears that the POST is aborted. But when the datastore is refreshed, e.g. by invoking:
store.query({...})
the new item, whose POST was aborted, is then added to the store. It's as if the query() call is quietly adding the new item to the datastore when an internet connection is established again.
I'm not observing this behavior in Chrome. And in order to get uniform behavior across different browsers, I would like to know if there is a way to ensure that once a POST is aborted, it's existence and memory is completely obliterated in Firefox.
I'm not sure why you have a separate method of adding an item to the store?
When you do a get on the Cache, if the item does not exist in the memory store, then a request is made by the JsonRest store and the returned item is stored in the memory store. There should be no need to do a separate xhr request, infact query() calls on the cache always go through the JsonRest store, you'd need to override the caches query function to make it check the memory store first. If you just want to add to the store, then a put or add would suit your needs, and do what your xhr post is doing.
Also if you are not happy with the way failed requests are handled by the JsonRest store, just extend it and override its get, query, add, remove or put function(s), handling request responses in your own way.

Asynchronous Ajax call in SCORM API

I am creating a javascript API for SCORM 2004 4th Edition. For those who don't know about SCORM, basically it is an API standard that eLearning courses can use to communicate with an LMS (Learning Management System). Now the API has to have the following method:
Initialize(args)
GetValue(key)
SetValue(key, value)
Terminate(args)
Commit(args)
GetDiagnostic(args)
GetErrorString(args)
GetLastError()
Now Initialize has to be called before anything else, and Terminate must the last. GetValue/SetValue can be called anywhere in between there. What I am doing is in the Initialize method I am getting some JSON from a web service and storing that in the API (to be used when using the GetValue/SetValue methods later). The problem I am coming across is that the AJAX call via jQuery is asynchronous, so the Initialize method call could be done before the JSON is loaded. With that being the way it is, a call to GetValue after calling Initialize could cause unexpected issues b/c the JSON that GetValue uses isn't there yet. My question is this: What can I do to ensure that the JSON is loaded before the GetValue/SetValue methods are called? I know the simple answer is to make it synchronous, but that is not advised mostly, and it doesn't seem to want to do that for me. Here is my code regarding that:
function GetJSON(){
var success = false;
$.ajaxSetup({async:false}); //should make it synchronous
$.getJSON("http://www.mydomain.com/webservices/scorm.asmx/SCORMInitialize?
learnerID=34&jsoncallback=?",
function(data){
bind(data);
success = true;
}
);
return success;
}
function bind(data){
this.cmi = eval("(" + data.d + ")");
$.ajaxSetup({async:true}); //should make it asynchronous again
}
Does anyone have any ideas? I would really appreciate it!
You've articulated the problem well. After the SCO calls Initialize, the CMI data needs to be immediately available for the SCO to make subsequent GetValue calls. However, making synchronous AJAX calls isn't advised, if there is a hangup in the request, it can lock up the entire browser until the request returns or times out. The solution is to pre-load all of the required data before the SCO is loaded. In our SCORM Engine implementation, we preload all of the data (CMI and sequencing) when the player is launched and then use a background process to periodically commit dirty data as the learner progresses through the course. It can get a bit tricky to ensure that all data is properly persisted when dealing with the combinations of possible window launching and exit scenarios, but it's certainly possible. You will want to avoid any requests to the server from within a SCORM API call as SCOs will often flood the LMS with big batches of calls. Making server requests within those calls can seriously degrade the learner's experience and place a performance burden on the server.
Mike
The way we approached this problem was to queue the CMI data in the API when the SCO is launched. We first navigate to a launch page that loads the CMI data into the API's queue, and then the laucnch page actually launches the SCO. When the SCO calls intialize, we just move the data into the CMI.