files.list() reproducibly returns incomplete list in "drive.files" scope - google-drive-api

Our application needs a full list of the user's files and folders. We use files.list() via the Javascript library (essentially the same code as shown in the official API reference as an example).
We use the "drive.files" scope.
Examining the response to the list, we find that some files are always missing. I did various tests to understand the problem:
The files clearly exist. They show up in the Google Drive Webapp and, if I explicitly request them via ID, I can get them via the API without problems.
It's reproducible, always the same files are missing.
It is not transient. I tried a day after and still the same files are missing. I know of a few strange effects in the API that go away after some time but not this one.
It is not a one time thing (e.g. some weird things went wrong during upload). If I repeat with a completely different Google Account again files are missing. Of a small set of 147 uploaded files in one test 4 are missed by the files.list call, in another test with the same 147 files on another account 23 files are missing.
It only occurs when I use the drive.files scope. If I relax the scope to drive all files are returned. If look at "Details" in the Google Drive Webapp also the missing files are shown as created by our Application. So it does not seem that they lost their origin somehow.
It also occurs when I specify a search query. If I call files.list with a search term "q: modifiedDate > '2012-06-04T12:00:00'" which also should return all files, the same files are missing.
I re-implemented the same thing as pure REST call to the API to rule out that it is an issue with the Javascript library. The error remains.
Update: I could track it down to an issue with the paging and the maxResults parameter. If I use different values the API returns different number of items:
With maxResults=100 I get 100+100+7=207.
With maxResults=99 I get 99+99+28=226.
With maxResults=101 I get 101+101+0=202.
The last result is interesting which gave me a nextLink indicating there are more results but the items array in the last response was actually empty. This alone might indicate a bug.
Still, this only occurs in drive.file scope, the counts are consistent in the full drive scope.
I'd be glad to hear ideas for a workaround. I'm aware of other ways to keep track of the users files, e.g. using the changes feed. I'm using that already but for a specific part in our application I simply need a reliable and complete list of all our application's items in a user's account.
One more note: We had other issues with the "drive.files" scope before (see Listing files with search query returns out-of-scope results (drive.files.list call, using drive.files scope)). This turned out to be an easy fix. Perhaps this issue is related.

Are there any difference in the files belonging to "shared to me" and own files/folders, was the issue for me ?
The way it is presented in Google Drive was not the same result I got when searching without the correct flags.
I found out when I did this file list with all the folders, that I did have to include from where the search scope of files should be.
- Include deleted files
- Include shared to me files

Related

How do I avoid Google Drive API audit? -- Only Read access is needed to list files from folder and to download them

The product I'm working on currently uses the scope "https://www.googleapis.com/auth/drive" (which is now "restricted" by Google), which gives full read and write access to a user's Drive account, including app metadata. But we only need read access to list all files and folders inside a specific folder, and we need to be able to download those files, that's all.
Google Drive API will soon apply the new "restricted" scope policy (https://support.google.com/cloud/answer/9110914#restricted-scopes), which will require us to go through a very expensive audit (tens of thousands of dollars...). Is there a possible workaround to get 'read-only' access on a specific folder, and avoid the audit (note that https://www.googleapis.com/auth/drive.readonly is also a restricted mode)?
I'm aware of the "https://www.googleapis.com/auth/drive.file" scope (which is "recommended" by Google, so no audit required), which almost solves this problem. But we have thousands of users bringing in data from multiple Drive Folders, and pushing new files daily. This scope would introduce a manual step for a client each morning to have to "approve" every new file, and this would be a big scalability/usability problem.
Ideally, I would like Google to add a new scope, like a read-only access to anything inside a folder, before they go forward with their audit... but i doubt that this will happen soon.
Does anyone know of a better option?
[EDIT] For reference, here is the list of scopes and we can see which ones are "restricted", "sensitive" and "recommended" : https://developers.google.com/drive/api/v2/about-auth
Solution
Hi! So after taking a better look at this it seems that restricted scopes do NOT require any paid audit. The main difference is that they will have a wider access to user's data and thus it requires you to go through a restricted scope verification process.
You can use these restrictive scopes (the one that best fits your application) without the need of paying any audit. See more information about how to implement restrictive scopes here.

Google Drive list method doesn't list files created recently

I've developed an extension for Google Chrome that HEAVILY relies on Google Drive API (the extension is LBTimer, available in Google Chrome web store), to store data in the appfolder, using XMLHttpRequests
Since May, 13th 2015 I'm finding a problem when using the list method.
If I programmatically create a file in the appfolder, I receive the response: 200 OK and the file created. If then I use the list method to list the files in the appfolder, the file just created is not listed. It happened with several files yesterday. This morning, the files were listed normally, but with any file I create today it happens the same (correctly created but not listed).
Three screens follow: the 1st one is creating a test file in the appfolder using the extension's code. the image shows the server response (200 OK, file created). The second screen shows the list request (list all files whose title contains 'test', it should include the file just created). The third screen shows the response from the server (an empty items list).
There is a way to get them listed: If I create a file, it returns (among other data) the file Id. If I make a simple GET request for that Id, then it is listed from then on.
All other methods are working as expected (as usual), but the list method is giving me this problem since yesterday. Since there was no change in the extension's code, I assume there must have been a change in the API code.
Apparently, it was a caching issue in Google Drive, which has been resolved as of May 15th 2015: Google Drive developers community post
Yes, I noticed the same issue from today. I'm using Google Drive Java API to create files in hidden Application data folder (appfolder). Files are created correctly, but "Hidden data size" i 0. I noticed that my files appears after few hours! I reported this issue to Google several hours ago by Google Developers Console, but the issue still occurs. I think more users should do this to get their attention to this critical issue.

What is the expected behaviour of the changes feed with drive.file scope?

My expectation is that if I query the Changes Feed with a scope of drive.file, I will only receive changes to files owned by my application.
However, in testing that I have done, I am seeing files in the feed that have nothing to do with my app. At least some of them are files that have been shared with me.
Anybody know exactly how this is supposed to work?
Edit 0
Similar or duplicate StackOverflow questions
Listing files with search query returns out-of-scope results (drive.files.list call, using drive.files scope)
List ignores drive.file scope and shows shared files not created by the calling app
The files returned will not be specific to your app. Files that are "public on the web" are also reported back, regardless of whether or not your app created them or they were ever opened by the user in your app.
There is a parameter (includeSubscribed) that will filter out shared docs but this is also a bit limited (see below).
From Detect Changes:
For Google Drive apps that need to keep track of changes to files,
polling repeatedly can be both inefficient and resource-intensive. The
Changes feed provides a more efficient way to detect changes to all
files, including those that have been shared with a user. The feed
works by providing the current state of each file, if and only if the
file has changed since the given changestamp.
Here is a relevant parameter from Changes:list.
includeSubscribed boolean
Whether to include shared files and public
files the user has opened. When set to false, the list will include
owned files plus any shared or public files the user has explictly
added to a folder in Drive. (Default: true)
Scope(https://www.googleapis.com/auth/drive.file)
Meaning(Per-file access to files created or opened by the app)
The scope https://www.googleapis.com/auth/drive.file strikes this balance in a practical way. Presumably, users only open or create a file with an app that they trust, for reasons they understand.
But though, to your point, please refer to this q&a.

How can we add a file to a user's files.list via the sdk?

We are having issues where sometimes a file that a user can access is not returned when the user issues a files.list. This can happen in many ways. For example, new members of a Google group will not see previously shared files, as described in this question. Moreover, acording to Google documentation there are other limits on sharing which can prevent shared files from appearing in the "Shared with me" view. Finally, a user can issue a files.delete on a file she doesn't own, and the file will disappear from files.list but will still exist.
What can a user do via the SDK alone to cause a file which she can access via files.get to appear in the list of files retrieved via files.list? We are using a service account which impersonates users; the user never authenticates to Google via a browser. A link in an email that the user needs to click won't work for us, unfortunately. Accessing the file via the Google Drive UI has the desired effect, but the analogous files.get call does not.
The Google Calendar API explicitly exposes a CalendarList interface where a user can issue an insert to add an existing calendar to her list. The Google Drive SDK seems like a hybrid Files/FilesList interface with some of the functionality missing (nothing like FilesList.insert) and some of the functionality mixed together (issuing a delete as a non-owner acts like FilesList.delete but issuing it as the owner acts like Files.delete).
If we can't manage the user's files list programmatically then it is not useful for our service. We could ignore the files.list call entirely and just start recursively performing children.list queries on all shared folders, but this is incredibly expensive (unless someone knows how to issue a single query which returns all the Files resources in a folder and not just the IDs of those resources).
Any help would be appreciated. We've been trying this many different ways and have been frustrated at every turn. Thanks!

Calling a Google Drive SDK from Google App Script application

i have been going around in circles here and have totally confused myself. I need some help.
I am (trying to) writing an application for a client that in concept is simple. he want a google write document with a button. the google drive account has several folders, each shared with several people. when he drops a new file in one of the folders, he wants to be able to open this write file, this file is the template for his email. he clicks the button, the system calls the changes service in the Google Drive SDK https://developers.google.com/drive/manage-changes, gets the list of files that have been added since the last time it was checked, then pull the list of people that the file has been shared with, and use the write file as a template to send that list of people an email saying their file is ready.
SO, easy enough, right?
I started by looking at the built in functions in the Google App Script API. I found this method, https://developers.google.com/apps-script/class_docslist#find in the DocsList class. problem is the description for the query simply says "the query string". So at first i tried the Drive SDK query parameters, which are
var files = DocsList.find("modifiedDate > 2012-12-20T12:00:00-08:00.");
it didn't work. that leads me to believe it is a simple full text search on the content. Thats not good enough.
That lead me into trying to call a Drive SDK method from within an App Script application. Great, we need an OLap 2 authentication. easy enough. found the objects in the script reference and hit my wall.
Client ID and Client Secret.
you see, when i create what this really is, a service account, the olap control in apps script doesn't know how to handle the encrypted json and pass it back and forth. Then when i tried to create and use an installed applications key, i get authentication errors because the controls again, don't know what to do with the workflow. and finally, when i try to create a web app key, i can't because i don't have the site host name or redirect URI. And i can't use the application key ability because since im working with files OLap 2 is required.
i used the anonymous olap for a while, but hit the limit of anonymous calls per day in the effort of trying to figure out the code a bit, thats not going to work because the guy is going to be pushing this button constantly thru the day.
i have been pounding my head on the desk over this for 5 hours now. i need some help here, can anyone give me a direction to go?
PS, yes, i know i can use the database controls and load the entire list of files into memory and compare it to the list of files in the database. problem being, we are talking tens of thousands of files. bad idea.
I wouldn't use DocsList anymore - DriveApp is supposed to be a more reliable replacement. Some of the commands have changed, so instead of find, use searchFiles. This should work more effectively (they even use a query like yours as an example).