How to get an accurate file list through the google drive api - google-drive-api

I'm developing an application using the Google drive api (java version). The application saves files on a Google drive mimicking a file system (i.e. has a folder tree). I started by using the files.list() method to retrieve all the existing files on the Google drive but the response got slower as the number of files increased (after a couple of hundred).
The java Google API hardcodes the response timeout to 20 seconds. I changed the code to load one folder at a time recursively instead (using files.list().setQ("'folderId' in parents) ). This method beats the timeout problem but it consistently misses about 2% of the files in my folders (the same files are missing each time). I can see those files through the Google drive web browser interface and even through the Google drive API if I search the file name directly files.list().setQ("title='filename'").
I'm assuming that the "in parents" search uses some inexact indexing which may only be updated periodically. I need a file listing that's more robust and accurate.
Any ideas?

could you utilize the Page mechanism to do multiple times of queries and each query just asks for a small mount of result ?

Related

Mirror synchronization of local folders with google drive

I want to synchronize some local folders from my desktop to my Google Drive account. I have to mention that I have more than 2 million files totalizing 1 To and file sizes are from 1o to 100 Go (a zip archive).
Using drive for desktop, the google application for the synchronization, takes years since each time this app is opened, it is checking all the files. Considering the number of files I have, you understand that it is quite long. Additionally, I have the feeling that only 3 files can be simultaneously uploaded on the drive with this "Google Drive for desktop" app.
I am looking for an alternative solution that would allow me to save my local folder in a "mirror" way. I mean that, when modifications are performed in my local folder on my computer, they are pushed to my google drive in real-time.
Do you know about such a free software I could use, that would not take years in an infinite checking loop before synchronizing my files?
I use Viper FTP for Mac for a similar task. There you can define what they call an "observed folder" - a folder that is watched by ViperFTP and if any modifications are detected, new / modified files are uploaded to the defined server (s). The app also supports Google Drive.

requests: Get last modified time of Google Doc or Sheet?

I want to download a Google Sheet (and/or Doc, or Colab Notebook) from an "Anyone can View" sharing URL, if the file is newer than my local copy. To do that, I need to find out when the remote file was last modified. Which I thought shouldn't be hard.
There are threads explaining how to do this for regular files on websites that make use of the HTML Last-Modified property, but Google doesn't provide this field in its headers. It provides a Date: but that's just the download date/time that updates every moment.
I see threads about doing this from within the Doc or Sheet itself. My question is not about that. I'm talking about getting the info remotely by running a python script on my local machine.
I see a thread about using the Google Drive API v3, but....is it really necessary to go through all that (e.g. install oauth, register an API key, etc. effectively create an entire Google app *) just to find out when a publicly-available file was last modified? Is there an easier way?
Thanks!
EDIT: * I started down the road of Google Drive API but I find it confusing and overwhelming. It's like they think I'm trying to create an app for general users for the Android Store, instead of just myself. (??)

Google Drive list method doesn't list files created recently

I've developed an extension for Google Chrome that HEAVILY relies on Google Drive API (the extension is LBTimer, available in Google Chrome web store), to store data in the appfolder, using XMLHttpRequests
Since May, 13th 2015 I'm finding a problem when using the list method.
If I programmatically create a file in the appfolder, I receive the response: 200 OK and the file created. If then I use the list method to list the files in the appfolder, the file just created is not listed. It happened with several files yesterday. This morning, the files were listed normally, but with any file I create today it happens the same (correctly created but not listed).
Three screens follow: the 1st one is creating a test file in the appfolder using the extension's code. the image shows the server response (200 OK, file created). The second screen shows the list request (list all files whose title contains 'test', it should include the file just created). The third screen shows the response from the server (an empty items list).
There is a way to get them listed: If I create a file, it returns (among other data) the file Id. If I make a simple GET request for that Id, then it is listed from then on.
All other methods are working as expected (as usual), but the list method is giving me this problem since yesterday. Since there was no change in the extension's code, I assume there must have been a change in the API code.
Apparently, it was a caching issue in Google Drive, which has been resolved as of May 15th 2015: Google Drive developers community post
Yes, I noticed the same issue from today. I'm using Google Drive Java API to create files in hidden Application data folder (appfolder). Files are created correctly, but "Hidden data size" i 0. I noticed that my files appears after few hours! I reported this issue to Google several hours ago by Google Developers Console, but the issue still occurs. I think more users should do this to get their attention to this critical issue.

What is the expected behaviour of the changes feed with drive.file scope?

My expectation is that if I query the Changes Feed with a scope of drive.file, I will only receive changes to files owned by my application.
However, in testing that I have done, I am seeing files in the feed that have nothing to do with my app. At least some of them are files that have been shared with me.
Anybody know exactly how this is supposed to work?
Edit 0
Similar or duplicate StackOverflow questions
Listing files with search query returns out-of-scope results (drive.files.list call, using drive.files scope)
List ignores drive.file scope and shows shared files not created by the calling app
The files returned will not be specific to your app. Files that are "public on the web" are also reported back, regardless of whether or not your app created them or they were ever opened by the user in your app.
There is a parameter (includeSubscribed) that will filter out shared docs but this is also a bit limited (see below).
From Detect Changes:
For Google Drive apps that need to keep track of changes to files,
polling repeatedly can be both inefficient and resource-intensive. The
Changes feed provides a more efficient way to detect changes to all
files, including those that have been shared with a user. The feed
works by providing the current state of each file, if and only if the
file has changed since the given changestamp.
Here is a relevant parameter from Changes:list.
includeSubscribed boolean
Whether to include shared files and public
files the user has opened. When set to false, the list will include
owned files plus any shared or public files the user has explictly
added to a folder in Drive. (Default: true)
Scope(https://www.googleapis.com/auth/drive.file)
Meaning(Per-file access to files created or opened by the app)
The scope https://www.googleapis.com/auth/drive.file strikes this balance in a practical way. Presumably, users only open or create a file with an app that they trust, for reasons they understand.
But though, to your point, please refer to this q&a.

How i can get multiple files from google drive through the google drive api?

I would like to know how i could to obtain multiple files from google drive. I searched this in the reference but i not found this information. I'm building a web application that will talk to drive and retreive a link of a zip file to download. Zip of the files.
I'm using php with api v2.
That is currently not possible with the Drive API, you have to send multiple requests to retrieve multiple files.
I've been faced with a similar problem and while there's currently no way of doing this through Drive (to my knowledge), this is the solution I came up with.
I'm serving up hundreds of thousands of files to clients using a Drive folder as the storage with a custom built front-end built with the Drive API. With that many files, it's ridiculously tedious for users to download files one at a time. So the idea was to let the users select which files they'd like to download and then present them with a zip file containing the files.
To do that, first you'll need to get an array of the Drive files that you want to download whether that's some you generate programmatically or through check-boxes on the front-end. Then you'll loop through that array and grab the 'downloadURL' value for each file and perform a cURL to the url provided. Depending on how many files you're planning on handling per request you can either keep them all in memory or temporarily store them on the disk or in a database. Regardless, once you have all of the files, you can then zip them up using any number of zip libs that are out there. Then just send the resulting zip file to the user.
In our case we ended up sticking with individual file downloads because of the potentially massive amount of resources and bandwidth this can eat but it's a potential solution if you're not serving large numbers of files.
Assuming I am answering the correct query, if you place the files in a folder on google drive, then as far as I know it is possible to download as a complete group of files.