Google Drive API 'Revision not found' - google-drive-api

I am using the Google Drive REST Api to get a list a revisions (https://developers.google.com/drive/v2/reference/revisions/list).
I have two problems with this:
The number of revisions is inconsistent. I always get the same last 30 revisions but the total quantity ranges between 70 and 98. Each time I do the call, I get a different number of revisions.
I keep the revision ids for later use (for backup purposes) and when I come back and call GET on the Revisions API (https://developers.google.com/drive/v2/reference/revisions/get), the first few revisions return a 'Revision not found' and some random ones also return the same error.
I can reproduce this behavior in my app but also directly on the API's documentation page.
Is there a way to know which versions are 'permanent' so I don't keep them for later use?
Thanks!

Based on the Manage Revisions page, you're getting "Revision not found" due to the automatic purges made by the service for disk space optimization.
Google Drive automatically purges (or "prunes") revisions in order to optimize disk usage. To prevent this from happening, you can set the boolean flag keepRevisionForever to true to mark revisions that you don't want Drive to purge.
As indicated, you can set the keepRevisionForever flag on a specific revision for it to not be purged.

Related

Drive API update

We got the Drive API update changes last week.
The email content:
Very appreciate if you can help to confirm the following two questions:
We have tried the resourceKey, but it didn't work, is there a specific time for it to take effect?
URL type fields such as exportLinks, webContentLink, and webViewLink
will include the resourceKey.
Currently, we are just using the webViewLink, do we still need to update our code for accessing files to include the appropriate resource keys?
Beginning Monday, September 13, 2021, Google will begin enforcing a security update to Google Drive. Users like you who own or manage impacted files will be notified about the affected files beginning Monday, July 26, 2021.
This security update adds a resource key that makes sharing links from Google Drive files more secure. When file links are updated, users may receive new file access requests. Those who do not have accessed to the files before the update will have to use URLs containing resource keys to access the files.
To avoid broken links, users should update Google Drive links on their websites and shared resources before September 13.
Update Impacts Developers
Affected items are those that have a Drive File API permission with type=domain or type=anyone where withLink=true (v2) or allowFileDiscovery=false (v3).
In addition to the item ID, applications may need a resource key to access files. Use Google’s Developer resource to learn more about how this update will impact your projects.
access
You can also check that the resourceKey security update is already activated in your domain by going to Google Admin.
The effects should have kicked in today, the 13th. So far I'm NOT getting the resourceKey when I call for a file list. Yes, the resourceKey is part of the field list I send. Testing this in https://developers.google.com/drive/api/v3/reference/files/get using * for field list and a file ID that SHOULD be returning a resourceKey there is nothing returned for resourceKey. We are getting 403 errors on certain operation in our code - only place where a direct URL to the Google drive is used.
Essentially we do NOT have the resourceKey supplied to us so there is no way to add it to the URL!
When I checked my test GDrive before the weekend in order to turn this security update ON, the console indicated that it was already turned on, which I had certainly not done!

Google Drive 600 member limit - how close are we to reaching it?

According to the following link, shared Google Drives have a membership limit of 600 combined users and groups.
https://support.google.com/a/answer/7338880?hl=en
At my organization, we have a shared drive. We are unfortunately forced to share with users individually rather than create groups due to a conflict between organizational security requirements and our particular use case.
We are automating sharing using the Google Drive API, and we have a growing number of drive members which will eventually approach the 600 number. Is there a way we can find out how many drive members we have so we know if we're in danger of hitting the limit of 600? A user would be considered a member of the drive if at least one folder or piece of content anywhere on the drive had been shared with them. There doesn't seem to be a straightforward way in the Google Drive API to query the number of users that are members of the drive.
Thanks!
I haven't found a way to directly get a drive's number of members but you might be able to make use of drives.list's query and search drives with memberCount having more than 550.
If this gets your target drive, then you should be able to estimate for how long will it get full.
You can also do a repeated query to actually get the exact number of members.
For example:
name contains 'name of your shared drive' and memberCount > 200
if this returns your target drive, then we know it is greater than 200. Proceed with bigger memberCount comparison to get the actual member count
if this doesn't return your target drive, lessen your memberCount and see if it catches your target drive.
repeat until you get a more specific number of memberCount
With these in mind, you can create an algorithm that will loop the call until it matches the actual memberCount. But if it is a hassle for you to do this, then simply querying if your target drive is having greater than 550 members will suffice and should warn you that it is already near reaching the limit.
Note:
You need to use true for useDomainAdminAccess and have administrator privileges for it to work.
Reference:
Search for shared drives
Drives: list

Google Drive multiple files download

We have a client-server architecture that uses Google Drive for sharing files between the client and the server, without having to actually send them.
The client uses the Google Drive API to get a list of file IDs of all files it wants to share with the server.
The server then downloads the files with the appropriate authorization token.
Server response time is crucial for user experience.
We tried a few approaches:
First, we used the webContentLink. This worked until we started receiving large files from the client. Instead of getting the files' content, we got an html with a warning "exceeds the maximum size that Google can scan". We could not find a header we can use to skip this check.
Second, we switched to the Google API resource URL with the alt=media query param. This works, but we then hit API quota errors (User Rate Limit Exceeded). Since this is server code, it was identified as a single user for all requests.
Then we added the quotaUser param to represent on behalf of which user each request is. We still got many 403 responses.
In addition, we implemented exponential backoff for the failed requests.
We also added a cache for the successful requests.
Our current solution is a combination of the two. Using the webContentLink whenever possible (which appears not to affect the Google API quota). If the response is not as expected, (i.e. an html, wrong size, etc.), we try the Google API resource URL (with exponential backoff).
(Most of the files are small enough to not exceed the scan size limit)
Both client and server uses the same OAuth 2.0 client ID.
Here are my questions:
1. Is it possible to skip the virus scan, so that all files can be downloaded using the webContentLink?
2. Is the size threshold for the virus scan documented? Assuming we know the file size we can then save the round-trip of the first request (using the webContentLink)
3. Is there anything else we can do other than applying for a higher quota?
Is it possible to skip the virus scan, so that all files can be downloaded using the webContentLink?
If it is greater than 25MB it is not possible with webContentLink but since you are using authorized request use files.get with alt=media. Apply appropriate error handling options (which you have done using exponential backoff). The next step would be checking if you code is optimized then after checking and applied recommended optimization and still received Error 403 Limit Exceed, time to apply for a higher quota.
Is the size threshold for the virus scan documented? Assuming we know the file size we can then save the round-trip of the first request (using the webContentLink)
To answer this, you can refer to the Google Drive Help Forum : How can I successfully download large files from google drive without network errors at the most end of the download:
Only files smaller than 25 MB can be scanned for viruses.
Is there anything else we can do other than applying for a higher quota?
You can do the following before applying for a higher quota:
Performance Tips
Drive Platform Best Practices
Handling API Errors
After all optimization is done, the only option is to apply for higher quota limit.
Hope this helps!

What are the consistency guarantees of the Google Drive API?

I've written a test suite for my google drive api library and am witnessing some non-deterministic behavior. In the simplest case, I can insert a permission on a file, then immediately get a list of permissions on the file and I don't see the newly inserted permission.
I'm chalking this up to eventual consistency being eventual, but it would be nice to know if this is actually the case; the documentation makes no mention of consistency delays.
I can't see this documented anywhere but there is a simple experiment you can do.
Adding and removing permissions is an asynchronous and queued task in my opinion and eventual consistency is my observation too. You can confirm this with a GSuite for Business account by conducting a test as follows:
In the Drive UI, upload a folder tree structure with a root folder, 3 or 4 levels of sub-folders and 300 to 500 files. You may get away with fewer but this is how many I used.
In the Drive UI, share the root of those folders with another user on your domain.
In Admin Console > Reports > Audit > Drive, add Filters as follows:
Event name: User Sharing Permissions Change
User name: the email of the user you added in step 2.
Owner: your email
Date and time range:
From: add yesterday's date
To: add today's date and 23:59 as time
Press search. You should see hundreds of events - one for every file and folder you added in step 1. Each event shows the exact time stamp of the permission being added.
As you should see, the permissions are not added instantly. It can take many minutes/hours depending on the numbers involved and (I assume) indeterminate work going on in the Google cloud.
It is indeed. If you think about Google's infrastructure, it's all about read performance and data integrity through massive distribution. The inevitable consequence of that is that write performance is relatively poor and asynchronous.

Issue with Google Drive API and group sharing

I'm facing an issue with an application I'm developing using Google Drive.
I have a Google group with some users inside, and I share a collection with this group.
When I try to find this collection using Google Drive API (files().list()), as one of the users of this group, the collection shows up properly.
However, if I add another user to the group (either using API or Google CPanel), and try to find the collection using Google Drive API as this user, the collection doesn't show up, as if the user is not able to see it even though he is in a group allowed to see the collection.
If I manually open once the collection through my browser, then the collection is showing through Drive API.
Is this a normal behaviour ? In my use case I cannot expect from the users to open in their browser each and every collection shared with them in order for the application to work.
Any insight?
I opened a Google Support case about this and apparently this behavior is "expected". Here are some excerpts from my exchange with "Angel" from
Google Enterprise Support (typos corrected and emphasis mine):
After reviewing the stackoverflow question, we need to clarify to you that the behavior shown is expected. When adding a user to a group, this group must be added again for any files that it has been shared with.
and
All information previously provided is from internal documentation for Drive UI; however the functionality is the same for SDK, therefore, group must be deleted and added back to the list of users that have access to files/folders after adding a new member.
So, there you have it. Not sure if #Burcu will ever come back and confirm.
<EDIT> It gets worse. According to this Google document, groups with more than 200 members will never see files shared with them, even if you delete and add the group back. </EDIT>
Useless post-answer rant follows:
This behavior, even if it is "expected" by Google, does not seem to be properly documented, and it is neither expected nor usable by clients of the service. How are we supposed to know when a user has been added to a group that has items shared with it? Are we supposed to constantly monitor group memberships as well as maintain a list of all things shared with the group and then *re*share them with the group when the membership changes, just to get consistent behavior? It makes me wonder why Google doesn't already do this on the back end; it can't be that expensive to register a list of callbacks with a group that are triggered upon membership changes. And the requirement that we actually unshare is even more bizarre, since it necessitates a short period of time during which nobody in the group can access the resource.