Google Drive Rest API - How to check if file has changed - google-drive-api

Is there a reliable way, short of comparing full contents, of checking if a file was updated/change in Drive?
I have been struggling with this for a bit. Here's the two things I have tried:
1. File version number
I upload a plain text file to Google Drive (simple upload, update endpoint), and save the version from the file metadata returned after a successful upload.
Then I poll the Drive API (get endpoint) occasionally to check if the version has changed.
The trouble is that within a second or two of uploading the file, the version gets bumped up again.
There are no changes to the file content. The file has not been opened, viewed, or even downloaded anywhere else. Still, the version number increases from what it was after the upload.
To my code this version number change indicates that the remote file has been changed in Drive, so it downloads the new version. Every time!
2. The Changes endpoints
As an alternative I tried using the Changes api.
After I upload the file, I get a page token using changes.getStartPageToken or changes.list.
Later I use this page token to poll the Changes API for changes, and filter the changes for the fileId of uploaded file. I use these options when polling for changes:
{
"includeRemoved": false
"restrictToMyDrive": true
"spaces": "drive"
}
Here again, there is the same problem as with the version number. The page token returned immediately after uploading the file changes again within a second or two. The new page token shows the uploaded file having been changed.
Again, there is no change to the content of the file. It hasn't been opened, updated, downloaded anywhere else. It isn't shared with anyone else.
Yet, a few seconds after uploading, the file reappears in the changes list.
As a result, the local code redownloads the file from Drive, assuming remote changes.
Possible workaround
As a hacky hook, I could wait a few seconds after the file upload before getting the new file-version/changes-page-token. This may take care of the delayed version increment issue.
However, there is no documentation of what is causing this phantom change in version number (or changes.list). So, I have no sure way of knowing:
How long a wait is safe enough to get a 'settled' version number without losing possible changes by other users/apps?
Whether the new (delayed) version number will be stable, or may change again at any time for no reason?
Is there a reliable way, short of comparing full contents, of checking if a file was updated/change in Drive?

You can try using the md5Checksum property of the File resource object, if your file is not a Google Doc file (ie. binary). You should be able to use that to track changes to the contents of your binary files.
You might also be able to use the Revisions API.
The Revisions resource object also has a md5Checksum property.

As a workaround, how about using Drive Activity API? I think that there are several answers for your situation. So please think of this as just one of them.
When Drive Activity API is used, the activity information about the target file can be retrieved. For example, from ActionDetail, you can see whether the target file was edited, renamed, deleted and so on.
The sample endpoint and request body are as follows.
Endpoint:
POST https://driveactivity.googleapis.com/v2/activity:query?fields=activities%2CnextPageToken
Request body:
{"itemName": "items/### fileId of target file ###"}
Response:
Sample response is as follows. You can see the information from this. The file with the fileId and filename was edited at the timestamp.
{
"activities": [
{
"primaryActionDetail": {
"edit": {} <--- If the target file was edited, this property is added.
},
"actors": [
{
"user": {
"knownUser": {
"personName": "people/### userId who edited the target file ###",
"isCurrentUser": true
}
}
}
],
"actions": [
{
"detail": {
"edit": {}
}
}
],
"targets": [
{
"driveItem": {
"name": "items/### fileId of target file ###",
"title": "### filename of target file ###",
"file": {},
"mimeType": "### mimeType of target file ###",
"owner": {
"user": {
"knownUser": {
"personName": "people/### owner's userId ###",
"isCurrentUser": true
}
}
}
}
}
],
"timestamp": "2000-01-01T00:00:0.000Z"
},
],
"nextPageToken": "###"
}
Note:
When you use this API in my environment, please enable Drive Activity API at API console and include https://www.googleapis.com/auth/drive.activity.readonly in the scopes.
Although when I used this API, I felt that the response was fast, if the response was slow when you use this, I apologize.
References:
Google Drive Activity API
ActionDetail
If this was not what you want, I apologize.

What you are seeing is the eventual consistency feature of the Google Drive filesystem. If you think about search, it doesn't matter how quickly a search index is updated, only that it is eventually updated and is very efficient for reading. Google Drive works on the same premise.
Drive acknowledges your updates as quickly as possible. Long before those updates have propagated to all worldwide copies of your file. Derived data (eg. timestamps and I think I recall, md5sums) are also calculated after the update has "completed".
The solution largely depends on how problematic the redundant syncs are to your app.
The delay of a few seconds is enough to deal with the vast majority of phantom updates.
You could switch to the v2 API and use etags.
You could implement your own version number using custom properties. So every time you sync up, you increment your own version number. You only sync down if the application version number has changed.

Related

Is it possible to initiate Google Drive File Approval via API?

Google Drive Approvals are now out of beta and allow for a user to request one or more other parties to approve/reject a document, then lock it as non-editable. Is there a way to do this via the Google Drive/Google Docs/Google Slides API, or do you have to use the Web GUI?
I have found a workaround to know if a document is approved or not (unfortunately not to start an approval flow):
Open the document you want to start approval;
Click File > Approval > Start new Approval flow, select option "Lock file before sending approval request"
Call API /drive/v3/files/:fileId?fields=contentRestrictions, the response will look like this:
{
"contentRestrictions":[
{
"readOnly":true,
"restrictingUser":{
"kind":"drive#user",
"displayName":"........",
"me":true,
"permissionId":"........",
"emailAddress":".........."
},
"restrictionTime":"2022-04-27T11:03:42.430Z",
"type":"globalContentRestriction"
}
]
}
Back to the document, make the approval, and file will be locked.
Call again API /drive/v3/files/:fileId?fields=contentRestrictions, and now the response response will look like this:
{
"contentRestrictions":[
{
"readOnly":true,
"reason":"Locked for File Approval",
"restrictingUser":{
"kind":"drive#user",
"displayName":"......",
"me":true,
"permissionId":".......",
"emailAddress":"........"
},
"restrictionTime":"2022-04-27T10:54:24.594Z",
"type":"globalContentRestriction"
}
]
}
The difference is in the "reason" field, "Locked for File Approval" means that the file was approved.
P.S.: I don't how trustable this workaround is, I tested several and it seems consistent.
P.S.: I don't know if there is a better way to do it, just could find this one at the moment

Google Drive API: Where or How to find Shared Drive ID?

I'm trying to use Google's APIs Explorer to run a method (drive.permissions.create) in which I grant a user access to an entire Google Shared Drive (not specifically a file or folder).
After executing the command, I get the following error:
An error occurred. See the response for details.
Request
POST https://www.googleapis.com/drive/v3/files/0AKppN1yZFzBbUk9PVA/permissions?emailMessage=Test&key={YOUR_API_KEY}
{
"role": "organizer",
"type": "user",
"emailAddress": "test#test.com"
}
Response
{
"error":{
"errors":[
{
"domain":"global",
"reason":"notFound",
"message":"File not found: 0AKppN1yZFzBbUk9PVA.",
"locationType":"parameter",
"location":"fileId"
}
],
"code":404,
"message":"File not found: 0AKppN1yZFzBbUk9PVA."
}
}
I acquired value, 0AKppN1yZFzBbUk9PVA, from another command I thought would give me the ID of the Shared Drive I'm trying to share(drive.drives.list). However, this value is incorrect. Where or how can I find the ID of the Shared Drive I'm trying to give access to?
Also, if I'm missing another value to input other than a correct Id for the command to be successful, please let me know. There is still so much I don't know about how Google Drive's API works (or APIs for that matter).
Thanks in advance!
For Shared Drives you need to set supportsAllDrives to true.

Forge convertion to obj only returning svf

I'm following the step-by-step instructions Extract Geometry tutorial , and everything seems to work fine, except when I check the manifest after posting the job, it always returns the manifest for the initial conversion to SVF.
The tutorial specifically states that you must convert to SVF first. This takes a few seconds to a few minutes, starting at 0% and going until 100%. I await completion, and when I post the second job with the following payload (verifying that the payload is as requested)
let objPayload = {
"input": {
"urn": job.urn # urn retrieved from the file upload / svf conversion
},
"output": {
"formats": [
{
"type": "obj"
, "advanced": {
"modelGuid": metaData[0].guid,
"objectIds": [-1]
}
}]
}
}
( where metaData[0].guid is the provided guid from Step 1's call to /modelderivative/v2/designdata/${urn}/metadata)
, then the job actually starts at about 99%. It sometimes takes a few moments to complete, but when it does, the call to retrieve the manifest returns the previous manifest where the output format is marked at "svf".
The POST Job page states that
Derivatives are stored in a manifest that is updated each time this endpoint is used on a source file.
So I would expect the the returned manifest to be updated to return the requested 'obj'. But it is not.
What am I missing here?
As Cyrille pointed out, the translate job only works consistently when translating to SVF. If translating to OBJ, you can only do so from specific formats, listed in this table.
At the time of this writing, if you request a job outside that table (eg IFC->OBJ), it will still accept your job, and simply not do it. So if you're following the "Extract Geometry" tutorial, when you request the manifest, it is still pointing to the original SVF translation.

Q: Google Drive API - Getting modified files

When requesting a list of files modified since a certain time, does anyone know how long it takes for the response to show files after they has been modified?
I modified a file a little after 6/24 at 12:30am (which was a few minutes ago). If I request a list of files that has been modified since 8:35pm on the previous day, the file shows up:
REQUEST (modifiedTime > "2017-06-23T21:30:00.000Z")
GET https://www.googleapis.com/drive/v3/files?corpora=teamDrive&includeTeamDriveItems=true&orderBy=modifiedTime+desc&q=(trashed+!%3D+true)+AND+(NOT+(mimeType+contains+%22.folder%22))+AND+(modifiedTime+%3E+%222017-06-23T21%3A30%3A00.000Z%22)&supportsTeamDrives=true&teamDriveId=0AF36YeSWsu3dUk9PVA&fields=files(name%2Cid%2CfileExtension%2CmimeType%2CcreatedTime%2CmodifiedTime%2Csize%2CimageMediaMetadata(height%2Cwidth)%2Cparents%2CwebContentLink%2CheadRevisionId)&key={YOUR_API_KEY}
RESPONSE
{
"files": [
{
"id": "1gc9ooedN1YNQkMHqFuI-keekHvuN9h57ssz8Dn8cpU0",
"name": "2017 Men's NCAA Wrap-Up",
"mimeType": "application/vnd.google-apps.spreadsheet",
"parents": [
"0B4jAnSzS-VxlLVpBQ21KMjVMSE0"
],
"createdTime": "2017-06-16T12:38:55.364Z",
"modifiedTime": "2017-06-24T00:31:46.251Z"
}
]
}
If I request a list of files that have been updated since 11:30pm on the previous day, it does not:
REQUEST (modifiedTime > "2017-06-23T23:30:00.000Z")
GET https://www.googleapis.com/drive/v3/files?corpora=teamDrive&includeTeamDriveItems=true&orderBy=modifiedTime+desc&q=(trashed+!%3D+true)+AND+(NOT+(mimeType+contains+%22.folder%22))+AND+(modifiedTime+%3E+%222017-06-23T23%3A30%3A00.000Z%22)&supportsTeamDrives=true&teamDriveId=0AF36YeSWsu3dUk9PVA&fields=files(name%2Cid%2CfileExtension%2CmimeType%2CcreatedTime%2CmodifiedTime%2Csize%2CimageMediaMetadata(height%2Cwidth)%2Cparents%2CwebContentLink%2CheadRevisionId)&key={YOUR_API_KEY}
RESPONSE
{
"files": [
]
}
Eventually the file will show up in the list, but it does not seem to be a matter of minutes (I stopped clicking refresh after 5 minutes). If I walk away for an hour or two, it shows up in the list. Interestingly enough, the modifiedTime on the file is immediately correct if the file is returned in the response (see the first response above). Is this a bug or should I expect to have to wait a certain period of time (and if so, how long) before the query returns the right results?
The answer I've found is that the time seems to vary. I have switched to using the drive.changes.list method instead of the drive.files.list method with the "q" parameter. Not only do changes appear sooner in the changes list, but you can actually see how long it was between the file "modifiedTime" and the change "time". I have seen it range from seconds up to 10-15 minutes.
The other observation I had was that if I close the file in the browser, the change immediately appears in the changes list. I guess Google auto saves the document at particular times. I can't find a way to force a save while the document is open. An explicit File | Save might be nice to have, but closing the window seems to do the trick.

Can using the Documents List API cause files to appear on the change list?

My application is currently using the Document List API to track file and metadata changes using the Changelist. When we find a file has changed, we grab the metadata, the acl information, and the actual file. Lately we've found that we are getting a number a percentage of files that continually show up in the changelist every time we check.
After a bit of investigating, there is very little metadata that is changing in the file.
Here are examples from two different files that continually show up in the changelists.
Is there anyway I can avoid seeing these files over and over again? I have partially optimized to not download the files again, but it is still taking extra quite a bit of overhead to weed out false-positives from the changelist. Does anyone know if updating my app to use the Drive API will fix this issue?
Here is an example of what I'm seeing:
File 1 - Through the Documents List API
Initial Info
entry:etag=\""CkcaSU1LASt7ImBk"\"
id:...feeds/id/spreadsheet%3A0AgVqS9FfzZOCdGhZSVZ4UEtyT2tmRnZsR3lGNFBrVWc
published:2010-12-13T01:58:22.467Z
updated:2010-12-13T02:03:22.269Z
...
link:rel=\"thumbnail\" type=\"image/jpeg\" href=...?id=0AgVqS9FfzZOCdGhZSVZ4UEtyT2tmRnZsR3lGNFBrVWc&v=1&s=AMedNnoAAAAAUQHGlnP_b5jppjlFLN9OHRY5VSP2KZNR&sz=s220\"
...
/entry
Next Time I looked at the changelist
entry etag=\""CkUFR0sIQyt7ImBk"\"
id:...feeds/id/spreadsheet%3A0AgVqS9FfzZOCdGhZSVZ4UEtyT2tmRnZsR3lGNFBrVWc
published:2010-12-13T01:58:22.467Z
updated:2010-12-13T02:03:22.269Z
...
link:rel=\"thumbnail\" type=\"image/jpeg\" href=\"...?id=0AgVqS9FfzZOCdGhZSVZ4UEtyT2tmRnZsR3lGNFBrVWc&v=1&s=AMedNnoAAAAAUQMH4STQC7QSN1CJivPIl0U5KvMD8eKe&sz=s220\"
...
/entry
The only differences are the etag, updated time, and thumbnail image. The file itself did not change at all.
File 2 - This info I grabbed using the APIs explorer (using the DriveAPI 2 changes.get)
{
"kind": "drive#change",
"id": "21012",
"fileId": "0AgVqS9FfzZOCdGQyQUNjWkF0alVpNGd0WXNLMnpNU2c",
...
"thumbnailLink": ".../feeds/vt?gd=true&id=0AgVqS9FfzZOCdGQyQUNjWkF0alVpNGd0WXNLMnpNU2c&v=1&s=AMedNnoAAAAAUQlhSo3rF73K5WnN7E0qSR0uMhWEqM-t&sz=s220",
...
}
Ran through grabbing changes from the Documents List API, then checked the changelist again.
{
"kind": "drive#change",
"id": "21013",
"fileId": "0AgVqS9FfzZOCdGQyQUNjWkF0alVpNGd0WXNLMnpNU2c",
...
"thumbnailLink": ".../feeds/vt?gd=true&id=0AgVqS9FfzZOCdGQyQUNjWkF0alVpNGd0WXNLMnpNU2c&v=1&s=AMedNnoAAAAAUQlh69m8ZG_MzNujmmu80HN9XJ2jpG61&sz=s220",
...
}
In this case, the thumbnail link had again changed, and there was no longer a change with id 21012.