Read file from drive in google colab - google-drive-api

I Have read the notebook about how to open drive. I already did as instructed using:
from google.colab import drive
drive.mount('/content/drive')
After this, I can use !ls to list the contents of my drive but I cannot read or open any file. I already tried:
with open("/content/drive/My Drive/filename.ext", "r") as file:
file = open("/content/drive/My Drive/filename.ext", "r")
!cp "/content/drive/My Drive/filename.ext" "filename.ext"
and also
import pandas as pd
file = pd.read_csv("/content/drive/My Drive/filename.ext")
But none of the above worked. I always get "operation not supported" or "cannot open file for reading".
I have seen some suggestin to use PyDrive. But it is done by copy file from Google Drive to Google Drive. I don't get why you would have to copy back and forth files, since I need to iterate over all the files on the folder.
Why can't google colab just read the file stored on drive? Or am I doing something wrong? Another thing is that I uploaded a bunch of csv files, but google drive lists them as ".csv.gsheet" (using glob). Could that be the problem? I have no other ideas.

It is straight forward.
from google.colab import drive
drive.mount('/content/drive')
This will ask to open a url which will authorize the mount after you copy paste the token.
If you are not able to read files even now, then prefix your file path with this: 'drive/My Drive' and you are good to go.
For example: file = 'drive/My Drive/data/file.txt'
Where data is a directory in my Google Drive containing file.txt file.

I ran into a similar issue last night. As some of the previous responders posted there are concerns that influence your ability to read the file. These concerns are, one, making certain that your file is accessible via google drive from your Collab notebook and also, two, making certain that your file is in the correct format.
I will explain the steps and include a screen shot.
Open Google Collab. Open the File Browser.
Click the icon that says Mount Drive when hovered. This inserts a new cell in your notebook with the code:
from google.colab import drive
drive.mount('/content/drive')
Run the cell. You are prompted to accept permissions and get a token to use to mount the drive. Grant the permissions and copy and paste the code into the text input. Hit enter.
The drive now appears in the file browser. Right click the folder /drive/My Drive or click the three dots action menu and select Upload.
Locate your file on disk and Upload.
The file appears in the File Browser. Right click the File (or use the three dots action menu) and select Copy Path.
Paste that file path into your pd.read_csv() call.
Run the cell with the pd.read_csv function call.
You should now have the file uploaded in your Google Drive. Accessible to google collab and file formatting preserved because it not been accessed by any other program to munge the format.
Below is the example sans Permission tab because I previously granted permissions.

I just tried mounting and creating a Drive file as you described and couldn't reproduce the error you describe.
https://colab.research.google.com/drive/17iiKJPQOPv1eW5-Ctf707mPHXDtipE5G
Perhaps try resetting your backend using the Runtime -> Reset all runtimes menu. Or, can you share a notebook illustrating the problem?

I (partially) found out what was going on based on Bob Smith and Ami F's answers.
I believe google drive blocks read access from files converted to drive formats (gsheet, gdoc, etc.). And so, whenever I tried to use !cat or open, I got an "operation unsupported" error. When I tried Bob's example, creating a file and then reading it, it worked on my notebook.
So I managed to prevent google from converting files, deleted the old files and uploaded everything to drive again. Now, all my csv's were being kept unchanged (no .gsheet extesion) and I am able to access them using open.

The fact that you see ".csv.gsheet" filenames even though you upload ".csv" filenames makes me think that you're uploading your CSVs to sheets.google.com instead of drive.google.com. Can you confirm that uploading to drive.google.com makes things work?

I do suspect RenatoSz's answer is correct: I can open XLSX files fine, but even just file = open('name_of_file.gsheet') fails for me with Operation not supported error. Annoying that you cannot do the simple action of opening a Google Sheet in Google Colab - this seems like basic functionality.
A workaround for me was:
from google.colab import auth
auth.authenticate_user()
import gspread
from oauth2client.client import GoogleCredentials
# authorise
gc = gspread.authorize(GoogleCredentials.get_application_default())
# open
gsheets = gc.open_by_url('some_fun_URL')
# read
sheets = gsheets.worksheet('List of all experts').get_all_values()
# parse
df = pd.DataFrame(sheets[1:], columns=sheets[0])
Note that gc.open(...) did not work for me.

You can avoid this problem by the following steps
Upload the dataset(.csv) to google drive.
Now select the uploaded dataset on google drive and select the share option from the drop down menu acquired by making a left-click on the selected file.
Now a pop-up window appears on the screen, change the sharing setting to editor.
And in the bottom left of the pop-up window change the restricted(Only-added can edit) to "Anyone with the link can edit.
After these settings are saved. Copy the generated sharable link.
Now got to the below mentioned website and convert your link by pasting the priorly copied link and generate the google drive downloadable link.
https://sites.google.com/site/gdocs2direct/
Copy the generated google drive downloadable link.
We are ready with the perfect path address now for the dataset.
file = open("Paste Here the Generated link which we copied", "r")
This would sort the issue perfectly.
The same would work even if was a .txt file as well.

Related

How to direct Puppeteer downloads directly into Google AppScript

would appreciate if anyone can suggest ideas:
I currently have a Puppeteer script which automates CSV downloads into google drive which then allows me to programmatically access the data and move it into google sheets.
Currently, I run this locally on my computer on VS code. I don't yet understand all too well how to handle file downloads through puppeteer, but I currently have a work around by setting the default chrome download path to a folder on my computer, and that folder is actually Drive for desktop, so easy way to get files directly in drive.
await page._client.send('Page.setDownloadBehavior', {behavior: 'allow', downloadPath: path.resolve('G:/My Drive/DriveData')});
Thing is, I'd like this to be some type of scheduled cloud function not having to be manually run on my computer. IF I were to do that, my Drive for Desktop work-around would not be viable. So is there a way I could pass the file download in Puppeteer directly into Google AppScript somehow? Or use the Drive API to directly receive the file?
My main issue is I don't understand how to actually handle file downloads in Puppeteer, I'm currently just editing the default download location for the google chrome instance puppeteer creates. I ultimately just need to get the data somewhere accessible by Google AppScript.
Any suggestions would be uber-appreciated!
I was also facing the same situation, unfortunately we cant install or use puppeteer in google app script environment,
The alternative is to do the file download using UrlFetchApp , cookies and sessions using session and cookies
The following code will help you save the file into Gdrive
var file = folder.createFile(UrlFetchApp.fetch("url of the downloaded file fetched dynamically"));
You need to check through the network logs in chrome to know how the file and on what URL request you are getting the file downloaded which you can pass to the function
Another alternative that will help you to download the file to drive is to use https://developers.google.com/drive
I had used python to save files in drive https://www.thepythoncode.com/article/using-google-drive--api-in-python

Power Automate Error moving CSV from SharePoint to Google Drive

I'm running into an issue with my Power Automate Flow. I'm trying to push a CSV file from a SharePoint folder to a folder in Google Drive. Creating the CSV is fine, but when I try to push the CSV there seems to be a (unreasonable) limit on the size of the CSV I can push.
As the file already exists and I am simply PUSHING the file, not recreating it, surely it shouldn't hit any limits? This file is only 52MB!
The Flow I have - previous step is just a scheduler
error message I get
Look at this link and try with copy to google drive
https://sharepains.com/2019/05/31/copy-large-files-with-microsoft-flow/

Browse Files and folders uploaded on Google Drive

I'm writing from Italy (sorry for mistakes), my question is about GOOGLE DRIVE API and
I'm writing here because the G-Suite support wrote me to contact you, I'm not a programmer.
The big question is: how can I browse folders and files uploaded on my Google Drive Api?
I write you what apppened.
I wanted to use a backup program to save in a planned way files and folders an Google Drive.
I downloaded the program (Iperius Backup) and followed the instructions.
To understand on what I did, I paste the link of the instructions below:
http://www.iperiusbackup.net/en/backup-to-google-drive/
http://www.iperiusbackup.net/en/how-to-enable-google-drive-api-and-get-client-credentials/
I tried the software and everything worked.
So I opened Google Drive but the back-upped files weren't there.
Seen that there was something not clear, I have erased all the files on my Google Drive. I wanted to make sure that it was empty, hoping to erase also “hidden files”(...even if I thought it wouldn't have been so simple...)
To make sure that everything was erased (also “hidden files”), I tried to restore files from the backup software but unfortunately, the software worked well...the files were still there, somewhere...
This means that I have saved through the backup software files and folders on Google Drive “Api”, but I cannot manage them.
How can I have access to my space in Google Drive Api and erase all my files and folders that are still there?
Thank you in advance for the help,
best regards,
Stefano
To list all files in your drive, you can use Files: list.
This method accepts the q parameter, which is a search query
combining one or more search terms. For more information, see the
Search for Files and Team Drives guide.
You can use the Try it now or use the sample code given in a specific language you are using.
To delete/erase a file, use Files: delete.
Permanently deletes a file by ID. Skips the trash. The currently
authenticated user must own the file or be an organizer on the parent
for Team Drive files. Try it now or see an example.

Cannot export previous Sheet revisions to xlsx

I am using the Google Drive REST Api V2 in order to backup certain files to a local drive. Using the revisions list method (https://developers.google.com/drive/v2/reference/revisions/list), I obtain an export link to xlsx or docx depending on my file type.
For example, the link looks like : https://docs.google.com/spreadsheets/export?id=[fileId]&revision=[revisionId]&exportFormat=xlsx
However, I am now unable to use the export link to xlsx on any revision except the latest on my Sheet. When following the link, I obtain a page saying : "Google Docs encountered an error. Please try reloading this page, or coming back to it in a few minutes."
I have tried using the export links to other types and these seem to work fine. Please note that I do not have the same issue with Google Docs when I use the docx format.
I have also noticed that the REST Api V3 does not offer a way to export previous revisions of Google Drive documents. Therefore, I am wondering if the reason it is not working anymore on V2 is because it simply cannot be done anymore or it is just a temporary failure.
Found a workaround to your problem.
Use exportFormat=csv instead of exportFormat=xlsx.
This will download the file as .csv file. Open the CSV file and save it as .xlsx.
This should work for you.
EDIT:
Just to close this thread, the Google Drive team has recently fixed the said issue. &exportFormat=xlsx is now working as intended. I gave it a try and it works!

Move file from appDataFolder to user's root folder with Google drive API

Is it possible to move file from appDataFolder to user's root folder on Google Drive using drive API v2 or v3? I can't find any example how to do that. I just try to use gapi.client.drive.files.update from javascript drive v3 API and addParents parameter to change folder, it works fine with files in user's root folder, but doesn't work with files in appDataFolder.
I know that it is possible to copy file from appDataFolder to user's drive root, but I need to keep fileId, and copying generates new fileId for copied file.
I found there is a file property called "spaces" and files from appDataFolder a in spaces=appDataFolder whereas files from user's root folder are in spaces=drive. Is it possible to move file between these spaces keeping same fileId?
I found some similar posts:
Copy an exising Drive file into the appdata folder
Is it possible to share the application data on google drive
and it looks like it is not possible to do it this way. When I check my console I also get "Method not supported for appdata contents" or "Method not supported for files within the Application Data folder." message.
So is there any method to move file from appDataFolder?
Thank's for help.
No, it seems to be impossible. The current document doesn't mention about that, but the error message clearly says so. Thumb down for drive api.