From colaboratory, is it possible to directly manipulate sqlite 3 format data in google drive?
It is possible if you upload it, but it is convenient to use it in google drive.
You can do load files directory from Drive by mounting your Google Drive as a FUSE filesystem.
Here's an example:
https://colab.research.google.com/notebook#fileId=1srw_HFWQ2SMgmWIawucXfusGzrj1_U0q
There's no official Google Drive FUSE filesystem. But, several open-source FUSE + Drive libraries have been written by third parties. The example notebook above uses google-drive-ocamlfuse. The notebook shows three things:
Installing the Drive FUSE wrapper in the Colab VM.
Authenticating to Drive and mounting your Drive using the FUSE wrapper.
Listing and creating files in the newly mounted Drive-backed filesystem.
First and foremost, upload the database.sqlite and the desired csv file(Reviews.csv in my case) to Google Drive.
Then, you need to mount your drive in Google Colab using the following command:
from google.colab import drive
drive.mount('/content/gdrive')
This will result in the following output:
Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly
Enter your authorization code:
Click on the URL that will lead to a new tab that shows your Google account to provide certain privileges to Google Drive File Stream:
Choose an account
to continue to Google Drive File Stream
You must select the same Google account with which you logged in to the Google Colab. Then click on the 'Allow' button. This will lead to another page that shows the alphanumeric code. Copy the code and paste it in the text area(Enter your authorization code: ).
Consequently, the message about the drive being mounted will be displayed:
··········
Mounted at /content/gdrive
Now click on the folder icon(below 'show code snippet pane' and is located on the left side of the code cell), you can see the gdrive folder. Your Google Drive - My Drive will be located inside gdrive. Now click on the desired folder in which you stored the database.sqlite file. Right-click on it and select 'copy path'.
The path you copied should be pasted in the path link of the following command:
con = sqlite3.connect('paste the path you copied')
For example, if the database.sqlite resides in '/content/gdrive/My Drive/Colab Notebooks/database.sqlite', then the command will be as follows:
con = sqlite3.connect('/content/gdrive/My Drive/Colab Notebooks/database.sqlite')
Now you may run some SQL queries to check whether all is well:
filtered_data = pd.read_sql_query("""
SELECT *
FROM Reviews
WHERE Score != 3
""", con)
print(filtered_data.head())
Related
I need to upload a large file ( Sample File) to google colab. This file is located on a google drive account.
Consider these situation:
My google drive is approximately full, so I could not upload it to my drive
My connection speed is low and downloading this file and uploading it to google drive is a great challenge for me.
Also, I read some stackoverflow pages like: Import data into Google Colaboratory and these ones:
Download File from URL to Google Drive using Google Colab in Python , Get Started: 3 Ways to Load CSV files into Colab and 7 ways to load external data into Google Colab. But non of them was useful for my case. I also tried !wget command but it could not download a google drive link.
Assume you have a shared google drive link like this one:
Then share it with another person:
Now, go to your google drive and check Shared with me:
Add shortcut to drive: Now this file has been added to google drive and it is accessible in colab.
Finally, go to colab and run these commands:
from google.colab import drive
drive.mount('/content/drive')
import os
os.chdir("drive/My Drive/Weights/")
And file is there!
I am working on google colaboratory and i have mounted my gdrive to it so i can access the csv data i am processing with pandas for example.
# Mount the path to the location of your data
from google.colab import drive
drive.mount('/gdrive')
Everything works great.
But now when i share my colab to other colleagues they have no access to the csv files ofcourse.
So how can i share these csv residing in my drive as well ? Or are there other alternatives to hosting the data into gdrive so i and my colleagues can access them through a panda read_csv ?
There are 2 easy ways.
1. gdown
If you have just 1-2 csv files that can be shared publicly. You can get the file IDs. Then, use gdown to download it (no authentication).
Get ID from the shared link
https://drive.google.com/file/d/1KgSEm4A7hCov2Ta6-CZldqT-9JgtiVRI/view?usp=sharing
!gdown --id 1KgSEm4A7hCov2Ta6-CZldqT-9JgtiVRI
Your friend will get dataset.zip in the current directory.
2. kora.drive
If you have many csv files/folders, just put them all in a single folder and share that folder to your friend. I made a library to help download the whole folder. It will check user permission to read that folder, by asking you to authenticate.
Get Folder ID from the folder link
https://drive.google.com/drive/folders/1HvIeNhqtFVllXFWzH5NawDdnIfgGDwCK
!pip install kora -q
from kora import drive
drive.download_folder("1HvIeNhqtFVllXFWzH5NawDdnIfgGDwCK")
Your friend will get a new directory Dataset inside current directory (e.g. /content/) with all files/subfolders copied from your google drive.
There seem to be lots of ways to access a file on Google Drive from Colab but no simple way to save a file from Google Colab back to Google Drive.
For example, to access a Google Drive file from Colab, you can mount the Google Drive using
from google.colab import drive
drive.mount('/content/drive')
However, to save an output file you've generated in Colab on Google Drive the methods seem very complicated as in:
Upload File From Colab to Google Drive Folder
Once Google Drive is mounted, you can even view the drive files in the Table of Contents from Colab. Is there no simple way to save or copy a file created in Colab and visible in the Colab directory back to Google Drive?
Note: I don't want to save it to a local machine using something like
from google.colab import files
files.download('example.txt')
as the file is very large
After you have mounted the drive, you can just copy it there.
# mount it
from google.colab import drive
drive.mount('/content/drive')
# copy it there
!cp example.txt /content/drive/MyDrive
Other answers suggest how to copy a specific file, I would like to mention you can also copy the entire directory, which is useful when copying logs from callbacks from Colab to Drive:
from google.colab import drive
drive.mount('/content/drive')
In my case, the folder names were:
%cp -av "/content/logs/scalars/20201228-215414" "/content/drive/MyDrive/Colab Notebooks/logs/scalars/manual_add"
You can use shutil to copy/move files between colab and google drive
import shutil
shutil.copy("/content/file.doc", "/content/gdrive/file.doc")
When you are saving files, simply specify the Google Drive path for saving the file.
When using large files, Colab sometimes syncs the VM and Drive asynchronously. To force the sync, simply run:
from google.colab import drive
drive.flush_and_unmount()
in my case I use the common approach with the !cp command.
But sometimes, it didn't work in Colab because we didn't enter the right file path.
basic code: !cp source_filepath destination_filepath
implementation code:
!cp /content/myfolder/myitem.txt /content/gdrive/MyDrive/mydrivefolder/
in addition, to correctly enter the path, you can copy the path location from the table of contents on the left side by clicking the dot menu -> copy path.
Once you see the file in the Table of Contents of Colab on the left, simply drag that file into the "/content/drive/My Drive/" directory located on the same panel. Once the file is inside your "My Drive", you will be able to see it inside your Google Drive.
After you mount your drive...
from google.colab import drive
drive.mount('/content/drive')
...just prepend the full path, including the mounted path (/content/drive) to the file you want to write.
someList = []
with open('/content/drive/My Drive/data/file.txt', 'w', encoding='utf8') as output:
for line in someList:
output.write(line + '\n')
In this case we save it in a folder called data located in the root of your Google Drive.
You may often run into quota limits using the gdown library.
Access denied with the following error:
Too many users have viewed or downloaded this file recently. Please
try accessing the file again later. If the file you are trying to
access is particularly large or is shared with many people, it may
take up to 24 hours to be able to view or download the file. If you
still can't access a file after 24 hours, contact your domain
administrator.
You may still be able to access the file from the browser:
https://drive.google.com/uc?id=FILE_ID
No doubt gdown is faster but i copy my files using the command below and avoid quota limits
!cp /content/drive/MyDrive/Dataset/test1.zip /content/dataset
I was trying to upload a big image folder into google drive and github but github not allowed and google drive taking too long. How can I upload the local folder to colab.
Sorry, I don't think there's a solution to your issue. If your fundamental problem is limited upload capacity from the machine with the images, you'll just need to wait.
A nice property to uploading to Drive is that you can use programs like Backup and Sync to retry the transfer until it's successful. And, once the images have been uploaded to Drive once, you'll be able to access them quickly in Colab thereafter without uploading again. (See this example notebook showing how to connect your Google Drive files to Colab as a filesystem.)
convert the folder to zip file and then upload it on colab.
further you can unzip your folder by following command.
! unzip "your path"
The unzip method only works for csv files.
If you use a kaggle dataset, use
os.environ['KAGGLE_USERNAME'] = 'enter_username_here' # username
os.environ['KAGGLE_KEY'] = 'enter_key_here' # key
!kaggle datasets download -d dataset_api_command_here
If you have the image in google drive, use
from google.colab import drive
drive.mount('/content/drive')
Is it possible to copy a file from Google Drive to Google Cloud Storage? I imagine it would be very fast since both are on a similar storage system.
I haven't seen any information on either ways to do this seamlessly, without having to download the file (probably to an out-of-google system) and then re-uploading it. About one year ago, a similar question was asked: Transfer files from dropbox/drive to Google cloud storage.
Is there a way to do this directly, with the file staying within-Google the entire time?
#frunkad commented a link to a nice workaround, but for completeness I will recite this here, as it is currently the top result in search.
You can open a Colab (Juypiter Notebook on Google servers), mount your gDrive and use the gCloud CLI to copy files.
open https://colab.research.google.com/ and connect to a machine
Mount your Drive by executing the code below, clicking on the link and pasting the authentification key
from google.colab import drive
drive.mount(‘/content/drive’)
Connect to your Google Cloud Storage. (Project id can be found here)
from google.colab import auth
auth.authenticate_user()
project_id = 'your-project-id'
!gcloud config set project {project_id}
!gsutil ls
Copy files using gsutil. Use -m tag for multi-threading to increase speed. (There is a Subfolder called "My Drive" that you have to address in your mounted drive)
bucket_name = 'your_bucket_name'
!gsutil -m cp -r /content/drive/My\ Drive/Your-Data/* gs://{bucket_name}/
Original Author Philip Lies
Link to his colab: https://colab.research.google.com/drive/1Xc8E8mKC4MBvQ6Sw6akd_X5Z1cmHSNca
There's no direct way to do this. You could write an App Engine or Compute Engine app that reads from Drive using its API and writes to GCS using its API.