How to download Google Drive Folder using wget? - google-drive-api

I need to download whole folder from google drive using wget.
How can I do that?

"You can't" more and less...
pip install gdown
gdown --no-check-certificate --folder <folder URL>
I've been using it for a while in macOS and Ubuntu.

You can download zipped files like this:
Suppose this is the shared link from your google drive:
https://drive.google.com/file/d/345efwerfer23rffedser/view?usp=sharing
copy this part of the shared_link: 345efwerfer23rffedser
to bellow command
!wget --no-check-certificate 'https://drive.google.com/uc?export=download&id=345efwerfer23rffedser' -O 'files.zip'

Related

Trying to connect google drive to paperspace gradient notebook

I'm trying to mount google drive to paperspace notebook using google-drive-ocamlfuse with the following code
sudo add-apt-repository ppa:alessandro-strada/ppa
sudo apt update && sudo apt install google-drive-ocamlfuse
but when launching with
google-drive-ocamlfuse
there's an error:
/bin/sh: 1: firefox: not found
/bin/sh: 1: google-chrome: not found
/bin/sh: 1: chromium-browser: not found
/bin/sh: 1: open: not found
Cannot retrieve auth tokens.
Failure("Error opening URL:https://accounts.google.com/o/oauth2/auth?client_id=..........
ocamlfuse's github page has instructions on "Headless Usage & Authorization" but it's for local machine not for something like paperspace.
is there any way i can use google-drive-ocamlfuse to mount the drive?
is there any other better/simpler method to mount google drive on paperspace gradient?
Short answer:
There is no way to mount Google Drive as filesystem on paperspace gradient.
Long answer:
Your error message says cannot open browser. You are correct, should use headless mode [https://github.com/astrada/google-drive-ocamlfuse/wiki/Headless-Usage-&-Authorization]. Basically create an OAuth App, note down the client-id and client-secret, then authenticate using google-drive-ocamlfuse -headless -id client-id -secret client-secret.
But even if the authentication step success, you will still encounter error like fuse: device not found, try 'modprobe fuse' first. It is because Paperspace gradient notebook is running as container. A container cannot perform fuse operation unless it has SYS_ADMIN capability. (See FUSE inside Docker). In this case, we have no control on how paperspace running their container. So we are unable to mount filesystem on paperspace gradient.
However, you can use something like https://github.com/iterative/PyDrive2 to access Google Drive file.

Is there a way to mount Google Drive on my local machine like what could be done in Colab?

In Colab, the following code snippet is used for mounting Google Drive.
from google.colab import drive
drive.mount('/test', force_remount=True)
And I'm wondering if it could work on my local machine. When implementing this locally, it says "no module named google", even after having executed pip install google.
Is there another package that should be installed, or it just cannot be achieved? I've searched for a while, but it seems that the only solution is to install Google Drive Desktop to give access to remote files.
Although google.colab python library can be found here, this library is a collection of tools meant to work in conjunction with the Google Colab product.
Indeed, Google Drive Desktop is your best option to "mount" your Google Drive to your local machine.
Alternatively, there are several 3rd party Google Drive clients available.
Use ocamlfuse.
Here are the step by step details: https://medium.com/#enthu.cutlet/how-to-mount-google-drive-on-linux-windows-systems-5ef4bff24288
Instead of mounting it to a home folder (named googledrive in tutorial), I suggest mounting it such that folder structure would be same for both colab and local machine. To do that:
create your mounting folder at root (it's not a recommended practice but there is no harm). You need to use sudo. i.e at /, use sudo mkdir test
then create MyDrive inside test.
Chnage test or MyDrive owner to yourself: sudo chown <your username> MyDrive/
Mount to MyDrive by: google-drive-ocamlfuse MyDrive/
Enjoy!

Read Files from Google Drive in JupyterLab

I have installed the jupyter lab extension to connect to google drive.
I can create and open files in the drive from the JupyterLab UI.
But I can't find a way to read files located in the drive in the notebook.
For example I would want to be able to run in my notebook the following:
df = pd.read_csv("Gdrive/MyDrive/somefileinthedrive.csv")
Any suggestions ?
You could open jupyterlab directly in the Google Drive directory like this answer: https://stackoverflow.com/a/69077943/13129641
I use to open jupyterlab from anaconda prompt with:
python -m jupyterlab --notebook-dir=G:/
Note: You can change G:/ to another directory, if you desire.

Wget get past "infected with a virus" screen on Google Drive

So I've been trying to get wget to download a Google Drive file that I uploaded. Unfortunately, Google Drive incorrectly flags the file as a virus, so wget can't get the direct download link.
Things I've tried:
using the gdrive.pl fie that someone made, but I'm on Windows, and /tmp/cookies.txt does not exist.
doing wget --no-check-certificate https://docs.google.com/uc?export=download&id=FILEID -O FILENAME, but it says 400 Bad Request
using https://docs.google.com/uc?export=download&id=ID, but it fails because of the download infected file warning.
Does anyone have any suggestions to solve this?
Here is what I was able to do, based on a starting point I found at https://medium.com/#acpanjan/download-google-drive-files-using-wget-3c2c025a8b99 :
Edit I noticed you said Windows, so this command with sed won't work natively in Windows - I'll put steps without sed for Windows below
You of course start by sharing the file and getting the file ID from the share link on google drive. Then:
wget --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=SHARE_LINK_ID" -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p' > /tmp/confirm && wget --load-cookies /tmp/cookies.txt --no-check-certificate "https://docs.google.com/uc?export=download&confirm="$(cat /tmp/confirm)"&id=SHARE_LINK_ID" -O YOUR_FILENAME && rm /tmp/cookies.txt /tmp/confirm
Replace SHARE_LINK_ID with your ID from your shared file link. Replace YOUR_FILENAME with your desired output file name.
This attempts to download the file and gets the html of the warning message about potential viruses in the file. It uses cookies as you need to use the same session ID for the subsequent download with the confirmation code.
It then gets the generated confirm code from that response and writes it to a temporary file.
I then does another wget adding the confirmation code to the query string to download the file, using the saved cookie to allow the confirmation code to work for the saved session.
Most likely this could be worked into a script, passing an argument of the share link ID to make it more useful.
For Windows (without sed)
wget --save-cookies %TMP%/cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=SHARE_LINK_ID" -O %TMP%/confirm.txt
Downloads the confirmation html.
notepad %TMP%/confirm.txt
Opens %TMP%/confirm.txt in Notepad to get the confirm code string (CTRL+F to look for "confirm=" and get the code right after that). Replace it in the below command line (along with putting in the filename you want and the share link ID from google drive)
wget --load-cookies %TMP%/cookies.txt --no-check-certificate "https://docs.google.com/uc?export=download&confirm=CONFIRM_CODE&id=SHARE_LINK_ID" -O YOUR_FILENAME
Delete the temp files:
del %TMP%/cookies.txt %TMP%/confirm.txt
Try this. Don't forget to replace two FILEID and one FILENAME fields with your desired file's file id and the output file's name respectively.
wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=FILEID' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=FILEID" -O FILENAME && rm -rf /tmp/cookies.txt
source: https://medium.com/geekculture/wget-large-files-from-google-drive-336ba2e1c991

Copy PDF file from Google drive to remote server

I've built nice browsing window which shows all of the pdf files on my (or any user) Google Drive for managing purposes.
What i looking to do is simple, i want to take a pdf file from my google drive (i have all the info related to this file - "downloadUrl","webContentLink" etc) and just copy it to my server (remote).
Any thoughts?
I guess I'm pretty late here, but this may help other people too.
You could try using Grive. Here's a straightforward tutorial: http://xmodulo.com/2013/05/how-to-sync-google-drive-from-the-command-line-on-linux.html
Even if you don't have root access on the server, you can simply build from source, and:
$ mkdir ~/google_drive
$ cd ~/google_drive
$ grive -a
You'll receive an auth URL which you need to paste on your browser and click on "Allow Access" and you're done. Go to the google_drive dir on your server and run grive to sync between your local dir and your GDrive.