How to import local files to colab - pickle

I have a ipython file that I want to execute on colab. When I first ran it, the local files were imported but now it gives me an error.Following are the code snippet and error
FileNotFoundError Traceback (most recent call last)
<ipython-input-1-9132fbd19d75> in <module>()
1 import pickle
----> 2 pickle_in =
open(r"C:/Users/manas/PycharmProjects/allProjects/X.pickle","rb")
3 X = pickle.load(pickle_in)
4
5 pickle_in =
open(r"C:/Users/manas/PycharmProjects/allProjects/y.pickle","rb")
FileNotFoundError: [Errno 2] No such file or directory:
'C:/Users/manas/PycharmProjects/allProjects/X.pickle'
import pickle
pickle_in =
open(r"C:/Users/manas/PycharmProjects/allProjects/X.pickle","rb")
X = pickle.load(pickle_in)
pickle_in =
open(r"C:/Users/manas/PycharmProjects/allProjects/y.pickle","rb")
Y = pickle.load(pickle_in)

You can upload your files on colab if you're not going to be doing this often. Else it can get quite annoying.
from google.colab import files
files.upload()
Using the above snippet you can upload and use whatever you'd like.
However, if you're pickles are of a larger size, I'd advice you to just upload them on your Drive. Accessing them from your Drive is far easier and less troublesome. To access files on your Drive, all you have to do is mount it in colab's file directory.
from google.colab import drive
drive.mount("/content/drive")
This will generate a link, click on it and sign in using Google OAuth, paste the key in the colab cell and you're connected.
Check out the list of available files in the side panel on the left side and copy the path of the file you want to access. Read it as you would, with any other file.

Related

Python Os.walk misses few files to process in the directory

Out of 10 files in the directory, only 8 files are processed and 2 files are not processed. But if I delete all the 8 files and try running with the missed 2 files it is working. Why Os.walk is missing files? Also is there a way to process all the files in the directory one after another without missing any.
Note: The solution will be used for the folder that contains 100K JSON files.
for root, dirs, files in os.walk('D:/M'):
for file in files:
if file.endswith(".json"):
Strfil=os.path.join(root,file)
with open(Strfil, 'r') as json_file:
For file system related things it is better to use the pathlib module
With pathlib you can do something like this.
from pathlib import Path
json_files = list(Path("D:/M").glob("**/*.json"))
for f in json_files:
with open(f, 'r') as json_file:
I think any file with more than 250 characters will be skipped by Windows as 'too long'. What I suggest is to map the network drive to make the path much shorter.
e.g. z:\myfile.xlsx instead of c:\a\b\c\d\e\f\g\myfile.xlsx

How to upload json file to Google Drive in Google Colab?

I am training a model in Google Colab and storing it in json format. I want to upload this trained model to my drive in the colab itself.
I am currently doing:
model_json = model.to_json()
with open("trainedModel.json", "w") as json_file:
json_file.write(model_json)
model.save_weights("trainedModel.h5")
print("Saved model to disk")
print("This file ran till end.\nNow uploading to drive:")
uploaded = drive.CreateFile({'parents':[{u'id':'#id_no'}],'title': 'trainedModel.json'})
uploaded.SetContentFile('trainedModel.json')
uploaded.Upload()
uploaded = drive.CreateFile({'parents':[{u'id': '#id_no''}],'title': 'trainedModel.h5'})
uploaded.SetContentFile('trainedModel.h5')
uploaded.Upload()
But this gives me:
FileNotFoundError: [Errno 2] No such file or directory: 'client_secrets.json'
I'd recommend using the file browser browser or Drive FUSE instead. Both are radically simpler than using the Drive API directly.
File browser upload:
Drive FUSE:
from google.colab import drive
drive.mount('/content/gdrive')
(Details)
This was happening because the authorization code given to the notebook that grants it permission expires after a few minutes/hours.
This problem got resolved by requesting the authorization code again. That is inserting
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
after saving the model file and before uploading it to the drive.

Goolge Colab file upload not uploading the csv file

Ran the following code in Colab:
uploaded = files.upload()
Clicked "Choose Files" and selected the csv I want to upload.
But the file is not being uploaded. It worked once before but not again. All videos I watch about how to upload csv's locally, once the file is selected, Colab uploads the file immediately.
Colab is not doing that in my case. It's just sitting stagnant like this:
stagnant colab
This is a Colab bug --
https://github.com/googlecolab/colabtools/issues/437
The team reports they are working on a service change to correct the problem today. (21 Feb, 2019)
Well even i had the same probem
you can also go for other option
1. In left pane of google colab there is a tab called Files Click on that.
2. click on the upload files.
Thats it!!!! Pretty Simple Enough!!!!
for more details :https://www.youtube.com/watch?v=0rygVrmHidg
Upload the file from the local machine using the following command:
from google.colab import files
import pandas as pd
uploaded = files.upload()
for fn in uploaded.keys():
print('User uploaded file "{name}" with length {length} bytes'.format(
name=fn, length=len(uploaded[fn])))
After that read the csv/text file using the pandas:
dataFrame= pd.read_csv('filename.csv')
I have same problem. I use this method. Upload the file from the google drive using the following command:
from google.colab import drive
drive.mount('/content/drive')
data = pd.read_csv('/content/drive/Myfiles/datafiles/mydata.csv', delimiter=",")

Colab how to get file id for existing file

I am starting with colab for ml and I have problem importing files from my google drive into the notebook. Say I got a file pretrained_vgg19.mat in my drive like drive/jupyter/pretrained_vgg19.mat. The code snippet for importing files from drive says that I need to use the file_ID that looks like laggVyWshwcyP6kEI-y_W3P8D26sz. How do I get this file_ID?
See PyDrive documentation for the ListFile command:
from pydrive.drive import GoogleDrive
drive = GoogleDrive(gauth) # Create GoogleDrive instance with authenticated GoogleAuth instance
# Auto-iterate through all files in the root folder.
file_list = drive.ListFile({'q': "'root' in parents and trashed=false"}).GetList()
for file1 in file_list:
print('title: %s, id: %s' % (file1['title'], file1['id']))
Now all you need to do is tweak the search parameters, since you know the title of the file already. See docs.
file_list = drive.ListFile({'q': "name='pretrained_vgg19.mat' and trashed=false"}).GetList()
for file in file_list:
print('%s' % (file['id']))
Note that it is possible to have files with the same folder name and file name, because you can create multiple folders with identical paths in Google Drive. If there is even a chance of this, you will get multiple files returned in your list operation and will need to use some other criteria in order to select the correct one.
user244343's answer didn't work for me since the gauth object doesn't exist. I did this instead (test.zip needs to point to the right folder and file in your Drive!):
!apt-get install -qq xattr
filename = "/content/drive/My\ Drive/test.zip"
# Retrieving the file ID for a file in `"/content/drive/My Drive/"`:
id = !xattr -p 'user.drive.id' {filename}
print(id)

How to read pickled file uploaded on Google Colab

I have uploaded pickled file on google colab using
from google.colab import files
uploaded = files.upload()
Say my pickled file name is Train.p, how do I use it using typical functions, I have tried the code below but it does not work.
with open(io.StringIO(uploaded['train.p']), 'rb') as file:
train = pickle.load(file)
Try this
import io
train = pickle.load(io.BytesIO(uploaded[‘train.p’]))