Using PIL, How do I point the “fonts” to its referenced file on GCF - google-cloud-functions

I am using PIL on cloud function
How do I point the “fonts” to that file.
I wonder shall I save the “fonts file” to the cloud storage and point the path? Any ideas or comments will be appreciated. Than you!
from PIL import Image, ImageDraw
from PIL import ImageFont
From Google.cloud import storage
storage_client = storage.Client()
. . . . . .
with blob.open() as file:
img = Image.open(file)
draw = ImageDraw.Draw(img)
font = ImageFont.truetype("simsun.ttc", 18). ## <- the font here
. . . . . .

The PIL does not have the built-in ability to automatically open files from GCS. you will need to either:
Download the file to local storage and point PIL to that file or
Give PIL a BlobReader which it can use to access the data
You will have to use the reference path for the file and call the same in the code import function , example code as below:
font_file = load_font_from_gcs(gs_path)
font = ImageFont.truetype(BytesIO(font_file), 11)
Check the following helpful links with similar implementation:
GCS Python way to list certain folder
Read and write file from gcs via python
Loading fonts and images from gcs
Loading font from url to pillow

Related

How to allow user to download entire folder, directory, or at least multiple images? (Flask)

I am trying to figure out how to allow user to download a folder that contains multiple images when the user presses download button. For now, I can only use send_file code to allow user to download a single image. Is there any possible way to do this?
#app.route('/download')
def download():
path = "where my image is"
return send_file(path, as_attachment=True)
This is my current code and I want to change it.
I would advise you to archive the directory and then send it.
A list of all files in the folder is created.
These files are then compressed in a zip archive that is written to a stream.
The stream is then transmitted via send_file.
from flask import send_file
from zipfile import ZipFile
from io import BytesIO
import os, glob
#app.route('/download', methods=['GET'])
def download():
path = # target directory
root = os.path.dirname(path)
files = glob.glob(os.path.join(path, '*'))
stream = BytesIO()
with ZipFile(stream, 'w') as zf:
for f in files:
zf.write(f, os.path.relpath(f, root))
stream.seek(0)
return send_file(
stream,
as_attachment=True,
attachment_filename='archive.zip',
mimetype='application/zip'
)

Import pre-trained Deep Learning Models into Foundry Codeworkbooks

How do you import a h5 model locally from Foundry into code workbook?
I want to use the hugging face library as shown below, and in its documentation the from_pretrained method expects a URL path to the where the pretrained model lives.
I would ideally like to download the model onto my local machine, upload it onto Foundry, and have Foundry read in said model.
For reference I’m trying to do this on code workbook or code authoring. It looks like you can work directly with files from there, but I’ve read the documentation and the given example was for a CSV file whereas this model contains a variety of files like h5 and json format. Wondering how I can access these files and have them passsed into the from_pretrained method from the transformers package
Relevant links:
https://huggingface.co/transformers/quicktour.html
Pre-trained Model:
https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english/tree/main
Thank you!
I've gone ahead and added the transformers (hugging face) package onto the platform.
As for the uploading the package you can follow these steps:
Use your dataset with the model-related files as an input to your code workbook transform
Use pythons raw file access to access the contents of the dataset: https://docs.python.org/3/library/filesys.html
Use pythons built-in tempfile to build a folder and add the files from step 2, https://docs.python.org/3/library/tempfile.html#tempfile.mkdtemp , https://www.kite.com/python/answers/how-to-write-a-file-to-a-specific-directory-in-python
Pass in the tempfile (tempfile.mkdtemp() return the absolute path) to the from_pretrained method
import tempfile
def sample (dataset_with_model_folder_uploaded):
full_folder_path = tempfile.mkdtemp()
all_file_names = ['config.json', 'tf_model.h5', 'ETC.ot', ...]
for file_name in all_file_names:
with dataset_with_model_folder_uploaded.filesystem().open(file_name) as f:
pathOfFile = os.path.join(fullFolderPath, file_name)
newFile = open(pathOfFile, "w")
newFile.write(f.read())
newFile.close()
model = TF.DistilSequenceClassification.from_pretrained(full_folder_path)
tokenizer = TF.Tokenizer.from_pretrained(full_folder_path)
Thanks,

Fetch Folder from drive for Google Colab

I'm trying to run a deep learning model in jupyter notebook and its taking forever and also karnel dies during training . So i'm trying to run it on Google Colab . I've learned some basics that are available on the internet but its not helping me at all . The model gets it dataset from a module ,
this link https://github.com/awslabs/handwritten-text-recognition-for-apache-mxnet/blob/master/ocr/utils/iam_dataset.py has the module that extract and preprocess dataset for trining from local computer. I've uploaded the dataset in Gdrive now i want to change the path so that this module finds that 'dataset' folder . I've been stuck on it for 5 days and now i'm clueless .
I will suggest you not to load the dataset from GDrive to colab directly. It increases the dataset loading time.
Google Colab provides some local storage for your work(around 70 GB) that is shown on the upper-right corner below the RAM bar. Bring your dataset to that storage. This is how you can do it:-
import zipfile
from google.colab import drive
zip_ref = zipfile.ZipFile("/content/drive/My Drive/dataset.zip", 'r')
zip_ref.extractall("/content/")
zip_ref.close()
Please note that your entire dataset should be zipped.
It will be more than 20 times faster than the method you are trying...
Format of zipfile.ZipFile() function above:-
zip_ref = zipfile.ZipFile("/content/drive/Zip file location in GDrive", 'r')
If you click the folder icon in the left side in colab interface you should see your dataset there.
You can then access your dataset using the filepath='/content/dataset'

Goolge Colab file upload not uploading the csv file

Ran the following code in Colab:
uploaded = files.upload()
Clicked "Choose Files" and selected the csv I want to upload.
But the file is not being uploaded. It worked once before but not again. All videos I watch about how to upload csv's locally, once the file is selected, Colab uploads the file immediately.
Colab is not doing that in my case. It's just sitting stagnant like this:
stagnant colab
This is a Colab bug --
https://github.com/googlecolab/colabtools/issues/437
The team reports they are working on a service change to correct the problem today. (21 Feb, 2019)
Well even i had the same probem
you can also go for other option
1. In left pane of google colab there is a tab called Files Click on that.
2. click on the upload files.
Thats it!!!! Pretty Simple Enough!!!!
for more details :https://www.youtube.com/watch?v=0rygVrmHidg
Upload the file from the local machine using the following command:
from google.colab import files
import pandas as pd
uploaded = files.upload()
for fn in uploaded.keys():
print('User uploaded file "{name}" with length {length} bytes'.format(
name=fn, length=len(uploaded[fn])))
After that read the csv/text file using the pandas:
dataFrame= pd.read_csv('filename.csv')
I have same problem. I use this method. Upload the file from the google drive using the following command:
from google.colab import drive
drive.mount('/content/drive')
data = pd.read_csv('/content/drive/Myfiles/datafiles/mydata.csv', delimiter=",")

How to use trained data with pytesseract?

Using this tool http://trainyourtesseract.com/ I would like to be able to use new fonts with pytesseract. the tool give me a file called *.traineddata
Right now I'm using this simple script :
try:
import Image
except ImportError:
from PIL import Image
import pytesseract as tes
results = tes.image_to_string(Image.open('./test.jpg'),boxes=True)
file = open('parsing.text','a')
file.write(results)
print(results)
How to I use my traineddata file so I'm able to read new font with the python script ?
thanks !
edit#1 : so I understand that *.traineddata can be used with Tesseract as a command-line program. so my question still the same, how do I use traineddata with python ?
edit#2 : the answer to my question is here How to access the command line for Tesseract from Python?
Below is a sample of pytesseract.image_to_string() with options.
pytesseract.image_to_string(Image.open("./imagesStackoverflow/xyz-small-gray.png"),
lang="eng",boxes=False,
config="--psm 4 --oem 3
-c tessedit_char_whitelist=-01234567890XYZ:"))
To use your own trained language data, just replace "eng" in lang="eng" with you language name(.traineddata).