Is there a way to reduce down the size of the pickle file for PyCaret models?

Is there a way to reduce down the size of the pickle file for PyCaret models? - pickle

I tried to compress a pickle file representing a PyCaret model.
import joblib
joblib.dump('my_file.pkl', 'new_file.pkl.z',compress=3)
The above code didn't work.
How can I reduce the size of a pickle file for PyCaret models?

There is no need to call the joblib.dump function yourself, you should call the PyCaret save function directly on the loaded model as follows:
from pycaret.regression import load_model, save_model
your_model = load_model('my_file.pkl')
save_model(your_model, f'my_file.pkl', **{"compress":3})
Example above is for a regression model
For reference:
Pycaret calls the jolib.dump function here passing any given kwargs.

Related

Transforming shapefiles to dataframes with shapefile_to_dataframe() helper function - fiona related error

I am trying to use the Palantir Foundry helper function shapefile_to_dataframe() in order to ingest shapefiles for later usage in geolocation features.
I have manually imported the shapefiles (.shp, .shx & .dbf) in a single dataset (no access issues through the filesystem API).
As per documentation, I have imported geospatial-tools and the GEOSPARK profiles + included dependencies in the transforms-python build.gradle.
Here is my transform code, which is mostly extracted from the documentation:
from transforms.api import transform, Input, Output, configure
from geospatial_tools import geospatial
from geospatial_tools.parsers import shapefile_to_dataframe
#geospatial()
#transform(
raw = Input("ri.foundry.main.dataset.0d984138-23da-4bcf-ad86-39686a14ef21"),
output = Output("/Indhu/InDhu/Vincent/geo_energy/datasets/extract_coord/raw_df")
)
def compute(raw, output):
return output.write_dataframe(shapefile_to_dataframe(raw))
Code assist then become extremely slow to load, and then I am finally getting following error:
AttributeError: partially initialized module 'fiona' has no attribute '_loading' (most likely due to a circular import)
Traceback (most recent call last):
File "/myproject/datasets/shp_to_df.py", line 3, in <module>
from geospatial_tools.parsers import shapefile_to_dataframe
File "/scratch/standalone/3a553998-623b-48f5-9c3f-03de7e64f328/code-assist/contents/transforms-python/build/conda/env/lib/python3.8/site-packages/geospatial_tools/parsers.py", line 11, in <module>
from fiona.drvsupport import supported_drivers
File "/scratch/standalone/3a553998-623b-48f5-9c3f-03de7e64f328/code-assist/contents/transforms-python/build/conda/env/lib/python3.8/site-packages/fiona/__init__.py", line 85, in <module>
with fiona._loading.add_gdal_dll_directories():
AttributeError: partially initialized module 'fiona' has no attribute '_loading' (most likely due to a circular import)
Thanks a lot for your help,
Vincent

I was able to reproduce this error and it seems like it happens only in previews - running the full build seems to be working fine. The simplest way to get around it is to move the import inside the function:
from transforms.api import transform, Input, Output, configure
from geospatial_tools import geospatial
#geospatial()
#transform(
raw = Input("ri.foundry.main.dataset.0d984138-23da-4bcf-ad86-39686a14ef21"),
output = Output("/Indhu/InDhu/Vincent/geo_energy/datasets/extract_coord/raw_df")
)
def compute(raw, output):
from geospatial_tools.parsers import shapefile_to_dataframe
return output.write_dataframe(shapefile_to_dataframe(raw))
However, at the moment, the function shapefile_to_dataframe isn't going to work in the Preview anyway because the full transforms.api.FileSystem API isn't implemented - specifically, the functions ls doesn't implement the parameter glob which the full transforms API does.

How can I save my session or my GAN model into a js file

I want to deploy my GANs model on a web-based UI for this I need to convert my model's checkpoints into js files to be called by web code. There are functions for saved_model and Keras to convert into pb files but none for js
my main concern is that I am confused about how to dump a session or variable weights in js files

You can save a keras model from python. There is a full tutorial here but basically it amounts to calling this after training:
tfjs.converters.save_keras_model(model, tfjs_target_dir)
then hosting the result somewhere publicly accessible (or on the same server as your web UI) then you can load your model into tensorflow.js as follows:
import * as tf from '#tensorflow/tfjs';
const model = await tf.loadLayersModel('https://foo.bar/tfjs_artifacts/model.json');

Creating a serving graph separately from training in tensorflow for Google CloudML deployment?

I am trying to deploy a tf.keras image classification model to Google CloudML Engine. Do I have to include code to create serving graph separately from training to get it to serve my models in a web app? I already have my model in SavedModel format (saved_model.pb & variable files), so I'm not sure if I need to do this extra step to get it to work.
e.g. this is code directly from GCP Tensorflow Deploying models documentation
def json_serving_input_fn():
"""Build the serving inputs."""
inputs = {}
for feat in INPUT_COLUMNS:
inputs[feat.name] = tf.placeholder(shape=[None], dtype=feat.dtype)
return tf.estimator.export.ServingInputReceiver(inputs, inputs)

You are probably training your model with actual image files, while it is best to send images as encoded byte-string to a model hosted on CloudML. Therefore you'll need to specify a ServingInputReceiver function when exporting the model, as you mention. Some boilerplate code to do this for a Keras model:
# Convert keras model to TF estimator
tf_files_path = './tf'
estimator =\
tf.keras.estimator.model_to_estimator(keras_model=model,
model_dir=tf_files_path)
# Your serving input function will accept a string
# And decode it into an image
def serving_input_receiver_fn():
def prepare_image(image_str_tensor):
image = tf.image.decode_png(image_str_tensor,
channels=3)
return image # apply additional processing if necessary
# Ensure model is batchable
# https://stackoverflow.com/questions/52303403/
input_ph = tf.placeholder(tf.string, shape=[None])
images_tensor = tf.map_fn(
prepare_image, input_ph, back_prop=False, dtype=tf.float32)
return tf.estimator.export.ServingInputReceiver(
{model.input_names[0]: images_tensor},
{'image_bytes': input_ph})
# Export the estimator - deploy it to CloudML afterwards
export_path = './export'
estimator.export_savedmodel(
export_path,
serving_input_receiver_fn=serving_input_receiver_fn)
You can refer to this very helpful answer for a more complete reference and other options for exporting your model.
Edit: If this approach throws a ValueError: Couldn't find trained model at ./tf. error, you can try it the workaround solution that I documented in this answer.

Does SystemVerilog support global functions?

I want to make a parity_check function which can be accessed by three different modules. Is this possible in SV? If yes, where do I declare this function and how do I import it into my module?

You can put a function into a separate file and include it using `include:
`include "some_file_name.sv"
However, a much better way is to use a package:
package some_package_name;
function string some_function_name;
return "some_function_name called";
endfunction
endpackage
You would put that into a separate file and you must compile that before compiling any module that uses it. You then import the package into each module:
module some_module_name;
import some_package_name::*; // or import some_package_name::some_function_name;
initial
$display(some_function_name);
endmodule
Putting a function in a package is better than just putting it into a file and using include, because a package is a named scope. Because a package is a named scope, any issues with some clash of names can be resolved by, instead of using import, referring to the full name of the function in its package, eg:
module some_module_name;
initial
$display(some_package_name::some_function_name);
endmodule

ImportError: cannot import name 'persist'

I want to persist a trained model in CNTK and found the 'persist' functionality after some amount of searching. However, there seems to be some error in importing it.
from cntk import persist
This is throwing ImportError.
Am I doing something the wrong way? Or is this no longer supported? Is there an alternate way to persist a model?

persist is from an earlier beta. save_model is now a method of every CNTK function. So instead of doing save_model(z, filename) you do z.save_model(filename). Load_model works the same as before but you import it from cntk.ops.functions. For an example, see: https://github.com/Microsoft/CNTK/blob/v2.0.beta7.0/Tutorials/CNTK_203_Reinforcement_Learning_Basics.ipynb or https://github.com/Microsoft/CNTK/blob/v2.0.beta7.0/bindings/python/cntk/tests/persist_test.py

The functionality has moved to cntk functions. The new way is mynetwork.save_model(...) where mynetwork represents the root of your computation (typically the prediction). For loading the model you can just say mynetwork = C.load_model(...)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Is there a way to reduce down the size of the pickle file for PyCaret models? - pickle

I tried to compress a pickle file representing a PyCaret model. import joblib joblib.dump('my_file.pkl', 'new_file.pkl.z',compress=3) The above code didn't work. How can I reduce the size of a pickle file for PyCaret models?

Related

Transforming shapefiles to dataframes with shapefile_to_dataframe() helper function - fiona related error

How can I save my session or my GAN model into a js file

Creating a serving graph separately from training in tensorflow for Google CloudML deployment?

Does SystemVerilog support global functions?

ImportError: cannot import name 'persist'

Categories

Resources