sqlacodegen issue with geometry type - sqlalchemy

i'm using sqlacodegen in order to automatically generate model code for SQLAlchemy.
My problem is that I have some columns of type geometry and as far as I know, according to the link bellow, sqlacodegen does not support this kind of types.
https://bitbucket.org/agronholm/sqlacodegen/issues/18/did-not-recognize-type-geometry-on
Does anyone know how can I workaround this holdback?
Maybe with the method Mapper or something?
Thnks

Probably you have already figured this out, so this goes to the posterity.
According to the same link you gave (issue #18), this already has a PR to fix it (with no tests though, so it was not accepted as of today).
Update: this PR has been merged as of version 2.1.0. Therefore, the below solution---from the original answer---should not be needed anymore. Note that this PR explicitly addressed the support for PostGIS and not MySQL.
Nevertheless, the fix is easy. Install GeoAlchemy2 (it brings PostGIS support to SQLAlchemy) and import it in sqlacodegen/codegen.py
from sqlalchemy.types import Boolean, String
import sqlalchemy
from geoalchemy2 import Geometry
Otherwise, ignore those SAWarnings and manually fix the geometry types that were not correctly identified, i.e., replace NullType with Geometry.
the_geom = Column(NullType, index=True)
# becomes
the_geom = Column(Geometry(...), index=True)

Related

BentoML - Seving a CatBoostClassifier with cat_features

I am trying to create a BentoML service for a CatBoostClassifier model that was trained using a column as a categorical feature. If i save the model and I try to make some predictions with the saved model (not as a BentoML service) all works as expected, but when I create the service using BentML I get an error
_catboost.CatBoostError: Bad value for num_feature[non_default_doc_idx=0,feature_idx=2]="Tertiary": Cannot convert 'b'Tertiary'' to float
The value is found in a column named 'road_type' and the model was trained using 'object' as the data type for the column.
If I try to give a float or an integer for the 'road_type' column I get the following error
_catboost.CatBoostError: catboost/libs/data/model_dataset_compatibility.cpp:53: Feature road_type is Categorical in model but marked different in the dataset
If someone has encountered the same issue and found a solution I would appreciate it. Thanks!
I have tried different approaches for saving the model or loading the model but unfortunately it did not worked.
You can try to explicitly pass the cat_features to the bentoml runner.
It would be something like this:
from catboost import Pool
runner = bentoml.catboost.get("bentoml_catboost_model:latest").to_runner()
cat_features = [2] # specify your cat_features indexes
prediction = runner.predict.run(Pool(input_data, cat_features=cat_features))

SQLModel in Fastapi using __table_args__ is unable to create tables for unit testing [duplicate]

I have a Pylons project and a SQLAlchemy model that implements schema qualified tables:
class Hockey(Base):
__tablename__ = "hockey"
__table_args__ = {'schema':'winter'}
hockey_id = sa.Column(sa.types.Integer, sa.Sequence('score_id_seq', optional=True), primary_key=True)
baseball_id = sa.Column(sa.types.Integer, sa.ForeignKey('summer.baseball.baseball_id'))
This code works great with Postgresql but fails when using SQLite on table and foreign key names (due to SQLite's lack of schema support)
sqlalchemy.exc.OperationalError: (OperationalError) unknown database "winter" 'PRAGMA "winter".table_info("hockey")' ()
I'd like to continue using SQLite for dev and testing.
Is there a way of have this fail gracefully on SQLite?
I'd like to continue using SQLite for
dev and testing.
Is there a way of have this fail
gracefully on SQLite?
It's hard to know where to start with that kind of question. So . . .
Stop it. Just stop it.
There are some developers who don't have the luxury of developing on their target platform. Their life is a hard one--moving code (and sometimes compilers) from one environment to the other, debugging twice (sometimes having to debug remotely on the target platform), gradually coming to an awareness that the gnawing in their gut is actually the start of an ulcer.
Install PostgreSQL.
When you can use the same database environment for development, testing, and deployment, you should.
Not to mention the QA team. Why on earth are they testing stuff they're not going to ship? If you're deploying on PostgreSQL, assure the quality of your work on PostgreSQL.
Seriously.
I'm not sure if this works with foreign keys, but someone could try to use SQLAlchemy's Multi-Tenancy Schema Translation for Table objects. It worked for me but I have used custom primaryjoin and secondaryjoinexpressions in combination with composite primary keys.
The schema translation map can be passed directly to the engine creator:
...
if dialect == "sqlite":
url = lambda: "sqlite:///:memory:"
execution_options={"schema_translate_map": {"winter": None, "summer": None}}
else:
url = lambda: f"postgresql://{user}:{pass}#{host}:{port}/{name}"
execution_options=None
engine = create_engine(url(), execution_options=execution_options)
...
Here is the doc for create_engine. There is a another question on so which might be related in that regard.
But one might get colliding table names all schema names are mapped to None.
I'm just a beginner myself, and I haven't used Pylons, but...
I notice that you are combining the table and the associated class together. How about if you separate them?
import sqlalchemy as sa
meta = sa.MetaData('sqlite:///tutorial.sqlite')
schema = None
hockey_table = sa.Table('hockey', meta,
sa.Column('score_id', sa.types.Integer, sa.Sequence('score_id_seq', optional=True), primary_key=True),
sa.Column('baseball_id', sa.types.Integer, sa.ForeignKey('summer.baseball.baseball_id')),
schema = schema,
)
meta.create_all()
Then you could create a separate
class Hockey(Object):
...
and
mapper(Hockey, hockey_table)
Then just set schema above = None everywhere if you are using sqlite, and the value(s) you want otherwise.
You don't have a working example, so the example above isn't a working one either. However, as other people have pointed out, trying to maintain portability across databases is in the end a losing game. I'd add a +1 to the people suggesting you just use PostgreSQL everywhere.
HTH, Regards.
I know this is a 10+ year old question, but I ran into the same problem recently: Postgres in production and sqlite in development.
The solution was to register an event listener for when the engine calls the "connect" method.
#sqlalchemy.event.listens_for(engine, "connect")
def connect(dbapi_connection, connection_record):
dbapi_connection.execute('ATTACH "your_data_base_name.db" AS "schema_name"')
Using ATTACH statement only once will not work, because it affects only a single connection. This is why we need the event listener, to make the ATTACH statement over all connections.

Using Pickle to serialise large objects - what causes 'Memory error'

I'm pickling a very large (both in terms of properties and in terms raw size) class. I've been picking it no problem with pickle using pickle.dump, until I hit just under 4GB and now I consistently get 'Memory Error'. I've also tried using json.dump (and I get 'is not JSON serializable' error). I've also tried Hickle but I get the same error with Hickle as I do with Pickle.
I can't post all the code here (it's very long) but in essence It's a class that holds a dictionary of values from another class - something like this:
class one:
def __init__(self):
self.somedict = {}
def addItem(self,name,item)
self.somedict[name] = item
class two:
def __init__(self):
self.values = [0]*100
Where name is a string and item is an instance of the class two object.
There's a lot more code to it, but this is where the vast majority of things are held. Is there a reliable and ideally fast solution to saving this object to file and then being able to reload it at a later time. I save it every few thousand iterations (as a backup incase something goes wrong, so I need it to be reasonably quick).
Thanks!
Edit #1:
I've just thought that it might be useful to include some details on my system. I have 64Gb of ram - so I don't think pickling a 3-4GB file should cause this type of issue (although I could be wrong on this!).
You probably checked this one first but just in case: Did you make sure your Python installation 64 bit? The 3-4GB immediately reminded me of the memory limit of 32bit applications.
I found this resource quite useful for analyzing and resolving some of the more common memory related issues with Python.

Using Play Framework and case class with greater than 22 parameters

I have seen some of the other issues involving the infamous "22 fields/parameters" issue that is an inherent bug (feature?) of Scala V < 2.11. See here and here. However, as per this blog post, it appears that the 22 parameter limit in case class has been fixed; at least where the language is concerned.
I have a case class that I want to load an arbitrary (Read: > 22) number of values into which will later be read into a JSON object using the Play library.
It looks something like this:
object L {
import play.api.libs.json.Reads. _
import play.api.libs.functional.syntax._
implicit val responseRead: Reads[L] = (
MyField1.jsPath.Read[MyField1.t] and
MyField2.jsPath.Read[MyField2.t] and
...
MyField35.jsPath.Read[MyField35.t]
) (L.apply _)
}
case class L(myField1: MyField1.t, myField2: MyField2.t, ... myField35: MyField35.t)
The issue is that on compile, Scala complains that there are more than 22 parameters in the case class. (Specifically: on the last line of the object definition, when the compiler attempts to build, I get: "implementation restricts functions to 22 parameters".) I'm currently using Scala v2.11.6, so I think it's not a language issue. That makes me think that the Play library hasn't updated their implementation of Read.
If that's the case, then I guess the best bet is to group related fields into Tuples and pass the Tuples in through the JSON API?
As mentioned in the blog post you referenced, the 22-parameter limit is still in effect for functions in Scala 2.11 and later, so what you've encountered is a language issue. The function call in this case is:
L.apply _
Restructuring your model is one way to deal with this limit.
So the answer to this question is actually two parts:
1. Workaround
I'll call this the "workaround" because while it does "work" it usually addresses the symptom and not the problem.
My solution was to use shapeless to provide generic heterogeneous lists of arbitrary length. This solution is already widely discussed and available elsewhere. See, e.g., (1) [SO Post] How to get around the Scala case class limit of 22 fields?; (2) Blog post; (3) Yet another blog post.
2. Solution
As #jeffrey-chung mentions is to restructure the model to deal with this limit. As many in the industry have noted, having a function with more than 30 arguments is likely a sign that your function is doing too much or that the function should be refactored to ingest a smaller number of arguments. See, e.g., (1) Rule of 30 – When is a method, class or subsystem too big?; (2) Databrick's style guide.
See answer here
https://stackoverflow.com/a/57317220/1606452
It seems this handles it all nicely.
+22 field case class formatter and more for play-json
https://github.com/xdotai/play-json-extensions
Supports Scala 2.11.x, 2.12.x, and 2.13.x and play 2.3, 2.4, 2.5 and 2.7
And is referenced in the play-json issue as the preferred solution (but not yet merged)

Django order_by() is not working for me on PythonAnywhere

For some reason order_by() is not working for me on a queryset. I've tried everything I can think of, but my Django/MySQL installation doesn't seem to be doing anything with order_by() method. The list appears to just remain in a fairly unordered state, or is ordered on some basis I cannot see.
My Django installation is 1.8.
An example of one of my models is as follows:
class PositiveTinyIntegerField(models.PositiveSmallIntegerField):
def db_type(self, connection):
if connection.settings_dict['ENGINE'] == 'django.db.backends.mysql':
return "tinyint unsigned"
else:
return super(PositiveTinyIntegerField, self).db_type(connection)
class School(models.Model):
school_type = models.CharField(max_length=40)
order = PositiveTinyIntegerField(default=1)
# Make the identity of db rows clear in admin
def __str__(self):
return self.school_type
And here is the the relevant line from my view:
schools = School.objects.order_by('order')
At first I thought the problem was related to having used the non-standard PositiveTinyIntegerField() defined by a class I found on a website somewhere which allows me to use the MySQL Tiny Integer field. However, when I ordered by 'id', or 'school_type' the list still remained in an order that appeared fairly random to my eye.
I could put in my own loop which orders the queryset after it has been retrieved, but I'd really rather solve this issue so I can use the standard Django way of doing it.
I hope someone can see where the issue may be coming from.
I managed to resolved it with some help from the comments here. I tried writing the school object to stdout using sys.write.stdout(str(school)). The logs then showed me that in fact the data was being ordered correctly, so the problem had to be with how the data was being packaged before being rendered by the template.
I wrote the view some time ago before I decided I wanted it ordered, so it turned out the problem was caused by each school object (with an attached tree of related data) being read into a dictionary. Once I changed the data type to the list, the schools then rendered in my intended order.