sqlalchemy column that is sql expression with bind parameters - sqlalchemy

I'm trying to map a class that has a column, that doesn't really exist, but is simply a sql expression that takes bind parameters at query time. The model below is an example of what I'm trying to do.
class House(db.Model):
__tablename__ = 'houses'
id = db.Column(db.Integer, primary_key=True)
#hybrid_property
def distance_from_current_location(self):
# what to return here?
return self.distance_from_current_location
#distance_from_current_location.expression
def distance_from_current_location(cls, latitude, longitude):
return db.func.earth_distance(
db.func.ll_to_earth(latitude, longitude), cls.earth_location)
# defer loading, as the raw value is pretty useless in python
earth_location = deferred(db.Column(EARTH))
Then I'd like to query via flask-sqlalchemy:
latidude = 80.20393
longitude = -90.44380
paged_data = \
House.query \
.bindparams({'latitude': latitude, 'longitude': longitude}) \
.paginate(1, 20, False)
My questions are:
How do I do this? Is it possible to use a hybrid_property like this?
If I can use a hybrid_property, what should the python method return? (there is no python way to interpret this, it should just return whatever the DB expression returned.
latitude and longitude only exist during query time and need to be bound for each query. How do I bind latitude and longitude during query time? The bindparams bit in my code snippet I just made up, but it illustrates what I want to do. Is it possible to do this?
I've read the docs but couldn't find any hybrid_property or method with a bind parameter in the examples...
(Also since this is not a real column, but just something I want to use on my model I don't want this to trigger alembic to generate a new column for it).
Thanks!

You can't do this. distance_from_current_location is not a fake column either, since it depends on query-specific parameters. Imagine you were to write a SQL view for this; how would you write the definition? (Hint: you can't.)
SQLAlchemy uses the identity map pattern, which means that for a particular primary key, only one instance exists in your entire session. How would you handle querying the same instance but with different latitude/longitude values? (The instances returned from the later query would overwrite those returned from the earlier one.)
The correct way to do this is through additional entities at query time, like this:
House.query \
.with_entities(House, db.func.earth_distance(db.func.ll_to_earth(latitude, longitude), House.earth_location)) \
.filter(...)
Or through a hyrbid_method (the usage of which requires passing in latitude and longitude every time):
class House(db.Model):
...
#hybrid_method
def distance_from_current_location(self, latitude, longitude):
# implement db.func.ll_to_earth(latitude, longitude), self.earth_location) **in Python** here
#distance_from_current_location.expression
def distance_from_current_location(cls, latitude, longitude):
...

Related

Can you construct an ActiveRecord scope with a variable query string?

Setup:
I'm using Ruby on Rails with ActiveRecord and MySQL.
I have a Coupon model.
It has an attribute called query, it is a string which could be run with a where.
For example:
#coupon.query
=> "'http://localhost:3003/hats' = :url OR 'http://localhost:3003/shoes' = :url"`
If I were to run this query it would either pass or fail based on the :url value I pass in.
# passes
Coupon.where(#coupon.query, url: 'http://localhost:3003/hats')
Coupon.where(#coupon.query, url: 'http://localhost:3003/shoes')
# fails
Coupon.where(#coupon.query, url: 'http://localhost:3003/some_other_url')
This query varies between Coupon models, but it will always be compared to the current url.
I need a way to say: Given an ActiveRecord collection #coupons only keep coupons with queries that pass.
The structure of the where is always the same, but the query changes.
Is there any way to do this without a loop? I could potentially have a lot of coupons and I am hoping to do this an ActiveRecord scope. Something like this?
#coupons.where(self.query, url: #url)
Perhaps I need to write a user defined function in my database?
Using multiple variables in a query is easy, but where the thing you are comparing your variable to is also a variable - that has me stumped. Any suggestions very appreciated.
I would agree with Les Nightingill's comment that this looks like something that should probably be solved at a more architectural level. I'd imagine an easy refactoring to extract a new CouponQuery model that's a 1:n table containing multiple entries for a coupon_id for each query url that should pass. Then you could use a simple join like
Coupon.joins(:coupon_query).where(coupon_queries: { url: my_url })
If adding a new table is not an option, and if you're running on a newer MySQL version (>= 5.7), you could consider transforming the query column (or adding a new json_query column) into a MySQL JSON field and using the new JSON_CONTAINS query.
If from the user-side they should be able to manage the queries as a plain text field, you could use a before_save hook on your model to translate this into the separate table structure or JSON format respectively.
But if neither is an option for you and you really need to stick with the query column that stores a plain string, then you could use a LIKE query to match the sub-string 'your-url' = :url:
Coupon.where('url LIKE "%? = :url%"', my_url)
which, if you e.g. pass 'http://localhost:3003/hats' as my_url would return something like this SQL query:
SELECT `coupons`.* FROM `coupons`
WHERE (url LIKE "%'http://localhost:3003/hats' = :url%")

Store results of expensive function calls in a MySQL table

Let's suppose I have a set of integers of a variable length. I apply a function on this set of integers and I obtain a result.
myFunction(setOfIntegers) => myResult
Let's suppose a call to myFunction is very expensive and I would like to somehow store the results of this function calls.
In my application I am already using MySQL and what I was thinking was to somehow create a table with the setOfIntegers as a PK and myResult as an additional field.
I was thinking that I could do this by transforming the setOfIntegers to a string before storing it in the DB.
Can this be done in any other way? Or would there be a better way to store results of such function calls in order to avoid calling them a 2nd time with the same set of integers?
I don't know about Java, but Perl has my $str = join(',', $array) and PHP has $str = implode(',', $array). Then the string $str could be used as the PRIMARY KEY (assuming it is not too long). And the result would go in the other column.
Your app code (in Java) would need to first do an implode and SELECT to see if the function has already been evaluated for the given array. If not, then perform the function and end by INSERTing a new row.
If this will be multi-threaded, you could use INSERT IGNORE to deal with dups. (There are other solutions, too.)
Another note: If your set-of-integers is ordered, then what I describe is 'complete'. If it is unordered, then sort it before imploding. This will provide a canonical representation.
If the function can be implemented in MySQL directly, I would suggest using Views.
https://www.mysqltutorial.org/mysql-views-tutorial.aspx/

How to use RETURNING for query.update() with sqlalchemy

I want to specify the return values for a specific update in sqlalchemy.
The documentation of the underlying update statement (sqlalchemy.sql.expression.update) says it accepts a "returning" argument and the docs for the query object state that query.update() accepts a dictionary "update_args" which will be passed as the arguments to the query statement.
Therefore my code looks like this:
session.query(
ItemClass
).update(
{ItemClass.value: value_a},
synchronize_session='fetch',
update_args={
'returning': (ItemClass.id,)
}
)
However, this does not seem to work. It just returns the regular integer.
My question is now: Am I doing something wrong or is this simply not possible with a query object and I need to manually construct statements or write raw sql?
The full solution that worked for me was to use the SQLAlchemy table object directly.
You can get that table object and the columns from your model easily by doing
table = Model.__table__
columns = table.columns
Then with this table object, I can replicate what you did in the question:
from your_settings import db
update_statement = table.update().returning(table.id)\
.where(columns.column_name=value_one)\
.values(column_name='New column name')
result = db.session.execute(update_statement)
tuple_of_results = result.fetchall()
db.session.commit()
The tuple_of_results variable would contain a tuple of the results.
Note that you would have to run db.session.commit() in order to persist the changes to the database as you it is currently running within a transaction.
You could perform an update based on the current value of a column by doing something like:
update_statement = table.update().returning(table.id)\
.where(columns.column_name=value_one)\
.values(like_count=table_columns.like_count+1)
This would increment our numeric like_count column by one.
Hope this was helpful.
Here's a snippet from the SQLAlchemy documentation:
# UPDATE..RETURNING
result = table.update().returning(table.c.col1, table.c.col2).\
where(table.c.name=='foo').values(name='bar')
print result.fetchall()

Multiple, unknown number of fields passed into a query

Is it possible to create a generic query that would work for different types of documents? For example I have "cases" and "factories",
They have different set of fields. e.g:
{
id: 'case_o1',
name: 'Case numero uno',
amount: 40
}
{
id: 'factory_002',
location: 'Venezuela',
workers: 200,
operating: true
}
Is it possible to create a generic query where I would pass the type of an entity (case or factory) and additional parameters and it would filter results based on those?
I could of course use javascript view, but it doesn't allow me to filter by multiple fields. Let's say I want to fetch all factories located in Venezuela, with number of workers between 20 and 55.
I started with this, but then I got stuck:
select * from `mybucket` as entity
where position(meta(entity).id, $entity_type) == 0
How do I pass multiple predicates and have the query to recognize them?
I can of course list fields like this:
where position(meta(entity).id, $entity_type) == 0
and entity.location == 'Venezuela'
and entity.workers > $workers_min
and entity.workers < $workers_max
but then
I'm gonna have to create a separate query for each entity
And even then it won't solve my problem - I have no idea how to ignore predicates, what if next time $workers_min and $workers_max are not passed, does it mean I have to create a query for every single predicate (column)?
For security reasons I cannot generate free-form queries and pass them to Couchbase server, all the queries are already stored in the database, our api just picks them up out of a document and executes them
I think it's possible to create a query that would be "short-circuiting" for args that's undefined (e.g. WHERE $location IS MISSING OR entity.location == $location or something like that)
Is it possible at all to create a query that would be able to effectively filter and order a dataset based on arbitrary parameters? Or there's no way?
#Agzam. Sorry. I were writting my comment when you said it. But anyway. What you are asking for is possible by using coalesces in a not too complex expressions, but it is a REALLY bad idea because this will drastically throw down most of internal database optimizations. Including the use of any existing index. So, except if you are dealing with a relatively small database (and you are sure it will remain being approximately the same size), I suggest you to better try distinct approach… This is, in fact, the reason I implmented sqlapi.
If you need to have all querys previously stored in database, it probably could be much better to sort given arguments by its name and precalculate and store precalculated querys for each possible combination.
You can do it by assigning a default value to the variable when is not used. For instance if $location is not used you can set it to -1 as default value.
Then the where condition would be:
WHERE ($location=-1 OR entity.location = $location)

Django queryset count with extra select

I have a model with PointField for location coordinates. I have a MySQL function that calculates the distance between two points called dist. I use extra() "select" to calculate distance for each returned object in the queryset. I also use extra() "where" to filter those objects that are within a specific range. Like this
query = queryset.extra(
select={
"distance":"dist(geomfromtext('%s'),geomfromtext('%s'))"%(loc1, loc2)
},
where=["1 having `distance` <= %s"%(km)]
) #simplified example
This works fine for getting and reading the results, except when I try counting the resultset I get the error that 'distance' is not a field. After exploring a bit further, it seems that count ignores the "select" from extra and just uses "where". While the full SQL query looks like this:
SELECT (dist(geomfromtext('POINT (-4.6858300000000003 36.5154300000000021)'),geomfromtext('POINT (-4.8858300000000003 36.5154300000000021)'))) AS `distance`, `testmodel`.`id`, `testmodel`.`name`, `testmodel`.`email`, (...) FROM `testmodel` WHERE 1 having `distance` <= 50.0
The count query is much shorter and doesn't have the dist selection part:
SELECT COUNT( `testmodel`.`id`) FROM `testmodel` WHERE 1 having `distance` <= 50.0
Logically, MySQL gives an error because "distance" is undefined. Is there a way to tell Django it has to include the extra select for the count?
Thanks for any ideas!
You could use a raw query if you are not plannig to use any other database system.
params = {'point1':wktpoint1, 'point2':wktpoint2}
query = """
SELECT
dist(%(point1)s, %(point2)s)
FROM
testmodel
;"""
query_set = self.raw(query, params)
Also, if you need more GIS support, you should evaluate PostgreSQL+PostGIS (If you don't like to reinvent the wheel, you should not make your own dist function)
Django offers GIS support through GeoDjango. There you got functions like distance. You should check support here
In order to use GeoDjango you need to add a field on yout model, to tell them to use the GeoManager, Then you can start doing geoqueries, and you should have no problems with count.
with mysql you cando something like this using geodjango
### models.py
from django.contrib.gis.db import models
class YourModel(models.Model):
your_geo_field=models.PolygonField()
#your_geo_field=models.PointField()
#your_geo_field=models.GeometryField()
objects = models.GeoManager()
### your code
from django.contrib.gis.geos import *
from django.contrib.gis.measure import D
a_geom=fromstr('POINT(-96.876369 29.905320)', srid=4326)
distance=5
YoourModel.objects.filter(your_geo_field__distance_lt=(a_geom, D(m=distance))).count()
you can see better examples here and the reference here