Django queryset count with extra select - mysql

I have a model with PointField for location coordinates. I have a MySQL function that calculates the distance between two points called dist. I use extra() "select" to calculate distance for each returned object in the queryset. I also use extra() "where" to filter those objects that are within a specific range. Like this
query = queryset.extra(
select={
"distance":"dist(geomfromtext('%s'),geomfromtext('%s'))"%(loc1, loc2)
},
where=["1 having `distance` <= %s"%(km)]
) #simplified example
This works fine for getting and reading the results, except when I try counting the resultset I get the error that 'distance' is not a field. After exploring a bit further, it seems that count ignores the "select" from extra and just uses "where". While the full SQL query looks like this:
SELECT (dist(geomfromtext('POINT (-4.6858300000000003 36.5154300000000021)'),geomfromtext('POINT (-4.8858300000000003 36.5154300000000021)'))) AS `distance`, `testmodel`.`id`, `testmodel`.`name`, `testmodel`.`email`, (...) FROM `testmodel` WHERE 1 having `distance` <= 50.0
The count query is much shorter and doesn't have the dist selection part:
SELECT COUNT( `testmodel`.`id`) FROM `testmodel` WHERE 1 having `distance` <= 50.0
Logically, MySQL gives an error because "distance" is undefined. Is there a way to tell Django it has to include the extra select for the count?
Thanks for any ideas!

You could use a raw query if you are not plannig to use any other database system.
params = {'point1':wktpoint1, 'point2':wktpoint2}
query = """
SELECT
dist(%(point1)s, %(point2)s)
FROM
testmodel
;"""
query_set = self.raw(query, params)
Also, if you need more GIS support, you should evaluate PostgreSQL+PostGIS (If you don't like to reinvent the wheel, you should not make your own dist function)
Django offers GIS support through GeoDjango. There you got functions like distance. You should check support here
In order to use GeoDjango you need to add a field on yout model, to tell them to use the GeoManager, Then you can start doing geoqueries, and you should have no problems with count.
with mysql you cando something like this using geodjango
### models.py
from django.contrib.gis.db import models
class YourModel(models.Model):
your_geo_field=models.PolygonField()
#your_geo_field=models.PointField()
#your_geo_field=models.GeometryField()
objects = models.GeoManager()
### your code
from django.contrib.gis.geos import *
from django.contrib.gis.measure import D
a_geom=fromstr('POINT(-96.876369 29.905320)', srid=4326)
distance=5
YoourModel.objects.filter(your_geo_field__distance_lt=(a_geom, D(m=distance))).count()
you can see better examples here and the reference here

Related

Laravel model scope having sum larger than x

I have the following two models:
ModelA
- id
- population
ModelB
- id
- model_a_id
- population
Now, I want to define a scope "not_full". The aim of this scope is to only return the ModelA instances where the population is larger than the population in the related ModelB's. This brings me to the following scope function in ModelA:
public function scopeTesting(Builder $query): Builder
{
return $query
->join('model_b', 'model_a.id', '=', 'model_b.model_a_id')
->groupBy('model_a.id')
->havingRaw('SUM(model_b.population) <= model_a.population');
}
As this would become a MySQL query, we have to deal with only_full_group_by of MySQL 8. This then errors our query as eloquent will now use SELECT * in the query, as MySQL doesn't know which value model_b.id it should use. However, we don't care about model_b.id. We only care about the values in ModelA. This would normally be solved with changing the select to SELECT model_a.* but this will then probably bring some bugs further down the read as every query with this scope will now only be able to return the values of ModelA. Is there any way to get around this issue in eloquent?

sqlalchemy column that is sql expression with bind parameters

I'm trying to map a class that has a column, that doesn't really exist, but is simply a sql expression that takes bind parameters at query time. The model below is an example of what I'm trying to do.
class House(db.Model):
__tablename__ = 'houses'
id = db.Column(db.Integer, primary_key=True)
#hybrid_property
def distance_from_current_location(self):
# what to return here?
return self.distance_from_current_location
#distance_from_current_location.expression
def distance_from_current_location(cls, latitude, longitude):
return db.func.earth_distance(
db.func.ll_to_earth(latitude, longitude), cls.earth_location)
# defer loading, as the raw value is pretty useless in python
earth_location = deferred(db.Column(EARTH))
Then I'd like to query via flask-sqlalchemy:
latidude = 80.20393
longitude = -90.44380
paged_data = \
House.query \
.bindparams({'latitude': latitude, 'longitude': longitude}) \
.paginate(1, 20, False)
My questions are:
How do I do this? Is it possible to use a hybrid_property like this?
If I can use a hybrid_property, what should the python method return? (there is no python way to interpret this, it should just return whatever the DB expression returned.
latitude and longitude only exist during query time and need to be bound for each query. How do I bind latitude and longitude during query time? The bindparams bit in my code snippet I just made up, but it illustrates what I want to do. Is it possible to do this?
I've read the docs but couldn't find any hybrid_property or method with a bind parameter in the examples...
(Also since this is not a real column, but just something I want to use on my model I don't want this to trigger alembic to generate a new column for it).
Thanks!
You can't do this. distance_from_current_location is not a fake column either, since it depends on query-specific parameters. Imagine you were to write a SQL view for this; how would you write the definition? (Hint: you can't.)
SQLAlchemy uses the identity map pattern, which means that for a particular primary key, only one instance exists in your entire session. How would you handle querying the same instance but with different latitude/longitude values? (The instances returned from the later query would overwrite those returned from the earlier one.)
The correct way to do this is through additional entities at query time, like this:
House.query \
.with_entities(House, db.func.earth_distance(db.func.ll_to_earth(latitude, longitude), House.earth_location)) \
.filter(...)
Or through a hyrbid_method (the usage of which requires passing in latitude and longitude every time):
class House(db.Model):
...
#hybrid_method
def distance_from_current_location(self, latitude, longitude):
# implement db.func.ll_to_earth(latitude, longitude), self.earth_location) **in Python** here
#distance_from_current_location.expression
def distance_from_current_location(cls, latitude, longitude):
...

Rails - how to fetch random records from an object?

I am doing something like this:
data = Model.where('something="something"')
random_data = data.rand(100..200)
returns:
NoMethodError (private method `rand' called for #<User::ActiveRecord_Relation:0x007fbab27d7ea8>):
Once I get this random data, I need to iterate through that data, like this:
random_data.each do |rd|
...
I know there's a way to fetch random data in MySQL, but I need to pick the random data like 400 times, so I think to load data once from database and 400 times to pick random number is more efficient than to run the query 400 times on MySQL.
But - how to get rid of that error?
NoMethodError (private method `rand' called for #<User::ActiveRecord_Relation:0x007fbab27d7ea8>):
Thank you in advance
I would add the following scope to the model (depends on the database you are using):
# to model/model.rb
# 'RANDOM' works with postgresql and sqlite, whereas mysql uses 'RAND'
scope :random, -> { order('RAND()') }
Then the following query would load a random number (in the range of 200-400) of objects in one query:
Model.random.limit(rand(200...400))
If you really want to do that in Rails and not in the database, then load all records and use sample:
Model.all.sample(rand(200..400))
But that to be slower (depending on the number of entries in the database), because Rails would load all records from the database and instantiate them what might take loads of memory.
It really depends how much effort you want to put into optimizing this, because there's more than one solution. Here's 2 options..
Something simple is to use ORDER BY RAND() LIMIT 400 to randomly select 400 items.
Alternatively, just select everything under the moon and then use Ruby to randomly pick 400 out of the total result set, ex:
data = Model.where(something: 'something').all # all is necessary to exec query
400.times do
data.sample # returns a random model
end
I wouldn't recommend the second method, but it should work.
Another way, which is not DB specific is :
def self.random_record
self.where('something = ? and id = ?', "something", rand(self.count))
end
The only things here is - 2 queries are being performed. self.count is doing one query - SELECT COUNT(*) FROM models and the other is your actual query to get a random record.
Well, now suppose you want n random records. Then write it like :
def self.random_records n
records = self.count
rand_ids = Array.new(n) { rand(records) }
self.where('something = ? and id IN (?)',
"something", rand_ids )
end
Use data.sample(rand(100..200))
for more info why rand is not working, read here https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/4555

Retrieving ultimate sql query sentence (with the values in place of any '?')

Since it may be efficient to paste a flawed sql query directly into a database administration tool such as phpmyadmin in order to work on it until it returns the expected result,
Is there any way to retrieve the ultimate sql sentence Sqlalchemy Core supposedly passes to the MySql database, in a ready-to-execute shape ?
This typically means that you want the bound parameters to be rendered inline. There is limited support for this automatically (as of SQLA 0.9 this will work):
from sqlalchemy.sql import table, column, select
t = table('x', column('a'), column('b'))
stmt = select([t.c.a, t.c.b]).where(t.c.a > 5).where(t.c.b == 10)
print(stmt.compile(compile_kwargs={"literal_binds": True}))
also you'd probably want the query to be MySQL specific, so if you already have an engine lying around you can pass that in too:
from sqlalchemy import create_engine
engine = create_engine("mysql://")
print(stmt.compile(engine, compile_kwargs={"literal_binds": True}))
and it prints:
SELECT x.a, x.b
FROM x
WHERE x.a > 5 AND x.b = 10
now, if you have more elaborate values in the parameters, like dates, SQLAlchemy might throw an error, it only has "literal binds" renderers for a very limited number of types. An approach that bypasses that system instead and gives you a pretty direct shot at turning those parameters into strings is then do to a "search and replace" on the statement object, replacing the bound parameters with literal strings:
from sqlalchemy.sql import visitors, literal_column
from sqlalchemy.sql.expression import BindParameter
def _replace(arg):
if isinstance(arg, BindParameter):
return literal_column(
repr(arg.effective_value) # <- do any fancier conversion here
)
stmt = visitors.replacement_traverse(stmt, {}, _replace)
once you do that you can just print it:
print(stmt)
or the MySQL version:
print(stmt.compile(engine))

SQLAlchemy query to return only n results?

I have been googling and reading through the SQLAlchemy documentation but haven't found what I am looking for.
I am looking for a function in SQLAlchemy that limits the number of results returned by a query to a certain number, for example: 5? Something like first() or all().
for sqlalchemy >= 1.0.13
Use the limit method.
query(Model).filter(something).limit(5).all()
Alternative syntax
query.(Model).filter(something)[:5].all()
If you need it for pagination you can do like this:
query = db.session.query(Table1, Table2, ...).filter(...)
if page_size is not None:
query = query.limit(page_size)
if page is not None:
query = query.offset(page*page_size)
query = query.all()
Or if you query one table and have a model for it you can:
query = (Model.query(...).filter(...))
.paginate(page=start, per_page=size))
Since v1.4, SQLAlchemy core's select function provides a fetch method for RDBMS that support FETCH clauses*. FETCH was defined in the SQL 2008 standard to provide a consistent way to request a partial result, as LIMIT/OFFSET is not standard.
Example:
# As with limit queries, it's usually sensible to order
# the results to ensure results are consistent.
q = select(tbl).order_by(tbl.c.id).fetch(10)
# Offset is supported, but it is inefficient for large resultsets.
q_with_offset = select(tbl).order_by(tbl.c.id).offset(10).fetch(10)
# A suitable where clause may be more efficient
q = (select(tbl)
.where(tbl.c.id > max_id_from_previous_query)
.order_by(tbl.c.id)
.fetch(10)
)
The syntax is supported in the ORM layer since v1.4.38. It is only supported for 2.0-style select on models; the legacy session.query syntax does not support it.
q = select(Model).order_by(Model.id).fetch(10)
* Currently Oracle, PostgreSQL and MSSQL.
In my case it works like
def get_members():
m = Member.query[:30]
return m