Ordering a queryset by occurrences

Ordering a queryset by occurrences - mysql

I have a django model:
class Field:
choice = models.CharField(choices=choices)
value = models.CharField(max_length=255)
In my database I have some cases where there are 3 "fields" with the same choice, and some cases where there is 1 field of that choice
How can I order the queryset so it returns, sorted by choice, but with all ones in a set of 3 at the start?
For example
[1,1,1,3,3,3,4,4,4,2,5] where 1,2,3,4,5 are possible choices?

This is the best I can do using django's ORM. Basically, just like in SQL, you have to construct a custom order_by statement. In our case, we'll place it in the SELECT and then order by it:
1) Get a list of choices sorted by frequency: [1, 3, 4, 2, 5]
freq_list = (
Field.objects.values_list('choice', flat=True)
.annotate(c=Count('id')).order_by('-c', 'choice')
)
2) Add indexes with enumerate: [(0,1), (1,3), (2,4), (3,2), (4,5)]
enum_list = list(enumerate(freq_list))
3) Create a list of cases: ['CASE', 'WHEN choice=1 THEN 0', ..., 'END']
case_list = ['CASE']
case_list += ["WHEN choice={1} THEN {0}".format(*tup) for tup in enum_list]
case_list += ['END']
4) Combine the case list into one string: 'CASE WHEN choice=1 THEN 0 ...'
case_statement = ' '.join(case_list)
5) Finally, use the case statement to select an extra field 'o' which will be corresponding order, then just order by this field
Field.objects.extra(select={'o': case_statement}).order_by('o')
To simplify all this, you can put the above code into a Model Manager:
class FieldManager(models.Manager):
def get_query_set(self):
freq_list = (
Field.objects.values_list('choice', flat=True)
.annotate(c=Count('id')).order_by('-c', 'choice')
)
enum_list = list(enumerate(freq_list))
case_list = ['CASE']
case_list += ["WHEN choice={1} THEN {0}".format(*tup) for tup in enum_list]
case_list += ['END']
case_statement = ' '.join(case_list)
ordered = Field.objects.extra(select={'o': case_statement}).order_by('o')
return ordered
class Field(models.Model):
...
freq_sorted = FieldManager()
Now you can query:
Field.freq_sorted.all()
Which will get you a Field QuerySet sorted by frequency of choices

You should make a function and detect which is repeated to select unique, then calling from mysql as a function over mysql

Related

sqlalchemy EXCEPT: select all IDs except some ids

I am trying to query my database with sqlalchemy in python to select all rows except those whose IDs belong to a certain list. Something like this;
exceptList = [1, 3, 5]
db.query.all() except those in exceptList
How do I go about this?

Given this initial setup:
class Question(db.Model):
id = db.Column(db.Integer, primary_key=True)
category = db.Column(db.String)
db.create_all()
# Assign category alternately: id 1 ->, id 2 -> B etc.
db.session.add_all([Question(category='AB'[n % 2]) for n in range(5)])
db.session.commit()
Let's try to get question for category "A", assuming questions 1 - 3 have already been asked.
If you already have the list, you can do
q = Question.query.filter(Question.id.not_in([1, 2, 3]), Question.category == 'A')
next_question = q.one()
print(next_question.id, next_question.category)
If the exception list must be obtained via a query, you can use an EXCEPT clause:
# Create a filter than defines the rows to skip
skip = Question.query.filter(db.or_(Question.id < 4, Question.category == 'B'))
q = Question.query.except_(skip)
next_question = q.one()
print(next_question.id, next_question.category)
This documentation section describes how to use except_ (though it uses UNION as an example).

You can try something like below.
except_list = ["password", "another_column"]
result = session.query(*[c for c in User.__table__.c if c.name not in except_list).all()

How to create union of two different django-models?

I have two django-models
class ModelA(models.Model):
title = models.CharField(..., db_column='title')
text_a = models.CharField(..., db_column='text_a')
other_column = models.CharField(/*...*/ db_column='other_column_a')
class ModelB(models.Model):
title = models.CharField(..., db_column='title')
text_a = models.CharField(..., db_column='text_b')
other_column = None
Then I want to merge the two querysets of this models using union
ModelA.objects.all().union(ModelB.objects.all())
But in query I see
(SELECT
`model_a`.`title`,
`model_a`.`text_a`,
`model_a`.`other_column`
FROM `model_a`)
UNION
(SELECT
`model_b`.`title`,
`model_b`.`text_b`
FROM `model_b`)
Of course I got the exception The used SELECT statements have a different number of columns.
How to create the aliases and fake columns to use union-query?

You can annotate your last column to make up for column number mismatch.
a = ModelA.objects.values_list('text_a', 'title', 'other_column')
b = ModelB.objects.values_list('text_a', 'title')
.annotate(other_column=Value("Placeholder", CharField()))
# for a list of tuples
a.union(b)
# or if you want list of dict
# (this has to be the values of the base query, in this case a)
a.union(b).values('text_a', 'title', 'other_column')

In SQL query, we can use NULL to define the remaining columns/aliases
(SELECT
`model_a`.`title`,
`model_a`.`text_a`,
`model_a`.`other_column`
FROM `model_a`)
UNION
(SELECT
`model_b`.`title`,
`model_b`.`text_b`,
NULL
FROM `model_b`)

In Django, union operations needs to have same columns, so with values_list you can use those specific columns only like this:
qsa = ModelA.objects.all().values('text_a', 'title')
qsb = ModelB.objects.all().values('text_a', 'title')
qsa.union(qsb)
But there is no way(that I know of) to mimic NULL in union in Django. So there are two ways you can proceed here.
First One, add an extra field in your Model with name other_column. You can put the values empty like this:
other_column = models.CharField(max_length=255, null=True, default=None)
and use the Django queryset union operations as described in here.
Last One, the approach is bit pythonic. Try like this:
a = ModelA.objects.values_list('text_a', 'title', 'other_column')
b = ModelB.objects.values_list('text_a', 'title')
union_list = list()
for i in range(0, len(a)):
if b[i] not in a[i]:
union_list.append(b[i])
union_list.append(a[i])
Hope it helps!!

Using GROUP_CONCAT with other annotations in Django

I use an annotation which counts upvotes/downvotes while returning a list of articles:
queryset = queryset.annotate(
upvotes_count=models.Sum(
models.Case(
models.When(likes__like_state=1, then=1),
default=0,
output_field=models.IntegerField()
)
)
).annotate(
downvotes_count=models.Sum(
models.Case(
models.When(likes__like_state=-1, then=1),
default=0,
output_field=models.IntegerField()
))
)
But each article also has a few categories as ManyToMany related field and I needed to return those categories comma-separated, so I wrote this function:
class GroupConcat(models.Aggregate):
function = 'GROUP_CONCAT'
template = "%(function)s(%(distinct)s %(expressions)s %(separator)s)"
def __init__(self, expression, distinct=False, separator=', ', **extra):
super(GroupConcat, self).__init__(
expression,
distinct='DISTINCT' if distinct else '',
separator="SEPARATOR '%s'" % separator,
output_field=models.CharField(),
**extra
)
And added it to my annotation:
queryset = queryset.annotate(category=GroupConcat('categories__name'))
It works fine but upvotes_count and downvotes_count went crazy and started to multiply(!) results by amount of categories.
So the question is: "Is there a way to use GROUP_CONCAT in Django without breaking down SUM annotations?"

Very nice solution.
But to operate with group by field you should use order_by statement.
for example:
Store.objects.all().values('value').order_by('value').annotate(stores=GroupConcat('id'))
would generate sql statement
SELECT store.value, GROUP_CONCAT(store.id SEPARATOR ",") AS
stores FROM store WHERE store.value > 0 GROUP BY
store.value ORDER BY store.value ASC
and result would be
value, stores
1 "16,27"
Without order_by it would be like this:
SELECT store.value, GROUP_CONCAT(store.id SEPARATOR ",") AS
stores FROM store WHERE store.value > 0 GROUP BY store.id
ORDER BY store.value ASC
and result would be
value, stores
1 16
2 27

How to use the table columns instead of variables in QueryExpression::addCase()

In CakePHPs new ORM, you can use the QueryBuilder to build (in theory) any query.
I want to select the value of one of two columns, depending on another value. In a regular query, that can be done as follows:
SELECT IF(from_id = 1, to_id, from_id) AS other_id FROM messages;
I am trying to archive the same query using the QueryBuilder and QueryExpression::addCase()
$messagesQuery = $this->Messages->find('all');
$messagesQuery->select([
'other_id' => $messagesQuery->newExpr()->addCase(
$messagesQuery->newExpr()->add(['from_id' => $this->authUser->id]),
['to_id', 'from_id'],
['integer', 'integer']
)
]);
This does not work, as the passed values are not integers, but rather table columns containing integers.
Through trial and error (using the method add() again), I got the following:
$messagesQuery = $this->Messages->find('all');
$messagesQuery->select([
'other_id' => $messagesQuery->newExpr()->addCase(
$messagesQuery->newExpr()->add(['from_id' => $this->authUser->id]),
[
$messagesQuery->newExpr()->add(['to_id']),
$messagesQuery->newExpr()->add(['from_id'])
],
['integer', 'integer']
)
]);
This results in the following query:
SELECT (CASE WHEN from_id = 1 THEN to_id END) AS `other_id` FROM messages Messages
Now, the ELSE part is missing, although the CakePHP book states:
Any time there are fewer case conditions than values, addCase will automatically produce an if .. then .. else statement.
The examples in the CakePHP book are not very helpful in this case, as they only use static integers or strings as values, for example:
#SELECT SUM(CASE published = 'Y' THEN 1 ELSE 0) AS number_published, SUM(CASE published = 'N' THEN 1 ELSE 0) AS number_unpublished FROM articles GROUP BY published
$query = $articles->find();
$publishedCase = $query->newExpr()->addCase($query->newExpr()->add(['published' => 'Y']), 1, 'integer');
$notPublishedCase = $query->newExpr()->addCase($query->newExpr()->add(['published' => 'N']), 1, 'integer');
$query->select([
'number_published' => $query->func()->sum($publishedCase),
'number_unpublished' => $query->func()->sum($unpublishedCase)
])
->group('published');
Is there a way to get the method addCase to use the two table columns as values instead of just static values?

As it turns out, I was just one logical step short of the solution in my previous edit.
As the CakePHP book correctly states:
Any time there are fewer case conditions than values, addCase will automatically produce an if .. then .. else statement.
For that to work though, both the conditions and values have to be an array, even if there is only one condition. (This the CakePHP book does not state.)
This code:
$messagesQuery = $this->Messages->find('all');
$messagesQuery->select([
'other_id' => $messagesQuery->newExpr()->addCase(
[
$messagesQuery->newExpr()->add(['from_id' => $this->authUser->id])
],
[
$messagesQuery->newExpr()->add(['to_id']),
$messagesQuery->newExpr()->add(['from_id'])
],
['integer', 'integer']
)
]);
results in this query:
SELECT (CASE WHEN from_id = 1 THEN to_id ELSE from_id END) AS `other_id` FROM messages Messages
Eureka

Joining 2 Tables on Multiple Non Foreign Key Columns in Flask with SQLAlchemy and Retrieving All Columns

I have a few tables shown below that I would like to join on columns that are not foreign keys to each other's tables and then have access to the columns of both. Here are the classes:
class Yi(db.Model):
year = db.Column(db.Integer(4), primary_key=True)
industry_id = db.Column(db.String(5), primary_key=True)
wage = db.Column(db.Float())
complexity = db.Column(db.Float())
class Ygi(db.Model, AutoSerialize):
year = db.Column(db.Integer(4), primary_key=True)
geo_id = db.Column(db.String(8), primary_key=True)
industry_id = db.Column(db.String(5), primary_key=True)
wage = db.Column(db.Float())
So, what I would like to get are the columns of both tables joined by the IDs I specify, in this case Year and industry_id. Is this possible? Here is the SQL I've written to achieve this...
SELECT
yi.complexity, ygi.*
FROM
yi, ygi
WHERE
yi.year = ygi.year and
yi.industry_id = ygi.industry_id

One dirty way is :
q=session.query(Ygi,Yi.complexity).\
filter(Yi.year==Ygi.year).\
filter(Yi.industry_id==Ygi.industry_id)
Which gives you :
SELECT ygi.year AS ygi_year, ygi.geo_id AS ygi_geo_id,
ygi.industry_id AS ygi_industry_id, ygi.wage AS ygi_wage,
yi.complexity AS yi_complexity
FROM ygi, yi
WHERE yi.year = ygi.year
AND yi.industry_id = ygi.industry_id
I find this dirty because it does not use the join() method.
You can figure out how to use the join() with the SQLAlchemy documentation
Then, you can choose to use a virtual model : see answer of TokenMacGuy in this question Mapping a 'fake' object in SQLAlchemy.
It will be a good solution.
Or you will just have a YiYgi class that will not be a sqlalchemy.Base derived class but just an object. It more a "hand-fashion" way to do it.
The class will have a classmethod get() method that will:
call the query you build just before,
call the init with the returned request lines and build up one instance per line
This is an example :
class YiYgi(object):
def __init__(self,year, geo_id, industry_id, wage, complexity):
# Initialize all your fields
self.year = year
self.geo_id = geo_id
self.industry_id = industry_id
self.wage = wage + 100 # You can even make some modifications to the values here
self.complexity = complexity
#classmethod
def get_by_year_and_industry(cls, year, industry_id):
""" Return a list of YiYgi instances, void list if nothing available """
q = session.query(Ygi,Yi.complexity).\
filter(Yi.year==Ygi.year).\
filter(Yi.industry_id==Ygi.industry_id)
results = q.all()
yiygi_list = []
for result in results:
# result is a tuple with (YGi instance, Yi.complexity value)
ygi_result = result[0]
yiygi = YiYgi(ygi_result.ygi_year,
ygi_result.geo_id,
ygi_result.industry_id,
ygi_result.wage,
result[1])
yiygi_list.append(yiygi)
return yiygi_list

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Ordering a queryset by occurrences - mysql

You should make a function and detect which is repeated to select unique, then calling from mysql as a function over mysql

Related

sqlalchemy EXCEPT: select all IDs except some ids

How to create union of two different django-models?

Using GROUP_CONCAT with other annotations in Django

How to use the table columns instead of variables in QueryExpression::addCase()

Joining 2 Tables on Multiple Non Foreign Key Columns in Flask with SQLAlchemy and Retrieving All Columns

Categories

Resources