I am trying to query my database with sqlalchemy in python to select all rows except those whose IDs belong to a certain list. Something like this;
exceptList = [1, 3, 5]
db.query.all() except those in exceptList
How do I go about this?
Given this initial setup:
class Question(db.Model):
id = db.Column(db.Integer, primary_key=True)
category = db.Column(db.String)
db.create_all()
# Assign category alternately: id 1 ->, id 2 -> B etc.
db.session.add_all([Question(category='AB'[n % 2]) for n in range(5)])
db.session.commit()
Let's try to get question for category "A", assuming questions 1 - 3 have already been asked.
If you already have the list, you can do
q = Question.query.filter(Question.id.not_in([1, 2, 3]), Question.category == 'A')
next_question = q.one()
print(next_question.id, next_question.category)
If the exception list must be obtained via a query, you can use an EXCEPT clause:
# Create a filter than defines the rows to skip
skip = Question.query.filter(db.or_(Question.id < 4, Question.category == 'B'))
q = Question.query.except_(skip)
next_question = q.one()
print(next_question.id, next_question.category)
This documentation section describes how to use except_ (though it uses UNION as an example).
You can try something like below.
except_list = ["password", "another_column"]
result = session.query(*[c for c in User.__table__.c if c.name not in except_list).all()
Related
I have an array of usernames as users:[Test1,Test2].I have to loop through this array and should find the unmatched usernames from table b.I have written the query as below:
def usersArray = []
def find
params.users.each{
find= sql.rows("select distinct name from table a,table b where a.id=b.id and b.name!=:n",[n:it])
if(find.size >0)
{
def usList = ["nm":find]
usersArray.push(usList);
}
}
From the above solution in my result i see both Test 1 and Test 2 even though they match.How should i change the query to display only the unmatched users?
Another way - count the existing rows which match the parameter name, then push those that have zero (forgive the bad syntax):
....
numberFound = sql.rows("select count(*)from table a where a.name=:n",[n:it])
if(numberFound = 0)
{
def usList = ["nm":find]
usersArray.push(usList);
}
...
Here is an example of how you might go about solving this problem. This assumes you have a domain class called User with a property called name which you want to match on.
// given a list of user names
List users = ['Test1', 'Test2', 'Test3', 'Test4']
// find all the users that match those names, and collect the matched names into a List
List matched = User.findAll("from User as u where u.name in (:names)", [names: users]).collect { it.name }
// remove the matched names from the user list and arrive at an 'unmatched' names list
List unmatched = users.minus(matched)
This was written off the top of my head so please forgive any typos or other random assumptions.
UPDATED
Since you seem set on using SQL you might be able to do something like this instead
List users = ['Test1', 'Test2', 'Test3', 'Test4']
List placeholders = []
users.each { placeholders << '?' }
String select = "select distinct name from table a,table b where a.id=b.id and b.name in (${placeholders.join(',')})"
List matched = sql.rows(select, users)
List unmatched = users.minus(matched)
I have two django-models
class ModelA(models.Model):
title = models.CharField(..., db_column='title')
text_a = models.CharField(..., db_column='text_a')
other_column = models.CharField(/*...*/ db_column='other_column_a')
class ModelB(models.Model):
title = models.CharField(..., db_column='title')
text_a = models.CharField(..., db_column='text_b')
other_column = None
Then I want to merge the two querysets of this models using union
ModelA.objects.all().union(ModelB.objects.all())
But in query I see
(SELECT
`model_a`.`title`,
`model_a`.`text_a`,
`model_a`.`other_column`
FROM `model_a`)
UNION
(SELECT
`model_b`.`title`,
`model_b`.`text_b`
FROM `model_b`)
Of course I got the exception The used SELECT statements have a different number of columns.
How to create the aliases and fake columns to use union-query?
You can annotate your last column to make up for column number mismatch.
a = ModelA.objects.values_list('text_a', 'title', 'other_column')
b = ModelB.objects.values_list('text_a', 'title')
.annotate(other_column=Value("Placeholder", CharField()))
# for a list of tuples
a.union(b)
# or if you want list of dict
# (this has to be the values of the base query, in this case a)
a.union(b).values('text_a', 'title', 'other_column')
In SQL query, we can use NULL to define the remaining columns/aliases
(SELECT
`model_a`.`title`,
`model_a`.`text_a`,
`model_a`.`other_column`
FROM `model_a`)
UNION
(SELECT
`model_b`.`title`,
`model_b`.`text_b`,
NULL
FROM `model_b`)
In Django, union operations needs to have same columns, so with values_list you can use those specific columns only like this:
qsa = ModelA.objects.all().values('text_a', 'title')
qsb = ModelB.objects.all().values('text_a', 'title')
qsa.union(qsb)
But there is no way(that I know of) to mimic NULL in union in Django. So there are two ways you can proceed here.
First One, add an extra field in your Model with name other_column. You can put the values empty like this:
other_column = models.CharField(max_length=255, null=True, default=None)
and use the Django queryset union operations as described in here.
Last One, the approach is bit pythonic. Try like this:
a = ModelA.objects.values_list('text_a', 'title', 'other_column')
b = ModelB.objects.values_list('text_a', 'title')
union_list = list()
for i in range(0, len(a)):
if b[i] not in a[i]:
union_list.append(b[i])
union_list.append(a[i])
Hope it helps!!
I have a django model:
class Field:
choice = models.CharField(choices=choices)
value = models.CharField(max_length=255)
In my database I have some cases where there are 3 "fields" with the same choice, and some cases where there is 1 field of that choice
How can I order the queryset so it returns, sorted by choice, but with all ones in a set of 3 at the start?
For example
[1,1,1,3,3,3,4,4,4,2,5] where 1,2,3,4,5 are possible choices?
This is the best I can do using django's ORM. Basically, just like in SQL, you have to construct a custom order_by statement. In our case, we'll place it in the SELECT and then order by it:
1) Get a list of choices sorted by frequency: [1, 3, 4, 2, 5]
freq_list = (
Field.objects.values_list('choice', flat=True)
.annotate(c=Count('id')).order_by('-c', 'choice')
)
2) Add indexes with enumerate: [(0,1), (1,3), (2,4), (3,2), (4,5)]
enum_list = list(enumerate(freq_list))
3) Create a list of cases: ['CASE', 'WHEN choice=1 THEN 0', ..., 'END']
case_list = ['CASE']
case_list += ["WHEN choice={1} THEN {0}".format(*tup) for tup in enum_list]
case_list += ['END']
4) Combine the case list into one string: 'CASE WHEN choice=1 THEN 0 ...'
case_statement = ' '.join(case_list)
5) Finally, use the case statement to select an extra field 'o' which will be corresponding order, then just order by this field
Field.objects.extra(select={'o': case_statement}).order_by('o')
To simplify all this, you can put the above code into a Model Manager:
class FieldManager(models.Manager):
def get_query_set(self):
freq_list = (
Field.objects.values_list('choice', flat=True)
.annotate(c=Count('id')).order_by('-c', 'choice')
)
enum_list = list(enumerate(freq_list))
case_list = ['CASE']
case_list += ["WHEN choice={1} THEN {0}".format(*tup) for tup in enum_list]
case_list += ['END']
case_statement = ' '.join(case_list)
ordered = Field.objects.extra(select={'o': case_statement}).order_by('o')
return ordered
class Field(models.Model):
...
freq_sorted = FieldManager()
Now you can query:
Field.freq_sorted.all()
Which will get you a Field QuerySet sorted by frequency of choices
You should make a function and detect which is repeated to select unique, then calling from mysql as a function over mysql
Given the following relationships:
- 1 MasterProduct parent -> many MasterProduct children
- 1 MasterProduct child -> many StoreProducts
- 1 StoreProduct -> 1 Store
I have defined the following declarative models in SQLAlchemy:
class MasterProduct(Base):
__tablename__ = 'master_products'
id = Column(Integer, primary_key=True)
pid = Column(Integer, ForeignKey('master_products.id'))
children = relationship('MasterProduct', join_depth=1,
backref=backref('parent', remote_side=[id]))
store_products = relationship('StoreProduct', backref='master_product')
class StoreProduct(Base):
__tablename__ = 'store_products'
id = Column(Integer, primary_key=True)
mid = Column(Integer, ForeignKey('master_products.id'))
sid = Column(Integer, ForeignKey('stores.id'))
timestamp = Column(DateTime)
store = relationship('Store', uselist=False)
class Store(Base):
__tablename__ = 'stores'
id = Column(Integer, primary_key=True)
My goal is to replicate the following query in SQLAlchemy with eager loading:
SELECT *
FROM master_products mp_parent
INNER JOIN master_products mp_child ON mp_child.pid = mp_parent.id
INNER JOIN store_products sp1 ON sp1.mid = mp_child.id
LEFT JOIN store_products sp2
ON sp1.mid = sp2.mid AND sp1.sid = sp2.sid AND sp1.timestamp < sp2.timestamp
WHERE mp_parent.id = 6752 AND sp2.id IS NULL
The query selects all MasterProduct children for parent 6752 and all
corresponding store products grouped by most recent timestamp using a NULL
self-join (greatest-n-per-group). There are 82 store products returned from the
query, with 14 master product children.
I've tried the following to no avail:
mp_child = aliased(MasterProduct)
sp1 = aliased(StoreProduct)
sp2 = aliased(StoreProduct)
q = db.session.query(MasterProduct).filter_by(id=6752) \
.join(mp_child, MasterProduct.children) \
.join(sp1, mp_child.store_products) \
.outerjoin(sp2, and_(sp1.mid == sp2.mid, sp1.sid == sp2.sid, sp1.timestamp < sp2.timestamp)) \
.filter(sp2.id == None) \
.options(contains_eager(MasterProduct.children, alias=mp_child),
contains_eager(MasterProduct.children, mp_child.store_products, alias=sp1))
>>> mp_parent = q.first() # the query below looks ok!
SELECT <all columns from master_products, master_products_1, and store_products_1>
FROM master_products INNER JOIN master_products AS master_products_1 ON master_products.id = master_products_1.pid INNER JOIN store_products AS store_products_1 ON master_products_1.id = store_products_1.mid LEFT OUTER JOIN store_products AS store_products_2 ON store_products_1.mid = store_products_2.mid AND store_products_1.sid = store_products_2.sid AND store_products_1.timestamp < store_products_2.timestamp
WHERE master_products.id = %s AND store_products_2.id IS NULL
LIMIT %s
>>> mp_parent.children # only *one* child is eagerly loaded (expected 14)
[<app.models.MasterProduct object at 0x2463850>]
>>> mp_parent.children[0].id # this is correct, 6762 is one of the children
6762L
>>> mp_parent.children[0].pid # this is correct
6752L
>>> mp_parent.children[0].store_products # only *one* store product is eagerly loaded (expected 7 for this child)
[<app.models.StoreProduct object at 0x24543d0>]
Taking a step back and simplifying the query to eagerly load just the children
also results in only 1 child being eagerly loaded instead of all 14:
mp_child = aliased(MasterProduct)
q = db.session.query(MasterProduct).filter_by(id=6752) \
.join(mp_child, MasterProduct.children)
.options(contains_eager(MasterProduct.children, alias=mp_child))
However, when I use a joinedload, joinedload_all, or subqueryload, all
14 children are eagerly loaded, i.e.:
q = db.session.query(MasterProduct).filter_by(id=6752) \
.options(joinedload_all('children.store_products', innerjoin=True))
So the problem seems to be populating MasterProduct.children from the
explicit join using contains_eager.
Can anyone spot the error in my ways or help point me in the right direction?
OK what you might observe in the SQL is that there's a "LIMIT 1" coming out. That's because you're using first(). We can just compare the first two queries, the contains eager, and the joinedload:
join() + contains_eager():
SELECT master_products_1.id AS master_products_1_id, master_products_1.pid AS master_products_1_pid, master_products.id AS master_products_id, master_products.pid AS master_products_pid
FROM master_products JOIN master_products AS master_products_1 ON master_products.id = master_products_1.pid
WHERE master_products.id = ?
LIMIT ? OFFSET ?
joinedload():
SELECT anon_1.master_products_id AS anon_1_master_products_id, anon_1.master_products_pid AS anon_1_master_products_pid, master_products_1.id AS master_products_1_id, master_products_1.pid AS master_products_1_pid
FROM (SELECT master_products.id AS master_products_id, master_products.pid AS master_products_pid
FROM master_products
WHERE master_products.id = ?
LIMIT ? OFFSET ?) AS anon_1 JOIN master_products AS master_products_1 ON anon_1.master_products_id = master_products_1.pid
you can see the second query is quite different; because first() means a LIMIT is applied, joinedload() knows to wrap the "criteria" query in a subquery, apply the limit to that, then apply the JOIN afterwards. In the join+contains_eager case, the LIMIT is applied to the collection itself and you get the wrong number of rows.
Just changing the script at the bottom to this:
for q, query_label in queries:
mp_parent = q.all()[0]
I get the output it says you're expecting:
[explicit join with contains_eager] children=3, store_products=27
[joinedload] children=3, store_products=27
[joinedload_all] children=3, store_products=27
[subqueryload] children=3, store_products=27
[subqueryload_all] children=3, store_products=27
[explicit joins with contains_eager, filtered by left-join] children=3, store_products=9
(this is why getting a user-created example is so important)
I have a table posts and it stores 3 types of post, Topic, Reply and Comment. Each one has its parent id.
# Single table inheritance
class Post(Base):
__tablename__ = 'posts'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('posts.id'))
discriminator = Column(String(1))
content = Column(UnicodeText)
added_at = Column(DateTime)
__mapper_args__ = {'polymorphic_on': discriminator}
class Topic(Post):
replies = relation("Reply")
__mapper_args__ = {'polymorphic_identity': 't'}
class Reply(Post):
comments = relation("Comment")
__mapper_args__ = {'polymorphic_identity': 'r'}
class Comment(Post):
__mapper_args__ = {'polymorphic_identity': 'c'}
And I'm using eagerload_all() to get all the replies and comments belong to one topic:
session.query(Topic).options(eagerload_all('replies.comments')).get(topic_id)
My question is, if I want to get only replies and those replies' comments in certain time period, for example, this week, or this month. How should I use filter to achieve this?
Thank you
The use of eagerload_all will only query for the children of an object Topic immediately rather on first request to the Replies and/or Comments, but since you load the Topic object into the session, all its related children will be loaded as well. This gives you the first option:
Option-1: Filter in the python code instead of database:
Basically create a method on the Topic object similar to
class Topic(Post):
...
def filter_replies(self, from_date, to_date):
return [r for r in self.replies
if r.added_at >= from_date
and r.added_at <= to_date]
Then you can do similar code on Replies to filter Comments or any combination of those. You get the idea.
Option-2: Filter on the database level:
In order to achieve this you need not load the Topic object, but filter directly on the Reply/Comment. Following query returns all Reply for a given Topic with a date filter:
topic_id = 1
from_date = date(2010, 9, 5)
to_date = date(2010, 9, 15)
q = session.query(Reply)
q = q.filter(Reply.parent_id == topic_id)
q = q.filter(Reply.added_at >= from_date)
q = q.filter(Reply.added_at <= to_date)
for r in q.all():
print "Reply: ", r
The version for the Comment is just a little bit more involved as you require an alias in order to overcome the SQL statement generation issue as all your objects are mapped to the same table name:
topic_id = 1
from_date = date(2010, 9, 5)
to_date = date(2010, 9, 15)
ralias = aliased(Reply)
q = session.query(Comment)
q = q.join((ralias, Comment.parent_id == ralias.id))
q = q.filter(ralias.parent_id == topic_id)
q = q.filter(Comment.added_at >= from_date)
q = q.filter(Comment.added_at <= to_date)
for c in q:
print "Comment: ", c
Obviously you can create a function that would combine both peaces into a more comprehensive query.
In order to achieve this week or this month type of queries you can either convert these filter into a date range as shown above or use the expression.func functionality of SA.