Django - Avoiding joins when querying foreign key ids? - mysql

Say I have a simple blog entry model in Django:
class Entry(models.Model):
author = models.ForeignKey(Author)
topic = models.ForeignKey(Topic)
entry = models.CharField(max_length=50, default='')
Now say I want to query for a author or topic, but exclude a particular topic altogether.
entry_list = Entry.objects.filter(Q(author=12)|Q(topic=123)).exclude(topic=666)
Sinmple enough, but I've found that this raw SQL contains a join on the topic table, even though it doesn't have to be used:
SELECT `blog_entry`.`id`
FROM `blog_entry`
LEFT OUTER JOIN `blog_topic`
ON (`blog_entry`.`topic_id` = `blog_topic`.`id`)
WHERE ((`blog_entry`.`author_id` = 12
OR `blog_entry`.`topic_id` = 123
)
AND NOT ((`blog_topic`.`id` = 666
AND NOT (`blog_topic`.`id` IS NULL)
AND `blog_topic`.`id` IS NOT NULL
))
)
Why is that? How can I get Django to query only the column ids and not join tables? I've tried the following but it give a FieldError:
entry_list = Entry.objects.filter(Q(author_id=12)|Q(topic_id=123)).exclude(topic_id=666)

i wonder whether this is a bug.
trying a similar example, i get no join when putting the exclude before the filter (but i do get it using your order)

Related

Different result for #SQLResultSetMapping+Joins on multiple entities | JPA

I am running a NativeQuery with JPA that gives different results compared to running the query in an sql tool. Probably I missunderstand s.th. within the concept of #SQLResultSetMapping.
--- Overview ---
I am mapping the result to two entities, so I am expecting to receive a List of Entity-Pairs. This works.
When you look at the picture below, you'll see the result of the query in an sql tool, where ..
.. the RED BOX maps to one entity
.. the GREEN BOX maps to the second entity
JPA should give me one of the native row as a pair of two entities.
Problem
This is where things go wrong. Yes, I will receive a list of pairs of both entities, but unlike in the picture the column "pp.id" does not iterate over all rows of the respective table (in the picture "5,6,7,..", from JPA "5,5,5,5,..").
The column pp.id is a joined column, I guess that I missunderstand something within JPA when it comes to Joins + SQLResultSetMappings. It appears to me that the difference is that JPA is always joining THE SAME row from table 'propertyprofile' (more detailes below), unlike when the query is run in sql.
I hope that somebody takes pity on me and helps me out. :)
--- Details ---
Query
I am basically trying to find out if every 'product', has defined a 'value' (table propertyvalue) for a predefined 'property' (table propertyprofile).
The probably most relevant part is at the bottom, where "propertyprofile" is joined and "propertyvalue" is left-joined.
select sg.ID as 'sg.id', sg.Name as 'sg.name', ppcount.totalppcount as 'sg.totalppcount', ppcount.totalppothercount as 'sg.totalppothercount',
p.ID as 'product.id', pp.id as 'pp.id', pp.Role as 'pp.role', pp.Name as 'pp.name',
(case when pv.id is null then '0' else '1' end) as 'hasPropertyValue', pv.ID as 'pv.id', pv.StringValue, pv.IntervallMin, pv.IntervallMax
from shoppingguide sg
join
(
select sg.ID as 'sgid', count(*) as 'totalppcount', count(pp_other.ID) as 'totalppothercount' from propertyprofile pp_all
left join propertyprofile pp_other on pp_other.id = pp_all.id AND pp_other.Role = '0'
join shoppingguide sg on pp_all.ShoppingGuideID = sg.ID
join shopifyshop ss on sg.ShopifyShopID = ss.ID
where
pp_all.ShoppingGuideID = sg.ID AND
ss.Name = :shopName
GROUP BY pp_all.ShoppingGuideID
) ppcount on ppcount.sgid = sg.id
join shopifyshop ss on sg.ShopifyShopID=ss.ID
join product p on p.ShopifyShopID = ss.ID
join propertyprofile pp on (pp.ShoppingGuideID = sg.id AND pp.Role = '0')
left join propertyvalue pv on (pv.ProductID=p.ID and pv.PropertyProfileID = pp.id)
where
ss.Name = :shopName
order by sg.id asc, p.id asc, pp.id asc
;
Tables
There are a lot of tables involved, but these are the most important ones to understand the query:
product
propertyprofile - a feature that all products have (e.g. height, price)
propertyvalue - data for a specific feature; relates to propertyprofile (e.g. 5cm; $120)
SQLResultSetMapping
The mapping is done onto two entites: ProductDataFillSummary_ShoppingGuideInformation, ProductDataFillSummary_ProductInformation.
#SqlResultSetMapping(
name = "ProductDataFillSummaryMapping",
entities = {
#EntityResult (
entityClass = ProductDataFillSummary_ShoppingGuideInformation.class,
fields = {
#FieldResult(name = "shoppingGuideId", column = "sg.id"),
#FieldResult(name = "shoppingGuideName", column = "sg.name"),
#FieldResult(name = "numberOfTotalPropertyProfiles", column = "sg.totalppcount"),
#FieldResult(name = "numberOfTotalPropertyProfilesOther", column = "sg.totalppothercount")
}),
#EntityResult(
entityClass = ProductDataFillSummary_ProductInformation.class,
fields = {
#FieldResult(name = "productID", column = "product.id"),
#FieldResult(name = "propertyProfileId", column = "pp.id"),
#FieldResult(name = "propertyProfileRole", column = "pp.role"),
#FieldResult(name = "propertyValueId", column = "pv.id"),
#FieldResult(name = "hasPropertyValue", column = "hasPropertyValue")
}
)
})
Analysis
The problem seems to be that Hibernate does NOT ..
.. process each row
.. per row map onto designated entities
.. put the mapped entities for this row into List (in my example a pair of entities)
In fact hibernate seems to match both entities, which should go into the same entry of List, based on the primary key attributes, i.e. sth like this:
.. process each row
.. for each row map to respective entities (separately)
.. store the mapped entities using their primary key
.. match respective entities which go into the same entry for List
In my example, a pairs of [ProductDataFillSummary_ShoppingGuideInformation, ProductDataFillSummary_ProductInformation] will be inserted into the list. When 'ProductDataFillSummary_ProductInformation' is insterted, Hibernate will try to find the correct instance using the primary key (here 'ProductDataFillSummary_ProductInformation.productId'). Due to several rows for ProductDataFillSummary_ProductInformation having the same value for productId, always the first instance will be fetched and used for List.
Solution
Either use a compound key that considers 'ProductDataFillSummary_ProductInformation.productId' and '.propertyProfileId', or ..
Use an artifical key (uuid) if it's not possible to use a combined key:
concat(p.ID, '-', pp.ID) as 'uuid'

Django querysets: Excluding NULL values across multiple joins

I'm trying to avoid using extra() here, but haven't found a way to get the results I want using Django's other queryset methods.
My models relationships are as follows:
Model: Enrollment
FK to Course
FK to User
FK to Mentor (can be NULL)
Model: Course
FK to CourseType
In a single query: given a User, I'm trying to get all of the CourseTypes they have access to. A User has access to a CourseType if they have an Enrollment with both a Course of that CourseType AND an existing Mentor.
This user has 2 Enrollments: one in a Course for CourseType ID 6, and the other for a Course for CourseType ID 7, but her enrollment for CourseType ID 7 does not have a mentor, so she does not have access to CourseType ID 7.
user = User.objects.get(pk=123)
This works fine: Get all of the CourseTypes that the user has enrollments for, but don't (yet) query for the mentor requirement:
In [28]: CourseType.objects.filter(course__enrollment__user=user).values('pk')
Out[28]: [{'pk': 6L}, {'pk': 7L}]
This does not give me the result I want: Excluding enrollments with NULL mentor values. I want it to return only ID 6 since that is the only enrollment with a mentor, but it returns an empty queryset:
In [29]: CourseType.objects.filter(course__enrollment__user=user).exclude(course__enrollment__mentor=None).values('pk')
Out[29]: []
Here's the generated SQL for the last queryset that isn't returning what I want it to:
SELECT `courses_coursetype`.`id` FROM `courses_coursetype` INNER JOIN `courses_course` ON ( `courses_coursetype`.`id` = `courses_course`.`course_type_id` ) INNER JOIN `store_enrollment` ON ( `courses_course`.`id` = `store_enrollment`.`course_id` ) WHERE (`store_enrollment`.`user_id` = 3877 AND NOT (`courses_coursetype`.`id` IN (SELECT U0.`id` AS `id` FROM `courses_coursetype` U0 LEFT OUTER JOIN `courses_course` U1 ON ( U0.`id` = U1.`course_type_id` ) LEFT OUTER JOIN `store_enrollment` U2 ON ( U1.`id` = U2.`course_id` ) WHERE U2.`mentor_id` IS NULL)))
The problem, it seems, is that in implementing the exclude(), Django is creating a subquery which is excluding more rows than I want excluded.
To get the desired results, I had to use extra() to explicitly exclude NULL Mentor values in the WHERE clause:
In [36]: CourseType.objects.filter(course__enrollment__user=user).extra(where=['store_enrollment.mentor_id IS NOT NULL']).values('pk')
Out[36]: [{'pk': 6L}]
Is there a way to get this result without using extra()? If not, should I file a ticket with Django per the docs? I looked at the existing tickets and searched for this issue but unfortunately came up short.
I'm using Django 1.7.10 with MySQL.
Thanks!
Try using isnull.
CourseType.objects.filter(
course__enrollment__user=user,
course__enrollment__mentor__isnull=False,
).values('pk')
Instead of exclude() you can create complex queries using Q(), or in your case ~Q():
filter_q = Q(course__enrollment__user=user) | ~Q(course__enrollment__mentor=None)
CourseType.objects.filter(filter_q).values('pk')
This might lead to a different SQL statement.
See docs:
https://docs.djangoproject.com/en/3.2/topics/db/queries/#complex-lookups-with-q-objects

Facing issue with SQL query in the where clause

I have the following database scheme on MySQL and I would like to retrieve all elements for a speciic id.
So for instance, I would like to retrieve cities, categories, departments linked to the coupon_id=1 (and other fields).
I wrote the following SQL query but unfortunatelly could not get the desired result.
SELECT cc_coupon.id_coupon as idCoupon,
cc_coupon.condition_coupon,
cc_coupon.description,
cc_coupon.type_coupon,
cc_coupon_by_categorie.id_categorie,
cc_categorie.categorie as category,
cc_annonceur.raison_sociale,
cc_coupon_active_in_cities.id_ville as ville_slug,
cc_villes_france.ville_slug,
cc_villes_france.ville_nom_departement,
cc_villes_france.ville_departement
FROM cc_coupon,
cc_coupon_by_categorie,
cc_categorie,
cc_annonceur,
cc_coupon_active_in_cities,
cc_coupon_active_in_departments,
cc_villes_france
WHERE cc_coupon.id_coupon = cc_coupon_by_categorie.id_coupon
and cc_categorie.id_categorie = cc_coupon_by_categorie.id_categorie
and cc_coupon.id_annonceur = cc_annonceur.id_annonceur
and cc_coupon.id_coupon = cc_coupon_active_in_cities.id_coupon
and cc_villes_france.id_ville = cc_coupon_active_in_cities.id_ville
and cc_villes_france.ville_departement = cc_coupon_active_in_departments.ville_departement
and cc_coupon.id_coupon = 1
and cc_coupon_active_in_cities.id_coupon = 1
and cc_coupon_active_in_departments.id_coupon = 1
Thanks for your help.
I think you should use the on and not where when you want to join two tables. When you want to specify other conditions use where clause.

Django ORM: using extra to order models by max(datetime field, max(datetime field of related items))

Given the following models:
class BaseModel(models.Model):
modified_date = models.DateTimeField(auto_now=True)
class Meta:
abstract = True
class Map(BaseModel):
...
class MapItem(BaseModel):
map = models.ForeignKey(Map)
...
How do I structure my ORM call to sort Maps by the last time either the Map or one of its MapItems was modified?
In other words, how do I generate a value for each Map that represents the maximum of the Map's own modified_date and the latest modified_date of its related MapItems and sort by it without resorting to raw SQL or Python?
I tried the following query but the last_updated values are blank when my QuerySet is evaluated and I'm not quite sure why:
Map.objects.extra(select={
"last_updated": "select greatest(max(maps_mapitem.modified_date), maps_map.modified_date)
from maps_map join maps_mapitem on maps_map.id = maps_mapitem.id"}).
Thanks in advance.
Edit 0: as Peter DeGlopper points out, my join was incorrect. I've fixed the join and the last_updated values are now all equal instead of being blank:
Map.objects.extra(select={
"last_updated": "select greatest(max(maps_mapitem.modified_date), maps_map.modified_date)
from maps_map join maps_mapitem on maps_map.id = maps_mapitem.maps_id"}).
Your join is wrong. It should be:
maps_map join maps_mapitem on maps_map.id = maps_mapitem.map_id
As it stands you're forcing the PKs to be equal, not the map's PK to match the items' FKs.
edit
I suspect your subquery isn't joining against the main maps_map part of the query. I am sure there are other ways to do this, but this should work:
Map.objects.extra(select={
"last_updated": "greatest(modified_date, (select max(maps_mapitem.modified_date) from maps_mapitem where maps_mapitem.map_id = maps_map.id))"})

Django query based on foreign key relationship

I have two models, a Project and an Action:
class Project(models.Model):
name = models.CharField("Project Name", max_length=200, unique = True)
class Action(models.Model):
name = models.CharField("Action Name", max_length=200)
project = models.ForeignKey(Project, blank=True, null=True, verbose_name="Project")
notes = models.TextField("Notes", blank=True)
complete = models.BooleanField(default=False, verbose_name="Complete?")
status = models.IntegerField("Action Status", choices = STATUS, default=0)
I need a query that returns all the Projects for which there are no actions with status < 2.
I tried:
Project.objects.filter(action__status__gt = 1)
But this returns all the Projects because in each Project, there are some actions with status 2 and some actions with status less than 2. Also, it repeated Projects in the resulting query. My current solution is below:
Project.objects.filter(action__status__gt =1).exclude(action__status__lt =2).annotate()
This collapses the repeating results and shows only actions with action statuses greater than 1. But is this the correct way to construct such a query? What if I wanted to return Projects with actions statuses greater than 1 OR Projects with no actions?
I might have misunderstood your requirement, but I think you can do that using annotations.
Project.objects.annotate(m = Min('action__status')).filter(Q(m = None) | Q(m__gt = 1))
The SQL generated is:
SELECT
"testapp_project"."id", "testapp_project"."name",
MIN("testapp_action"."status") AS "m"
FROM "testapp_project"
LEFT OUTER JOIN "testapp_action" ON ("testapp_project"."id" = "testapp_action"."project_id")
GROUP BY "testapp_project"."id", "testapp_project"."name"
HAVING(
MIN("testapp_action"."status") IS NULL
OR MIN("testapp_action"."status") > 1
)
Which is pretty self-explanatory.
Django's ORM is not capable of expressing this. You will need to use a raw query in order to perform this.