what is the equivalent ORM query in Django for sql join - mysql

I have two django models and both have no relation to each other but have JID in common(I have not made it foreign key):
class result(models.Model):
rid = models.IntegerField(primary_key=True, db_column='RID')
jid = models.IntegerField(null=True, db_column='JID', blank=True)
test_case = models.CharField(max_length=135, blank=True)
class job(models.Model):
jid = models.IntegerField(primary_key = True, db_column='JID')
client_build = models.IntegerField(max_length=135,null=True, blank=True)
I want to achieve this sql query in ORM:
SELECT *
FROM result
JOIN job
ON job.JID = result.JID
Basically I want to join two tables and then perform a filter query on that table.
I am new to ORM and Django.

jobs = job.objects.filter(jid__in=result.objects.values('jid').distinct()
).select_related()

I don't know how to do that in Django ORM but here are my 2 cents:
any ORM makes 99% of your queries super easy to write (without any SQL). For the 1% left, you've got 2 options: understand the core of the ORM and add custom code OR simply write pure SQL. I'd suggest you to write the SQL query for it.
if both table result and job have a JID, why won't you make it a foreign key? I find that odd.
a class name starts with an uppercase, class *R*esult, class *J*ob.

You can represent a Foreign Key in Django models by modifying like this you result class:
class result(models.Model):
rid = models.IntegerField(primary_key=True, db_column='RID')
# jid = models.IntegerField(null=True, db_column='JID', blank=True)
job = models.ForeignKey(job, db_column='JID', blank=True, null=True, related_name="results")
test_case = models.CharField(max_length=135, blank=True)
(I've read somewhere you need to add both blank=True and null=True to make a foreign key optional in Django, you may try different options).
Now you can access the job of a result simply by writing:
myresult.job # assuming myresult is an instance of class result
With the parameter related_name="results", a new field will automatically be added to the class job by Django, so you will be able to write:
myjob.results
And obtain the results for the job myjob.
It does not mean it will necessarilly be fetched by Django ORM with a JOIN query (it will probably be another query instead), but the effect will be the same from your code's point of view (performance considerations aside).
You can find more information about models.ForeignKey in Django documentation.

Related

Is there a specific ordering needed for classes in Peewee models?

I'm currently trying to create an ORM model in Peewee for an application. However, I seem to be running into an issue when querying a specific model. After some debugging, I found out that it is whatever below a specific model, it's failing.
I've moved around models (with the given ForeignKeys still being in check), and for some odd reason, it's only what is below a specific class (User).
def get_user(user_id):
user = User.select().where(User.id==user_id).get()
return user
class BaseModel(pw.Model):
"""A base model that will use our MySQL database"""
class Meta:
database = db
class User(BaseModel):
id = pw.AutoField()
steam_id = pw.CharField(max_length=40, unique=True)
name = pw.CharField(max_length=40)
admin = pw.BooleanField(default=False)
super_admin = pw.BooleanField()
#...
I expected to be able to query Season like every other model. However, this the peewee error I run into, when I try querying the User.id of 1 (i.e. User.select().where(User.id==1).get() or get_user(1)), I get an error returned with the value not even being inputted.
UserDoesNotExist: <Model: User> instance matching query does not exist:
SQL: SELECT `t1`.`id`, `t1`.`steam_id`, `t1`.`name`, `t1`.`admin`, `t1`.`super_admin` FROM `user` AS `t1` WHERE %s LIMIT %s OFFSET %s
Params: [False, 1, 0]
Does anyone have a clue as to why I'm getting this error?
Read the error message. It is telling you that the user with the given ID does not exist.
Peewee raises an exception if the call to .get() does not match any rows. If you want "get or None if not found" you can do a couple things. Wrap the call to .get() with a try / except, or use get_or_none().
http://docs.peewee-orm.com/en/latest/peewee/api.html#Model.get_or_none
Well I think I figured it out here. Instead of querying directly for the server ID, I just did a User.get(1) as that seems to do the trick. More reading shows there's a get by id as well.

Examining SQLAlchemy query results in Pyramid

I'm trying to add a method to an (quite big) existing project writtent in python with pyramid framework and sqlalchemy ORM. I've wanted to execute an sql query with sqlalchemy but I've never developped with pyramid or sqlalchemy before. So I would like to test it and see if the query returns what I'm expecting but I don't want to add useless code to test my query ( like a new template, a view etc). My SQL query is :
select a.account_type, u.user_id from accounts a inner join account_users au on a.account_id=au.account_id inner join users u on u.user_id=au.user_id where u.user_id = ?;
And my method is :
def find_account_type_from_user_id(self,user_id):
'''
Method that finds the account type (one/several points of sale...)
from the id of the user who is linked to this account
:param user_id:
:return:(string) account_type
'''
q = self.query(Account)\
.join(AccountUser)\
.join(User)\
.filter(User.user_id == user_id)\
.one()
return q
ps: I've already searched on the internet but I only find things like : unit tests etc and I've never did that. (Noob's sorry).
Unit tests are a must to test new services, fixes, refactoring code, etc, you need a good collection of unit tests.
You can start here.
Two ways to see SQLAlchemy query content
Set sqlalchemy logging level to INFO - see instructions https://opensourcehacker.com/2016/05/22/python-standard-logging-pattern/
Use pyramid_debugtoolbar and it shows all queries your view made
Execute query interactively using pshell - no views need to be added

Django sort queryset by related model field

I have the following models (abbreviated for clarity):
class Order(models.Model):
some fields
class OrderStatus(models.Model):
order = models.ForiegnKey(Order)
status = models.CharField(choices=['ORDERED', 'IN_TRANSIT', 'RECEIVED'])
time = models.DateTimeField()
I would like to sort all Orders that contain all three OrderStatuses by their order received time.
In other words, select the orders that have records of Ordered, In Transit, and Received like so:
Order.objects.filter(orderstatus__status=OrderStatus.ORDERED)
.filter(orderstatus__status=OrderStatus.IN_TRANSIT)
.filter(orderstatus__status=OrderStatus.RECEIVED)
... and then sort them by the time field of their related OrderStatus model for which status=OrderStatus.RECEIVED.
This is where I'm stuck. I have read the Django docs on the .extra() queryset modifier and direct SQL injection, but I'm still at a loss. Could I achieve it with an annotated field and Q objects or am I better going the .extra route?
Didn't you try to do like this?
Order.objects.filter(orderstatus__status=OrderStatus.ORDERED)
.filter(orderstatus__status=OrderStatus.IN_TRANSIT)
.filter(orderstatus__status=OrderStatus.RECEIVED)
.order_by('orderstatus__time')
On my models it worked as expected - order_by picked the last joined orderstatus just as you need. If you're not sure you can check the real query like this (in django shell):
from django.db import connection
# perform query
print(connection.queries)
Also it can be done like this:
OrderStatus.objects.filter(status=OrderStatus.RECEIVED)
.order_by('time').select_related('order')
.filter(order__orderstatus__status=OrderStatus.ORDERED)
.filter(order__orderstatus__status=OrderStatus.IN_TRANSIT)

Django admin MySQL slow INNER JOIN

I have a simple model with 3 ForeignKey fields.
class Car(models.Model):
wheel = models.ForeignKey('Wheel', related_name='wheels')
created = models.DateTimeField(auto_now_add=True)
max_speed = models.PositiveSmallIntegerField(null=True)
dealer = models.ForeignKey('Dealer')
category = models.ForeignKey('Category')
For the list view in the django admin i get 4 queries. One of them is a SELECT with 3 INNER JOINS. That one query is way to slow. Replacing the INNER JOINs with STRAIGHT_JOIN would fix the issue. Is there a way to patch the admin generated query just before it is evaluated?
I've implemented a fix for INNER JOIN for Django ORM, it will use STRAIGHT_JOIN in case of ordering with INNER JOINs. I talked to Django core-devs and we decided to do this as a separate backend for now. So you can check it out here: https://pypi.python.org/pypi/django-mysql-fix
However, there is one other workaround. Use a snippet from James's answer, but replace select_related with:
qs = qs.select_related('').prefetch_related('wheel', 'dealer', 'category')
It will cancel INNER JOIN and use 4 separate queries: 1 to fetch cars and 3 others with car_id IN (...).
UPDATE:
I've found one more workaround. Once you specify null=True in your ForeignKey field, Django will use LEFT OUTER JOINs instead of INNER JOIN. LEFT OUTER JOIN works without performance issues in this case, but you may face other issues that I'm not aware of yet.
You may just specify list_select_related = () to prevent django from using inner join:
class CarAdmin(admin.ModelAdmin):
list_select_related = ()
You could overwrite
def changelist_view(self, request, extra_context=None):
method in your admin class inherited from ModelAdmin class
something like this(but this question is rather old):
Django Admin: Getting a QuerySet filtered according to GET string, exactly as seen in the change list?
Ok, I found a way to patch the admin generated Query. It is ugly but it seems to work:
class CarChangeList(ChangeList):
def get_results(self, request):
"""Override to patch ORM generated SQL"""
super(CarChangeList, self).get_results(request)
original_qs = self.result_list
sql = str(original_qs.query)
new_qs = Car.objects.raw(sql.replace('INNER JOIN', 'STRAIGHT_JOIN'))
def patch_len(self):
return original_qs.count()
new_qs.__class__.__len__ = patch_len
self.result_list = new_qs
class CarAdmin(admin.ModelAdmin):
list_display = ('wheel', 'max_speed', 'dealer', 'category', 'created')
def get_changelist(self, request, **kwargs):
"""Return custom Changelist"""
return CarChangeList
admin.site.register(Rank, RankAdmin)
I came across the same issue in the Django admin (version 1.4.9) where fairly simple admin listing pages were very slow when backed by MySQL.
In my case it was caused by the ChangeList.get_query_set() method adding an overly-broad global select_related() to the query set if any fields in list_display were many-to-one relationships. For a proper database (cough PostgreSQL cough) this wouldn't be a problem, but it was for MySQL once more than a few joins were triggered this way.
The cleanest solution I found was to replace the global select_related() directive with a more targeted one that only joined tables that were really necessary. This was easy enough to do by calling select_related() with explicit relationship names.
This approach likely ends up swapping in-database joins for multiple follow-up queries, but if MySQL is choking on the large query many small ones may be faster for you.
Here's what I did, more or less:
from django.contrib.admin.views.main import ChangeList
class CarChangeList(ChangeList):
def get_query_set(self, request):
"""
Replace a global select_related() directive added by Django in
ChangeList.get_query_set() with a more limited one.
"""
qs = super(CarChangeList, self).get_query_set(request)
qs = qs.select_related('wheel') # Don't join on dealer or category
return qs
class CarAdmin(admin.ModelAdmin):
def get_changelist(self, request, **kwargs):
return CarChangeList
I've had slow admin queries on MySQL and found the easiest solution was to add STRAIGHT_JOIN to the query. I figured out a way to add this to a QuerySet rather than being forced to go to .raw(), which won't work with the admin, and have open sourced it as part of django-mysql. You can then just:
def get_queryset(self, request):
qs = super(MyAdmin, self).get_queryset(request)
return qs.straight_join()
MySQL still has this problem even in version 8 and Django still doesn't allow you to add STRAIGHT_JOIN in the query set. I found a hackish solution to add STRAIGHT_JOIN...:
This was tested with Django 2.1 and MySQL 5.7 / 8.0
def fixQuerySet(querySet):
# complete the SQL with params encapsulated in quotes
sql, params = querySet.query.sql_with_params()
newParams = ()
for param in params:
if not str(param).startswith("'"):
if isinstance(param, str):
param = re.sub("'", "\\'", param)
newParams = newParams + ("'{}'".format(param),)
else:
newParams = newParams + (param,)
rawQuery = sql % newParams
# escape the percent used in SQL LIKE statements
rawQuery = re.sub('%', '%%', rawQuery)
# replace SELECT with SELECT STRAIGHT_JOIN
rawQuery = rawQuery.replace('SELECT', 'SELECT STRAIGHT_JOIN')
return querySet.model.objects.raw(rawQuery)
Important: This method returns a raw query set so should be called just before consuming the query set

How does SqlAlchemy handle unique constraint in table definition

I have a table with the following declarative definition:
class Type(Base):
__tablename__ = 'Type'
id = Column(Integer, primary_key=True)
name = Column(String, unique = True)
def __init__(self, name):
self.name = name
The column "name" has a unique constraint, but I'm able to do
type1 = Type('name1')
session.add(type1)
type2 = Type(type1.name)
session.add(type2)
So, as can be seen, the unique constraint is not checked at all, since I have added to the session 2 objects with the same name.
When I do session.commit(), I get a mysql error since the constraint is also in the mysql table.
Is it possible that sqlalchemy tells me in advance that I can not make it or identifies it and does not insert 2 entries with the same "name" columm?
If not, should I keep in memory all existing names, so I can check if they exist of not, before creating the object?
SQLAlechemy doesn't handle uniquness, because it's not possible to do good way. Even if you keep track of created objects and/or check whether object with such name exists there is a race condition: anybody in other process can insert a new object with the name you just checked. The only solution is to lock whole table before check and release the lock after insertion (some databases support such locking).
AFAIK, sqlalchemy does not handle uniqueness constraints in python behavior. Those "unique=True" declarations are only used to impose database level table constraints, and only then if you create the table using a sqlalchemy command, i.e.
Type.__table__.create(engine)
or some such. If you create an SA model against an existing table that does not actually have this constraint present, it will be as if it does not exist.
Depending on your specific use case, you'll probably have to use a pattern like
try:
existing = session.query(Type).filter_by(name='name1').one()
# do something with existing
except:
newobj = Type('name1')
session.add(newobj)
or a variant, or you'll just have to catch the mysql exception and recover from there.
From the docs
class MyClass(Base):
__tablename__ = 'sometable'
__table_args__ = (
ForeignKeyConstraint(['id'], ['remote_table.id']),
UniqueConstraint('foo'),
{'autoload':True}
)
.one() throws two kinds of exceptions:
sqlalchemy.orm.exc.NoResultFound and sqlalchemy.orm.exc.MultipleResultsFound
You should create that object when the first exception occurs, if the second occurs you're screwed anyway and shouldn't make is worse.
try:
existing = session.query(Type).filter_by(name='name1').one()
# do something with existing
except NoResultFound:
newobj = Type('name1')
session.add(newobj)