How to limit results in reverse relation in Django - mysql

I have two tables, one called Company and the other called User, each user is related to one company using ForeignKey. So I can use reverse relation in Django to get all users for specific company (e.g. company.users)
In my case, I'm building ListAPIView which return multiple companies, and I'd like to return latest created user. My problem is that I don't want to use prefetch_related or select_related so it will load all the users, as we might end up having thousands of users per each company! Also I don't want to load each latest user in a separate query so we end up having tens of queries per API request!
I've tried something like this:
users_qs = models.User.objects.filter(active=True).order_by('-created')
company_qs = models.Company.objects.prefetch_related(
Prefetch('users', queryset=users_qs[:1], to_attr='user')
).order_by('-created')
In this case, prefetch_related failed as we can't set limit on the Prefetch's queryset filter (it gives this error "Cannot filter a query once a slice has been taken.")
Any ideas?

I think you are providing an object instead of a queryset Prefetch('users', queryset=users_qs[:1], to_attr='user')

Related

Looking for correct query for where.not.any in Rails ActiveRecord

Beginner Rails Question.
I have a table of Users and a table of Teams.
A User has many teams and Teams belong to User.
I want to query if a user does not have a team.
I'm using this query:
User.joins(:teams).where.not(teams: {team_name: 'coconuts'})
This works except if the user has more than one team.
For example User Bill is on the coconuts team and the breadfruit team.
The above query returns Bill when he should be excluded because he is on the coconuts team.
I see why this is happening but I'm having trouble thinking of another query that will work for this scenario.
What is the correct way to grab this data?
I'm using Rails 4.
Try to the following, please consider simple and clean code vs performance:
team = Team.find_by(name: 'coconuts')
excluded_user_ids = team.user_ids
User.where.not(id: excluded_user_ids)
# If you want more a little bit efficiently and suppose you have the join model `Membership`
excluded_user_ids = team.memberships.pluck(:user_id)
# Or if you want more efficiently (just 1 query) and suppose you're using Postgresql
User
.left_outer_joins(:teams)
.group('users.id')
.select("users.*, count(teams.id) AS count_foo, count(teams.id) filter (where teams.name = 'coconuts') AS count_bar")
.having('count_foo != count_bar')
Using just Ruby, and not active record, you can do
User.select {|user| user.teams.pluck(:team_name).exclude?('coconuts')
}

iterate over instance field containing multiple instances?

This is my fist time using django and I'm having some problems to understand how data is stored, so I can not really use it as I want to. I made many researches but I don't find any related question probably because I don't have the right keywords.
In my app model I created a WebPage and Count class:
class Count(models.Model):
date = models.DateField(default=date.today)
count = models.IntegerField(default=0)
class WebPage(models.Model):
link = models.CharField(max_length=60)
id = models.AutoField(primary_key=True)
clicks = models.IntegerField(default=0)
stats = models.ForeignKey(Count)
Then I created a WebPage object with multiple Count objects and I'd like to create a method to retrieve the sum of the count instances.
def get_clicks(self):
self.clicks=0
for object in self.stats:
self.clicks+=object.count
return str(self.clicks)
but I get the error 'Count' object is not iterable which is logic because I defined self.stats as an single Count object. I told my self that if the Count instances are not stored in self.stats they could be stored as "global" Count instances so I iterated over the object instances for object in self._meta.fields but the multiple Count instances are missing:
statistics.WebPage.link
statistics.WebPage.id
statistics.WebPage.clicks
statistics.WebPage.stats
And I think that iterate over the "global" Count objects it is not an option because I could not know which Count instance belong to which WebPage.
Where the self.stats Count instances hidden? Thanks for the help!
(I'm using django 1.7)
The problem is that your ForeignKey is the wrong way round. The model that the FK is defined in is the "many" side of the one-to-many relationship. That is, the way you have it, each Count has many WebPages, whereas I think you want the other way round.
Then you can sum the counts attached to your web page in one go via aggregation:
from django.db.models import Sum
total_clicks=my_web_page.count_set.aggregate(Sum('count'))
or, to get clicks for a whole queryset of webpages:
my_web_page_queryset.annotate(clicks=Sum('count__count'))
web_page = Webpage.objects.get(pk=1)
def get_clicks(web_page):
return web_page.contact_set.count()
This will work if you get a single Webpage. If you get a group of webpages, you can loop through them and call get_clicks for each.

correctly fetch nested list in SQL

I have a design problem with SQL request:
I need to return data looking like:
listChannels:
-idChannel
name
listItems:
-data
-data
-idChannel
name
listItems:
-data
-data
The solution I have now is to send a first request:
*"SELECT * FROM Channel WHERE idUser = ..."*
and then in the loop fetching the result, I send for each raw another request to feel the nested list:
"SELECT data FROM Item WHERE idChannel = ..."
It's going to kill the app and obviously not the way to go.
I know how to use the join keyword, but it's not exactly what I want as it would return a row for each data of each listChannels with all the information of the channels.
How to solve this common problem in a clean and efficient way ?
The "SQL" way of doing this produces of table with columns idchannel, channelname, and the columns for item.
select c.idchannel, c.channelname, i.data
from channel c join
item i
on c.idchannel = i.idchannel
order by c.idchannel, i.item;
Remember that a SQL query returns a result set in the form of a table. That means that all the rows have the same columns. If you want a list of columns, then you can do an aggregation and put the items in a list:
select c.idchannel, c.channelname, group_concat(i.data) as items
from channel c join
item i
on c.idchannel = i.idchannel
group by c.idchannel, c.channelname;
The above uses MySQL syntax, but most databases support similar functionality.
SQL is made for accessing two-dimensional data tables. (There are more possibilities, but they are very complex and maybe not standardized)
So the best way to solve your problem is to use multiple requests. Please also consider using transactions, if possible.

Valid to return different json-response depending on list or retrieve?

I am currently designing a Rest API and is a little stuck on performance matters for 2 of the use cases in the system:
List all campaigns (api/campaigns) - needs to return campaign data needed for listing and paging campaigns. Maybe return up to 1000 records and would take ages to retreive and return detailed data. The needed data can be returned in a single DB call.
Retrieve campaign item (api/campaigns/id) - need to return all data about the campaign and may take up to a second to run. Multiple DB calls is needed to get all campaign data for a single campaign.
My question is: Is it valid to return different json-responses to those 2 calls (if well documented) even if it regards the same resource? I am thinking that the list response is a sub set of the retreive-response. The reason for this is to make to save DB calls and bandwitdh + parsing.
Thanks in advance!
I think it's both fine and expected for /campaigns and /campaigns/{id} to return different information. I would suggest using query parameters to limit the amount of information you need to return. For instance, only return a URI to each player unless you see a ?expand=players query parameter, in which case you return detailed player information.

Can I create sperate queries for different views?

I'm learning sqlalchemy and not sure if I grasp it fully yet(I'm more used to writing queries by hand but I like the idea of abstracting the queries and getting objects). I'm going through the tutorial and trying to apply it to my code and ran into this part when defining a model:
def __repr__(self):
return "<User('%s','%s', '%s')>" % (self.name, self.fullname, self.password)
Its useful because I can just search for a username and get only the info about the user that I want but is there a way to either have multiple of these type of views that I can call? or am I using it wrong and should be writing a specific query for getting different data for different views?
Some context to why I'm asking my site has different templates, and most pages will just need the usersname, first/last name but some pages will require things like twitter or Facebook urls(also fields in the model).
First of all, __repr__ is not a view, so if you have a simple model User with defined columns, and you query for a User, all the columns will get loaded from the database, and not only those used in __repr__.
Lets take model Book (from the example refered to later) as a basis:
class Book(Base):
book_id = Column(Integer, primary_key=True)
title = Column(String(200), nullable=False)
summary = Column(String(2000))
excerpt = Column(Text)
photo = Column(Binary)
The first option to skip loading some columns is to use Deferred Column Loading:
class Book(Base):
# ...
excerpt = deferred(Column(Text))
photo = deferred(Column(Binary))
In this case when you execute query session.query(Book).get(1), the photo and excerpt columns will not be loaded until accessed from the code, at which point another query against the database will be executed to load the missing data.
But if you know before you query for the Book that you need the column photo immediately, you can still override the deferred behavior with undefer option: query = session.query(Book).options(undefer('photo')).get(1).
Basically, the suggestion here is to defer all the columns (in your case: except username, password etc) and in each use case (view) override with undefer those you know you need for that particular view. Please also see the group parameter of deferred, so that you can group the attributes by use case (view).
Another way would be to query only some columns, but in this case you are getting the tuple instance instead of the model instance (in your case User), so it is potentially OK for form filling, but not so good for model validation: session.query(Book.id, Book.title).all()