How do I make a newsfeed without many join statements - mysql

What is the best way to approach making a newsfeed?
Currently I have an observer that creates a new newsfeedactivity record everytime someone creates a record. But, to deal with privacy I end up have 7 or 8 joins to get the appropriate output.
It seems like this is going to be slow and inefficient. What's another strategy for pulling out the right newsfeedactivity records as a scope?
More details:
Currently I have a site to help users track projects that they're working on. There are public and private projects (where people can invite collaborators).
I want my newsfeed to include when public projects are created. When you are invited to a private project. When a user follows a project. And then all of the actions of the other users that you're following. Then for the private projects I have another join table to determine who has access to the projects. (There are also comments on each of these projects that I want to show up in the newsfeed as well).
All of the following relationships are currently in join tables, which is why I have a lot of joins.
To get an idea of the type of query - I'm thinking it would look something like this:
SELECT news_feed_activities.* FROM news_feed_activities LEFT JOIN
user_following_relationships ON
user_following_relationships.following_id =
news_feed_activities.user_id LEFT JOIN
user_project_relationships ON
user_project_relationships.project_id =
news_feed_activities.responding_to_id AND
news_feed_activities.responding_to_type = 'Project' WHERE
(user_following_relationships.user_id = 1 OR
user_project_relationships.user_id = 1 OR
news_feed_activities.user_id = 1 OR
up2.user_id = 1) GROUP BY news_feed_activities.id ORDER BY
news_feed_activities.id DESC
EDIT:
I think I'm probably going to end up using Redis along these lines http://blog.waxman.me/how-to-build-a-fast-news-feed-in-redis

As RoR.
In your controller:
#user = current_user # (?)
recent_since = 24.hours.ago
#news_feed = []
# 1) I want my newsfeed to include when public projects are created.
#news_feed += Project.recent.open
# 2) When you are invited to a private project.
#news_feed += #user.invites.received.pending
# 3) When a user follows a project.
#news_feed += #user.user_following_relationships.recent
# 4) And then all of the actions of the other users that you're following.
#news_feed += #user.follows.collect(&:activities)
# 5) Then for the private projects I have another join table to determine who has access to the projects. (There are also comments on each of these projects that I want to show up in the newsfeed as well).
#news_feed += #user.projects.closed
#news_feed.sort!{ |a,b| a.created_at <=> b.created_at }
I did some sample scopes for you too.
project.rb
scope :recent, :conditions => ["created_at >= ?", 24.hours.ago]
scope :open, :conditions => "publicity = 'Public'"
scope :closed, :conditions => "publicity = 'Private'"
This is based on the precept that your news feed is actually a summary of recent activity across models rather than having a 'newsfeed' model.

Related

Rails activerecord find objects with specific children associations

I have a model User who can have many Features:
class User << ActiveRecord::Base
has_many :features, dependent: :destroy
end
class Feature << ActiveRecord::Base
belongs_to :user
# columns id, user_id, name
end
I have 2 features I can put on a user called "feat1" and "feat2" The 2 features can combine for a total of 4 types of users:
User has feat1 ONLY
User has feat2 ONLY
User has BOTH feat1 and feat2
User has NEITHER feat1 and feat2
I want to create scopes on user to scope out the 4 types of users.
User.only_feat1
User.only_feat2
User.both_feats
User.no_feats
I've been playing around with .merge, .uniq, .joins, .includes, but can't seem to figure out the activerecord way.
So here's my solution. It involves a lot of raw SQL, but it gets the job done. The problem is if I join the same table (features) to my users table more then once, then the where clauses get overwritten and I can't do multiple joins.
So to combat this problem, I wrote a lot of raw sql which explicitly aliases the table on the join forcing the double join table.
scope :without_feature, ->(feat) { joins("LEFT OUTER JOIN features af ON users.id = af.user_id AND af.name = '#{feat}'").where(af: {id: nil}) }
scope :feat1, -> { joins("INNER JOIN features rs ON users.id = rs.user_id AND rs.name = 'feat1'") }
scope :feat2, -> { joins("INNER JOIN features rr ON users.id = rr.user_id AND rr.name = 'feat2'") }
scope :both_feats, -> { feat1.feat2 }
scope :only_feat1, -> { feat1.without_feature('feat2') }
scope :only_feat2, -> { feat2.without_feature('feat1') }
Hope this helps anyone else.

ActiveRecord custom has_many association makes multiple calls to the database

I have a pair of ActiveRecord objects that have a belongs_to ... has_many association, with the has_many association being custom-made. Example:
First AR object:
class Car < Vehicle
has_many :wheels, class_name: "RoundObject", foreign_key: :vehicle_id, conditions: "working = 1"
validates_presence_of :wheels
...
end
Second AR object:
class RoundObject < ActiveRecord::Base
belongs_to :vehicle
...
end
Please note that the above is not indicative of my app's function, simply to outline the association between my two AR objects.
The issue I'm having is that, when I reset the cache (and thus my Rails app re-caches all AR objects in the database), when it comes time for the RoundObject object to get re-cached, it makes multiple calls to the database, one for each unique vehicle_id associated with the collection of RoundObjects. The SQL commands being run are output to the console, so this is what my output looked like:
RoundObject Load (2.0ms) SELECT `round_objects`.* FROM `round_objects` WHERE `round_objects`.`vehicle_id` = 28 AND (active = 1)
RoundObject Load (1.0ms) SELECT `round_objects`.* FROM `round_objects` WHERE `round_objects`.`vehicle_id` = 29 AND (active = 1)
RoundObject Load (2.0ms) SELECT `round_objects`.* FROM `round_objects` WHERE `round_objects`.`vehicle_id` = 30 AND (active = 1)
My app has several other AR objects that use the built-in has_many association without any modifications, and I notice that they only hit the database once when resetting the cache. For instance:
Micropost Load (15.0ms) SELECT `microposts`.* FROM `microposts` INNER JOIN `posts` ON `posts`.`id` = `microposts`.`post_id` WHERE `microposts`.`active` = 1 AND `posts`.`active` = 1
My question is, how can I make my AR object only hit the database once on cache reset, while still maintaining the custom has_many association I need? Can I manually force a join on the SQL query being called, and will this help?
Thank you!
You can use includes method while calling your Vehicle object to include the RoundObject.
It will go like this:
Vehicle.where(conditions_for_getting_data).includes(:round_object)

Django how to select foreign value in query

Having following models:
User(models.Model):
...
login = ...
Asset(models.Model):
user = models.ForeignKey(User)
...
How to select users login in Asset query using django QuerySet capabilities. For example:
Asset.objects.extra(select = {'user_login' : 'user__login'})
make to return query set with user_login field in each model object
Each Asset object already has a foreign key to the user. So you can always access -
asset = Asset.objects.get(pk=any_id)
if asset.user.login == 'some_value':
do_some_magic()
Please read the documentation.
Use .select_related('user') to select all assets and related users in a single query. Then simply access it through asset.user.login.
assets = Asset.objects.selec_related('user').filter(<any filter>)
for asset in assets:
# no additional queries here, as the user objects are preloaded into memory
print asset.user.login
I have found following solution:
Asset.object.extra( select = {'user_login' : '`%s.%s`' % (User._meta.db_table, 'login') } ).order_by('user__login')
The order_by expression is used to make JOIN on User's model table, than user's login can be accessed in SELECT expression within user_table.login

Symfony 2 Self referencing many to many repository

I have a self referencing many to many relationship on my User entity being they can have many followers or follow many other users.
I am now trying to write a query in the user repository which will determine if a user is following another user.
I tried to write the query directy on user_relations (the mapping table) but it would not let me as it not related to the user entity.
So I tried:
$query = $this->createQueryBuilder('u')
->select('count(u.id)')
->innerJoin('u.following', 'r')
->where('u.id = :userID')
->where('r.from_id = :followingID')
->setParameter('userID', $userId)
->setParameter('followingID', $followingId)
Which results in an error stating the user entity does not have a field named from_uid.
How the hell can I correctly achieve this?
You can use MEMBER OF doctrine keyword
$query = $em->createQuery('SELECT u.id FROM User u WHERE :followingID MEMBER OF u.following');
$query->setParameter('followingID', $followingId);
$ids = $query->getResult();

Foreign keys cause more queries?

I have 2 objects - Order and Product. On the Order only productID is saved and when I view orders I want to see the product name. According to ScuttGu blog this is easily done by using a template field with Eval("Product.ProductName"). However, when reviewing the actual queries I see that for each order a separate query is made.
It doesn't sould right to me because for many rows and/or foreign keys many additional queries will be made. Doesn't it make the whole this too inefficient (i.e. why linq doesn't use a join)?
Thanks
That is because your products are lazy loaded - that is they are loaded when needed.
You can DataLoadOptions to set your fetchingstrategy, and load the products with your order:
MyDataContext db = new MyDataContext();
DataLoadOptions options = new DataLoadOptions();
options.LoadWith<Order>(order => order.Product);
db.LoadOptions = options;
var orders = from c in db.Orders
If you don't like the pr. datacontext specification of loadoptions you do something like this (not testet):
MyDataContext db = new MyDataContext();
db.Orders
.Select(o => new { Order = o, Products = o.Products})
.ToList()
.Select(x => x.Order)
.ToList();
I have implemented something like this guys fethingstrategies, which works out nicely with my repositories and the specification pattern.
This happens because at the point that Eval has run, the query already has, without said join.
When you're fetching the query you can use DataLoadOptions to include this using the .LoadWith() method:
var dlo = new DataLoadOptions();
dlo.LoadWith<Order>(o => o.Product);
var dc = new DataContext();
dc.LoadOptions = dlo;
var orders = from dc.Orders select o;