Why Laravel does not optimize model queries automatically? - mysql

Till today I was relying on Laravel relationships, but since I opened mysql logs I was very disappointed.
When I execute code
Company::with(['users', 'machines'])->get()
mysql.log looks this way
select * from `company` where `company`.`id` = '48' limit 1
select * from `user` where `user`.`company_id` in ('48')
select * from `machine` where `machine`.`company_id` in ('48')
Why Laravel does not use joins for eager fetching? Also, are there any ways of improving perfomance and still using Laravel Models?
I know that Doctrine ORM eager loading works pretty nice by using joins.
Thank you for your help.

If you really want to use joins instead of the Eloquent computed queries, I suppose you could just use the fluent query builder (that comes shipped with Laravel through the DB facade) and stick that code into a method of your model to keep everything nice and SRP.
For instance:
class Company extends Model {
public function sqlWithJoin() {
$users = DB::table('company')
->leftJoin('user', 'company.id', '=', 'user.company_id')
->get();
return $users;
}
}
This would generate a proper join query for you.
As for why you would want to do this, you would have to benchmark both options to see which one gives you the best performance for your specific data. I wouldn't generalize that one option always has better/worse performance than the other.

As stated in the comments, I'm not sure performance-wise why this is the preferred method, but from usability, being able to access a model's relationships as it's own separate property is much easier than working with a join, especially in the event of a many-to-one relationship.
Let's compare the above example using both ->with() and ->leftJoin() methods.
When using ->with() every relationship is defined as a property of Company, accessed via $company->users. It's easy to run a foreach() loop over this property foreach($company->users AS $user) and output information, such as username, email, etc. Also, if the Company has no users, you don't have to worry about displaying empty values (especially important on chaining models using . notation, such as users.user_details).
Now, looking at leftJoin(). If you try to chain multiple leftJoins() on each model and their sub-models, there's a chance you won't get the results you're expecting. Essentially, leftJoin() doesn't handle NULL records as well as individual queries can.
Next, to output a list of a company's users, you would have to run a loop such as:
foreach($company AS $row){
echo $row->username;
echo $row->email;
// etc etc
}
This becomes problematic as Eloquent doesn't handle duplicate properties well at all. For example, if the company has an email field as well as the user, it's anyone's guess which is actually displayed. Unless you do a selectRaw("companies.email AS email, users.email AS user_email)", only one email property is going to be returned. This also applies to columns like id, where multiple are going to be fetched by using leftJoin(), but only one will actually be accessible.
Long story short, leftJoin() comes with the potential for a lot of issues when trying to join multiple tables with the possibility of duplicate information, null information, etc. While the performance of running multiple queries using the ->with() method may not be the best, it allows for easier use in retrieving and displaying information.

Related

Yii2 Dynamic Relational Query Junction with Sort uses 2 queries instead of a JOIN, why?

I'm working with Yii2 Relational Database / Active Query Models and I ran into an issue trying to use the magic method getModelName() with $this->hasMany()->viaTable() to set the relation while trying to sort by a sort column in my junction table.
First I tried to just add my ->orderBy() clause to the main query:
return $this->hasMany(Category::class,
['id' => 'categoryId'])
->viaTable('{{kits_x_categories}}',
['kitId' => 'id'])
->orderBy('{{kits_x_categories}}.sort asc');
That didn't work as expected and upon further digging I found out that this results in two separate queries, the first one selects my category Ids into an array, then uses said array for a WHERE IN() clause in the main (2nd) query to get the actual models that are related.
My first thought was to use the 3rd function($query) {} callback parameter of the ->viaTable() call and putting my $query->orderBy() clause on there:
return $this->hasMany(Category::class,
['id' => 'categoryId'])
->viaTable('{{kits_x_categories}}',
['kitId' => 'id'],
function($query) {
return $query->orderBy('{{kits_x_categories}}.sort asc');
}
);
However, all that did was return the category ID's in my desired order but ultimately had no effect on the main query that does the IN() condition with said ids since the order of the ids in the IN() condition have no effect on anything.
Finally, I ended up with this which lets it do what it wants, but then forces in my join to the main query with the IN() condition so that I can have the main query sort by my junction table sort column. This works as desired:
return $this->hasMany(Category::class,
['id' => 'categoryId'])
->viaTable('{{kits_x_categories}}',
['kitId' => 'id'])
->leftJoin('{{kits_x_categories}}', '{{kits_x_categories}}.categoryId = {{categories}}.id')
->where(['{{kits_x_categories}}.kitId' => $this->id])
->orderBy('{{kits_x_categories}}.sort asc');
This results in 2 queries.
First the query gets the category ids from the join table:
SELECT * FROM `kits_x_categories` WHERE `kitId`='49';
Then the main query with the IN() condition and my forced join for sort:
SELECT `categories`.* FROM `categories`
LEFT JOIN `kits_x_categories` ON `kits_x_categories`.categoryId = `categories`.id
WHERE (`kits_x_categories`.`kitId`='49') AND (`categories`.`id` IN ('11', '7', '9'))
ORDER BY `kits_x_categories`.`sort`
So here is my actual question... This seems largely inefficient to me but I am by no means a database/SQL god so maybe I just don't understand fully. What I want is to understand.
Why does Yii do this? What is the point of making one query to get the IDs first, then making another query to get the objects based on the ids of the relation? Wouldn't it be more efficient to just do a regular join here? Then, in my opinion, sorting by a junction sort column would be intuitive rather than counter-intuitive.
The only thing I can think of is has to do with the lazy vs eager loading of data, maybe in lazy in only gets the IDs first then when it needs to load the data it pulls the actual data using IN()? If I used joinWith() instead of viaTable() would that make any difference here? I didn't dig into this as I literally just thought of that as I was typing this.
Lastly, In this scenario, There is only going to be a few categories for each kit so efficiency is not a big deal but I'm curious are there any performance implications in my working solution if I were to use this in the future on a different model set that could have thousands+ of relations?
Yii 2 does that:
To support lazy loading.
To support cross-database relations such as MySQL -> Redis.
To reduce number of edge-cases significantly so internal AR code becomes less complicated.
3rd party software is usually designed to get you started with databases. But then they fall apart when the app grows. This means that you need to learn the details of the underlying database in addition to the details of the layer.
Possibly this specific issue can be solved by improving the indexes on the many-to-many table with the tips here: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
This, of course, depends on whether the layer lets you tweak the schema that it created for you.
If there is a way to write "raw" SQL, that might let you get rid of the 2-step process, but you still need to improve the indexes on that table.

Database Access Objects batchInsert() yii2 run another function after each record is inserted

I am using batchInsert() function of Yii2 Database Access Objects. I need to run another function like sending email from PHP code after each record is inserted.What is the workaround to achieve this? Or is it possible to get all AUTO_INCREMENT ids of rows inserted?
The code is
Yii::$app->db->createCommand()->batchInsert(Address::tableName(),
['form_id','address'], $rows)->execute();
I am using batchInsert() documented in https://www.yiiframework.com/doc/api/2.0/yii-db-command#batchInsert()-detail
First, you're not using ActiveRecord. More about ActiveRecord you can find in the documentation: API and guide. Yii::$app->db->createCommand() is a DAO which is much simpler than ActiveRecord and does not support events in the same way.
Second, there is no batchInsert() for ActiveRecord and there is a good reason for that - it is hard to detect IDs of inserted records when you're inserting them in batch (at least in a reliable and DB-independent way). More about this you can read at GitHub.
However if you know IDs of records or some unique fields before insert (for example user in addition to the numeric ID, it also has a unique login and/or email), you may just fetch them after insert and run event manually:
$models = Address::findAll(['email' => $emailsArray]);
foreach ($models as $model) {
$model->trigger('myEvent');
}
But unless you're inserting millions of records, you should probably stick to simple foreach and $model->save(), for the sake of simplicity and reliability.
There is also a yii2-collection official extension. This is still more like a draft and POC, but it may be interesting in the future.

SQL Inheritance, get by ID

This question aims to get the most clean and "best" way to handle this kind of problem.
I've read many questions about how to handle inheritance in SQL and like the Table Per Type model most and would like to use it. The problem with this is that you have to know what type you are going to query to do the proper join.
Let's say we have three tables:Son, Daughter and Child.
This works very well if you for example want to query all daughters. You can simply join the child and get all the information.
What I'm trying to do is to query a Child by ID and get the associated sub class information. What I could do is to add a column Type to the child and select the associated data with a second select, but that does not seem pretty nice. Another way to do it would be to join all sub tables, but that doesn't seem to be that nice either.
Is there an inheritance model to solve this kind of problem in a clean, nice and performant way?
I'm using MySQL btw
Given your detailed definition in the comment with the use case
The Server gets the http request domain.com/randomID.
it becomes apparent, that you have a single ID at hand for which you want to retrieve the attributes of derived entities. For your case, I would recommend to use the LEFT JOIN approach:
SELECT age,
son.id is not null as isSon,
randomColumn,
daughter is not null as isDaughter,
whatEver
FROM child
LEFT JOIN son on child.id = son.id
LETT JOIN daughter on child.id = daughter.id
WHERE
child.id = #yourRandomId
This approach, BTW, stays very close to your current database design and thus you would not have to change much. Yet, you are able to benefit from the storage savings that the improved data model provides.
Besides that, I do not see many chances to do it differently:
You have different columns with different datatypes (esp. if looking at your use case), so it is not possible to reduce the number of columns by combining some of them.
Introducing a type attribute is already rejected in your question; sending single SELECT statements as well.
In the comment you are stating that you are looking for something like Map<ID, Child> in MySQL. Please note that this java'ish expression is a compile-time expression which gets instantiated during runtime with the corresponding type of the instance. SQL does not know the difference between runtime and compile-time. Thus, there is also no need for such a generic expression. Finally, also please note that in case of your Java program, you also need to analyse (by introspection or usage of instanceof) which type your value instance has -- and that is also a "single-record" activity which you need to perform.

Separate get request and database hit for each post to get like status

So I am trying to make a social network on Django. Like any other social network users get the option to like a post, and each of these likes are stored in a model that is different from the model used for posts that show up in the news feed. Now I have tried two choices to get the like status on the go.
1.Least database hits:
Make one sql query and get the like entry for every post id if they exist.Now I use a custom django template tag to see if the like entry for the current post exist in the Queryset by searching an array that contains like statuses of all posts.
This way I use the database to get all values and search for a particular value from the list using python.
2.Separate Database Query for each query:
Here i use the same custom template tag but rather that searching through a Queryset I use the mysql database for most of the heavy lifting.
I use model.objects.get() for each entry.
Which is a more efficient algorithm. Also I was planning on getting another database server, can this change the choice if network latency is only around 0.1 ms.
Is there anyway that I can get these like statuses on the go as boolean values along with all the posts in a single db query.
An example query for the first method can be like
Let post_list be the post QuerySet
models.likes.objects.filter(user=current_user,post__in = post_list)
This is not a direct answer to your question, but I hope it is useful nonetheless.
and each of these likes are stored in a model that is different from the model used for news feed
I think you have a design issue here. It is better if you create a model that describes a post, and then add a field users_that_liked_it as a many-to-many relationship to your user model. Then, you can do something like post.users_that_liked_it and get a query set of all users that liked your page.
In my eyes you should also avoid putting logic in templates as much as possible. They are simply not made for it. Logic belongs into the model class, or, if it is dependent on the page visited, in the view. (As a rule of thumb).
Lastly, if performance is your main worry, you probably shouldn't be using Django anyway. It is just not that fast. What Django gives you is the ability to write clean, concise code. This is much more important for a new project than performance. Ask yourself: How many (personal) projects fail because their performance is bad? And how many fail because the creator gets caught in messy code?
Here is my advice: Favor clarity over performance. Especially in a young project.

Limiting results of System.Data.Linq.Table<T>

I am trying to inherit from my generated datacontext in LinqToSQL - something like this
public class myContext : dbDataContext {
public System.Data.Linq.Table<User>() Users {
return (from x in base.Users() where x.DeletedOn.HasValue == false select x);
}
}
But my Linq statement returns IQueryable which cannot cast to Table - does anyone know a way to limit the contents of a Linq.Table - I am trying to be certain that anywhere my Users table is accessed, it doesn't return those marked deleted. Perhaps I am going about this all wrong - any suggestions would be greatly appreciated.
Hal
Another approach would to be use views..
CREATE VIEW ActiveUsers as SELECT * FROM Users WHERE IsDeleted = 0
As far as linq to sql is concerned, that is just the same as a table. For any table that you needed the DeletedOn filtering, just create a view that uses the filter and use that in place of the table in your data context.
You could use discriminator column inheritance on the table, ie. a DeletedUsers table and ActiveUsers table where the discriminator column says which goes to which. Then in your code, just reference the Users.OfType ActiveUsers, which will never include anything deleted.
As a side note, how the heck do you do this with markdown?
Users.OfType<ActiveUsers>
I can get it in code, but not inline
Encapsulate your DataContext so that developers don't use Table in their queries. I have an 'All' property on my repositories that does a similar filtering to what you need. So then queries are like:
from item in All
where ...
select item
and all might be:
public IQueryable<T> All
{
get { return MyDataContext.GetTable<T>.Where(entity => !entity.DeletedOn.HasValue); }
}
You can use a stored procedure that returns all the mapped columns in the table for all the records that are not marked deleted, then map the LINQ to SQL class to the stored procedure's results. I think you just drag-drop the stored proc in Server Explorer on to the class in the LINQ to SQL designer.
What I did in this circumstance is I created a repository class that passes back IQueryable but basically is just
from t in _db.Table
select t;
this is usually referenced by tableRepository.GetAllXXX(); but you could have a tableRepository.GetAllNonDeletedXXX(); that puts in that preliminary where clause to take out the deleted rows. This would allow you to get back the deleted ones, the undeleted ones and all rows using different methods.
Perhaps my comment to Keven sheffield's response may shed some light on what I am trying to accomplish:
I have a similar repository for most
of my data access, but I am trying to
be able to traverse my relationships
and maintain the DeletedOn logic,
without actually calling any
additional methods. The objects are
interrogated (spelling fixed) by a StringTemplate
processor which can't call methods
(just props/fields).
I will ultimately need this DeletedOn filtering for all of the tables in my application. The inherited class solution from Scott Nichols should work (although I will need to derive a class and relationships for around 30 tables - ouch), although I need to figure out how to check for a null value in my Derived Class Discriminator Value property.
I may just end up extended all my classes specifically for the StringTemplate processing, explicitly adding properties for the relationships I need, I would just love to be able to throw StringTemplate a [user] and have it walk through everything.
There are a couple of views we use in associations and they still appear just like any other relationship. We did need to add the associations manually. The only thing I can think to suggest is to take a look at the properties and decorated attributes generated for those classes and associations.
Add a couple tables that have the same relationship and compare those to the view that isn't showing up.
Also, sometimes the refresh on the server explorer connection doesn't seem to work correctly and the entities aren't created correctly initially, unless we remove them from the designer, close the project, then reopen the project and add them again from the server explorer. This is assuming you are using Visual Studio 2008 with the linq to sql .dbml designer.
I found the problem that I had with the relationships/associations not showing in the views. It seems that you have to go through each class in the dbml and set a primary key for views as it is unable to extract that information from the schema. I am in the process of setting the primary keys now and am planning to go the view route to isolate only non-deleted items.
Thanks and I will update more later.