Strange behavior for includes with conditions in Rails

Strange behavior for includes with conditions in Rails - mysql

I've had to query some weird database extracted from an excel file, it's pretty bad designed. This has leaded me to some strange needs for the ActiveRecord, one of those is setting conditions when eager loading relations.
So here's my problem and my weird solution.
1.- I include the relation
ModelOne.includes(:relation)
2.- I try to set conditions on the columns of the included table
ModelOne.includes(:relation).where("relation.some_column = something")
And I get the following error
Mysql2::Error: Unknown column some_column in where clause ...
The error, of course, displays the query which contains no joins or something that refers to the included table.
Now, in the other hand, this works:
ModelOne.includes(:relation).where(relation: { some_column: "something" })
This sintax with hashes it's cool but doesn't support LIKE queries, for example, so... the strange thing is that, after passing a hash including the included table, every reference to the columns of this table works. Let's say, for what I've found, this is how I would do a LIKE query:
ModelOne.includes(:relation).where.not(relation: { id: nil }).where("some_column LIKE ?","")
Note the weird not where, it does nothing in terms of conditioning the query, but I need it so I can use the columns of the eager loaded table in following methods (LIKE's, groups and so)
What is the right way of doing this? Why does Rails behave like this? What am I doing wrong? When does Rails actually include the eager loaded table?
Note: Using the joins method it's not an option in this case for different reasons

http://guides.rubyonrails.org/active_record_querying.html
Read 13.2 Specifying Conditions on Eager Loaded Associations
Using where like this will only work when you pass it a Hash. For
SQL-fragments you need use references to force joined tables:
Article.includes(:comments).where("comments.visible = true").references(:comments)

Related

Yii2 Dynamic Relational Query Junction with Sort uses 2 queries instead of a JOIN, why?

I'm working with Yii2 Relational Database / Active Query Models and I ran into an issue trying to use the magic method getModelName() with $this->hasMany()->viaTable() to set the relation while trying to sort by a sort column in my junction table.
First I tried to just add my ->orderBy() clause to the main query:
return $this->hasMany(Category::class,
['id' => 'categoryId'])
->viaTable('{{kits_x_categories}}',
['kitId' => 'id'])
->orderBy('{{kits_x_categories}}.sort asc');
That didn't work as expected and upon further digging I found out that this results in two separate queries, the first one selects my category Ids into an array, then uses said array for a WHERE IN() clause in the main (2nd) query to get the actual models that are related.
My first thought was to use the 3rd function($query) {} callback parameter of the ->viaTable() call and putting my $query->orderBy() clause on there:
return $this->hasMany(Category::class,
['id' => 'categoryId'])
->viaTable('{{kits_x_categories}}',
['kitId' => 'id'],
function($query) {
return $query->orderBy('{{kits_x_categories}}.sort asc');
}
);
However, all that did was return the category ID's in my desired order but ultimately had no effect on the main query that does the IN() condition with said ids since the order of the ids in the IN() condition have no effect on anything.
Finally, I ended up with this which lets it do what it wants, but then forces in my join to the main query with the IN() condition so that I can have the main query sort by my junction table sort column. This works as desired:
return $this->hasMany(Category::class,
['id' => 'categoryId'])
->viaTable('{{kits_x_categories}}',
['kitId' => 'id'])
->leftJoin('{{kits_x_categories}}', '{{kits_x_categories}}.categoryId = {{categories}}.id')
->where(['{{kits_x_categories}}.kitId' => $this->id])
->orderBy('{{kits_x_categories}}.sort asc');
This results in 2 queries.
First the query gets the category ids from the join table:
SELECT * FROM `kits_x_categories` WHERE `kitId`='49';
Then the main query with the IN() condition and my forced join for sort:
SELECT `categories`.* FROM `categories`
LEFT JOIN `kits_x_categories` ON `kits_x_categories`.categoryId = `categories`.id
WHERE (`kits_x_categories`.`kitId`='49') AND (`categories`.`id` IN ('11', '7', '9'))
ORDER BY `kits_x_categories`.`sort`
So here is my actual question... This seems largely inefficient to me but I am by no means a database/SQL god so maybe I just don't understand fully. What I want is to understand.
Why does Yii do this? What is the point of making one query to get the IDs first, then making another query to get the objects based on the ids of the relation? Wouldn't it be more efficient to just do a regular join here? Then, in my opinion, sorting by a junction sort column would be intuitive rather than counter-intuitive.
The only thing I can think of is has to do with the lazy vs eager loading of data, maybe in lazy in only gets the IDs first then when it needs to load the data it pulls the actual data using IN()? If I used joinWith() instead of viaTable() would that make any difference here? I didn't dig into this as I literally just thought of that as I was typing this.
Lastly, In this scenario, There is only going to be a few categories for each kit so efficiency is not a big deal but I'm curious are there any performance implications in my working solution if I were to use this in the future on a different model set that could have thousands+ of relations?

Yii 2 does that:
To support lazy loading.
To support cross-database relations such as MySQL -> Redis.
To reduce number of edge-cases significantly so internal AR code becomes less complicated.

3rd party software is usually designed to get you started with databases. But then they fall apart when the app grows. This means that you need to learn the details of the underlying database in addition to the details of the layer.
Possibly this specific issue can be solved by improving the indexes on the many-to-many table with the tips here: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
This, of course, depends on whether the layer lets you tweak the schema that it created for you.
If there is a way to write "raw" SQL, that might let you get rid of the 2-step process, but you still need to improve the indexes on that table.

MySql - Select * from 2 tables, but Prefix Table Names in the Resultset?

I'd like to select * from 2 tables, but have each table's column name be prefixed with a string, to avoid duplicate column name collissions.
For example, I'd like to have a view like so:
CREATE VIEW view_user_info as (
SELECT
u.*,
ux.*
FROM
user u,
user_ex ux
);
where the results all had each column prefixed with the name of the table:
e.g.
user_ID
user_EMAIL
user_ex_ID
user_ex_TITLE
user_ex_SIN
etc.
I've put a sql fiddle here that has the concept, but not the correct syntax of course (if it's even possible).
I'm using MySql, but would welcome generic solutions if they exist!
EDIT: I am aware that I could alias each of the fields, as mentioned in one of the comments. That's what I'm currently doing, but I find at the start of a project I keep having to sync up my tables and views as they change. I like the views to have everything in them from each table, and then I manually select out what I need. Kind of a lazy approach, but this would allow me to iterate quicker, and only optimize when it's needed.

I find at the start of a project I keep having to sync up my tables and views as they change.
Since the thing you're trying to do is not really supported by standard SQL, and you keep modifying database structures in development, I wonder if your best approach would be to write a little script that recreates that SELECT statement for you. Maybe wrap it in a method call in the development language of your choice?
Essentially you'd need to query INFORMATION_SCHEMA for the tables and columns of interest, probably via a join, and write the results out in SQL style.
Then just run the script every time you make database structural changes that are important to you, and watch your code magically keep up.

INSERT with Linq omitting some "fake" columns

I have a table in the database with the following columns: ID, Name, Txt. We are using Linq To Sql to implement our DAL. In there another collegue added two extra columns so in the code the same table results: ID, Name, Txt, NameTemp, TxtTemp.
These two "fake" tables are used in different parts of the code in LINQ joins and analyzing with SQL Profiler the parsed SQL query takes the "real" columns and everything works properly.
Now I need to make an INSERT using that table, but I get an exception since also the fake columns are used in the statement.
Since I cannot add the two fake columns in the DB(since unuseful there), is there a way in which I could make an insert with Linq omitting these two columns?

I think i know where you're getting at. You should be able to add properties to a partial linq class no problem, only thing is that if you try and use a linq query against these "fake" columns, you'll get an exception when linqtosql tries to reference a column that doesn't exist in the database. I've been through this before - i wanted to be able to select columns that don't exist in the database (but do in the linq2sql dbml class) and have linq2sql translate the columns into what they really are in the database. Only problem is that there's no real easy way to do this - you can add attributes to the "fake" properties so that linq2sql thinks that NameTmp and TxtTmp are in fact Name and Txt in the sql world, only problem is that when it comes to inserting a record, the translated sql specifies the same column twice (which SQL doesn't like and throws an exception).
You can mark the column with IsDbGenerated = true - that'll let you insert records without getting the double column problem, but you can't update a record without linqtosql complaining that you can't update a computed column. I guess you can use a sproc to get around this perhaps?
I logged a bug with Microsoft a while back, which they'll never fix. The info here might help you get what you need -
http://social.msdn.microsoft.com/Forums/eu/linqtosql/thread/5691e0ad-ad67-47ea-ae2c-9432e4e4bd46
https://connect.microsoft.com/VisualStudio/feedback/details/526402/linq2sql-doesnt-like-it-when-you-wrap-column-properties-with-properties-in-an-interface

LINQ is not for inserting data, but for querying only - Language INtegrated Query. Use ADO.NET for inserting the data.
(Leaving the first part to remind my stupidity)
Check ScottGu. The classes generated are partial (mentioned here), so you can put your 2 properties into the editable part and since they won't have any mapping attribute defined, they won't be mapped nor persisted.

mysql don't return results if not from statement but from INDEX table or something

I think my question was a little confusing.....It confused me :)
Working on a media site as a take-over project and it has a custom CMS. The client wants the ability to activate/deactivate media....sort of like Wordpress's publish/unpublish feature.
Instead of digging through all the code looking for mysql queries (which I'm not opposed to), I was wondering if you can add a sort of INDEX to a table that won't let it return result rows if that rows "active" column = let's say 0.
Just trying to be lazy and learn something at the same time, heh.
I don't need examples of queries to make it happen, btw.

What you describe is called a "view". Here is a page describing how to create them in MySQL: http://dev.mysql.com/doc/refman/5.0/en/create-view.html. However, in most cases you will still have to alter your code to use the view instead of the table.

You can consider create a view (which contains active record only)
AND swap the view name to actual table name instead, so you can achieve the negative filtering without changing any of your source code.

Limiting results of System.Data.Linq.Table<T>

I am trying to inherit from my generated datacontext in LinqToSQL - something like this
public class myContext : dbDataContext {
public System.Data.Linq.Table<User>() Users {
return (from x in base.Users() where x.DeletedOn.HasValue == false select x);
}
}
But my Linq statement returns IQueryable which cannot cast to Table - does anyone know a way to limit the contents of a Linq.Table - I am trying to be certain that anywhere my Users table is accessed, it doesn't return those marked deleted. Perhaps I am going about this all wrong - any suggestions would be greatly appreciated.
Hal

Another approach would to be use views..
CREATE VIEW ActiveUsers as SELECT * FROM Users WHERE IsDeleted = 0
As far as linq to sql is concerned, that is just the same as a table. For any table that you needed the DeletedOn filtering, just create a view that uses the filter and use that in place of the table in your data context.

You could use discriminator column inheritance on the table, ie. a DeletedUsers table and ActiveUsers table where the discriminator column says which goes to which. Then in your code, just reference the Users.OfType ActiveUsers, which will never include anything deleted.
As a side note, how the heck do you do this with markdown?
Users.OfType<ActiveUsers>
I can get it in code, but not inline

Encapsulate your DataContext so that developers don't use Table in their queries. I have an 'All' property on my repositories that does a similar filtering to what you need. So then queries are like:
from item in All
where ...
select item
and all might be:
public IQueryable<T> All
{
get { return MyDataContext.GetTable<T>.Where(entity => !entity.DeletedOn.HasValue); }
}

You can use a stored procedure that returns all the mapped columns in the table for all the records that are not marked deleted, then map the LINQ to SQL class to the stored procedure's results. I think you just drag-drop the stored proc in Server Explorer on to the class in the LINQ to SQL designer.

What I did in this circumstance is I created a repository class that passes back IQueryable but basically is just
from t in _db.Table
select t;
this is usually referenced by tableRepository.GetAllXXX(); but you could have a tableRepository.GetAllNonDeletedXXX(); that puts in that preliminary where clause to take out the deleted rows. This would allow you to get back the deleted ones, the undeleted ones and all rows using different methods.

Perhaps my comment to Keven sheffield's response may shed some light on what I am trying to accomplish:
I have a similar repository for most
of my data access, but I am trying to
be able to traverse my relationships
and maintain the DeletedOn logic,
without actually calling any
additional methods. The objects are
interrogated (spelling fixed) by a StringTemplate
processor which can't call methods
(just props/fields).
I will ultimately need this DeletedOn filtering for all of the tables in my application. The inherited class solution from Scott Nichols should work (although I will need to derive a class and relationships for around 30 tables - ouch), although I need to figure out how to check for a null value in my Derived Class Discriminator Value property.
I may just end up extended all my classes specifically for the StringTemplate processing, explicitly adding properties for the relationships I need, I would just love to be able to throw StringTemplate a [user] and have it walk through everything.

There are a couple of views we use in associations and they still appear just like any other relationship. We did need to add the associations manually. The only thing I can think to suggest is to take a look at the properties and decorated attributes generated for those classes and associations.
Add a couple tables that have the same relationship and compare those to the view that isn't showing up.
Also, sometimes the refresh on the server explorer connection doesn't seem to work correctly and the entities aren't created correctly initially, unless we remove them from the designer, close the project, then reopen the project and add them again from the server explorer. This is assuming you are using Visual Studio 2008 with the linq to sql .dbml designer.

I found the problem that I had with the relationships/associations not showing in the views. It seems that you have to go through each class in the dbml and set a primary key for views as it is unable to extract that information from the schema. I am in the process of setting the primary keys now and am planning to go the view route to isolate only non-deleted items.
Thanks and I will update more later.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008