NHibernate INNER JOIN on a SubQuery - mysql

I would like to do a subquery and then inner join the result of that to produce a query. I want to do this as I have tested an inner join query and it seems to be far more performant on MySql when compared to a straight IN subquery.
Below is a very basic example of the type of sql I am trying to reproduce.
Tables
ITEM
ItemId
Name
ITEMRELATIONS
ItemId
RelationId
Example Sql I would Like to create
Give me the COUNT of RELATIONs for ITEMs having a name of 'bob':
select ir.itemId, count(ir.relationId)
from ItemRelations ir
inner join (select itemId from Items where name = 'bob') sq
on ir.itemId = sq.itemId
group by ir.itemId
The base Nhibernate QueryOver
var bobItems = QueryOver.Of<Item>(() => itemAlias)
.Where(() => itemAlias.Name == "bob")
.Select(Projections.Id());
var bobRelationCount = session.QueryOver<ItemRelation>(() => itemRelationAlias)
.Inner.Join(/* Somehow join the detached criteria here on the itemId */)
.SelectList(
list =>
list.SelectGroup(() => itemRelationAlias.ItemId)
.WithAlias(() => itemRelationCountAlias.ItemId)
.SelectCount(() => itemRelationAlias.ItemRelationId)
.WithAlias(() => itemRelationCountAlias.Count))
.TransformUsing(Transformers.AliasToBean<ItemRelationCount>())
.List<ItemRelationCount>();
I know it may be possible to refactor this into a single query, however the above is merely as simple example. I cannot change the detached QueryOver, as it is handed to my bit of code and is used in other parts of the system.
Does anyone know if it is possible to do an inner join on a detached criteria?

MySql 5.6.5 has addressed the performance issue related to the query structure.
See here: http://bugs.mysql.com/bug.php?id=42259
No need for me to change the output format of my NHibernate queries anymore. :)

Related

"NOT IN" for Active Record

I have a MySQL query that I am trying to chain a "NOT IN" at the end of it.
Here is what it looks like in ruby using Active Record:
not_in = find_by_sql("SELECT parent_dimension_id FROM relations WHERE relation_type_id = 6;").map(&:parent_dimension_id)
joins('INNER JOIN dimensions ON child_dimension_id = dimensions.id')
.where(relation_type_id: model_relation_id,
parent_dimension_id: sub_type_ids,
child_dimension_id: model_type)
.where.not(parent_dimension_id: not_in)
So the SQL query I'm trying to do looks like this:
INNER JOIN dimensions ON child_dimension_id = dimensions.id
WHERE relations.relation_type_id = 5
AND relations.parent_dimension_id
NOT IN(SELECT parent_dimension_id FROM relations WHERE relation_type_id = 6);
Can someone confirm to me what I should use for that query?
do I chain on where.not ?
If you really do want
SELECT parent_dimension_id
FROM relations
WHERE relation_type_id = 6
as a subquery, you just need to convert that SQL to an ActiveRecord relation:
Relation.select(:parent_dimension_id).where(:relation_type_id => 6)
then use that as a value in a where call the same way you'd use an array:
not_parents = Relation.select(:parent_dimension_id).where(:relation_type_id => 6)
Relation.joins('...')
.where(relation_type_id: model_relation_id, ...)
.where.not(parent_dimension_id: not_parents)
When you use an ActiveRecord relation as a value in a where and that relation selects a single column:
r = M1.select(:one_column).where(...)
M2.where(:column => r)
ActiveRecord is smart enough to inline r's SQL as an in (select one_column ...) rather than doing two queries.
You could probably replace your:
joins('INNER JOIN dimensions ON child_dimension_id = dimensions.id')
with a simpler joins(:some_relation) if your relations are set up too.
You can feed where clauses with values or arrays of values, in which case they will be translated into in (?) clauses.
Thus, the last part of your query could contain a mapping:
.where.not(parent_dimension_id:Relation.where(relation_type_id:6).map(&:parent_dimension_id))
Or you can prepare a statement
.where('parent_dimension_id not in (?)', Relation.where(relation_type_id:6).map(&:parent_dimension_id) )
which is essentially exactly the same thing

INNER JOIN Results from Select Statement using Doctrine QueryBuilder

Can you use Doctrine QueryBuilder to INNER JOIN a temporary table from a full SELECT statement that includes a GROUP BY?
The ultimate goal is to select the best version of a record. I have a viewVersion table that has multiple versions with the same viewId value but different timeMod. I want to find the version with the latest timeMod (and do a lot of other complex joins and filters on the query).
Initially people assume you can do a GROUP BY viewId and then ORDER BY timeMod, but ORDER BY has no effect on GROUP BY, and MySQL will return random results. There are a ton of answers out there (e.g. here) that explain the problem with using GROUP and offer a solution, but I am having trouble interpreting the Doctrine docs to find a way to implement the SQL with Doctrine QueryBuilder (if it's even possible). Why don't I just use DQL? I may have to, but I have a lot of dynamic filters and joins that are much easier to do with QueryBuilder, so I wanted to see if that's possible.
Sample MySQL to Reproduce in Doctrine QueryBuilder
SELECT vv.*
FROM view_version vv
#inner join only returns where the result sets overlap, i.e. one record
INNER JOIN (
SELECT MAX(timeMod) maxTimeMod, viewId
FROM view_version
GROUP BY viewId
) version ON version.viewId = vv.viewId AND vv.timeMod = version.maxTimeMod
#join other tables for filter, etc
INNER JOIN view v ON v.id = vv.viewId
INNER JOIN content_type c ON c.id = v.contentTypeId
WHERE vv.siteId=1
AND v.contentTypeId IN (2)
ORDER BY vv.title ASC;
Theoretical Solution via Query Builder (not working)
I am thinking that the JOIN needs to inject a DQL statement, e.g.
$em = $this->getDoctrine()->getManager();
$viewVersionRepo = $em->getRepository('GutensiteCmsBundle:View\ViewVersion');
$queryMax = $viewVersionRepo->createQueryBuilder()
->addSelect('MAX(timeMod) AS timeModMax')
->addSelect('viewId')
->groupBy('viewId');
$queryBuilder = $viewVersionRepo->createQueryBuilder('vv')
// I tried putting the query in a parenthesis, to no avail
->join('('.$queryMax->getDQL().')', 'version', 'WITH', 'vv.viewId = version.viewId AND vv.timeMod = version.timeModMax')
// Join other Entities
->join('e.view', 'view')
->addSelect('view')
->join('view.contentType', 'contentType')
->addSelect('contentType')
// Perform random filters
->andWhere('vv.siteId = :siteId')->setParameter('siteId', 1)
->andWhere('view.contentTypeId IN(:contentTypeId)')->setParameter('contentTypeId', $contentTypeIds)
->addOrderBy('e.title', 'ASC');
$query = $queryBuilder->getQuery();
$results = $query->getResult();
My code (which may not match the above example perfectly) outputs:
SELECT e, view, contentType
FROM Gutensite\CmsBundle\Entity\View\ViewVersion e
INNER JOIN (
SELECT MAX(v.timeMod) AS timeModMax, v.viewId
FROM Gutensite\CmsBundle\Entity\View\ViewVersion v
GROUP BY v.viewId
) version WITH vv.viewId = version.viewId AND vv.timeMod = version.timeModMax
INNER JOIN e.view view
INNER JOIN view.contentType contentType
WHERE e.siteId = :siteId
AND view.contentTypeId IN (:contentTypeId)
ORDER BY e.title ASC
This Answer seems to indicate that it's possible in other contexts like IN statements, but when I try the above method in the JOIN, I get the error:
[Semantical Error] line 0, col 90 near '(SELECT MAX(v.timeMod)': Error: Class '(' is not defined.
A big thanks to #AdrienCarniero for his alternative query structure for sorting the highest version with a simple JOIN where the entity's timeMod is less than the joined table timeMod.
Alternative Query
SELECT view_version.*
FROM view_version
#inner join to get the best version
LEFT JOIN view_version AS best_version ON best_version.viewId = view_version.viewId AND best_version.timeMod > view_version.timeMod
#join other tables for filter, etc
INNER JOIN view ON view.id = view_version.viewId
INNER JOIN content_type ON content_type.id = view.contentTypeId
WHERE view_version.siteId=1
# LIMIT Best Version
AND best_version.timeMod IS NULL
AND view.contentTypeId IN (2)
ORDER BY view_version.title ASC;
Using Doctrine QueryBuilder
$em = $this->getDoctrine()->getManager();
$viewVersionRepo = $em->getRepository('GutensiteCmsBundle:View\ViewVersion');
$queryBuilder = $viewVersionRepo->createQueryBuilder('vv')
// Join Best Version
->leftJoin('GutensiteCmsBundle:View\ViewVersion', 'bestVersion', 'WITH', 'bestVersion.viewId = e.viewId AND bestVersion.timeMod > e.timeMod')
// Join other Entities
->join('e.view', 'view')
->addSelect('view')
->join('view.contentType', 'contentType')
->addSelect('contentType')
// Perform random filters
->andWhere('vv.siteId = :siteId')->setParameter('siteId', 1)
// LIMIT Joined Best Version
->andWhere('bestVersion.timeMod IS NULL')
->andWhere('view.contentTypeId IN(:contentTypeId)')->setParameter('contentTypeId', $contentTypeIds)
->addOrderBy('e.title', 'ASC');
$query = $queryBuilder->getQuery();
$results = $query->getResult();
In terms of performance, it really depends on the dataset. See this discussion for details.
TIP: The table should include indexes on both these values (viewId and timeMod) to speed up results. I don't know if it would also benefit from a single index on both fields.
A native SQL query using the original JOIN method may be better in some cases, but compiling the query over an extended range of code that dynamically creates it, and getting the mappings correct is a pain. So this is at least an alternative solution that I hope helps others.

Syntax error in complex SQL Query condition

I am having some trouble with my sql statement.
Here is a picture of the relevant tables:
A product can be in multiple categories.
A single product can have multiple varietycategories (ie: size, color, etc)
a varietycategory can have multiple varietycategoryoptions (ie: small, medium, large)
the table searchcriteria.criterianame loosly relates to varietycategory.category
the table searchcriteriaoption.criteriaoption loosely relates to varietycategoryoption.descriptor.
I get the searchcriteria.criterianame and use that string as the value we want to match with varietycategory.category and we also have to get the various searchcriteriaoption.criteriaoption strings (for that searchcriteria.criterianame) and match that against varietycategoryoption.descriptor for that varietycategory.category.
Here is the sql:
SELECT DISTINCT categories.*, product.*
FROM (categories, product, product_category)
LEFT JOIN varietycategory ON varietycategory.productid = product.id
LEFT JOIN varietycategoryoption ON varietycategoryoption.varietycategoryid = varietycategory.id
WHERE product_category.categoryid=4
AND product.id=product_category.productid
AND categories.category_id=product_category.categoryid
AND (
(varietycategory.category = 'color' AND (varietycategoryoption.descriptor='red' OR varietycategoryoption.descriptor='blue'))
OR
(varietycategory.category = 'size' AND (varietycategoryoption.descriptor = 'small' OR varietycategoryoption.descriptor='medium'))
)
but I get an error:
Unknown column 'varietycategory.id' in 'on clause'
I have tried to figure out what I am doing wrong. I tried simplifying the query a bit (just to try and determine what part of the sql query was causing the problem) to only match the searchcriteria.category string with the varietycategory.category and the query returns the data set correctly.
Here is the working query (this query is simplified and insufficient):
SELECT DISTINCT categories.*, product.*
FROM (categories, product, product_category)
LEFT JOIN varietycategory ON varietycategory.productid = product.id
WHERE product_category.categoryid=4
AND product.id=product_category.productid
AND categories.category_id=product_category.categoryid
AND (varietycategory.category = 'color' OR varietycategory.category = 'size' OR varietycategory.category='shape');
But I also need to be able to match against the varietycategoryoptions as well.
Just to avoid confusion, I am only using searchcriteria to get the field category and use it as a string to match against the varietycategory.category
and I am only using searchcriteriaoption to get the field criteriaoption and use it as a string to match against varietycategoryoption.descriptor
Does anyone know what I am doing wrong with my 1st query?
Please do help as SQL is not expertise.
Thank you!
The error is at:
OR
(varietycategory.category = 'size' (varietycategoryoption.desciptor = 'small' OR varietycategoryoption.descriptor='medium'))
^
|
An operator (AND, OR) is missing here
This has nothing to do with the join syntax, by the way.
Do not mix implicit and explicit joins. Your query should look like:
SELECT DISTINCT c.*, p.*
FROM product_category pc join
categories c
on c.category_id = pc.categoryid join
product p
on p.id = pc.productid join
varietycategory vc
ON vc.productid = p.id
WHERE c.categoryid = 4 AND
vc.category in ('color', 'size', 'shape');
You probably don't need the distinct, but that depends on the data. The left join is unnecessary because you are filtering on the second table in the where.
A simple rule: Never use commas in the from clause. To help, MySQL has scoping rules that can cause queries to break when you mix implicit and explicit join syntax.
The problem was a misspelled field on the table varietycategory, which I named
vcid, when I almost always name my table primary key id's "id".

Performing Join with Multiple Criteria in Propel 1.5

This question follows on from the questions here and here.
I have recently upgraded to Propel 1.5, and have started using it's Query features over Criteria. I have a query I cannot translate, however - a left join with multiple criteria:
SELECT * FROM person
LEFT JOIN group_membership ON
person.id = group_membership.person_id
AND group_id = 1
WHERE group_membership.person_id is null;
Its aim is to find all people not in the specified group. Previously I was using the following code to accomplish this:
$criteria->addJoin(array(
self::ID,
GroupMembershipPeer::GROUP_ID,
), array(
GroupMembershipPeer::PERSON_ID,
$group_id,
),
Criteria::LEFT_JOIN);
$criteria->add(GroupMembershipPeer::PERSON_ID, null, Criteria::EQUAL);
I considered performing a query for all people in that group, getting the primary keys and adding a NOT IN on the array, but there didn't seem a particularly easy way to get the primary keys from a find, and it didn't seem very elegant.
An article on codenugget.org details how to add extra criteria to a join, which I attempted:
$result = $this->leftJoin('GroupMembership');
$result->getJoin('GroupMembership')
->addCondition(GroupMembershipPeer::GROUP_ID, $group->getId());
return $result
->useGroupMembershipQuery()
->filterByPersonId(null)
->endUse();
Unfortunately, the 'useGroupMembershipQuery' overrides the left join. To solve this, I tried the following code:
$result = $this
->useGroupMembershipQuery('GroupMembership', Criteria::LEFT_JOIN)
->filterByPersonId(null)
->endUse();
$result->getJoin('GroupMembership')
->addCondition(GroupMembershipPeer::GROUP_ID, $group->getId());
return $tmp;
For some reason this results in a cross join being performed for some reason:
SELECT * FROM `person`
CROSS JOIN `group_membership`
LEFT JOIN group_membership GroupMembership ON
(person.ID=GroupMembership.PERSON_ID
AND group_membership.GROUP_ID=3)
WHERE group_membership.PERSON_ID IS NULL
Does anyone know why this might be doing this, or how one might perform this join successfully in Propel 1.5, without having to resort to Criteria, again?
Propel 1.6 supports multiple criteria on joins with addJoinCondition(). If you update the Symfony plugin, or move to sfPropelORMPlugin, you can take advantage of that. The query can then be written like this:
return $this
->leftJoin('GroupMembership')
->addJoinCondition('GroupMembership', 'GroupMembership.GroupId = ?', $group->getId())
->where('GroupMembership.PersonId IS NULL');

linq to sql where in

I want to translate query like this:
SELECT * FROM Product WHERE Product.ID in (SELECT Product.ID FROM other_table)
into LINQ. I read about using the contains method but the problem is that it generates a lot of parameters for each id passed in like this:
WHERE [Product].[ID] IN (#p0, #p1)
If I had for example one bilion parameters I want to pass into my query the server won't be able to execute such a long query. Is it possible to create LINQ query in such a way that the generated SQL will be close to the original?
Thanks,
Romek
If you are using large tables then IN statments are a bad idea, they are very slow. You should be doing joins.
Anyway, here is what you want;
using(dbDataContext db = new dbDataContext())
{
var result = from p in db.products
join o in db.other_table
on p.ID equals o.ID
select p;
}
You should be able to use join for this.
other_Table.Join(product, ot => ot.Id, pd => pd.Id, (pd, ot) => pd);