I have an SELECT statement that has a huge number of left join and I want to filter some out.
When I check how many records i have in total and subtract the records with my LIKE statements, I should get the amount that is not affected by my restrictions.
But when I negate my restriction to get the ones I didn't affect, I get an different number than calculated.
SQL without restrictions (Record count: 13.251.981)
SELECT p.product_number
FROM product p
LEFT JOIN product_category pc on p.id = pc.product_id
LEFT JOIN product_category_tree pct on p.id = pct.product_id
LEFT JOIN product_configurator_setting pcs on p.id = pcs.product_id
LEFT JOIN product_cross_selling pcs2 on p.id = pcs2.product_id
LEFT JOIN product_cross_selling_assigned_products pcsap on p.id = pcsap.product_id
LEFT JOIN product_cross_selling_translation pcst on pcs2.id = pcst.product_cross_selling_id
LEFT JOIN product_custom_field_set pcfs on p.id = pcfs.product_id
LEFT JOIN product_media pm on p.id = pm.product_id
LEFT JOIN product_option po on p.id = po.product_id
LEFT JOIN product_price pp on p.id = pp.product_id
LEFT JOIN product_property pp2 on p.id = pp2.product_id
LEFT JOIN product_review pr on p.id = pr.product_id
LEFT JOIN product_search_keyword psk on p.id = psk.product_id
LEFT JOIN product_tag pt on p.id = pt.product_id
LEFT JOIN product_translation pt2 on p.id = pt2.product_id
LEFT JOIN product_visibility pv on p.id = pv.product_id
With restriction (Record count: 9.285.545)
WHERE p.product_number NOT LIKE 'SW%'
AND p.product_number NOT LIKE '%.%'
AND pt2.name NOT LIKE '%Gutschein'
AND pt2.name NOT LIKE '%Test%'
With negated restriction (Record count: 100.851)
WHERE p.product_number LIKE 'SW%'
OR p.product_number LIKE '%.%'
OR pt2.name LIKE '%Gutschein'
OR pt2.name LIKE '%Test%';
From my calculations i should get 3.966.436 records that don't get affected. (13.251.981 - 9.285.545 = 3.966.436)
But instead I get 100.851
How is that possible?
The solution for me was actually this WHERE:
WHERE p.product_number < 'SW'
Related
I am trying to do a left outer join to a subquery, is that possible?
Can I do something like this?:
##this is this weeks targets
select * from targets t
inner join streams s on s.id = t.stream_id
where t.week_no =WEEKOFYEAR(NOW())
left outer join
(
###############This is records selected so far this week
select p.brand_id, p.part_product_family, sum(r.best) from records r
inner join products p on p.id = r.product_id
left outer join streams s on s.body = p.brand_id and s.stream = p.part_product_family
where WEEKOFYEAR(r.date_selected) =WEEKOFYEAR(NOW())
group by p.brand_id, p.part_product_family;
) sq_2
on s.stream = sq_2.part_product_family
This is working:
##this is this weeks targets
select * from targets t
inner join streams s on s.id = t.stream_id
left outer join
(
###############This is records selected so far this week
select p.brand_id, p.part_product_family, sum(r.best) from records r
inner join products p on p.id = r.product_id
left outer join streams s on s.body = p.brand_id and s.stream = p.part_product_family
where WEEKOFYEAR(r.date_selected) =WEEKOFYEAR(NOW()) and YEAR(r.date_selected) = YEAR(now())
group by p.brand_id, p.part_product_family
) sq_2
on s.body = sq_2.brand_id and s.stream = sq_2.part_product_family
I have the following query:
SELECT DISTINCT (
s.styleTitle
), COUNT(p.id) AS `PictureCount`
FROM `style` s
LEFT JOIN `instagram_picture_style` ps ON s.id = ps.style_id
LEFT JOIN `instagram_shop_picture` p ON ps.picture_id = p.id
LEFT JOIN `instagram_picture_category` c ON c.picture_id = p.id
LEFT JOIN `instagram_second_level_category` sl ON c.second_level_category_id = sl.id
WHERE sl.id =25
GROUP BY p.id
ORDER BY PictureCount
however this query gives me:
I basically wanted the list to be ordered by the style that has the most pictures in it. What did I do wrong? Why is it giving me 1 on all of the styles, I am pretty sure it has more pictures for that style
ORDER BY doesn't have underscores. But equally important, you are using DISTINCT in a way where you seem to think that it is a function. It is not. It is a modifies on the SELECT and it applies to all columns.
You should group by the same column you have in the distinct. Something like this:
SELECT s.styleTitle, COUNT(p.id) AS `PictureCount`
FROM `style` s
LEFT JOIN `instagram_picture_style` ps ON s.id = ps.style_id
LEFT JOIN `instagram_shop_picture` p ON ps.picture_id = p.id
LEFT JOIN `instagram_picture_category` c ON c.picture_id = p.id
LEFT JOIN `instagram_second_level_category` sl ON c.second_level_category_id = sl.id
WHERE sl.id = 25
GROUP BY s.styleTitle
ORDER BY PictureCount DESC;
In fact, you almost never need distinct with group by. If you are using, you need to think why it would be necessary.
SELECT p.product_id,p.account_id,i.image_id,a.email,p.title,p.price
FROM products AS p
LEFT OUTER JOIN products_images AS i
ON p.product_id = i.product_id AND i.featured=1 AND i.deleted=0
INNER JOIN accounts AS a
ON p.account_id = a.account_id
MATCH(p.title) AGAINST('+images')
I'm trying to use a MATCH for the first time. It says that I have a syntax error and I am not sure why?
You're missing the WHERE keyword before conditions that aren't part of the join:
SELECT p.product_id,p.account_id,i.image_id,a.email,p.title,p.price
FROM products AS p
LEFT OUTER JOIN products_images AS i
ON p.product_id = i.product_id AND i.featured=1 AND i.deleted=0
INNER JOIN accounts AS a
ON p.account_id = a.account_id
WHERE MATCH(p.title) AGAINST('+images')
I have following query.
select
Product.*,
(
select
group_concat(features.feature_image order by product_features.feature_order)
from product_features
inner join features
on features.id = product_features.feature_id
where
product_features.product_id = Product.id
and product_features.feature_id in(1)
) feature_image
from products as Product
where
Product.main_product_id=1
and Product.product_category_id='1'
I want to bypass the row if feature_image is empty.
Your query looks a bit strange because you are doing most of the work in a subquery:
select p.*, (select group_concat(f.feature_image order by pf.feature_order)
from product_features pf inner join
features f
on f.id = pf.feature_id
where pf.product_id = p.id and pf.feature_id in (1)
) as feature_image
from products p
where p.main_product_id=1 and p.product_category_id='1';
A more common way to phrase the query is as an inner join in the outer query:
select p.*, group_concat(f.feature_image order by pf.feature_order) as feature_image
from products p join
product_features pf
on pf.product_id = p.id and pf.feature_id in (1) join
features f
on f.id = pf.feature_id
where p.main_product_id=1 and p.product_category_id='1'
group by p.id;
This will automatically include only products that have matching features. You would use left outer join to get all products.
select distinct p.product_id from cscart_products p
left join product_bikes pb on p.product_id = pb.product_id
left join cscart_product_options po on po.product_id = p.product_id
left join cscart_product_option_variants pov on pov.option_id = po.option_id
left join variant_bikes vb on vb.variant_id = pov.variant_id
where pb.bike_id = 111 or vb.bike_id = 111
And:
select distinct p.product_id from cscart_products p
left join product_bikes pb on p.product_id = pb.product_id and pb.bike_id = 111
left join cscart_product_options po on po.product_id = p.product_id
left join cscart_product_option_variants pov on pov.option_id = po.option_id
left join variant_bikes vb on vb.variant_id = pov.variant_id and vb.bike_id = 111
Return different result sets, why?
The first query has an OR in the WHERE clause:
WHERE pb.bike_id = 111 OR vb.bike_id = 111
The second query effectively has AND instead, via the conditions:
LEFT JOIN product_bikes pb ON p.product_id = pb.product_id AND pb.bike_id = 111
...
LEFT JOIN variant_bikes vb ON vb.variant_id = pov.variant_id AND vb.bike_id = 111
Bonus question: is there a way with joins to make it behave the same and benefit from having smaller joins for performance?
There is a way to write the query, but it isn't necessarily any faster because the method (that I'm thinking of) uses UNION:
select distinct p.product_id from cscart_products p
left join product_bikes pb on p.product_id = pb.product_id and pb.bike_id = 111
left join cscart_product_options po on po.product_id = p.product_id
left join cscart_product_option_variants pov on pov.option_id = po.option_id
left join variant_bikes vb on vb.variant_id = pov.variant_id -- and vb.bike_id = 111
UNION
select distinct p.product_id from cscart_products p
left join product_bikes pb on p.product_id = pb.product_id -- and pb.bike_id = 111
left join cscart_product_options po on po.product_id = p.product_id
left join cscart_product_option_variants pov on pov.option_id = po.option_id
left join variant_bikes vb on vb.variant_id = pov.variant_id and vb.bike_id = 111
There probably is a better way of doing it, such that you have a UNION sub-query, along the lines of:
SELECT DISTINCT p.product_id
FROM cscart_products AS p
LEFT JOIN cscart_product_options AS po ON po.product_id = p.product_id
LEFT JOIN cscart_product_option_variants AS pov ON pov.option_id = po.option_id
LEFT JOIN (SELECT vb.product_id FROM variant_bikes AS vb WHERE vb.bike_id = 111
UNION
SELECT pb.product_id FROM product_bikes AS pb WHERE pb.bike_id = 111
) AS pv ON pv.product_id = p.product_id
Since you aren't (in the example) selecting data from the cscart_product_options or cscart_product_options_variants tables, you could eliminate those from the query. You should also look at whether the LEFT JOIN with the sub-query is appropriate; I think it more likely that you want an inner join. There may well be more work that can be done to improve the performance.
In addition to what Jonathan said. In the first query, the WHERE forces it that you don't get ANY results UNLESS (pb.bike_id = 111 or vb.bike_id = 111) is true. In the second query, you will get all DISTINCT [product_id]s even though only one row will be able to join via the LEFT JOINs.
If you are getting a lot of results from the second query vs. the first, that's way. The easier way to see this is by putting more in your SELECT so:
SELECT p.product_id, pb.bike_id ...
You'll notice if you do that, that the first query will have 111 in every product it displays, but the second query will have a lot of NULL values for pb.bike_id.
Make sense?