SELECT with WHERE NOT IN condition without subquery - mysql

I wonder if this is possible to have the same results as the following request without using a subquery:
SELECT p1.id
FROM products p1
WHERE NOT p1.id IN (
SELECT p2.id
FROM products p2
JOIN product_translations t
ON t.product_id = p2.id
AND t.locale = 'fr'
);
As you can see, a products row can have many product_translations rows (0..n).
The expected result must be only products that does not have product_translations with locale fr.

The join inside the sub-query is not needed. Your query thus equals:
SELECT id
FROM products
WHERE id NOT IN
(
SELECT product_id
FROM product_translations
WHERE locale = 'fr'
);
Without a sub-query you would have to outer join the translation table and eliminate the matches:
SELECT p.id
FROM products p
LEFT JOIN product_translations pt ON pt.product_id = p.id AND pt.locale = 'fr'
WHERE pt.product_id IS NULL;
Edit: Just for completeness' sake here is the NOT EXISTS version (also needing a sub-query of course):
SELECT id
FROM products
WHERE NOT EXISTS
(
SELECT *
FROM product_translations
WHERE product_id = products.id
AND locale = 'fr'
);

Related

Filter products by options

I have following database structure to store product options.
Now i have problem to filter out products that match only given options. First i did WHERE option_id IN (array of options), but that would give me products that match any of the options and that is not solution. User wants to filter out only products with given material, color, and size for instance. And if i do WHERE option_id = 4 AND option_id = 6 for instance i get nothing.
Here is my query:
SELECT DISTINCT p.id AS id,
...
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id = 1 // only to get the idea
GROUP BY id
ORDER BY id DESC
LIMIT 0,
12
Just to mention it is PHP application , where user select options from select element with or without multiple attribute.
How to acomplish this?
You can use having:
SELECT p.id AS id, ...
FROM products p JOIN
product_categories pc
ON p.id = pc.product_id LEFT JOIN
product_images pi
ON p.id = pi.product_id JOIN
product_options po
ON p.id = po.product_id
WHERE p.product_active = 1 AND
po.option_id IN (4, 6)
GROUP BY p.id
HAVING COUNT(DISTINCT po.option_id) = 2
ORDER BY p.id DESC
LIMIT 0, 12;
The HAVING clause is specifying that a given id has two matching options. Because of the WHERE clause, these are the only two options that you care about.
I didn't change your approach (you didn't supply the complete query), but you are doing joins along different dimensions -- categories, images, and options. This creates a Cartesian product for each product, and that is often not the best approach to such a query.
There is no need for LEFT JOIN in the solution.
SELECT DISTINCT p.id AS id
FROM products p
JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN (1, 2, 3)
GROUP BY p.id
HAVING COUNT(po.option_id) = 3
My solution keep only tables necessary to find the products with specified options.
In the case you want products having exactly this options and no others you can use NOT EXISTS:
SELECT DISTINCT p.id AS id
FROM products p
JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1 AND
po.option_id IN (1, 2, 3) and
NOT EXISTS (
SELECT 1
FROM product_options po2
WHERE p.id = po2.product_id and po2.option_id NOT IN (1, 2, 3)
)
GROUP BY p.id
HAVING COUNT(po.option_id) = 3
If you want to select products accoding to the other conditions (like product categories and so on) then use IN in the WHERE clause. This approach avoids generating duplicate po.option_id and the outer query will still work correctly even without DISTINCT in COUNT.
SELECT DISTINCT p.id AS id
FROM products p
JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1 AND
po.option_id IN (1, 2, 3) AND
-- use the following IN predicate to select products with specific features without introducing duplicates in your query
p.id IN (
select product_id FROM product_categories WHERE <your_condition>
)
GROUP BY p.id
HAVING COUNT(po.option_id) = 3
You select products with image lists. Something like:
select products.*, group_concat(product_images.id)
Additionally there may be options the product must all meet. This is criteria that belongs in the WHERE clause.
select
p.*,
(select group_concat(image) from product_images i where i.product_id = p.id) as images
from products p
where product_active = 1
and id in
(
select product_id
from product_options
where option_id in (1,3,55,97)
group by product_id
having count(*) = 4 -- four options in this example
);
Thanks guys, i've managed to return exactly what i wanted.
Now i just have problem with pagination query for the filtered products.
Final search query:
SELECT DISTINCT p.id AS id,
main_price,
promotion_price,
NEW,
sale,
recommended,
COUNT(pi.filename) AS image_count,
GROUP_CONCAT(DISTINCT pi.filename
ORDER BY pi.main_image DESC, pi.id ASC) AS images,
name_sr,
uri_sr,
description_sr
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY id
HAVING COUNT(DISTINCT po.option_id) = 1
ORDER BY id DESC
LIMIT 0,
12
Pagination query is something like this i modified it accorgin to new filter query:
SELECT COUNT(DISTINCT p.id) AS number
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY(p.id)
HAVING COUNT(DISTINCT po.option_id) = 1
If i leave out DISTINCT in SELECT COUNT i don't get filtered pagination, if i set DISTINCT i get number of rows that corespond to pagination. I suppose i could add another count(*) to all of this with subquery, but not sure if that is way to go and if there is more efficient and elegant way to do this.

How to optimize query with two inner join

Using this query to get the products with words that fulfill all three required word terms (lenovo, laptop, computer):
SELECT t1.id, t1.name, t1.price FROM
(SELECT p.id AS productid, name, price
FROM products p JOIN productwords pw ON p.id = pw.productid
JOIN words w ON pw.wordid = w.id WHERE word.term = 'lenovo') t1
INNER JOIN
(SELECT p.id AS productid, name, price
FROM products p JOIN productwords pw ON p.id = pw.productid
JOIN words w ON pw.wordid = w.id WHERE word.term = 'laptop') t2
INNER JOIN
(SELECT p.id AS productid, name, price
FROM products p JOIN productwords pw ON p.id = pw.productid
JOIN words w ON pw.wordid = w.id WHERE word.term = 'computer') t3
ON
t1.productid = t2.productid
AND
t1.productid = t3.productid
ORDER BY t1.name
As far as I can see, the query considers the whole words table for each term (the tables have indexes. Database is MySql).
Can the query be rewritten in a better way, so it will become faster? (the tables contain millions of rows)
For example with subsets, so the 'laptop' search only considers the rows matching 'lenovo' - and the 'computer' search only considers the rows matching first 'lenovo' and then 'laptop'.
Thanks!
You can use the HAVING clause :
SELECT p.id AS productid, name, price
FROM products p
JOIN productwords pw ON p.id = pw.productid
JOIN words w ON pw.wordid = w.id
WHERE word.term in ('lenovo','computer','laptop')
GROUP BY p.id , name, price
HAVING COUNT(DISTINCT word.term) = 3
That is if I understood the question, it looks like product -> words is 1:n relation , and if no column from the word table is selected, that should work perfectly.
This might be a quicker way of doing it:
SELECT p.id, name, price
FROM products p
where
EXISTS (select null
from productwords pw1
JOIN words w1 ON pw1.wordid = w1.id
where w1.term = 'lenovo'
and p.id = pw1.productid )
and EXISTS (select null
productwords pw2
JOIN words w2 ON pw2.wordid = w2.id
where w2.term = 'laptop'
and and p.id = pw2.productid )
and EXISTS (select null
productwords pw3 ON p.id = pw3.productid
JOIN words w3
where w3.term = 'computer'
and p.id = pw3.productid )
ORDER BY name;

How to remove a row if sub query returns null value?

I have following query.
select
Product.*,
(
select
group_concat(features.feature_image order by product_features.feature_order)
from product_features
inner join features
on features.id = product_features.feature_id
where
product_features.product_id = Product.id
and product_features.feature_id in(1)
) feature_image
from products as Product
where
Product.main_product_id=1
and Product.product_category_id='1'
I want to bypass the row if feature_image is empty.
Your query looks a bit strange because you are doing most of the work in a subquery:
select p.*, (select group_concat(f.feature_image order by pf.feature_order)
from product_features pf inner join
features f
on f.id = pf.feature_id
where pf.product_id = p.id and pf.feature_id in (1)
) as feature_image
from products p
where p.main_product_id=1 and p.product_category_id='1';
A more common way to phrase the query is as an inner join in the outer query:
select p.*, group_concat(f.feature_image order by pf.feature_order) as feature_image
from products p join
product_features pf
on pf.product_id = p.id and pf.feature_id in (1) join
features f
on f.id = pf.feature_id
where p.main_product_id=1 and p.product_category_id='1'
group by p.id;
This will automatically include only products that have matching features. You would use left outer join to get all products.

Join with where in condition difficulties

I have product table and product_attributes table. I want filter products with necessary attributes, here is my sql:
SELECT * FROM product p
INNER JOIN product_attributes p2 ON p.id = p2.product_id
WHERE p2.attribute_id IN (637, 638, 629))
But, it gives me all products even if product have only one attribute (637 for example). But i need products with all given attributes (637, 638, 629).
There's a fairly standard approach:
select * from product
where id in (
SELECT id
FROM product p
JOIN product_attributes p2 ON p.id = p2.product_id
AND p2.attribute_id IN (637, 638, 629)
GROUP BY id
HAVING COUNT(distinct attribute_id) = 3)
The HAVING clause ensures there were 3 different attribute ids (ie they were all found).
This can be expressed as a straight join (rather than the ID IN(...)), but it's simpler to read and should perform OK like thus.
Of slight interest may be the moving of the attribute id condition into the JOIN's ON condition.
This is an example of a "set-within-sets" subquery. I like to solve these with aggregation and the having clause, because this is the most flexible solution:
SELECT p.*
FROM product p join
product_attributes pa
on p.id = pa.product_id
group by p.id
having sum(pa.attribute_id = 637) > 0 and
sum(pa.attribute_id = 638) > 0 and
sum(pa.attribute_id = 629) > 0
An alternative having clause is:
having count(distinct case when pa.attribute_id IN (637, 638, 629)
then pa.attribute_id
end) = 3
You can use a query like this:
SELECT * FROM product p
INNER JOIN product_attributes p21
ON p.id = p21.product_id and p21.attribute_id = 637
INNER JOIN product_attributes p22
ON p.id = p22.product_id and p22.attribute_id = 638
INNER JOIN product_attributes p23
ON p.id = p23.product_id and p23.attribute_id = 629

MySQL LEFT JOIN, GROUP BY and ORDER BY not working as required

I have a table
'products' => ('product_id', 'name', 'description')
and a table
'product_price' => ('product_price_id', 'product_id', 'price', 'date_updated')
I want to perform a query something like
SELECT `p`.*, `pp`.`price`
FROM `products` `p`
LEFT JOIN `product_price` `pp` ON `pp`.`product_id` = `p`.`product_id`
GROUP BY `p`.`product_id`
ORDER BY `pp`.`date_updated` DESC
As you can probably guess the price changes often and I need to pull out the latest one. The trouble is I cannot work out how to order the LEFT JOINed table. I tried using some of the GROUP BY functions like MAX() but that would only pull out the column not the row.
Thanks.
It appears that it is impossible to use an ORDER BY on a GROUP BY summarisation. My fundamental logic is flawed. I will need to run the following subquery.
SELECT `p`.*, `pp`.`price` FROM `products` `p`
LEFT JOIN (
SELECT `price` FROM `product_price` ORDER BY `date_updated` DESC
) `pp`
ON `p`.`product_id` = `pp`.`product_id`
GROUP BY `p`.`product_id`;
This will take a performance hit but as it is the same subquery for each row it shouldn't be too bad.
You need to set aliases properly I think and also set what you are joining on:
SELECT p.*, pp.price
FROM products AS p
LEFT JOIN product_price AS pp
ON pp.product_id = p.product_id
GROUP BY p.product_id
ORDER BY pp.date_updated DESC
This will give you the last updated price:
select
p.*, pp.price
from
products p,
-- left join this if products may not have an entry in prodcuts_price
-- and you would like to see a null price with the product
join
(
select
product_price_id,
max(date_updated)
from products_price
group by product_price_id
) as pp_max
on p.product_id = pp.product_id
join products_price pp on
pp_max.prodcuts_price_id = pp.products_price_id
Mysqlism:
SELECT p.*, MAX(pp.date_updated), pp.price
FROM products p
LEFT JOIN product_price pp ON pp.product_id = p.product_id
GROUP BY p.product_id
Will work on some RDBMS:
SELECT p.*, pp.date_updated, pp.price
FROM products p
LEFT JOIN product_price pp ON pp.product_id = p.product_id
WHERE (p.product_id, pp.date_updated)
in (select product_id, max(date_updated)
from product_price
group by product_id)
Will work on most RDBMS:
SELECT p.*, pp.date_updated, pp.price
FROM products p
LEFT JOIN product_price pp ON pp.product_id = p.product_id
WHERE EXISTS
(
select null -- inspired by Linq-to-SQL style :-)
from product_price
WHERE product_id = p.product_id
group by product_id
HAVING max(date_updated) = pp.date_updated
)
Will work on all RDBMS:
SELECT p.*, pp.date_updated, pp.price
FROM products p
LEFT JOIN product_price pp ON pp.product_id = p.product_id
LEFT JOIN
(
select product_id, max(date_updated) as recent
from product_price
group by product_id
) AS latest
ON latest.product_id = p.product_id AND latest.recent = pp.date_updated
And if nate c's code intent is to just get one row from product_price, no need to table-derive (i.e. join (select product_price_id, max(date_updated) from products_price) as pp_max), he might as well just simplify(i.e. no need to use the product_price_id surrogate primary key) it like the following:
SELECT p.*, pp.date_updated, pp.price
FROM products p
LEFT JOIN product_price pp ON pp.product_id = p.product_id
WHERE pp.date_updated = (select max(date_updated) from product_price)