Filtering on a LEFT JOINED column - mysql

Is there a more efficient way to filter on a joined table as in the following example? Or is this a fine approach? This query returns the desired results, but I am an amateur at MySQL.
I have indexes on products.id, product_details.product_id and product_details.value
SELECT p.id
FROM products p
LEFT
JOIN product_details d
ON d.product_id = p.id
WHERE d.value = 1
OR p.id = 4
Simplified structure as follows:
products table
product_id (PRIMARY KEY) | name
--------------------------------
1 | Shirt
2 | Shoes
3 | Dress
4 | A product with no corresponding details row
product_details table
product_id (PRIMARY KEY) | value
---------------------------------
1 | 1
2 | 23
3 | 32

This is your query:
SELECT products.id
FROM products LEFT JOIN
product_details
ON product_details.product_id = products.id
WHERE product_details.value = 1 OR products.id = 4;
This is not a bad practice. I do think the query is easier to follow using EXISTS:
SELECT p.id
FROM products p
WHERE p.id = 4 OR
EXISTS (SELECT 1
FROM product_details pd
WHERE pd.product_id = p.id AND pd.value = 1
);
In addition EXISTS makes it clear that you don't want to return duplicates if there are duplicate matching rows in product_details.
If performance is you main consideration, then EXISTS is probably your best choice, with an index on product_details(product_id, value).

Couple of notes:
As a rule of thumb, a UNION ALL statement performs better than an OR operator. Also, this helps clear up the query.
Using both an implicit JOIN and a predicate in the WHERE clause on the same table can get you into trouble - especially if you're using a LEFT OUTER JOIN (the predicate in the WHERE clause has precedence over the LEFT OUTER JOIN).
Seems like you always want to pull back any records that has a products.id = 4, and also any products that have a product_details.value = 1. This seems like two separate queries to me, and splitting it would probably make it easier to maintain in the future.
SELECT
p.id
FROM
products p
WHERE
p.id = 4
UNION ALL
SELECT
p.id
FROM
product_details pd
JOIN
products p
ON
p.id = pd.product_id
WHERE
pd.value = 1
Source: https://bertwagner.com/posts/or-vs-union-all-is-one-better-for-performance/

Related

MYSQL Query search in relationship

For the sake of clarity and this question i will rename the tables so it is a bit clearer for everybody and explain what i want to achieve:
There is an input form with options that return categories ID's. If a 'Product' has 'Category', i want to return/find the 'Product' which lets say has multiple categories(or just 1) and all of its categories are inside the array that is passed from the form.
Products table
ID Title
1 Pizza
2 Ice Cream
Categories table
ID Title
1 Baked food
2 Hot food
ProductsCategories table
ID ProductId CategoryId
1 1 1
2 1 2
So if i pass [1,2] the query should return Product with id 1 since all ProductsCategories are inside the requested array, but if i pass only 1 or 2, the query should return no results.
Currently i have the following query which works, but for some reason if i create a second Product and create a ProductCategory that has a CategoryId same as the first product, the query returns nulll...
SELECT products.*
FROM products
JOIN products_categories
ON products_categories.product_id= products.id
WHERE products_categories.category_id IN (1, 2)
HAVING COUNT(*) = (select count(*) from products_categories pc
WHERE pc .product_id = products.id)
All help is deeply appretiated! Cheers!
In order to match all values in IN clause, you just need to know in addition the number of passed categories which you must use it in HAVING clause:
SELECT
p.*,
GROUP_CONCAT(c.title) AS categories
FROM
Products p
INNER JOIN ProductsCategories pc ON pc.productId = p.ID
INNER JOIN Categories c ON c.ID = pc.categoryId
WHERE
pc.categoryId IN (1,2)
GROUP BY
p.id
HAVING
COUNT(DISTINCT pc.categoryId) = 2 -- this is # of unique categories in IN clause
So in case IN (1,2) result is:
+----+-------+---------------------+
| id | title | categories |
+----+-------+---------------------+
| 1 | Pizza | Baked Food,Hot Food |
+----+-------+---------------------+
1 row in set
In case IN (1,3) result is Empty set (no results).
#mitkosoft, thanks for your answer, but sadly the query is not producing the needed results. If the product's categories are partially in the passed categories the product is still returned. Additionally i might not know how many parameters are sent by the form.
Luckily I managed to create the query that does the trick and works perfectly fine (at least so far)
SELECT products.*,
COUNT(*) as resultsCount,
(SELECT COUNT(*) FROM products_categories pc WHERE pc.product_id = products.id) as categoriesCount
FROM products
JOIN products_categories AS productsCategories
ON productsCategories.product_id= products.id
WHERE productsCategories.category_id IN (7, 15, 8, 1, 50)
GROUP BY products.id
HAVING resultsCount = categoriesCount
ORDER BY amount DESC #optional
That way the query is flexible and gives me exactly what I needed! - Only those products that have all their categories inside the search parameters(not partially).
Cheers! :)

mysql one-to-many query with negation and/or multiple criteria

I thought a query like this would be pretty easy because of the nature of relational databases but it seems to be giving me a fit. I also searched around but found nothing that really helped. Here's the situation:
Let's say I have a simple relationship for products and product tags. This is a one-to-many relationship, so we could have the following:
productid | tag
========================
1 | Car
1 | Black
1 | Ford
2 | Car
2 | Red
2 | Ford
3 | Car
3 | Black
3 | Lexus
4 | Motorcycle
4 | Black
5 | Skateboard
5 | Black
6 | Skateboard
6 | Green
What's the most efficient way to query for all (Ford OR Black OR Skateboard) AND NOT (Motorcycles OR Green)? Another query I'm going to need to do is something like all (Car) or (Skateboard) or (Green AND Motorcycle) or (Red AND Motorcycle).
There are about 150k records in the products table and 600k records in the tags tables, so the query is going to need to be as efficient as possible. Here's one query that I've been messing around with (example #1), but it seems to be taking about 4 seconds or so. Any help would be much appreciated.
SELECT p.productid
FROM products p
JOIN producttags tag1 USING (productid)
WHERE p.active = 1
AND tag1.tag IN ( 'Ford', 'Black', 'Skatebaord' )
AND p.productid NOT IN (SELECT productid
FROM producttags
WHERE tag IN ( 'Motorcycle', 'Green' ));
Update
The quickest query I've found so far is something like this. It's taking 100-200ms but it seems pretty inflexible and ugly. Basically I'm grabbing all products that match Ford, Black, or Skateboard. Them I'm concatenating all of the tags for those matched products into a colon-separated string and removing all products that match on :Green: AND :Motorcycle:. Any thoughts?
SELECT p.productid,
Concat(':', Group_concat(alltags.tag SEPARATOR ':'), ':') AS taglist
FROM products p
JOIN producttags tag1 USING (productid)
JOIN producttags alltags USING (productid)
WHERE p.active = 1
AND tag1.tag IN ( 'Ford', 'Black', 'Skateboard' )
GROUP BY tag1.productid
HAVING ( taglist NOT LIKE '%:Motorcycle:%'
AND taglist NOT LIKE '%:Green:%' );
I'd write the exclusion join with no subqueries:
SELECT p.productid
FROM products p
INNER JOIN producttags AS t ON p.productid = t.productid
LEFT OUTER JOIN producttags AS x ON p.productid = x.productid
AND x.tag IN ('Motorcycle', 'Green')
WHERE p.active = 1
AND t.tag IN ( 'Ford', 'Black', 'Skateboard' )
AND x.productid IS NULL;
Make sure you have an index on products over the two columns (active, productid) in that order.
You should also have an index on producttags over the two columns (productid, tag) in that order.
Another query I'm going to need to do is something like all (Car) or (Skateboard) or (Green AND Motorcycle) or (Red AND Motorcycle).
Sometimes these complex conditions are hard for the MySQL optimizer. One common workaround is to use UNION to combine simpler queries:
SELECT p.productid
FROM products p
INNER JOIN producttags AS t1 ON p.productid = t1.productid
WHERE p.active = 1
AND t1.tag IN ('Car', 'Skateboard')
UNION ALL
SELECT p.productid
FROM products p
INNER JOIN producttags AS t1 ON p.productid = t1.productid
INNER JOIN producttags AS t2 ON p.productid = t2.productid
WHERE p.active = 1
AND t1.tag IN ('Motorcycle')
AND t2.tag IN ('Green', 'Red');
PS: Your tagging table is not an Entity-Attribute-Value table.
I would get all the unique ID matches and the unique IDs to filter out, then LEFT JOIN those lists (as per tigeryan) and filter out any IDs that match. The query should also be easier to read and modify by keeping all the queries separate. It should be fairly quick also, although it may not look like it.
SELECT * FROM products p
WHERE
p.active=1 AND
productid IN (
SELECT matches.productid FROM (
SELECT DISTINCT productid FROM producttags
WHERE tag IN ('Ford','Green','Skatebaord')
) AS matches
LEFT JOIN (
SELECT DISTINCT productid FROM producttags
WHERE tag IN ('Motorcycles','Green')
) AS filter ON filter.productid=matches.productid
WHERE filter.productid IS NULL
)
Sometimes a JOIN is faster than an IN, depending on how mysql optimizes the query:
SELECT p.* FROM (
SELECT matches.productid FROM (
SELECT DISTINCT productid FROM producttags
WHERE tag IN ('Ford','Green','Skatebaord')
) AS matches
LEFT JOIN (
SELECT DISTINCT productid FROM producttags
WHERE tag IN ('Motorcycles','Green')
) AS filter ON filter.productid=matches.productid
WHERE filter.productid IS NULL
) AS idfilter
JOIN products p ON p.productid=idfilter.productid AND p.active=1
The second query should force the join order since the internal selects have to be done first.
I would usually attack this by trying to eliminate records in the from...
select p.productid
from product p
left join producttags tag1
on p.productid = tag1.productid and tag1.tag NOT IN ('Motorcycles','Green')
where tag1.tag IN ('Ford','Black','Skateboard') and p.active = 1
What about this one:
SELECT DISTINCT p.id FROM products AS p
JOIN producttags AS included ON (
included.productid = p.id
AND included.tag IN ('Ford', 'Black', 'Skatebaord')
)
WHERE active = 1
AND p.id NOT IN (
SELECT DISTINCT productid FROM producttags
WHERE tag IN ('Motorcycle', 'Green')
)
Alternative to the CONCAT/LIKE solution:
SELECT p.productid
FROM products p
JOIN producttags USING (productid)
WHERE p.active = 1
AND tag IN ('Ford', 'Black', 'Skateboard')
GROUP BY p.productid
HAVING SUM(IF(tag IN ('Motorcycle','Green'), 1, 0)) = 0;

Select only rows where count of rows in other table is greather than 0

I have 2 tables in my database:
Products:
--------------------------------------------------
| id | product_name | manufacturer |
--------------------------------------------------
Products_photos:
-----------------------------------------------
| id | product_id | image_name |
-----------------------------------------------
I want select all Products, where Product_photos count is greater than 0.
How I can do that?
#Edit:
I don't want to add results from Products_photos for my output. I want only show entries from Products, where are any images. Sorry for my english :)
Thanks for help
I think the joining solutions already offered are the best bet, in terms of query efficiency. But for clarity - in terms of expressing exactly what you ask for - I would choose an approach like this:
select * from products p
where exists (select * from products_photos pp where pp.product_id = p.id)
SELECT p.id, p.product_name, p.manufacturer
FROM Products p
INNER JOIN Products_photos i on i.product_id = p.id
you can do
Select P.id, P.product_name, P.manufacturer
from Products P
INNER JOIN Products_photos Pp on P.Id = Pp.product_id
For the Inner Join, it will only return rows where it's posible the joining, which means that you have at least one value in the Products_photos table.
SELECT P.* FROM Products AS P INNER JOIN Products_Photos AS PP ON P.id=PP.id
Another method, more inefficient but maybe better for you to understand, would be
SELECT P.* FROM Products AS P
WHERE P.id IN (SELECT DISTINCT id FROM Product_photos)

Selecting rows from multiple tables. How?

Let's say that we have tree tables.
Products Fields Fields Value
---------------- ------------- --------------
pid catid fid catid fid pid value
-------|-------| -----|------- ------|-----|--------
1 1 1 1 1 1 25%
2 1 2 1 1 2 32.5%
3 2 2 1 45%
2 2 42%
3 1 17.3%
3 2 21%
The normal way is selecting Products in a one query and loop through result set(RS1).
Then we select Fields for catid per each row (RS2).
Then doing the same action with RS2 for selecting `Fields Value'.
Only problem is performance issue that will be reduced due to executing a lot of queries` when there are a lot of rows in each table.
Would you suggest me better solution to execute less queries ?
edit
I want to show each product in a box and show fields for each product with it's proper value. joining tree tables together will returns duplicated values for each FieldValue in Products and not usable in loop.
Guessing what you need, try this:
SELECT f.catid, fv.* FROM Fields f
INNER JOIN Products p
ON f.catid = p.catid
INNER JOIN FieldsValue fv
ON fv.fid = f.fid AND fv.pid = p.pid
SELECT *
FROM Products
NATURAL JOIN Fields
NATURAL JOIN FieldsValue;
use Join Syntax :
SELECT * FROM Products as P
LEFT JOIN FieldsValue as FV ON FV.PID = P.PID
LEFT JOIN Fields as F on F.fid = FV.fid
You can join the tables together using left join:
select *
from Products p
left join Fields f on f.catid = p.catid
left join `fields value` fv on fv.fid = f.fid on fv.pid = p.pid
where p.pid = 1

Multiple Attributes from one Query

I am having a problem getting multiple attributes for one item. Below is my Table:
Table: product_attrib
id | product_id | name | value
------------------------------
0 | 33 | age | 25
1 | 33 | size | 25
My problem is when I join the query, I only get one of the attributes with such a query:
Query:
SELECT
p.*
,pa.name
,pa.value
FROM product AS p
LEFT OUTER JOIN product_attrib AS pa ON (
p.id = pa.product_id
)
My Results
"products_id":"0",
"products_price":"0.0000",
"products_name":null,
"products_description":null,
"attrib_name":"color",
"attrib_value":"red"
Do you see how I only get one attribute set?
Is there a way I can get all the attributes for a product?
Most likely, your original query is right as it is. You probably want the product, no matter if attributes can be found.
You can reverse the order of the tables in the JOIN to prevent losing rows from product_attrib like this (if product with product_id 33 does not exist):
SELECT
p.*
,pa.name
,pa.value
FROM product_attrib AS pa
LEFT JOIN product AS p ON p.id = pa.product_id
But that's probably not what you want.
A LEFT [OUTER] JOIN includes all rows from the left hand table and adds values from the right table where the JOIN condition can be fulfilled (potentially creating multiple rows if multiple matches are found in the right hand table.) If no matching row can be found in the right hand table NULL values are substituted for all columns of the right hand table.
Start by reading the manual here.
If you want "all attributes" per product in the same row, you need to aggregate values. Something like this:
SELECT p.*
,group_concat(pa.name) AS att_names
,group_concat(pa.value) AS att_values
FROM product AS p
LEFT JOIN product_attrib AS pa ON p.id = pa.product_id
WHERE p.product_id = 33
GROUP BY p.*;
i see so many people writing full joins when they aren't necessary. please correct me if i'm wrong.
SELECT
p.*
,pa.name
,pa.value
FROM product AS p, product_attrib AS pa
WHERE p.id = pa.product_id