I have two tables, products and categories, and a join table products_categories.
Categories are nested (via categories.parent_id). So, for any given product, it can belong to many, potentially nested categories.
The categories are structured like this:
department (depth = 0)
category (depth = 1)
class (depth = 2)
Each product will belong to one "department", one "category", and one "class".
So, a product like "Rad Widget", could belong to the "Electronics" department, the "Miscellaneous" category, and the "Widgets" class. My schema would look like this:
# products
id | name
---------------
1 | Rad Widget
# categories
id | parent_id | depth | name
--------------------------------------
1 | null | 0 | Electronics
2 | 1 | 1 | Miscellaneous
3 | 2 | 2 | Widgets
# products_categories
product_id | category_id
------------------------
1 | 1
1 | 2
1 | 3
I'd like to run a query that lists all of a product's departments into a single row, like this:
product.id | product.name | department | category | class
-----------------------------------------------------------------
1 | Rad Widget | Electronics | Miscellaneous | Widgets
I can't think of a way to do this, so I'm considering denormalizing my data, but I want to make certain I'm not missing something first.
Since each category (and by the way, you might want to rename either the table or the level so that "category" doesn't mean two different things) has a singular known parent, but an indeterminate number of unknown children, you need to "walk up" from the most specific (at depth = 2) to the most general category, performing a self-join on the category table for each additional value you want to insert.
If you're impatient, skip to the SQL Fiddle link at the bottom of the post. If you'd rather be walked through it, continue reading - it's really not that different from any other case where you have a surrogate ID that you want to replace with data from the corresponding table.
You could start by looking at all the information:
SELECT * FROM products AS P
JOIN
products_categories AS PC ON P.id = PC.product_id
JOIN
categories AS C ON PC.category_id = C.id
WHERE P.id = 1 AN D C.depth = 2;
+----+------------+------------+-------------+----+-----------+-------+---------+
| id | name | product_id | category_id | id | parent_id | depth | name |
+----+------------+------------+-------------+----+-----------+-------+---------+
| 1 | Rad Widget | 1 | 3 | 3 | 2 | 2 | Widgets |
+----+------------+------------+-------------+----+-----------+-------+---------+
First thing you have to do is recognize which information is useful and which is not. You don't want to be SELECT *-ing all day here. You have the first two columns you want, and the last column (recognize this as your "class"); you need parent_id to find the next column you want, and let's hold onto depth just for illustration. Forget the rest, they're clutter.
So replace that * with specific column names, alias "class", and go after the data represented by parent_id. This information is stored in the category table - you might be thinking, but I already joined that table! Don't care; do it again, only give it a new alias. Remember that your ON condition is a bit different - the products_categories has done its job already, now you want the row that matches C.parent_id - and that you only need certain columns to find the next parent:
SELECT
P.id,
P.name,
C1.parent_id,
C1.depth,
C1.name,
C.name AS 'class'
FROM
products AS P
JOIN
products_categories AS PC ON P.id = PC.product_id
JOIN
categories AS C ON PC.category_id = C.id
JOIN
categories AS C1 ON C.parent_id = C1.id
WHERE
P.id = 1
AND C.depth = 2;
+----+------------+-----------+---------------+---------+
| id | name | parent_id | name | class |
+----+------------+-----------+---------------+---------+
| 1 | Rad Widget | 1 | Miscellaneous | Widgets |
+----+------------+-----------+---------------+---------+
Repeat the process one more time, aliasing the column you just added and using the new C1.parent_id in your next join condition:
SELECT
P.id,
P.name,
PC.category_id,
C2.parent_id,
C2.depth,
C2.name,
C1.name AS 'category',
C.name AS 'class'
FROM
products AS P
JOIN
products_categories AS PC ON P.id = PC.product_id
JOIN
categories AS C ON PC.category_id = C.id
JOIN
categories AS C1 ON C.parent_id = C1.id
JOIN
categories AS C2 ON C1.parent_id = C2.id
WHERE
P.id = 1
AND C.depth = 2;
+----+------------+-----------+-------+-------------+---------------+---------+
| id | name | parent_id | depth | name | category | class |
+----+------------+-----------+-------+-------------+---------------+---------+
| 1 | Rad Widget | NULL | 0 | Electronics | Miscellaneous | Widgets |
+----+------------+-----------+-------+-------------+---------------+---------+
Now we're clearly done; we can't join another copy on C2.parent_id = NULL and we also see that depth = 0, so all that's left to do is get rid of the columns we don't want to display and double check our aliases. Here it is in action on SQL Fiddle.
If you want a list of all the categories, you can simply do a
Select Distinct p.category_id, c.name
From products_categories p Join
categories c On p.category_id = c.id
Where p.product_id = 1
The problem is you are putting Classes and Departments into your Category table. Technically you'd be correctly normalizing your data by moving each of these to their own tables. I know the overhead of creating more tables is a pain but it'll simplify your queries (saving processing power and potentially bandwidth).
Related
I am developing basically an e-commerce application. Application has two pages (all product and my-basket) authenticated user can add product to own basket. and I have three tables, the tables contains following data. I want to if the user adds product to own basket, these products don't exist on this user's all product page.
How should be the SQL query? I am looking query for all product page. so query's return type must be Product.
If user added any products to own basket on all product page these products
shouldn't see on the all product page for this user.
PRODUCT TABLE
+-------+--------+
| id | name |
+-------+--------+
| 1 | p1 |
| 2 | p2 |
+-------+--------+
USER TABLE
+-------+--------+
| id | name |
+-------+--------+
| 3 | U1 |
| 4 | U2 |
+-------+--------+
BASKET TABLE
+-------+---------+-------------+
| id | fk_user | fk_product |
+-------+---------+-------------+
| 5 | 3 | 1 |
| 6 | 4 | 2 |
+-------+---------+-------------+
So if authenticated user's id is 3. The user should see p2 product on own all product page.
try this:
SELECT product.name
FROM product
LEFT JOIN basket ON basket.fk_product = product.id
WHERE (basket.fk_user != 3 OR basket.fk_user IS NULL)
Check my demo query
If you want you can also join the user table but with the data you gave me is not necessary.
A left join keeps all rows in the first (product) table plus all rows in the second (basket) table, when the on clause evaluates to true.
When the on clause evaluates to false or NULL, the left join still keeps all rows in the first table with NULL values for the second table.
or, more commonly...
SELECT p.name
FROM product p
LEFT JOIN basket b
on b.fk_product = p.id
AND b.fk_user = 3
WHERE b.fk_user is null
What you are describing sounds like NOT EXISTS:
SELECT p.name
FROM product p
WHERE NOT EXISTS (SELECT 1
FROM basket b
WHERE b.fk_product = f.id AND
b.fk_user = 3
);
This seems like the most direct interpretation of your question.
I've made a little database in SQL that as 2 tables Product (Name, Ingredient and Available (Ingredient):
| Product | Available |
| Name | Ingredient | Ingredient |
| 1 | a | a |
| 1 | b | c |
| 2 | a |
| 2 | c |
I want the name of a product only if ALL its ingredients are inside the Available table.
For the previous example, the result should be: Product "2"
and not Product "1", because I don't have the ingredient "b" in the Available table.
Thanks for the help
You can try with left join (to figure out which Products don't have necessary Ingredients) and group by + having to filter Products that have at least one missing Ingredient:
select p.Name
from Products p
left join Available a on a.Ingredient = p.Ingredient
group by p.Name
having sum(a.Ingredient is null) = 0
You can try something like this also:
WITH TEMP_PRODUCTS AS
(
SELECT NAME, COUNT(1) AS NUMBER_OF_INGREDIENTS
FROM PRODUCT
GROUP BY PRODUCT
)
SELECT PRD.NAME, COUNT(1) AS NUMBER_OF_AVAILABLE_INGREDIENTS
FROM PRODUCT PRD
JOIN TEMP_PRODUCTS TMP ON PRD.NAME = TMP.NAME
WHERE EXISTS (SELECT 1
FROM INGREDIENT ING
WHERE ING.INGREDIENT = PRD.INGREDIENT)
GROUP BY PRD.NAME
HAVING COUNT(1) = TMP.NUMBER_OF_INGREDIENTS;
I have a standard nested category tree:
| id | parent_id | name |
+----+-----------+----------------+
| 1 | 0 | Category 1 |
| 2 | 0 | Category 2 |
| 3 | 0 | Category 3 |
| 4 | 1 | Category 1.1 |
| 5 | 1 | Category 1.2 |
| 6 | 2 | Category 2.1 |
| 7 | 2 | Category 2.2 |
| 8 | 7 | Category 2.2.1 |
and now I need to get top parent of specified item so I do:
SELECT
cat.*
FROM
categories cat
LEFT JOIN
categories subCat
ON
subCat.parent_id = cat.id
AND cat.parent_id = 0
WHERE
subCat.id = 5;
and if item is first-level child, it's working ok but is item is second-level child (eg. 8) I'm not getting records - how to do this?
Here is SQlFiddle: http://sqlfiddle.com/#!9/5879bd/11
UPDATE
Here is real example: http://sqlfiddle.com/#!9/6f1d1c/1
I want to get parent category of Xiaomi
With MySQL 5.6 you cannot use recursive CTEs.
To do it properly, for an arbitrary tree depth, you need to write a function/procedure, that traverses the hierarchy and returns the top node once reached.
As a workaround, when the maximum number of level d is set, you can left join the parent (d - 1) times. Use coalesce() to get the first non null value along the path. So in your case, for d = 3:
SELECT c.*
FROM categories c
INNER JOIN (SELECT coalesce(c3.id, c2.id, c1.id) id
FROM categories c1
LEFT JOIN categories c2
ON c2.id = c1.parent_id
LEFT JOIN categories c3
ON c3.id = c2.parent_id
WHERE c1.id = 10) t
ON t.id = c.id;
(I first select the ID of the top node and inner join the rest, to avoid coalesce() on all columns. It might give a false result on nullable columns if the value for the column in the top node is null but not for any child node. It should display NULL then, but will falsely show the non value from the child node.)
But note: It will fail if the depth grows!
This answers the original version of the question.
To get the top level, you can use the name column:
SELECT c.*
FROM categories c JOIN
categories sc
ON sc.id = 10 AND
c.name = SUBSTRING_INDEX(sc.name, '.', 1);
This query has been fun to figure out but I have come to place where I need some help.
I have several tables and the ultimate question is:
How many total parts are "missing", by vendor?
and / or
How many total parts are "missing", by vendor and category?
Missing: has not been utilized by the vendor (see query 1).
Note that parts are not attributed to a product or a vendor because both of those could change based on the season and often the parts inspire what the product will actually be.
Very basically, which part each vendor be aware of is the question we are trying to answer on a high level to determine which vendors have the most missing parts in which categories are those parts missing?
Now, I do have the first query I need working great. What it does is tell me the missing parts by category when I specify the specific vendor.
Here is the SQLfiddle for both the create script for the database and the working query:
Query 1:
http://sqlfiddle.com/#!9/088e7/1
And the query:
SELECT
c.name AS category,
COUNT(pt.id) AS parts,
COUNT(CASE WHEN in_stock IS NULL THEN pt.id END) AS missing_parts
FROM
season AS s
LEFT OUTER JOIN
(
SELECT
s.id AS season_id,
s.type season_type,
max(i.in_stock) AS in_stock
FROM
inventory AS i
JOIN season AS s ON i.season_id = s.id
JOIN product AS p ON i.product_id = p.id
JOIN vendor AS v ON p.vendor_id = v.id
JOIN part AS pt ON s.part_id = pt.id
WHERE
v.id = 2
AND
s.type = 'Type A'
GROUP BY
1,2) AS seas ON seas.season_id = s.id AND seas.season_type = s.type
JOIN part AS pt ON pt.id = s.part_id
JOIN part_data AS pd ON pt.id = pd.part_id
JOIN category AS c ON pt.category_id = c.id
WHERE
s.type = 'Type A'
GROUP BY
1;
The above works like a charm and here are the results:
| name | parts | missing_parts |
|-----------|-------|---------------|
| category3 | 3 | 2 |
| category4 | 2 | 0 |
| category5 | 2 | 2 |
| category6 | 3 | 3 |
My problem is when I try to do a similar query using vendor instead of category at the same time removing the vendor filter. In the following SQL fiddle, you can see that because the parts are in fact missing they of course cannot be attributed to a vendor when querying like I am.
http://sqlfiddle.com/#!9/088e7/2
And them Query 2:
SELECT
seas.vendor AS vendor,
COUNT(pt.id) AS parts,
COUNT(CASE WHEN in_stock IS NULL THEN pt.id END) AS missing_parts
FROM
season AS s
LEFT OUTER JOIN
(SELECT
s.id AS season_id,
v.name AS vendor,
s.type season_type,
max(i.in_stock) AS in_stock
FROM
inventory AS i
JOIN season AS s ON i.season_id = s.id
JOIN product AS p ON i.product_id = p.id
JOIN vendor AS v ON p.vendor_id = v.id
JOIN part AS pt ON s.part_id = pt.id
WHERE
s.type = 'Type A'
GROUP BY
1,2 ) AS seas ON seas.season_id = s.id AND seas.season_type = s.type
JOIN part AS pt ON pt.id = s.part_id
JOIN part_data AS pd ON pt.id = pd.part_id
JOIN category AS c ON pt.category_id = c.id
AND
s.type = 'Type A'
GROUP BY
1;
The results from query 2:
| vendor | parts | missing_parts |
|----------|-------|---------------|
| (null) | 4 | 4 |
| Vendor 1 | 2 | 0 |
| Vendor 2 | 3 | 0 |
| Vendor 3 | 2 | 0 |
| Vendor 4 | 2 | 0 |
| Vendor 5 | 2 | 0 |
Note the null value which makes sense as those are the "missing" parts I am looking for that cannot be attributed to a Vendor.
What I am wondering is if there is anyway to have the missing part count added to an additional column?
The missing parts column in the desired output is a hard to get accurate because again and thats very point of this query, I don't know...even with this tiny amount of data. Note again, the missing parts do not have vendors but here is my best shot.
| vendor | parts | missing_parts |
|----------|-------|---------------|
| Vendor 1 | 2 | 1 |
| Vendor 2 | 3 | 1 |
| Vendor 3 | 2 | 3 |
| Vendor 4 | 2 | 0 |
| Vendor 5 | 2 | 2 |
In an ideal world I would be able to also add category:
| category | vendor | parts | missing_parts |
|------------|----------|-------|---------------|
| category 1 | Vendor 1 | 2 | 1 |
| category 1 | Vendor 2 | 3 | 1 |
| category 1 | Vendor 3 | 2 | 3 |
| category 1 | Vendor 4 | 2 | 0 |
| category 1 | Vendor 5 | 2 | 2 |
| category 2 | Vendor 1 | 1 | 1 |
| category 2 | Vendor 2 | 1 | 1 |
| category 2 | Vendor 3 | 0 | 3 |
| category 2 | Vendor 4 | 2 | 0 |
| category 2 | Vendor 5 | 0 | 2 |
IF I am understanding what you are looking for, I would first start with what you are ultimately looking for..
A list of distinct parts and categories. THEN you are looking for who is missing what. To do so, this is basically a Cartesian of every vendor against this "master list of parts/categories" and who does/not have it.
SELECT DISTINCT
pt.id,
pt.category_id
from
part pt
Now, consider the second part. What are all the possible parts and categories a specific VENDOR has.
SELECT DISTINCT
pt.id,
pt.category_id,
p.vendor_id
FROM
season s
JOIN inventory i
ON s.id = i.season_id
JOIN product p
ON i.product_id = p.id
JOIN part pt
ON s.part_id = pt.id
In the above tables, I did not need the category or actual vendor tables joined as I only cared about the qualifying IDs of who has what. First, all possible part ID and category ID, but in the second, we also grab the VENDOR ID who has it.
Now, tie the pieces together starting with the vendor JOINED to category without any "ON" condition. The join is needed to allow the "v.id" as a lower join in the syntax this will give me a Cartesian of every vendor applied / tested to every category. Then, the category table joined to all the distinct parts and finally LEFT-JOINED to the distinct parts query PER VENDOR
Finally, add your aggregates and group by. Due to the left-join, if there IS an VndParts.ID, then the record DOES exist, thus Vendor Parts FOUND count is up. If the vendor parts id is NULL, then it is missing (hence my sum case/when) for the missing parts count.
SELECT
v.name Vendor,
c.name category,
count( PQParts.ID ) TotalAvailableParts,
count( VndParts.ID ) VendorParts,
sum( case when VndParts.ID IS NULL then 1 else 0 end ) MissingParts
from
vendor v JOIN
category c
JOIN
( SELECT DISTINCT
pt.id,
pt.category_id
from
part pt ) PQParts
ON c.id = PQParts.category_id
LEFT JOIN
( SELECT DISTINCT
pt.id,
pt.category_id,
p.vendor_id
FROM
season s
JOIN inventory i
ON s.id = i.season_id
JOIN product p
ON i.product_id = p.id
JOIN part pt
ON s.part_id = pt.id ) VndParts
ON v.id = VndParts.vendor_id
AND PQParts.ID = VndParts.ID
AND PQParts.Category_ID = VndParts.Category_ID
group by
v.name,
c.name
Applied against your SQL-Fiddle sample database construct
Now, even though you have created sample data of categories 1-6, all of your PARTS are only defined with categories 3-6 as in my sample data result. I can't force for data that does not exist per the sample query of
SELECT
*
from
category c
JOIN
( SELECT DISTINCT
pt.id,
pt.category_id
from
part pt ) PQParts
ON c.id = PQParts.category_id
If such actual data DID exist, then those missing pieces of other categories would also be displayed.
Now final note. You were also looking for a specific SEASON. I would just add a WHERE clause to accommodate that in the VndParts query. Then change PQParts query to include the season join such as
SELECT DISTINCT
pt.id,
pt.category_id
from
part pt
Now, consider the second part. What are all the possible parts and categories a specific VENDOR has.
SELECT DISTINCT
pt.id,
pt.category_id
FROM
season s
JOIN part pt
ON s.part_id = pt.id
WHERE
s.type = 'Type A'
To further restrict for a specific vendor, add the vendor clause in is easy enough as it is the basis of the of the vendor "v" at the outer criteria, and the vendor reference to the second LEFT-JOIN that also has the vendor alias available to filter out.
From your description, it seems you are looking to count how many parts in each category each vendor could have listed as as product but hasn't.
That's basically the difference between how many parts can be listed for each category, and how many were actually listed.
So you could count the possible and left join to a count of the actual.
Based on the sqlfiddle, the code below also assumes that you want to be able to focus on one season type, and that only parts (with sales?) listed in partdata are relevant.
select c.name as category
, v.name as vendor
, cpartcount.parts
, cpartcount.parts-coalesce(cvpartcount.parts,0) as missingparts
from vendor v
cross join
(
select pt.category_id, count(pt.id) as parts
from part pt
where pt.id in
(
select s.part_id
from season s
where s.type='Type A'
)
and pt.id in
(
select pd.part_id
from part_data pd
)
group by pt.category_id
) cpartcount
join category c
on cpartcount.category_id=c.id
left join
(
select pt.category_id, v.id as vendor_id, count(pt.id) as parts
from part pt,vendor v
where (v.id,pt.id) IN
(
select p.vendor_id, s.part_id
from product p
join inventory i
on p.id=i.product_id
join season s
on i.season_id = s.id
join part_data pd
on s.part_id=pd.part_id
where s.type='Type A'
)
group by pt.category_id,v.id
) as cvpartcount
on cpartcount.category_id=cvpartcount.category_id
and v.id=cvpartcount.vendor_id
The problem is that the 2'nd query has a GROUP BY on a field from the sub-query (vendor) that is join in LEFT JOIN so it will create an output row per each of the vendors (including NULL for rows from season that don't have a match with the sub-query).
More specifically - your count is on
COUNT(CASE WHEN in_stock IS NULL THEN pt.id END) AS missing_parts
(I would prefer writing SUM(in_stock IS NULL))
but since in_stock is an aggregation result per each vendor - you'll never have a NULL value there. (check the sub-query results)
I think you should clarify the goal of your queries. For example - the first one is returning -
Per each category the number of parts it has on the given seasons, and the number of seasons that this category wasn't available (and not the number of missing parts, since there is no join on category with the sub-query).
I think I have a somewhat trivial question but I can't figure out how this works. I have the following Companies and Products tables with a simple Many-To-Many relationship.
How would I have to extend this query, so that the results just contains let's say all companies which have products with id 1 AND 2?
I tried adding wheres and havings wherever I could imagine but all i could get was all companies which have products with id x (without the additional and)
Companies Table
id | name
-----------------
1 | Company 1
2 | Company 2
3 | Company 3
Companies_Products Table
id | product_id | company_id
----------------------------
1 | 1 | 1
2 | 2 | 1
3 | 3 | 1
4 | 1 | 2
5 | 1 | 3
6 | 2 | 3
Products Table
id | name
-----------------
1 | Product A
2 | Product B
3 | Product C
Statement
SELECT companies.name,
companies.id AS company_id,
products.id AS product_id
FROM companies
LEFT JOIN company_products
ON companies.id = company_products.company_id
INNER JOIN products
ON company_products.product_id = products.id
If you want ALL companies with associated products 1 and 2, you can write this query:
SELECT c.name,
c.id AS company_id
FROM companies c
WHERE (SELECT COUNT(*)
FROM company_products cp
WHERE cp.company_id = c.id
AND cp.product_id in ('1', '2')
) = 2
Go to Sql Fiddle
If you want to know informations about associated product in the main query so you must use a join in addition of existing query.
Maybe you could using the following subquery in your query:
SELECT company_id, count(*) as no_companies
FROM Companies_Products
WHERE product_id IN (1, 2)
HAVING count(*) = 2
(In this case company an product must be coupled only once.) It returns all the company_ids with product 1 and 2.
There always some discussion about subquery's and performance, but I don't think you will notice.
You could make this function flexible by using a array.
pseudo code:
$parameter = array(1, 2);
...
WHERE product_id IN $parameter
HAVING count(*) = count($parameter)
Please say so if you need more help.