I got an LEFT JOIN exercise at school:
"List all category names with the number of their products."
Used were two tables from the northwind DB: products (77 rows) and categories (8 rows)
I thought the product table should come first, since the main-data (number of products) will be found there and only the 8 category names will be needed from the joined table. Our teacher argued, that the categories table needs to be the main table, but i still can't understand why.
The two queries are:
SELECT C.CategoryID, CategoryName, COUNT(ProductID) [Count]
FROM Categories C LEFT JOIN Products P
ON C.CategoryID = P.CategoryID
GROUP BY C.CategoryID, CategoryName
and
SELECT P.CategoryID, CategoryName, COUNT(ProductID) [Count]
FROM Products P LEFT JOIN Categories C
ON P.CategoryID = C.CategoryID
GROUP BY CategoryName, P.CategoryID
Can anybody explain to me why, in this case, a certain order of used tables matters in terms of theoretical performance?And if: how so? (does size matter?;))
The name of the exercise tells yo what is the first table in your case.
"List all category names with the number of their products."
So get the all category names. Category names is what you HAVE TO SHOW - ALL OF THEM. You want to show all of them regardless of the fact is there a matching CategoryID in the Products table.
For example, if you want to show all product names with number of their categories then you want to show all product names regardless if there exists matching ProductID in Categories table.
Here is the demo
This demo shows you what the two queries will return if we have 3 categories and one product. It is not the best demo in the world but it does the trick I believe.
The tables:
create table Categories (CategoryID int, CategoryName varchar(20))
create table Products (ProductID int, CategoryID int)
The data:
insert into Categories values(1, 'Cat1');
insert into Categories values(2, 'Cat2');
insert into Categories values(3, 'Cat3');
insert into Products values(1, 1);
Query1:
SELECT C.CategoryID, CategoryName, COUNT(ProductID) as Cnt
FROM Categories C
LEFT JOIN Products P ON C.CategoryID = P.CategoryID
GROUP BY C.CategoryID, CategoryName
Result1:
CategoryID CategoryName Cnt
1 Cat1 1
2 Cat2 0
3 Cat3 0
Query2:
SELECT P.CategoryID, CategoryName, COUNT(ProductID) as Cnt
FROM Products P
LEFT JOIN Categories C ON P.CategoryID = C.CategoryID
GROUP BY CategoryName, P.CategoryID
Result2:
CategoryID CategoryName Cnt
1 Cat1 1
I see in your question that you say:
"Used were two tables from the northwind DB: products (77 rows) and categories (8 rows)"
So maybe it is strange now for you how can my example be like this and yours "since the results of both queries are obviousely the same" ?
Here is the demo that will show you how it can be the same with different set of data.
As an aside, here is another way to get the desired results.
SELECT C.CategoryID, C.CategoryName
( SELECT COUNT(*)
FROM Products AS P
WHERE P.CategoryID = c.CategoryID
) AS "Count"
FROM Categories AS C
The performance will be about the same as the 'correct' LEFT JOIN formulation.
A further note: COUNT(x) does the extra check to see that x IS NOT NULL; COUNT(*) simply counts the number of relevant rows.
In some other situation, you may need COUNT(DISTINCT productID); I suspect you do not need it in this case.
Related
Even though my question was warned as similar title, I couldn't find here any similar problem. Let me explain in details:
I've got two tables (I'm working with MySQL) with these values inserted:
table products:
id name
1 TV
2 RADIO
3 COMPUTER
table sales (product_id is A FK which references products(id)):
id quantity product_id
1 50 2
2 100 3
3 200 3
The tv's haven't been sold, radios got 1 sale (of 50 unities) and computers got two sales (one of 100 e other of 200 unities);
Now I must create a query where I can show the products and its sales, but there are some conditions that make that task difficult:
1 - If there's no sales, show obviously NULL;
2 - If there's 1 sale, show that sale;
3 - If there's more than 1 sale, show the latest sale (which I've tried to use function MAX(id) to make it simple, and yet didn't worked);
In the tables example above, I expect to show this, after a proper SQL Query:
products.NAME sales.QUANTITY
TV NULL
RADIO 50
COMPUTER 200
I've been trying lots of joins, inner joins, etc., but couldn't find the result I expect. Which SQL query can give the answer I expect?
Any help will be very appreciated.
Thanks.
Hope the below query works.
SELECT products.name, sl.quantity
FROM products LEFT JOIN (
SELECT product_id, max(quantity) as quantity FROM sales GROUP BY product_id) sl
ON products.id = sl.product_id
In MySQL 8.0 you can do:
with m (product_id, max_id) as ( -- This is a CTE
select product_id, max(id) from sales group by product_id
)
select
p.name,
s.quantity
from products p
left join m on m.product_id = p.id
left join sales s on s.id = m.max_id
If you have an older MySQL, you can use a Table Expression:
select
p.name,
s.quantity
from products p
left join ( -- This is a table expression
select product_id, max(id) as max_id from sales group by product_id
) m on m.product_id = p.id
left join sales s on s.id = m.max_id
For the sake of clarity and this question i will rename the tables so it is a bit clearer for everybody and explain what i want to achieve:
There is an input form with options that return categories ID's. If a 'Product' has 'Category', i want to return/find the 'Product' which lets say has multiple categories(or just 1) and all of its categories are inside the array that is passed from the form.
Products table
ID Title
1 Pizza
2 Ice Cream
Categories table
ID Title
1 Baked food
2 Hot food
ProductsCategories table
ID ProductId CategoryId
1 1 1
2 1 2
So if i pass [1,2] the query should return Product with id 1 since all ProductsCategories are inside the requested array, but if i pass only 1 or 2, the query should return no results.
Currently i have the following query which works, but for some reason if i create a second Product and create a ProductCategory that has a CategoryId same as the first product, the query returns nulll...
SELECT products.*
FROM products
JOIN products_categories
ON products_categories.product_id= products.id
WHERE products_categories.category_id IN (1, 2)
HAVING COUNT(*) = (select count(*) from products_categories pc
WHERE pc .product_id = products.id)
All help is deeply appretiated! Cheers!
In order to match all values in IN clause, you just need to know in addition the number of passed categories which you must use it in HAVING clause:
SELECT
p.*,
GROUP_CONCAT(c.title) AS categories
FROM
Products p
INNER JOIN ProductsCategories pc ON pc.productId = p.ID
INNER JOIN Categories c ON c.ID = pc.categoryId
WHERE
pc.categoryId IN (1,2)
GROUP BY
p.id
HAVING
COUNT(DISTINCT pc.categoryId) = 2 -- this is # of unique categories in IN clause
So in case IN (1,2) result is:
+----+-------+---------------------+
| id | title | categories |
+----+-------+---------------------+
| 1 | Pizza | Baked Food,Hot Food |
+----+-------+---------------------+
1 row in set
In case IN (1,3) result is Empty set (no results).
#mitkosoft, thanks for your answer, but sadly the query is not producing the needed results. If the product's categories are partially in the passed categories the product is still returned. Additionally i might not know how many parameters are sent by the form.
Luckily I managed to create the query that does the trick and works perfectly fine (at least so far)
SELECT products.*,
COUNT(*) as resultsCount,
(SELECT COUNT(*) FROM products_categories pc WHERE pc.product_id = products.id) as categoriesCount
FROM products
JOIN products_categories AS productsCategories
ON productsCategories.product_id= products.id
WHERE productsCategories.category_id IN (7, 15, 8, 1, 50)
GROUP BY products.id
HAVING resultsCount = categoriesCount
ORDER BY amount DESC #optional
That way the query is flexible and gives me exactly what I needed! - Only those products that have all their categories inside the search parameters(not partially).
Cheers! :)
I have 3 tables as follows :
Table 1: Product
id_product [Primary Key],added_time.
Table 2: Category
id_category [Primary Key],Category_name.
Table 3: product_category
id_category,id_product [Both Foreign Keys]
I want to pull Data as
Category_name,No Of Products in this Category,Last time when product was added to Category(Latest product added_time).
You could use this SQL:
SELECT Category.Category_name,
Count(DISTINCT Product.id_product) AS num_products,
Max(Product.added_time) last_added_time
FROM Category
LEFT JOIN product_category
ON product_category.id_category = Category.id_category
LEFT JOIN Product
ON Product.id_product = product_category.id_product
GROUP BY Category.Category_name;
Note that by using LEFT JOIN you will be certain to list all categories even those for which no products exist. If you don't want those, replace both LEFT keywords with INNER.
Note also that in standard SQL you need to GROUP BY any columns you mention in the SELECT list, unless they are aggregated, like with MAX or COUNT.
SELECT C.`Category_name`,
(SUM(IF(P.`id_product`IS NULL,0,1))) AS No_of_Products,
MAX(P.`added_time`) AS Latest_time
FROM
Category C
LEFT JOIN
product_category P_C ON C.`id_category` = P_C.`id_category`
LEFT JOIN
Product P ON P.`id_product` = P_C.`id_product`
GROUP BY C.`id_category`
Hope this helps.
I am creating a MySQL database. I have a column named, name of the product. The product has one Main Category. However, it can belong to n-number of sub categories. I need to map these sub categories to that product. How can I do that?
Table1 - Product Info
Columns - ID, Name, MainCategory, SubCategory (Not Sure Exactly)
Table2 - MainCategory
Columns - ID, Name
Table3 - SubCategory
Columns - ID, Name
Table1 has 1-to-1 relationship to Table2.
How do I map Table1 to Table3? Am I doing this wrong?
Thought: I want to do it in the manner so that whenever I click on any subcategory name on a website, I get a list of all the products under that category. Just like it happens in Online Stores Website.
Example: The product External Hard Drive will come under Computer Accessories. But I want to give it sub-categories like offer_running, 500GB, SomeCompanyName, black etc.
Hope I explained my question. Please help me in the designing of database. I have got all the basics of DBMS, but I don't know how to involve keywords and to store & map them in a database.
How about the following structure:
Table1 - Product Info
Columns - ID, Name, Main_Category_ID
Table2 - Category
Columns - ID, Name
Table3 - Product_Category
Columns - ID, Product_ID, Category_ID, Name
Then you could use the following query to get all of the get all of the information:
SELECT p.*, c.Name
FROM [Product Info] p
INNER JOIN Product_Category pc ON p.ID = pc.Product_ID
INNER JOIN Category c ON pc.Category_ID = c.ID
Notice that in the Product Info table I added a Main_Category_ID field. If you really need to identify the main category then this would work:
SELECT p.ID, p.Name, NULL AS MainCat, c.Name AS SubCat
FROM [Product Info] p
INNER JOIN Product_Category pc ON p.ID = pc.Product_ID
INNER JOIN Category c ON pc.Category_ID = c.ID
UNION
SELECT p.ID, p.Name, c.Name AS MainCat, NULL AS SubCat
FROM [Product Info] p
INNER JOIN Category c ON p.Main_Category_ID = c.ID
First of all, if table 1 and table 2 have a 1-1 relationship, there is no need for having two separate tables. This will make query processing faster and will eliminate the need for an extra join.
Can you clarify what the ID column in table 3 refers to? Is this an unique ID for table 3 or the ID from table 1?
If former is the case, then you need to have the ID column of table 1 as the foreign key in table 3. That will resolve your issues.
Hope that helps.
I have 3 tables: books, book_categories, categories.
book_categories table "joins" books and categories. It contains columns: id,book_id,category_id.
So one Book may belong to many categories and one Categorie may have many books.
I need query which retrieves all books from given_category except books which belongs to given_set_of_categories. So for example I want all books from category A but only if they don't belong also to category B or C. I need also sort (order) the result by Book.inserted column.
I know how to get all books from given_category with 2 joins but can't figure out how to exclude some books from other categories in result. I cant filter books in PHP because I am paginating the search result.
where
category_id = <given category>
and books.book_id not in
(
select book_id from book_categories
where category_id in (<given set of cat>)
)
order by books.inserted
So, if you mean it is in one category but not in any other:
AND EXISTS(SELECT * FROM books b JOIN book_categories bc ON b.id = bc.book_id JOIN categories c ON bc.category_id = c.id AND c.id = 'A')
AND NOT EXISTS(SELECT * FROM books b JOIN book_categories bc ON b.id = bc.book_id JOIN categories c ON bc.category_id = c.id AND c.id != 'A')
I think that this can be achieved through counting provided that book_categories entries are unique, thus the combination book_id & category_id are not repeating. Instead of trying directly to exclude records, we select from the combined set of categories [,] and then we'll count book_id entries that belong to the :
COUNT(IF(category_id = <given_category>, 1, NULL)) as cnt_exists
and after ensuring that it contains the required category, we count the total to see if it belongs to any other category as well:
COUNT(*) AS cnt_total
SELECT * FROM books b JOIN (
SELECT book_id,
COUNT(IF(category_id = <given_category>, 1, NULL)) as cnt_exists,
COUNT(*) AS cnt_total FROM book_categories WHERE
category_id IN(<given_category>, <given_set_of_categories>)
) bc ON b.id = bc.book_id AND
cnt_exists = 1 AND cnt_total = 1 ORDER BY b.inserted