COUNT(), GROUP BY and NULL Values in a MySql query - mysql

I have this MySql query working partially:
SELECT p.product_id, p.product_name, p.sales,
p.length, p.hits, COUNT(w.product_id) AS favorites
FROM `products` AS p, `products_mf_xref` AS m,
`wishlist_items` AS w
WHERE m.manufacturer_id = '1'
AND p.product_id = m.product_id
AND m.product_id = w.product_id
GROUP BY m.product_id ORDER BY p.product_id ASC
I'm recovering some fields from a table and trying to get the number of times these products are referenced in another table (this last table was called "whishlist"). The query is working OK, but I only get the products that are at least one time referenced in the wish list table.
I read that count(*) does not get NULL values what make sense, but I need also the products that are not referenced in the wish list table, I mean, products where COUNT(w.product_id) are equal to "0".
Any idea to recover all the products, including the null values?
Any idea to change my query? It's going to make me mad!!
Thanks in advance!

Use LEFT JOIN:
SELECT p.product_id, p.product_name, p.sales,
p.length, p.hits, COUNT(w.product_id) AS favorites
FROM `products` AS p
LEFT JOIN `products_mf_xref` AS m
ON p.product_id = m.product_id AND m.manufacturer_id = '1'
LEFT JOIN `wishlist_items` AS w ON m.product_id = w.product_id
GROUP BY m.product_id ORDER BY p.product_id ASC
By the way, as much as possible use JOIN to mirror the data relationships, use WHERE for filters

Related

How to get orders count sub-queries

I have some difficuties to get orders count with the following SQL query:
select
d.id,
d.title,
count(distinct o.id)
from store s
left join `order` o on o.store_id = s.id
left join order_product op on op.order_id=o.id
left join store_product sp on sp.id = op.product_id
left join product p on p.id = sp.product_id
left join department_category_to_entity dce1 on dce1.entity_type IN ('Product') and dce1.entity_id = p.id
left join department_category_to_entity dce2 on op.status != 'replaced' and
op.replacement_id is null and
dce2.entity_type IN ('StoreProduct') and
dce2.entity_id = sp.id
left join department_category_to_entity dce3 on op.status = 'replaced' and
op.replacement_id is not null and
dce3.entity_type IN ('StoreProduct') and
dce3.entity_id = op.replacement_id
left join department_category dc on dc.id = p.department_category_id or
dc.id = dce1.category_id or
dc.id = dce2.category_id or
dc.id = dce3.category_id
left join department d on d.id = dc.department_id
where d.id is not null
group by d.id;
Is it possible to get orders count without sub-queries or to get correct count of orders? Please, help... Thank you!
You have LEFT JOIN, which says to keep looking even if there is no row in the 'right' table. But, on the other hand, you are GROUPing BY a column in the last of a chain of LEFT JOINs! Perhaps you meant JOIN instead of LEFT JOIN??
Saying where d.id is not null is roughly equivalent to saying "Oops, all those LEFT JOINs could have been JOINs.
With GROUP BY and JOINs (LEFT or otherwise), you are doing an "inflate-deflate". What logically happens is all the JOINing is done to build a huge intermediate table with all the valid combinations. Then the COUNT(*) and GROUP BY are done. This tends to make the COUNTs (and SUMs, etc) have bigger values than expected.
What's the most direct route to get from department to order? It does not seem to involve store, so get rid of that table.
Are other tables irrelevant?
Even after addressing those issue, you still may be getting the wrong value. Please provide, for starters, `SHOW CREATE TABLE for each table.

Pagination count query with multiple joins

I had this question about filtering different products by selecting options. That query has been solved here: Filter products by options.
My problem is now with the count query for the pagination. For instance this query returns 37 rows with the count of 1.
SELECT COUNT(DISTINCT p.id) AS number
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY(p.id)
HAVING COUNT(DISTINCT po.option_id) = 1
But if i remove DISTINCT:
SELECT COUNT(p.id) AS number
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY(p.id)
HAVING COUNT(DISTINCT po.option_id) = 1
this returns also 37 rows with mixed numbers.
What am doing wrong? I know i could outcome this by running aditional count on this result set but i think that is not right solution?
Also as suggested in previous question, there was stated that i should not be needing DISTINCT and the query is flawwed because of that. Can you tell me what is the problem?
The only difference in your queries is here:
SELECT COUNT(DISTINCT p.id) AS number
vs.
SELECT COUNT(p.id) AS number
So you get the same number of result rows, because FROM, WHERE, and HAVING are all the same. Only the data per row you select is different.
In the first case it's the number of distinct IDs that are not null, which is always 1, because you group by that ID. (You say: Look at all records with ID 5 and count how many different IDs you find in these records. Well, the ID in every record with ID 5 is 5. So it's only one ID.)
In the second case you count IDs that are not null. As the ID is never null, this is the same as counting records: COUNT(*). And you'd better make this clear by using COUNT(*) instead of the obfuscated COUNT(p.id).
OK guys it is more clear now. Is something like this good enough to solve this?
SELECT COUNT(*) AS number
FROM (SELECT p.id
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY(p.id)
HAVING COUNT(DISTINCT po.option_id) = 1) AS count_table
I will go with this one. Thanks.

Using cases to determine which table should join

I have four tables products, product_histories, vendor_invoices and invoices
This is the query I have developed
SELECT p.product_id, product_name, vendor_name FROM products AS p
INNER JOIN product_histories AS ph ON p.product_id = ph.product_id
CASE
WHEN ph.history_type = "P" THEN
LEFT JOIN vendor_invoices AS vi ON link_id = vi.vi_id
WHEN ph.history_type = "S" THEN
LEFT JOIN invoices AS i ON i.invoice_id = link_id
END
ORDER BY ph_id ASC
What I want that if ph.history_type is P then is should join vendor_invoices and if it is S then it should join invoices. But it says there is a syntax error.
Can anyone help me out with it? Or could show a better way to achieve this problem.

Best way to write this query? Several JOINS

I have this query (below) while it does work I am wondering if it is the best as it will be going against thousands of records. I will try to explain the best I can.
SELECT items.*,
p.file AS item_pic,
i_f.id AS favorite_id,
COALESCE(f.favorite_count, 0) AS favorite_count,
COALESCE(b.num_buys, 0) AS num_buys,
COALESCE(c.comment_count, 0) AS comment_count
FROM items i
INNER JOIN (SELECT file,
item_id
FROM item_pics
ORDER BY item_pics.id ASC) AS p
ON p.item_id = i.id
LEFT JOIN (SELECT COUNT(*) AS favorite_count,
item_id
FROM item_favorites
GROUP BY item_id) AS f
ON f.item_id = i.id
LEFT JOIN (SELECT COUNT(*) AS num_buys,
item_id
FROM purchases
GROUP BY item_id) AS b
ON b.item_id = i.id
LEFT JOIN (SELECT COUNT(*) AS comment_count,
item_id
FROM comments
GROUP BY item_id) AS c
ON c.item_id = i.id
LEFT JOIN item_favorites AS i_f
ON i.id = i_f.item_id
AND i_f.userid = '14'
GROUP BY i.id
LIMIT 0, 20
So we are selecting the items in the database. The first join is for a picture (Items have multiple pictures but I only want one).
The next join is for favorite count. Each time a user favorites something it adds it to the table favorites with some info, so I am just trying to get the total number of favorites for that item.
Next up is the number of purchases for this item. Pretty much the same as favorites.
After that it is for comments. Again this is just like the purchases and favorites count.
The last join is to see if the logged in user (id 14) has favorited this item if not I use COALESCE to return 0.
Like I said this all works correctly but it does take a few seconds to load on a table of about 6700 items and about 180K rows in the purchases table for only loading 20 at a time (I do a scrolling/load similar to Facebook/Twitter). Indexes have been properly setup on all tables. Once this is complete/correct I would like to know how to limit results for purchases in the last seven days and order by number of purchases (num_buys).
EDIT: Results from EXPLAIN
I suppose you want the first picture (lowest id), and pictures are required, where as everything else is optional.
I guess you're doing subqueries because you think joining on uncorrelated subqueries (hitting the joined tables just once) will be faster than correlated subqueries or a plain JOIN. However, you end up having to lookup the records twice, and the second lookup (for the actual join) doesn't get to use an index because derived (temporary tables) don't have indexes.
Try normal JOINs:
SELECT items.*,
p.file AS item_pic,
COALESCE(i_f.id, 0) AS favorite_id,
COUNT(f.item_id) AS favorite_count,
COUNT(b.item_id) AS num_buys,
COUNT(c.item_id) AS comment_count
FROM items i
STRAIGHT_JOIN item_pics p
ON p.item_id = i.id
LEFT JOIN item_pics p2
ON p2.item_id = i.id
AND p2.id < p1.id
LEFT JOIN item_favorites f
ON f.item_id = i.id
LEFT JOIN purchases b
ON b.item_id = i.id
LEFT JOIN comments c
ON c.item_id = i.id
LEFT JOIN item_favorites AS i_f
ON i_f.item_id = i.id
AND i_f.userid = '14'
WHERE p2.id IS NULL
GROUP BY i.id
LIMIT 20
The double join on pictures is an anti-join WHERE p2.id IS NULL, to retrieve the picture with the lowest id.

MySQL query, dealing with active and inactive products

Facing a problem and not getting the hint for a few hours. Maybe onyone can help me out.
Have the following query which shows the Topsellers. So the status of the product (active or not) is saved in b.Article_Status (0=inactive, 1=active).
How do I get the products of the result list which have no active product in the productfamily at the moment. But the product shall still be shown if an old one was ordered (and so is in table order_items) is now inactive and the active one was not ordered yet.
Actual query looks as follow. Already fund a solution which works when the actual active product has been ordered once, but still the problem with the mentioned case.
SELECT count( a.order_itemid ) AS numOrders, c.Product_ID, c.Product_Name, d.producer_name
FROM order_items a
LEFT OUTER JOIN product_article b ON b.Article_ID = a.order_itemid
LEFT OUTER JOIN product c ON b.Article_Productid = c.Product_ID
LEFT OUTER JOIN producer d ON c.Product_Producer = d.producer_id
GROUP BY c.Product_ID
ORDER BY `numOrders` DESC
Solution was a WHERE EXISTS subquery
SELECT count( a.order_itemid ) AS numOrders, c.Product_ID, c.Product_Name, d.producer_name
FROM order_items a
LEFT OUTER JOIN product_article b ON b.Article_ID = a.order_itemid
LEFT OUTER JOIN product c ON b.Article_Productid = c.Product_ID
LEFT OUTER JOIN producer d ON c.Product_Producer = d.producer_id
WHERE EXISTS (SELECT * FROM product_article x WHERE c.Product_ID = x.Article_Productid AND x.Article_Status = 1)
GROUP BY c.Product_ID
ORDER BY `numOrders` DESC
LIMIT 5