SQL -- Counting with nested subquery - mysql

I have a collection of tables in a relational database
products
categories
orders
line_items
customers
Products has a many-to-many relationship with categories (join table categories_products) and also has and belongs to many orders through line_items, which is a join table for products and orders with an id. A customer also has many orders.
I'm trying to put together some SQL that will give me this sort of response:
customer_id | customer_first_name | category_id | category_name | number_purchased
-----------------------------------
1 |Jack | 1 | Electronics | 15
2 |Jill | 1 | Electronics | 2
2 |Jill | 2 | Hiking | 3
This is the giant hunk of SQL I've been trying to use to get these values:
SELECT
DISTINCT customers.id AS customer_id,
customers.first_name AS customer_first_name,
categories.id AS category_id,
categories.name AS category_name,
(
SELECT count(li.id) FROM line_items li
INNER JOIN orders o ON li.order_id = o.id
INNER JOIN products p ON li.product_id = p.id
INNER JOIN categories_products cp ON cp.product_id = p.id
WHERE
o.customer_id = customer_id
AND o.status = 3
AND cp.category_id = category_id
) AS number_purchased
FROM orders
LEFT JOIN customers ON orders.customer_id = customers.id
LEFT JOIN line_items li ON li.order_id = orders.id
LEFT JOIN products ON products.id = li.product_id
LEFT JOIN categories_products catpr ON catpr.product_id = products.id
LEFT JOIN categories ON catpr.category_id = categories.id
Only the count itself is wrong. Instead of getting the number of line items a customer has bought in a specific category, I'm instead getting a count for all LineItems that have been part of a completed order.
How can I get the count to correctly represent the number of line_items purchased by a specific customer within a category?
NOTE: in the SQL text, o.status = 3 is using an enum to indicate that an Order is "complete."

I think your inner join with categories_products is screwing this up. You should set up a fiddle, like #Strawberry suggested, or try this:
SELECT
DISTINCT customers.id AS customer_id,
customers.first_name AS customer_first_name,
categories.id AS category_id,
categories.name AS category_name,
(
SELECT count(li.id) FROM line_items li
INNER JOIN orders o ON li.order_id = o.id
INNER JOIN products p ON li.product_id = p.id
WHERE
o.customer_id = customer_id
AND o.status = 3
) AS number_purchased
FROM orders
LEFT JOIN customers ON orders.customer_id = customers.id
LEFT JOIN line_items li ON li.order_id = orders.id
LEFT JOIN products ON products.id = li.product_id
LEFT JOIN categories_products catpr ON catpr.product_id = products.id
LEFT JOIN categories ON catpr.category_id = categories.id

If you wanted to correct your count, I would advise using a GROUP BY clause in the subquery. If you GROUP BY orders then you will only get the specific order which you retrieved when looking that the user id was correct. I would encourage you to take a look at mistakes in other part of your SQL code to clean up this hulking query. For example, make sure you want to be using distinct and that you actually want to be using left joins versus inner joins, both of which could seriously mess with the performance of your program.

Related

Filter products by options

I have following database structure to store product options.
Now i have problem to filter out products that match only given options. First i did WHERE option_id IN (array of options), but that would give me products that match any of the options and that is not solution. User wants to filter out only products with given material, color, and size for instance. And if i do WHERE option_id = 4 AND option_id = 6 for instance i get nothing.
Here is my query:
SELECT DISTINCT p.id AS id,
...
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id = 1 // only to get the idea
GROUP BY id
ORDER BY id DESC
LIMIT 0,
12
Just to mention it is PHP application , where user select options from select element with or without multiple attribute.
How to acomplish this?
You can use having:
SELECT p.id AS id, ...
FROM products p JOIN
product_categories pc
ON p.id = pc.product_id LEFT JOIN
product_images pi
ON p.id = pi.product_id JOIN
product_options po
ON p.id = po.product_id
WHERE p.product_active = 1 AND
po.option_id IN (4, 6)
GROUP BY p.id
HAVING COUNT(DISTINCT po.option_id) = 2
ORDER BY p.id DESC
LIMIT 0, 12;
The HAVING clause is specifying that a given id has two matching options. Because of the WHERE clause, these are the only two options that you care about.
I didn't change your approach (you didn't supply the complete query), but you are doing joins along different dimensions -- categories, images, and options. This creates a Cartesian product for each product, and that is often not the best approach to such a query.
There is no need for LEFT JOIN in the solution.
SELECT DISTINCT p.id AS id
FROM products p
JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN (1, 2, 3)
GROUP BY p.id
HAVING COUNT(po.option_id) = 3
My solution keep only tables necessary to find the products with specified options.
In the case you want products having exactly this options and no others you can use NOT EXISTS:
SELECT DISTINCT p.id AS id
FROM products p
JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1 AND
po.option_id IN (1, 2, 3) and
NOT EXISTS (
SELECT 1
FROM product_options po2
WHERE p.id = po2.product_id and po2.option_id NOT IN (1, 2, 3)
)
GROUP BY p.id
HAVING COUNT(po.option_id) = 3
If you want to select products accoding to the other conditions (like product categories and so on) then use IN in the WHERE clause. This approach avoids generating duplicate po.option_id and the outer query will still work correctly even without DISTINCT in COUNT.
SELECT DISTINCT p.id AS id
FROM products p
JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1 AND
po.option_id IN (1, 2, 3) AND
-- use the following IN predicate to select products with specific features without introducing duplicates in your query
p.id IN (
select product_id FROM product_categories WHERE <your_condition>
)
GROUP BY p.id
HAVING COUNT(po.option_id) = 3
You select products with image lists. Something like:
select products.*, group_concat(product_images.id)
Additionally there may be options the product must all meet. This is criteria that belongs in the WHERE clause.
select
p.*,
(select group_concat(image) from product_images i where i.product_id = p.id) as images
from products p
where product_active = 1
and id in
(
select product_id
from product_options
where option_id in (1,3,55,97)
group by product_id
having count(*) = 4 -- four options in this example
);
Thanks guys, i've managed to return exactly what i wanted.
Now i just have problem with pagination query for the filtered products.
Final search query:
SELECT DISTINCT p.id AS id,
main_price,
promotion_price,
NEW,
sale,
recommended,
COUNT(pi.filename) AS image_count,
GROUP_CONCAT(DISTINCT pi.filename
ORDER BY pi.main_image DESC, pi.id ASC) AS images,
name_sr,
uri_sr,
description_sr
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY id
HAVING COUNT(DISTINCT po.option_id) = 1
ORDER BY id DESC
LIMIT 0,
12
Pagination query is something like this i modified it accorgin to new filter query:
SELECT COUNT(DISTINCT p.id) AS number
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY(p.id)
HAVING COUNT(DISTINCT po.option_id) = 1
If i leave out DISTINCT in SELECT COUNT i don't get filtered pagination, if i set DISTINCT i get number of rows that corespond to pagination. I suppose i could add another count(*) to all of this with subquery, but not sure if that is way to go and if there is more efficient and elegant way to do this.

find records grouped by two columns in mysql

I want to find out all seller who have uploaded products in categories (electronics,clothing,furniture), so for 3 categories there can be 3 row against each seller . tables I have are
1.category{category_id,name},
2.seller {seller_id,username},
3.products{product_id,seller_id,category_id,title}
Note:There can be maximum 3 result (coz I'm searching in 3 categories) against one seller even if he added more than one product in single category
expected result:
**product_id** **category** **sellerUsername**
101 electronics kuldeep
211 furniture kuldeep
322 clothing kuldeep
167 electronics roman
245 furniture roman
247 clothing dangi
246 furniture dangi
..
..
if you need only the matching relation use inner join
select a.product_id, b.username, c.name
from products as a
inner join seller as b on b.seller_id = a.seller_id
inner join category as c on c.category_id = a.category_id
else use left join
select a.product_id, b.username, c.name
from products as a
left join seller as b on b.seller_id = a.seller_id
left join category as c on c.category_id = a.category_id
The general solution to your problem is to join the three tables together and then aggregate by seller and category. In my solution, I have arbitraily chosen the max product ID, in the absence of any logic for doing otherwise. The query is slightly tricky, in that we need to additionally join this result again to the category and seller tables to get the human readable category and seller names. The reason for this is the GROUP BY query should ideally be done by ID and not name, since conceivably two categories (or sellers) could have the same name but have different IDs.
SELECT t3.product_id,
COALESCE(t1.name, 'NA'),
COALESCE(t2.username, 'NA')
FROM
(
SELECT MAX(p.product_id) AS product_id,
c.category_id,
s.seller_id
FROM products p
LEFT JOIN category c
ON p.category_id = c.category_id
LEFT JOIN seller s
ON p.seller_id = s.seller_id
WHERE c.name IN ('electronics', 'clothing', 'furniture')
GROUP BY s.seller_id,
c.category_id
) t1
LEFT JOIN category t2
ON t1.category_id = t2.category_id
LEFT JOIN seller t3
ON t1.seller_id = t3.seller_id
Check Below Code.
SET #row_number:=0;
SET #db_names:= '';
SET #db_names2:= '';
select product_id,name as category ,username as sellerUsername
from (
select a.product_id, c.name ,b.username,
#row_number:=CASE WHEN #db_names=username and #db_names2=name THEN #row_number+1
ELSE 1 END AS row_number,#db_names:=username AS username2,#db_names2:=name AS name2
from products as a
left join seller as b on b.seller_id = a.seller_id
left join category as c on c.category_id = a.category_id
where name IN ('electronics', 'clothing', 'furniture')
)a where row_number < 2
order by sellerUsername,name;
Output :

MySQL One-To-Many SUM of COUNTs

After 2 days of searching and trying similar questions, it's got to the point where I need to ask the question!
I have the following database structure (simplified)..
mt_product | mt_sku | mt_price
========== | ====== | ========
id | brand_id | mpn | id | product_id | retailer_id | sku | id | sku_id | price | date
For instance...
* A can of Coca-Cola is ONE product.
* It can be sold in many different retailers, who will all have a SKU for it.
* This SKU will have a price, which can change day-by-day.
I want to list the total number of prices for the product.
To list this I currently have the following query which nearly works...
SELECT
p.id AS pid, p.title AS p_title, p.cat, p.mpn,
b.id AS bid, b.name AS brand,
(SELECT COUNT(s.id) FROM mt_sku AS s WHERE s.pid = p.id) AS num_sku,
(SELECT COUNT(gbp.id) FROM mt_price AS gbp INNER JOIN mt_sku ON mt_sku.id = gbp.sid ) AS num_price
FROM mt_product AS p
INNER JOIN mt_brand b ON p.bid = b.id
INNER JOIN mt_sku s ON p.id = s.pid
num_sku returns as expected, however when I introduce the second sub query for num_price (and I have revised this many times) I either get...
* no duplications of the pid but the num_price is the total number of prices to SKUs, not the amount of prices for this product_id (as query above) eg1_img
* the correct number of num_price, but instead of totalling up the total num_price, the pid is duplicated in the table (as query below) - therefore as the pid is duplicated, this does not give me the result I want. I added DISTINCT as it helped an earlier version of the query, it now makes no difference. eg2_img
SELECT
DISTINCT(p.id) AS pid, p.title AS p_title, p.cat, p.mpn,
b.id AS bid, b.name AS brand,
(SELECT COUNT(s.id) FROM mt_sku AS s WHERE s.pid = p.id) AS num_sku,
(SELECT COUNT(gbp.id) FROM mt_price AS gbp WHERE s.id = gbp.sid) AS num_price
FROM mt_product AS p
INNER JOIN mt_brand b ON p.bid = b.id
INNER JOIN mt_sku s ON p.id = s.pid
I'm pretty sure the key to this is that
product can have multiple SKUs, of which a SKU has multiple price history.
Any help or ideas of the schema would be superb.
Try this:
SELECT
p.id AS pid, p.title AS p_title, p.cat, p.mpn,
b.id AS bid, b.name AS brand,
COUNT(DISTINCT s.id) AS num_sku,
COUNT(gbp.id) AS num_price
FROM mt_product AS p
INNER JOIN mt_brand b ON p.brand_id = b.id
INNER JOIN mt_sku s ON p.id = s.product_id
INNER JOIN mt_price gbp ON s.id = gbp.sku_id
GROUP BY b.id, p.id
The products that don't have SKUs defined will not appear in the result set. Use LEFT JOIN mt_sku to make them appear in the result set (having 0 for num_sku and num_price):
LEFT JOIN mt_sku s ON p.id = s.product_id
In both variants of the query, the products that do not have prices defined will not appear in the result set. Use LEFT JOIN mt_price to include them into the result set (having 0 for num_price):
LEFT JOIN mt_price gbp ON s.id = gbp.sku_id
Take a look at the MySQL documentation for JOINs, GROUP BY and GROUP BY aggregate functions.
If you want to list the total prices then you need correlations.
Your first count is fine, because it is correlated to the outer query. The second has no correlation, so that seems strange. The following fixes the num_price subquery:
SELECT p.id AS pid, p.title AS p_title, p.cat, p.mpn,
b.id AS bid, b.name AS brand,
(SELECT COUNT(s2.id) FROM mt_sku s2 WHERE s2.pid = p.id) AS num_sku,
(SELECT COUNT(gbp.id) FROM mt_price gbp WHERE s.id = gbp.sid ) AS num_price
FROM mt_product p INNER JOIN
mt_brand b
ON p.bid = b.id INNER JOIN
mt_sku s
ON p.id = s.pid;
I'm also not sure why you have all the joins in the outer query. I assume that a given product is going to have multiple rows, and you want the multiple rows to have the same num_sku and num_price values.

MySQL Join Question

Hi i'm struggling to write a particular MySQL Join Query.
I have a table containing product data, each product can belong to multiple categories. This m:m relationship is satisfied using a link table.
For this particular query I wish to retrieve all products belonging to a given category, but with each product record, I also want to return the other categories that product belongs to.
Ideally I would like to achieve this using an Inner Join on the categories table, rather than performing an additional query for each product record, which would be quite inefficient.
My simplifed schema is designed roughly as follows:
products table:
product_id, name, title, description, is_active, date_added, publish_date, etc....
categories table:
category_id, name, title, description, etc...
product_category table:
product_id, category_id
I have written the following query, which allows me to retrieve all the products belonging to the specified category_id. However, i'm really struggling to work out how to retrieve the other categories a product belongs to.
SELECT p.product_id, p.name, p.title, p.description
FROM prod_products AS p
LEFT JOIN prod_product_category AS pc
ON pc.product_id = p.product_id
WHERE pc.category_id = $category_id
AND UNIX_TIMESTAMP(p.publish_date) < UNIX_TIMESTAMP()
AND p.is_active = 1
ORDER BY p.name ASC
I'd be happy just retrieving the category id's releated to each returned product row, as I will have all category data stored in an object, and my application code can take care of the rest.
Many thanks,
Richard
SELECT p.product_id, p.name, p.title, p.description,
GROUP_CONCAT(otherc.category_id) AS other_categories
FROM prod_products AS p
JOIN prod_product_category AS pc
ON pc.product_id = p.product_id
LEFT JOIN prod_product_category AS otherc
ON otherc.product_id = p.product_id AND otherc.category_id != pc.category_id
WHERE pc.category_id = $category_id
AND UNIX_TIMESTAMP(p.publish_date) < UNIX_TIMESTAMP()
AND p.is_active = 1
GROUP BY p.product_id
ORDER BY p.name ASC
You would use an inner join to the product_category table, doing a left join there is pointless as you are using the value from it in the condition. Then you do a left join on the product_category table to get the other categories, and join in the categories for the data:
select
p.product_id, p.name, p.title, p.description,
c.category_id, c.name, c.title
from
prod_products p
inner join prod_product_category pc on pc.product_id = p.product_id
left join prod_product_category pc2 on pc2.product_id = p.product_id
left join prod_categories c on c.category_id = pc2.category_id
where
pc.category_id = #category_id and
unix_timestamp(p.publish_date) < unix_timestamp() and
p.is_active = 1
order by
p.name

MySQL join with a "bounce" off a third table

I have 3 MySQL tables.
companies with company_id and company_name
products with product_id and company_id
names with product_id, product_name and other info about the product
I'm trying to output the product_name and the company_name in one query for a given product_id.
Basically I need information from the names and companies tables and the link between them is the products table.
How do I do a join that needs to "bounce" off a third table?
Something like this but this obviously doesn't work:
SELECT product_name, company_name
FROM names
LEFT OUTER JOIN companies ON
(names.product_id = products.product_id and products.company_id = companies.company_id)
WHERE product_id = '12345'
select n.product_name, c.company_name
from names n
left outer join products p on n.product_id = p.product_id
left outer join companies c on p.company_id = c.company_id
where n.product_id = '12345'
You nearly have it, you just need to include the third table as another join in your query:
SELECT product_name, company_name
FROM names
LEFT JOIN products ON names.product_id = products.product_id
LEFT JOIN companies ON products.company_id = companies.company_id
WHERE product_id = '12345'
Also you should note that if you are using LEFT JOIN then the company name could be NULL if the company that made the product is unknown. So you need to test for that in your code to avoid an exception. If you know that it should never be NULL, or if you want to explicilty exclude products for which you don't know the company then use an INNER JOIN instead of a LEFT JOIN in both cases.