After 2 days of searching and trying similar questions, it's got to the point where I need to ask the question!
I have the following database structure (simplified)..
mt_product | mt_sku | mt_price
========== | ====== | ========
id | brand_id | mpn | id | product_id | retailer_id | sku | id | sku_id | price | date
For instance...
* A can of Coca-Cola is ONE product.
* It can be sold in many different retailers, who will all have a SKU for it.
* This SKU will have a price, which can change day-by-day.
I want to list the total number of prices for the product.
To list this I currently have the following query which nearly works...
SELECT
p.id AS pid, p.title AS p_title, p.cat, p.mpn,
b.id AS bid, b.name AS brand,
(SELECT COUNT(s.id) FROM mt_sku AS s WHERE s.pid = p.id) AS num_sku,
(SELECT COUNT(gbp.id) FROM mt_price AS gbp INNER JOIN mt_sku ON mt_sku.id = gbp.sid ) AS num_price
FROM mt_product AS p
INNER JOIN mt_brand b ON p.bid = b.id
INNER JOIN mt_sku s ON p.id = s.pid
num_sku returns as expected, however when I introduce the second sub query for num_price (and I have revised this many times) I either get...
* no duplications of the pid but the num_price is the total number of prices to SKUs, not the amount of prices for this product_id (as query above) eg1_img
* the correct number of num_price, but instead of totalling up the total num_price, the pid is duplicated in the table (as query below) - therefore as the pid is duplicated, this does not give me the result I want. I added DISTINCT as it helped an earlier version of the query, it now makes no difference. eg2_img
SELECT
DISTINCT(p.id) AS pid, p.title AS p_title, p.cat, p.mpn,
b.id AS bid, b.name AS brand,
(SELECT COUNT(s.id) FROM mt_sku AS s WHERE s.pid = p.id) AS num_sku,
(SELECT COUNT(gbp.id) FROM mt_price AS gbp WHERE s.id = gbp.sid) AS num_price
FROM mt_product AS p
INNER JOIN mt_brand b ON p.bid = b.id
INNER JOIN mt_sku s ON p.id = s.pid
I'm pretty sure the key to this is that
product can have multiple SKUs, of which a SKU has multiple price history.
Any help or ideas of the schema would be superb.
Try this:
SELECT
p.id AS pid, p.title AS p_title, p.cat, p.mpn,
b.id AS bid, b.name AS brand,
COUNT(DISTINCT s.id) AS num_sku,
COUNT(gbp.id) AS num_price
FROM mt_product AS p
INNER JOIN mt_brand b ON p.brand_id = b.id
INNER JOIN mt_sku s ON p.id = s.product_id
INNER JOIN mt_price gbp ON s.id = gbp.sku_id
GROUP BY b.id, p.id
The products that don't have SKUs defined will not appear in the result set. Use LEFT JOIN mt_sku to make them appear in the result set (having 0 for num_sku and num_price):
LEFT JOIN mt_sku s ON p.id = s.product_id
In both variants of the query, the products that do not have prices defined will not appear in the result set. Use LEFT JOIN mt_price to include them into the result set (having 0 for num_price):
LEFT JOIN mt_price gbp ON s.id = gbp.sku_id
Take a look at the MySQL documentation for JOINs, GROUP BY and GROUP BY aggregate functions.
If you want to list the total prices then you need correlations.
Your first count is fine, because it is correlated to the outer query. The second has no correlation, so that seems strange. The following fixes the num_price subquery:
SELECT p.id AS pid, p.title AS p_title, p.cat, p.mpn,
b.id AS bid, b.name AS brand,
(SELECT COUNT(s2.id) FROM mt_sku s2 WHERE s2.pid = p.id) AS num_sku,
(SELECT COUNT(gbp.id) FROM mt_price gbp WHERE s.id = gbp.sid ) AS num_price
FROM mt_product p INNER JOIN
mt_brand b
ON p.bid = b.id INNER JOIN
mt_sku s
ON p.id = s.pid;
I'm also not sure why you have all the joins in the outer query. I assume that a given product is going to have multiple rows, and you want the multiple rows to have the same num_sku and num_price values.
Related
This is a MySQL question. I have three tables with the following columns:
transactions (table): transact_id, customer_id, transact_amt, product_id,
products (table): product_id, product_cost, product_name, product_category
customers (table): customer_id, joined_at, last_login_at, state, name, email
I'd like a query that finds out the most popular item in every state and the state. One of the tricky parts is that some product_name have multiple product_id. Therefore I though joining the three tables that generate an output with two columns: state and product_name. Until here that worked fine doing this:
SELECT p.product_name, c.state
FROM products p
INNER JOIN transactions t
ON p.product_id = t.product_id
INNER JOIN customers c
ON c.customer_id = t.customer_id
This selects all the products, and the states from where the customer is. The problem is that I can't find the way to rank the mos popular product per state. I tried different group by, order by and using subqueries without success. I suspect I need to do subqueries, but I can't find the way to resolve it. The expected outcome should look like this:
most_popular_product | state
Bamboo | WA
Walnut | MO
Any help will be greatly appreciated.
Thank you!
You need a subquery that gets the count of transactions for each product in each state.
SELECT p.product_name, c.state, COUNT(*) AS count
FROM products p
INNER JOIN transactions t
ON p.product_id = t.product_id
INNER JOIN customers c
ON c.customer_id = t.customer_id
GROUP BY p.product_name, c.state
Then write another query that has this as a subquery, and gets the highest count for each state.
SELECT state, MAX(count) AS maxcount
FROM (
SELECT p.product_name, c.state, COUNT(*) AS count
FROM products p
INNER JOIN transactions t
ON p.product_id = t.product_id
INNER JOIN customers c
ON c.customer_id = t.customer_id
GROUP BY p.product_name, c.state
) AS t
GROUP BY state
Finally, join them together:
SELECT t1.product_name AS most_popular_product, t1.state
FROM (
SELECT p.product_name, c.state, COUNT(*) AS count
FROM products p
INNER JOIN transactions t
ON p.product_id = t.product_id
INNER JOIN customers c
ON c.customer_id = t.customer_id
GROUP BY p.product_name, c.state
) AS t1
JOIN (
SELECT state, MAX(count) AS maxcount
FROM (
SELECT p.product_name, c.state, COUNT(*) AS count
FROM products p
INNER JOIN transactions t
ON p.product_id = t.product_id
INNER JOIN customers c
ON c.customer_id = t.customer_id
GROUP BY p.product_name, c.state
) AS t
GROUP BY state
) AS t2 ON t1.state = t2.state AND t1.count = t2.maxcount
This is basically the same pattern as SQL select only rows with max value on a column, just using the first grouped query as the table you're trying to group.
I had this question about filtering different products by selecting options. That query has been solved here: Filter products by options.
My problem is now with the count query for the pagination. For instance this query returns 37 rows with the count of 1.
SELECT COUNT(DISTINCT p.id) AS number
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY(p.id)
HAVING COUNT(DISTINCT po.option_id) = 1
But if i remove DISTINCT:
SELECT COUNT(p.id) AS number
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY(p.id)
HAVING COUNT(DISTINCT po.option_id) = 1
this returns also 37 rows with mixed numbers.
What am doing wrong? I know i could outcome this by running aditional count on this result set but i think that is not right solution?
Also as suggested in previous question, there was stated that i should not be needing DISTINCT and the query is flawwed because of that. Can you tell me what is the problem?
The only difference in your queries is here:
SELECT COUNT(DISTINCT p.id) AS number
vs.
SELECT COUNT(p.id) AS number
So you get the same number of result rows, because FROM, WHERE, and HAVING are all the same. Only the data per row you select is different.
In the first case it's the number of distinct IDs that are not null, which is always 1, because you group by that ID. (You say: Look at all records with ID 5 and count how many different IDs you find in these records. Well, the ID in every record with ID 5 is 5. So it's only one ID.)
In the second case you count IDs that are not null. As the ID is never null, this is the same as counting records: COUNT(*). And you'd better make this clear by using COUNT(*) instead of the obfuscated COUNT(p.id).
OK guys it is more clear now. Is something like this good enough to solve this?
SELECT COUNT(*) AS number
FROM (SELECT p.id
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY(p.id)
HAVING COUNT(DISTINCT po.option_id) = 1) AS count_table
I will go with this one. Thanks.
I have following database structure to store product options.
Now i have problem to filter out products that match only given options. First i did WHERE option_id IN (array of options), but that would give me products that match any of the options and that is not solution. User wants to filter out only products with given material, color, and size for instance. And if i do WHERE option_id = 4 AND option_id = 6 for instance i get nothing.
Here is my query:
SELECT DISTINCT p.id AS id,
...
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id = 1 // only to get the idea
GROUP BY id
ORDER BY id DESC
LIMIT 0,
12
Just to mention it is PHP application , where user select options from select element with or without multiple attribute.
How to acomplish this?
You can use having:
SELECT p.id AS id, ...
FROM products p JOIN
product_categories pc
ON p.id = pc.product_id LEFT JOIN
product_images pi
ON p.id = pi.product_id JOIN
product_options po
ON p.id = po.product_id
WHERE p.product_active = 1 AND
po.option_id IN (4, 6)
GROUP BY p.id
HAVING COUNT(DISTINCT po.option_id) = 2
ORDER BY p.id DESC
LIMIT 0, 12;
The HAVING clause is specifying that a given id has two matching options. Because of the WHERE clause, these are the only two options that you care about.
I didn't change your approach (you didn't supply the complete query), but you are doing joins along different dimensions -- categories, images, and options. This creates a Cartesian product for each product, and that is often not the best approach to such a query.
There is no need for LEFT JOIN in the solution.
SELECT DISTINCT p.id AS id
FROM products p
JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN (1, 2, 3)
GROUP BY p.id
HAVING COUNT(po.option_id) = 3
My solution keep only tables necessary to find the products with specified options.
In the case you want products having exactly this options and no others you can use NOT EXISTS:
SELECT DISTINCT p.id AS id
FROM products p
JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1 AND
po.option_id IN (1, 2, 3) and
NOT EXISTS (
SELECT 1
FROM product_options po2
WHERE p.id = po2.product_id and po2.option_id NOT IN (1, 2, 3)
)
GROUP BY p.id
HAVING COUNT(po.option_id) = 3
If you want to select products accoding to the other conditions (like product categories and so on) then use IN in the WHERE clause. This approach avoids generating duplicate po.option_id and the outer query will still work correctly even without DISTINCT in COUNT.
SELECT DISTINCT p.id AS id
FROM products p
JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1 AND
po.option_id IN (1, 2, 3) AND
-- use the following IN predicate to select products with specific features without introducing duplicates in your query
p.id IN (
select product_id FROM product_categories WHERE <your_condition>
)
GROUP BY p.id
HAVING COUNT(po.option_id) = 3
You select products with image lists. Something like:
select products.*, group_concat(product_images.id)
Additionally there may be options the product must all meet. This is criteria that belongs in the WHERE clause.
select
p.*,
(select group_concat(image) from product_images i where i.product_id = p.id) as images
from products p
where product_active = 1
and id in
(
select product_id
from product_options
where option_id in (1,3,55,97)
group by product_id
having count(*) = 4 -- four options in this example
);
Thanks guys, i've managed to return exactly what i wanted.
Now i just have problem with pagination query for the filtered products.
Final search query:
SELECT DISTINCT p.id AS id,
main_price,
promotion_price,
NEW,
sale,
recommended,
COUNT(pi.filename) AS image_count,
GROUP_CONCAT(DISTINCT pi.filename
ORDER BY pi.main_image DESC, pi.id ASC) AS images,
name_sr,
uri_sr,
description_sr
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY id
HAVING COUNT(DISTINCT po.option_id) = 1
ORDER BY id DESC
LIMIT 0,
12
Pagination query is something like this i modified it accorgin to new filter query:
SELECT COUNT(DISTINCT p.id) AS number
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY(p.id)
HAVING COUNT(DISTINCT po.option_id) = 1
If i leave out DISTINCT in SELECT COUNT i don't get filtered pagination, if i set DISTINCT i get number of rows that corespond to pagination. I suppose i could add another count(*) to all of this with subquery, but not sure if that is way to go and if there is more efficient and elegant way to do this.
I have a collection of tables in a relational database
products
categories
orders
line_items
customers
Products has a many-to-many relationship with categories (join table categories_products) and also has and belongs to many orders through line_items, which is a join table for products and orders with an id. A customer also has many orders.
I'm trying to put together some SQL that will give me this sort of response:
customer_id | customer_first_name | category_id | category_name | number_purchased
-----------------------------------
1 |Jack | 1 | Electronics | 15
2 |Jill | 1 | Electronics | 2
2 |Jill | 2 | Hiking | 3
This is the giant hunk of SQL I've been trying to use to get these values:
SELECT
DISTINCT customers.id AS customer_id,
customers.first_name AS customer_first_name,
categories.id AS category_id,
categories.name AS category_name,
(
SELECT count(li.id) FROM line_items li
INNER JOIN orders o ON li.order_id = o.id
INNER JOIN products p ON li.product_id = p.id
INNER JOIN categories_products cp ON cp.product_id = p.id
WHERE
o.customer_id = customer_id
AND o.status = 3
AND cp.category_id = category_id
) AS number_purchased
FROM orders
LEFT JOIN customers ON orders.customer_id = customers.id
LEFT JOIN line_items li ON li.order_id = orders.id
LEFT JOIN products ON products.id = li.product_id
LEFT JOIN categories_products catpr ON catpr.product_id = products.id
LEFT JOIN categories ON catpr.category_id = categories.id
Only the count itself is wrong. Instead of getting the number of line items a customer has bought in a specific category, I'm instead getting a count for all LineItems that have been part of a completed order.
How can I get the count to correctly represent the number of line_items purchased by a specific customer within a category?
NOTE: in the SQL text, o.status = 3 is using an enum to indicate that an Order is "complete."
I think your inner join with categories_products is screwing this up. You should set up a fiddle, like #Strawberry suggested, or try this:
SELECT
DISTINCT customers.id AS customer_id,
customers.first_name AS customer_first_name,
categories.id AS category_id,
categories.name AS category_name,
(
SELECT count(li.id) FROM line_items li
INNER JOIN orders o ON li.order_id = o.id
INNER JOIN products p ON li.product_id = p.id
WHERE
o.customer_id = customer_id
AND o.status = 3
) AS number_purchased
FROM orders
LEFT JOIN customers ON orders.customer_id = customers.id
LEFT JOIN line_items li ON li.order_id = orders.id
LEFT JOIN products ON products.id = li.product_id
LEFT JOIN categories_products catpr ON catpr.product_id = products.id
LEFT JOIN categories ON catpr.category_id = categories.id
If you wanted to correct your count, I would advise using a GROUP BY clause in the subquery. If you GROUP BY orders then you will only get the specific order which you retrieved when looking that the user id was correct. I would encourage you to take a look at mistakes in other part of your SQL code to clean up this hulking query. For example, make sure you want to be using distinct and that you actually want to be using left joins versus inner joins, both of which could seriously mess with the performance of your program.
I want to find out all seller who have uploaded products in categories (electronics,clothing,furniture), so for 3 categories there can be 3 row against each seller . tables I have are
1.category{category_id,name},
2.seller {seller_id,username},
3.products{product_id,seller_id,category_id,title}
Note:There can be maximum 3 result (coz I'm searching in 3 categories) against one seller even if he added more than one product in single category
expected result:
**product_id** **category** **sellerUsername**
101 electronics kuldeep
211 furniture kuldeep
322 clothing kuldeep
167 electronics roman
245 furniture roman
247 clothing dangi
246 furniture dangi
..
..
if you need only the matching relation use inner join
select a.product_id, b.username, c.name
from products as a
inner join seller as b on b.seller_id = a.seller_id
inner join category as c on c.category_id = a.category_id
else use left join
select a.product_id, b.username, c.name
from products as a
left join seller as b on b.seller_id = a.seller_id
left join category as c on c.category_id = a.category_id
The general solution to your problem is to join the three tables together and then aggregate by seller and category. In my solution, I have arbitraily chosen the max product ID, in the absence of any logic for doing otherwise. The query is slightly tricky, in that we need to additionally join this result again to the category and seller tables to get the human readable category and seller names. The reason for this is the GROUP BY query should ideally be done by ID and not name, since conceivably two categories (or sellers) could have the same name but have different IDs.
SELECT t3.product_id,
COALESCE(t1.name, 'NA'),
COALESCE(t2.username, 'NA')
FROM
(
SELECT MAX(p.product_id) AS product_id,
c.category_id,
s.seller_id
FROM products p
LEFT JOIN category c
ON p.category_id = c.category_id
LEFT JOIN seller s
ON p.seller_id = s.seller_id
WHERE c.name IN ('electronics', 'clothing', 'furniture')
GROUP BY s.seller_id,
c.category_id
) t1
LEFT JOIN category t2
ON t1.category_id = t2.category_id
LEFT JOIN seller t3
ON t1.seller_id = t3.seller_id
Check Below Code.
SET #row_number:=0;
SET #db_names:= '';
SET #db_names2:= '';
select product_id,name as category ,username as sellerUsername
from (
select a.product_id, c.name ,b.username,
#row_number:=CASE WHEN #db_names=username and #db_names2=name THEN #row_number+1
ELSE 1 END AS row_number,#db_names:=username AS username2,#db_names2:=name AS name2
from products as a
left join seller as b on b.seller_id = a.seller_id
left join category as c on c.category_id = a.category_id
where name IN ('electronics', 'clothing', 'furniture')
)a where row_number < 2
order by sellerUsername,name;
Output :