SELECT
bp.product_id,bs.step_number,
p.price, pd.name as product_name
FROM
builder_product bp
JOIN builder_step bs ON bp.builder_step_id = bs.builder_step_id
JOIN builder b ON bp.builder_id = b.builder_id
JOIN product p ON p.product_id = bp.product_id
JOIN product_description pd ON p.product_id = pd.product_id
WHERE b.builder_id = '74' and bs.optional != '1'
group by bs.step_number
ORDER by bs.step_number, p.price
but here is my results
88 1 575.0000 Lenovo Thinkcentre POS PC
244 2 559.0000 Touchscreen with MSR - Firebox 15"
104 3 285.0000 Remote Order Printer - Epson
97 4 395.0000 Aldelo Lite
121 5 549.0000 Cash Register Express - Pro
191 6 349.0000 Integrated Payment Processing
155 7 369.0000 Accessory - Posiflex 12.1" LCD Customer Display
That's not how GROUP BY is supposed to work. If you group by a number of columns, your select can only return:
The columns you group by
Aggregation functions from other columns, such as MIN(), MAX(), AVG()...
So you'd need to do this:
SELECT
bs.step_number,
MIN(p.price) AS min_price, pd.name as product_name
FROM
builder_product bp
JOIN builder_step bs ON bp.builder_step_id = bs.builder_step_id
JOIN builder b ON bp.builder_id = b.builder_id
JOIN product p ON p.product_id = bp.product_id
JOIN product_description pd ON p.product_id = pd.product_id
WHERE b.builder_id = '74' and bs.optional != '1'
group by bs.step_number, pd.name
ORDER by bs.step_number, min_price
(MySQL allows a very relaxed syntax and will happily remove random rows for each group but other DBMS will trigger an error with your original query.)
Join to a sub select of the tables which only contain the min value of each group
In this example. the mygroup min(amt) returns the lowest dollar item for a group
I then join this back to the main table as a full inner join to limit the records only to that minimum.
Select A.myGROUP, A.Amt
from mtest A
INNER JOIN (Select myGroup, min(Amt) as minAmt from mtest group by mygroup) B
ON B.myGroup=A.mygroup
and B.MinAmt = A.Amt
Yes. Each different group key is returned only once. This problem is not easily solved. Run two distinct queries and combine results afterwards. IF this is not an option create a temporary table for the minimum price for each step join the tables in the query.
Related
I had this question about filtering different products by selecting options. That query has been solved here: Filter products by options.
My problem is now with the count query for the pagination. For instance this query returns 37 rows with the count of 1.
SELECT COUNT(DISTINCT p.id) AS number
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY(p.id)
HAVING COUNT(DISTINCT po.option_id) = 1
But if i remove DISTINCT:
SELECT COUNT(p.id) AS number
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY(p.id)
HAVING COUNT(DISTINCT po.option_id) = 1
this returns also 37 rows with mixed numbers.
What am doing wrong? I know i could outcome this by running aditional count on this result set but i think that is not right solution?
Also as suggested in previous question, there was stated that i should not be needing DISTINCT and the query is flawwed because of that. Can you tell me what is the problem?
The only difference in your queries is here:
SELECT COUNT(DISTINCT p.id) AS number
vs.
SELECT COUNT(p.id) AS number
So you get the same number of result rows, because FROM, WHERE, and HAVING are all the same. Only the data per row you select is different.
In the first case it's the number of distinct IDs that are not null, which is always 1, because you group by that ID. (You say: Look at all records with ID 5 and count how many different IDs you find in these records. Well, the ID in every record with ID 5 is 5. So it's only one ID.)
In the second case you count IDs that are not null. As the ID is never null, this is the same as counting records: COUNT(*). And you'd better make this clear by using COUNT(*) instead of the obfuscated COUNT(p.id).
OK guys it is more clear now. Is something like this good enough to solve this?
SELECT COUNT(*) AS number
FROM (SELECT p.id
FROM products p
LEFT JOIN product_categories pc ON p.id = pc.product_id
LEFT JOIN product_images pi ON p.id = pi.product_id
LEFT JOIN product_options po ON p.id = po.product_id
WHERE p.product_active = 1
AND po.option_id IN(1)
AND p.main_price BETWEEN 5250.00 AND 14000.00
GROUP BY(p.id)
HAVING COUNT(DISTINCT po.option_id) = 1) AS count_table
I will go with this one. Thanks.
I've looked at similar group_concat mysql optimisation threads but none seem relevant to my issue, and my mysql knowledge is being stretched with this one.
I have been tasked with improving the speed of a script with an extremely heavy Mysql query contained within.
The query in question uses GROUP_CONCAT to create a list of colours, tags and sizes all relevant to a particular product. It then uses HAVING / FIND_IN_SET to filter these concatenated lists to find the attribute, set by the user controls and display the results.
In the example below it's looking for all products with product_tag=1, product_colour=18 and product_size=17. So this could be a blue product (colour) in medium (size) for a male (tag).
The shop_products tables contains about 3500 rows, so is not particularly large, but the below takes around 30 seconds to execute. It works OK with 1 or 2 joins, but adding in the third just kills it.
SELECT shop_products.id, shop_products.name, shop_products.default_image_id,
GROUP_CONCAT( DISTINCT shop_product_to_colours.colour_id ) AS product_colours,
GROUP_CONCAT( DISTINCT shop_products_to_tag.tag_id ) AS product_tags,
GROUP_CONCAT( DISTINCT shop_product_colour_to_sizes.tag_id ) AS product_sizes
FROM shop_products
LEFT JOIN shop_product_to_colours ON shop_products.id = shop_product_to_colours.product_id
LEFT JOIN shop_products_to_tag ON shop_products.id = shop_products_to_tag.product_id
LEFT JOIN shop_product_colour_to_sizes ON shop_products.id = shop_product_colour_to_sizes.product_id
WHERE shop_products.category_id = '50'
GROUP BY shop_products.id
HAVING((FIND_IN_SET( 1, product_tags ) >0)
AND(FIND_IN_SET( 18, product_colours ) >0)
AND(FIND_IN_SET( 17, product_sizes ) >0))
ORDER BY shop_products.name ASC
LIMIT 0 , 30
I was hoping somebody could generally advise a better way to structure this query without re-structuring the database (which isn't really an option at this point without weeks of data migration and script changes)? Or any general advise on optimisation. Using explain currently returns the below (as you can see the indexes are all over the place!).
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE shop_products ref category_id,category_id_2 category_id 2 const 3225 Using where; Using temporary; Using filesort
1 SIMPLE shop_product_to_colours ref product_id,product_id_2,product_id_3 product_id 4 candymix_db.shop_products.id 13
1 SIMPLE shop_products_to_tag ref product_id,product_id_2 product_id 4 candymix_db.shop_products.id 4
1 SIMPLE shop_product_colour_to_sizes ref product_id product_id 4 candymix_db.shop_products.id 133
Rewrite query to use WHERE instead of HAVING. Because WHERE is applied when MySQL performs search on rows and it can use index. HAVING is applied after rows are selected to filter already selected result. HAVING by design can't use indexes.
You can do it, for example, this way:
SELECT p.id, p.name, p.default_image_id,
GROUP_CONCAT( DISTINCT pc.colour_id ) AS product_colours,
GROUP_CONCAT( DISTINCT pt.tag_id ) AS product_tags,
GROUP_CONCAT( DISTINCT ps.tag_id ) AS product_sizes
FROM shop_products p
JOIN shop_product_to_colours pc_test ON p.id = pc_test.product_id AND pc_test.colour_id = 18
JOIN shop_products_to_tag pt_test ON p.id = pt_test.product_id AND pt_test.tag_id = 1
JOIN shop_product_colour_to_sizes ps_test ON p.id = ps_test.product_id AND ps_test.tag_id = 17
JOIN shop_product_to_colours pc ON p.id = pc.product_id
JOIN shop_products_to_tag pt ON p.id = pt.product_id
JOIN shop_product_colour_to_sizes ps ON p.id = ps.product_id
WHERE p.category_id = '50'
GROUP BY p.id
ORDER BY p.name ASC
Update
We are joining each table two times.
First to check if it contains some value (condition from FIND_IN_SET).
Second join will produce data for GROUP_CONCAT to select all product values from table.
Update 2
As #Matt Raines commented, if we don't need list product values with GROUP_CONCAT, query becomes even simplier:
SELECT p.id, p.name, p.default_image_id
FROM shop_products p
JOIN shop_product_to_colours pc ON p.id = pc.product_id
JOIN shop_products_to_tag pt ON p.id = pt.product_id
JOIN shop_product_colour_to_sizes ps ON p.id = ps.product_id
WHERE p.category_id = '50'
AND (pc.colour_id = 18 AND pt.tag_id = 1 AND ps.tag_id = 17)
GROUP BY p.id
ORDER BY p.name ASC
This will select all products with three filtered attributes.
I think if I understand this question, what you need to do is:
Find a list of all of the shop_product.id's that have the correct tag/color/size options
Get a list of all of the tag/color/size combinations available for that product id.
I was trying to make you a SQLFiddle for this, but the site seems broken at the moment. Try something like:
SELECT shop_products.id, shop_products.name, shop_products.default_image_id,
GROUP_CONCAT( DISTINCT shop_product_to_colours.colour_id ) AS product_colours,
GROUP_CONCAT( DISTINCT shop_products_to_tag.tag_id ) AS product_tags,
GROUP_CONCAT( DISTINCT shop_product_colour_to_sizes.tag_id ) AS product_sizes
FROM
shop_products INNER JOIN
(SELECT shop_products.id id,
FROM
shop_products
LEFT JOIN shop_product_to_colours ON shop_products.id = shop_product_to_colours.product_id
LEFT JOIN shop_products_to_tag ON shop_products.id = shop_products_to_tag.product_id
LEFT JOIN shop_product_colour_to_sizes ON shop_products.id = shop_product_colour_to_sizes.product_id
WHERE
shop_products.category_id = '50'
shop_products_to_tag.tag_id=1
shop_product_to_colours.colour_id=18
shop_product_colour_to_sizes.tag_id=17
) matches ON shop_products.id = matches.id
LEFT JOIN shop_product_to_colours ON shop_products.id = shop_product_to_colours.product_id
LEFT JOIN shop_products_to_tag ON shop_products.id = shop_products_to_tag.product_id
LEFT JOIN shop_product_colour_to_sizes ON shop_products.id = shop_product_colour_to_sizes.product_id
GROUP BY shop_products.id
ORDER BY shop_products.name ASC
LIMIT 0 , 30;
The problem with you first approach is that it requires the database to create every combination of every product and then filter. In my example, I'm filtering down the product id's first then generating the combinations.
My query is untested as I don't have a MySQL Environment handy and SQLFiddle is down, but it should give you the idea.
First, I aliased your queries to shorten readability.
SP = Shop_Products
PC = Shop_Products_To_Colours
PT = Shop_Products_To_Tag
PS = Shop_Products_To_Sizes
Next, your having should be a WHERE since you are explicitly looking FOR something. No need trying to query the entire system just to throw records after the result is returned. Third, you had LEFT-JOIN, but when applicable to a WHERE or HAVING, and you are not allowing for NULL, it forces TO a JOIN (both parts required). Finally, your WHERE clause has quotes around the ID you are looking for, but that is probably integer anyhow. Remove the quotes.
Now, for indexes and optimization there. To help with the criteria, grouping, and JOINs, I would have the following composite indexes (multiple fields) instead of a table with just individual columns as the index.
table index
Shop_Products ( category_id, id, name )
Shop_Products_To_Colours ( product_id, colour_id )
Shop_Products_To_Tag ( product_id, tag_id )
Shop_Products_To_Sizes ( product_id, tag_id )
Revised query
SELECT
SP.id,
SP.name,
SP.default_image_id,
GROUP_CONCAT( DISTINCT PC.colour_id ) AS product_colours,
GROUP_CONCAT( DISTINCT PT.tag_id ) AS product_tags,
GROUP_CONCAT( DISTINCT PS.tag_id ) AS product_sizes
FROM
shop_products SP
JOIN shop_product_to_colours PC
ON SP.id = PC.product_id
AND PC.colour_id = 18
JOIN shop_products_to_tag PT
ON SP.id = PT.product_id
AND PT.tag_id = 1
JOIN shop_product_colour_to_sizes PS
ON SP.id = PS.product_id
AND PS.tag_id = 17
WHERE
SP.category_id = 50
GROUP BY
SP.id
ORDER BY
SP.name ASC
LIMIT
0 , 30
One Final comment. Since you are ordering by the NAME, but grouping by the ID, it might cause a delay in the final sorting. HOWEVER, if you change it to group by the NAME PLUS ID, you will still be unique by the ID, but an adjusted index ON your Shop_Products to
table index
Shop_Products ( category_id, name, id )
will help both the group AND order since they will be in natural order from the index.
GROUP BY
SP.name,
SP.id
ORDER BY
SP.name ASC,
SP.ID
QUERY 1(FIRST TABLE - STOCK IN)
select sum(s.liquid_quantity) as 'stock in total' from stockin_detail s
left join reagent r on r.id = s.reagent_id group by r.name
QUERY 2(SECOND TABLE - STOCK OUT)
select sum(t.consumption)as 'stock out total' ,r.name from stock_out s
inner join test_consumption t on s.consumption_id = t.id
inner join reagent r on r.id = t.reagent_id group by r.name
QUERY 1
stock in |r.name
100 |Reagent2
100 |Reagent3
QUERY 2
stock out |r.name
40 |Reagent2
20 |Reagent3
i tried doing this but it wont subtract because each of the nested select statement returns 'more than one column' error message due to group by.
I also tried removing the group by but ended up combing two different stocks then subtracted.
SELECT QUERY1 – QUERY2 as ‘current stocks’
EXPECTED OUT COME
current stock|r.name
60 |Reagent2
80 |Reagent3
Why not dump the two queries into inline views and then join them together?
Something like this should get you going. Inline view aliased 'i' is the stock in and inline view aliased 'o' is the stock out:
select i.name,i.in-ifnull(o.out,0) as 'current stock'
from
(
select sum(s.liquid_quantity) as in,r.name
from stockin_detail s
left join reagent r on r.id = s.reagent_id
group by r.name
) i
left outer join
(
select sum(t.consumption)as 'out' ,r.name from stock_out s
inner join test_consumption t on s.consumption_id = t.id
inner join reagent r on r.id = t.reagent_id
group by r.name
) o on i.name = o.name;
I have two tables, Product and Benchmark
A benchmark is linked to only one product. There can only be one benchmark per year per product.
I would like to retrieve every products' name for a set of years, and count how many benchmark there are for each product.
SELECT p.name,
p.id,
COUNT(p.id) AS nb_benchmark
FROM product p
INNER JOIN benchmark b0 ON b0.product_id = p.id
INNER JOIN benchmark b1 ON b1.product_id = p.id
WHERE p.owner = "MyCompany"
AND b0.year = 2011
AND b1.year = 2012
GROUP BY p.id
ORDER BY nb_trials DESC
But the count is wrong, it's way to high, it even gives me more results than there actually are in the database. I guess it's because of the JOINs, but I don't know how to build the query.
Remember that the basis of SQL joining is the cartesian product of rows in the referenced tables, which are then eliminated by filters and join conditions. Because you are joining TWICE to table benchmark, which from the nature of your query, we can assume has many benchmark rows per product per benchmark year.
e.g. 1 Product with 3 Benchmark rows each for 2011 and 2012
FROM product p -- 1 Product Row
INNER JOIN benchmark b0 ON b0.product_id = p.id -- 1 x 3 = 3
INNER JOIN benchmark b1 ON b1.product_id = p.id -- 1 x 3 x 3 = 9
So the multiple joins to benchmark introduces duplicate rows for product, which are then counted.
You can use COUNT(DISTINCT xx) to count distinct values, so your query should be of the form:
SELECT p.name,
p.id,
COUNT(DISTINCT p.id) AS distinct_products,
COUNT(DISTINCT b.name) AS distinct_benchmark_names
-- etc
FROM ...
Other Notes
for correctness sake you should GROUP BY both p.id and p.name. Although MySql allows this, other RDBMS are more strict.
Try this:
SELECT p.name,
p.id,
COUNT(b0.id) AS nb_benchmark
FROM product p
INNER JOIN benchmark b0 ON b0.product_id = p.id
WHERE p.owner = "MyCompany"
AND b0.year IN (2011, 2012)
GROUP BY p.name, p.id
ORDER BY nb_trials DESC
I have found a way to achieve what I wanted
SELECT p.name, p.id, COUNT(DISTINCT(b0.id)) + COUNT(DISTINCT(b1.id)) as nb_benchmark
FROM product p
INNER JOIN benchamrk b0 ON b0.product_id = p.id AND b0.year = 2011
INNER JOIN benchamrk b1 ON b1.product_id = p.id AND b1.year = 2012
WHERE
p.owner = "myCompany"
GROUP BY p.id
ORDER BY nb_benchmark DESC
Try this.
SELECT p.id, p.name, b.nb_benchmark
FROM product p
JOIN (
/* number of benchpark per product for years 2011 and 2012 */
SELECT product_id, COUNT(*) AS nb_benchmark
FROM benchmark
WHERE year = 2011 OR year = 2012
GROUP BY product_id
) b ON p.id = b.product_id
WHERE p.owner = "MyCompany"
ORDER BY nb_benchmark DESC
I want to find the last payment (or NULL if n/a) made for which specified product_id. Below is a representation of the tables I'm working with (simplified version).
+----------+
|Products |
|----------+
|product_id|
+----------+
+---------------+
|Orders |
+---------------+
|order_id |
|order_timestamp|
|order_status |
+---------------+
+-----------------+
|ProductsOrdersMap|
+-----------------+
|product_id |
|order_id |
+-----------------+
After JOINs, MAXs, GROUP BYs, LEFT JOINs, multiple INNER JOINs to get the greatest-n-per-group, I still can't get to the right result. Most of the times, products with multiple orders are returning multiple rows. The best results I got so far were (I was searching specific products):
product_id order_id order_timestamp order_status
8 NULL NULL NULL
9 NULL NULL NULL
10 NULL NULL NULL
12 NULL NULL NULL
13 NULL NULL NULL
14 11 2013-08-13 07:22:01 finished
15 11 2013-08-13 07:22:01 finished
15 12 2013-08-14 00:00:00 finished
32 11 2013-08-13 07:22:01 finished
83 9 2013-08-13 07:04:02 finished
83 10 2013-08-13 07:11:42 finished
Edit: After PP. anwser, I ended up with the following query:
SELECT p.product_id, o.order_id, MAX(order_timestamp) AS order_timestamp, order_status
FROM Products p LEFT JOIN (ProductsOrdersMap m, Orders o)
ON (p.product_id = m.product_id AND m.order_id = o.order_id)
WHERE p.product_id IN (8,9,10,12,13,14,15,32,83)
GROUP BY p.product_id
Which returns
product_id order_id order_timestamp order_status
8 NULL NULL NULL
9 NULL NULL NULL
10 NULL NULL NULL
12 NULL NULL NULL
13 NULL NULL NULL
14 11 2013-08-13 07:22:01 finished
15 11 2013-08-13 07:22:01 finished
32 11 2013-08-13 07:22:01 finished
83 9 2013-08-13 07:04:02 finished
At first glance, it seems correct but only the products IDs and the timestamps are right. Comparing the two queries above, you can see that, for products 15 and 83, order_id is wrong (order_status might be wrong as well).
This query should return the specified resultset (this is only desk checked, not tested)
to return ALL product_id
SELECT p.product_id
, m.order_d
, m.order_timestamp
, m.order_status
FROM products p
LEFT
JOIN ( SELECT kl.product_id
, MAX(ko.order_timestamp) AS latest_timestamp
FROM orderproductsmap kl
JOIN orders ko
ON ko.order_id = kl.order_id
GROUP
BY kl.product_id
) l
ON l.product_id = p.product_id
LEFT
JOIN ( SELECT ml.product_id
, mo.order_id
, mo.order_timestamp
, mo.order_status
FROM orderproductsmap ml
JOIN orders mo
ON mo.order_id = ml.order_id
) m
ON m.product_id = l.product_id
AND m.order_timestamp = l.latest_timestamp
GROUP
BY p.product_id
The inline view "l" gets us the latest "order_timestamp" for each "product_id". This is joined to inline view "m" to get us the whole row for the order that has the latest timestamp.
If there happens to be more than one order with the same latest "order_timestamp" (i.e. order_timestamp is not guaranteed to be unique for a given product_id) then the outermost GROUP BY ensures that only one of those order rows is returned.
If only particular product_id values need to be returned, add a WHERE clause in the outermost query. For performance, that same predicate can be repeated in the inline views.
to return only SPECIFIC product_id we add three WHERE clauses:
SELECT p.product_id
, m.order_d
, m.order_timestamp
, m.order_status
FROM products p
LEFT
JOIN ( SELECT kl.product_id
, MAX(ko.order_timestamp) AS latest_timestamp
FROM orderproductsmap kl
JOIN orders ko
ON ko.order_id = kl.order_id
WHERE kl.product_id IN (8,9,10,12,13,14,15,32,83)
GROUP
BY kl.product_id
) l
ON l.product_id = p.product_id
LEFT
JOIN ( SELECT ml.product_id
, mo.order_id
, mo.order_timestamp
, mo.order_status
FROM orderproductsmap ml
JOIN orders mo
ON mo.order_id = ml.order_id
WHERE ml.product_id IN (8,9,10,12,13,14,15,32,83)
) m
ON m.product_id = l.product_id
AND m.order_timestamp = l.latest_timestamp
WHERE p.product_id IN (8,9,10,12,13,14,15,32,83)
GROUP
BY p.product_id
Only the WHERE clause on the outermost query is required. The other two are added just to improve performance by limiting the size of each of the derived tables.
SELECT
P.product_id
,MAX(order_timestamp)
FROM
Products P
,Orders O
,ProductsOrdersMap M
WHERE
P.product_id = M.product_id
AND O.order_id = M.order_id
GROUP BY
P.product_id
To return all products, even those without orders, a LEFT JOIN is definitely the way to go. The answer from #PP above uses "old-style" inner joins and is equivalent to this:
SELECT
P.product_id
,MAX(order_timestamp)
FROM Products P
INNER JOIN ProductsOrdersMap M ON P.product_id = M.product_id
INNER JOIN Orders O ON O.order_id = M.order_id
GROUP BY
P.product_id
Starting with this syntax it's a lot easier to get to the LEFT JOIN - just replace INNER with LEFT:
SELECT
P.product_id
,MAX(order_timestamp)
FROM Products P
LEFT JOIN ProductsOrdersMap M ON P.product_id = M.product_id
LEFT JOIN Orders O ON O.order_id = M.order_id
GROUP BY
P.product_id
Addendum: Renato needed something more than just reworking the other answer as a LEFT JOIN because the order_id and order_status have to come along with the maximum timestamp. The easiest approach is to start with a list of product ID's and order ID's where the order has the maximum timestamp by order_id:
SELECT
p2.product_id,
o2.order_id
FROM Products p2
INNER JOIN ProductsOrdersMap m ON p2.product_id = m.product_id
INNER JOIN Orders o2 ON m.order_id = o2.order_id
WHERE (o2.order_id, o2.order_timestamp) IN (
SELECT order_id, MAX(order_timestamp)
FROM Orders
GROUP BY order_id)
Then, instead of using ProductsOrdersMap to resolve products to orders, use the results from the query above:
SELECT
p.product_id,
o.order_id,
o.TS,
o.order_status
FROM Products p
LEFT JOIN (
SELECT
p2.product_id,
o2.order_id
FROM Products p2
INNER JOIN ProductsOrdersMap m ON p2.product_id = m.product_id
INNER JOIN Orders o2 ON m.order_id = o2.order_id
WHERE (o2.order_id, o2.order_timestamp) IN (
SELECT order_id, MAX(order_timestamp)
FROM Orders
GROUP BY order_id)
) MaxTS ON p.product_id = MaxTS.product_id
LEFT JOIN Orders o ON MaxTS.order_id = o.order_id