Optimize joined order by query - mysql

I Have the following query:
SELECT `p_products`.`id`, `p_products`.`name`, `p_products`.`date`,
`p_products`.`img`, `p_products`.`safe_name`, `p_products`.`sku`,
`p_products`.`productstatusid`, `op`.`quantity`
FROM `p_products`
INNER JOIN `p_product_p_category`
ON `p_products`.`id` = `p_product_p_category`.`p_product_id`
LEFT JOIN (SELECT `p_product_id`,`order_date`,SUM(`product_quantity`) as quantity
FROM `p_orderedproducts`
WHERE `order_date`>='2013-03-01 16:51:17'
GROUP BY `p_product_id`) AS op
ON `p_products`.`id` = `op`.`p_product_id`
WHERE `p_product_p_category`.`p_category_id` IN ('15','23','32')
AND `p_products`.`active` = '1'
GROUP BY `p_products`.`id`
ORDER BY `date` DESC
Explain says:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY p_product_p_category ref p_product_id,p_category_id,p_product_id_2 p_category_id 4 const 8239 Using temporary; Using filesort
1 PRIMARY p_products eq_ref PRIMARY PRIMARY 4 pdev.p_product_p_category.p_product_id 1 Using where
1 PRIMARY ALL NULL NULL NULL NULL 78
2 DERIVED p_orderedproducts index order_date p_product_id 4 NULL 201 Using where
And I have indexes on a number of columns including p_products.date.
Problem is the speed when there are more then 5000 products in a number of categories. 60000 products take >1 second. Is there any way to speed things up?
This also holds true if I remove the left join in which case the result is:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p_product_p_category index p_product_id,p_category_id,p_product_id_2 p_product_id_2 8 NULL 91167 Using where; Using index; Using temporary; Using filesort
1 SIMPLE p_products eq_ref PRIMARY PRIMARY 4 pdev.p_product_p_category.p_product_id 1 Using where
The intermediatate table p_product_p_category has indexes on both p_product_id and p_category_id aswell as a combined index with both.
Tries Ochi's suggestion and ended up with:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY ALL NULL NULL NULL NULL 62087 Using temporary; Using filesort
1 PRIMARY nr1media_products eq_ref PRIMARY PRIMARY 4 cats.nr1media_product_id 1 Using where
2 DERIVED nr1media_product_nr1media_category range nr1media_category_id nr1media_category_id 4 NULL 62066 Using where
I think I can simplify the question to how can I join my products on the category intermediate table to fetch all unique products for the selected categories, sorted by date.
EDIT:
This gives me all unique products in the categories without using a temp table for ordering or grouping:
SELECT
`p_products`.`id`,
`p_products`.`name`,
`p_products`.`img`,
`p_products`.`safe_name`,
`p_products`.`sku`,
`p_products`.`productstatusid`
FROM
p_products
WHERE
EXISTS (
SELECT
1
FROM
p_product_p_category
WHERE
p_product_p_category.p_product_id = p_products.id
AND p_category_id IN ('15', '23', '32')
)
AND p_products.active = 1
ORDER BY
`date` DESC
Above query is very fast, much faster then the join using group by order by (0.04 VS 0.7 sec), although I don't understand why it can do this query without temp tables.
I think I need to find another solution for the orderedproducts join, it still slows the query down to >1 sec. Might make a cron to update the ranking of the products sold once every night and save that info to the p_products table.
Unless someone has a definitive solution...

You are joining every type of category to products - only then it gets filtered by category id
try to limit your query as soon as possible for e.g. instead of
INNER JOIN `p_product_p_category`
do
INNER JOIN ( SELECT * FROM `p_product_p_category` WHERE `p_category_id` IN ('15','23','32') )
so that you will be working on smaller subset of products right from begining

One possible solution would be to remove the derived table and just do a single Group By:
Select P.id, P.name, P.date
, P.img, P.safe_name, P.sku
, P.productstatusid
, Sum( OP.product_quantity ) As quantity
From p_products As P
Join p_product_p_category As CAT
On p_products.id = CAT.p_product_id
Left Join p_orderedproducts As OP
On OP.p_product_id = P.id
And OP.order_date >= '2013-03-01 16:51:17'
Where CAT.p_category_id In ('15','23','32')
And P.active = '1'
Group By P.id, P.name, P.date
, P.img, P.safe_name, P.sku
, P.productstatusid
Order By P.date Desc

Related

Improve performance of a last status retrieval from history table

I want to retrieve the latest status for an item from a history table. History table will have a record of all status changes for an item. The query must be quick to run.
Below is the query that I use to get the latest status per item
SELECT item_history.*
FROM item_history
INNER JOIN (
SELECT MAX(created_at) as created_at, item_id
FROM item_history
GROUP BY item_id
) as latest_status
on latest_status.item_id = item_history.item_id
and latest_status.created_at = item_history.created_at
WHERE item_history.status_id = 1
and item_history.created_at BETWEEN "2020-12-16" AND "2020-12-23"
I've tried putting query above into another inner join to link data with an item:
SELECT *
FROM `items`
INNER JOIN ( [query from above] )
WHERE items.category_id = 3
Notes about item_history table, I have index on the following columns: status_id, creatd_at and listing_id. I have also turned 3 of those into a compound primary key.
My issue is that MySQL keeps scanning the full table to grab MAX(created_at) which is a very slow operation, even tho I only have 3 million records within the history table.
Query plan as follows:
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
PRIMARY
items
NULL
ref
"PRIMARY,district"
district
18
const
694
100.00
NULL
1
PRIMARY
item_history
NULL
ref
"PRIMARY,status_id,created_at,item_history_item_id_index"
PRIMARY
9
"main.items.id,const"
1
100.00
"Using where"
1
PRIMARY
NULL
ref
<auto_key0>
<auto_key0>
14
"func,main.items.id"
10
100.00
"Using where; Using index"
2
DERIVED
item_history
NULL
range
"PRIMARY,status_id,created_at,item_history_item_id_index"
item_history_item_id_index
8
NULL
2751323
100.00
"Using index"
I want to retrieve the latest status for an item from a history table.
If you want the results for just one item, then use order by and limit:
select *
from item_history
where item_id = ? and created_at between '2020-12-16' and '2020-12-23'
order by created_at desc limit 1
This query would benefit an index on (item_id, created_at).
If you want the latest status per item, I would recommend a correlated subquery:
select *
from item_history h
where created_at = (
select max(h1.created_at)
from item_history h1
where h1.item_id = h.item_id
and h1.created_at between '2020-12-16' and '2020-12-23'
)
The same index should be beneficial.
Using window function MySQL 8.0.14+:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY item_id ORDER BY created_at DESC) r
FROM item_history
WHERE item_history.status_id = 1
and item_history.created_at BETWEEN '2020-12-16' AND '2020-12-23'
)
SELECT *
FROM cte WHERE r = 1;
Index on (item_id,created_at) will also help

select last record in each group for large database

I want to fetch last record in each group. I have used following query with very small database and it works perfectly -
SELECT * FROM logs
WHERE id IN (
SELECT max(id) FROM logs
WHERE id_search_option = 31
GROUP BY items_id
)
ORDER BY id DESC
But when it comes to actual database having millions of rows (80,00000+ rows), the system gets hanged.
I also tried another query, which gives result in 6.6sec on an average --
SELECT p1.id, p1.itemtype, p1.items_id, p1.date_mod
FROM logs p1
INNER JOIN (
SELECT max(id) as max_id, itemtype, items_id, date_mod
FROM logs
WHERE id_search_option = 31
GROUP BY items_id) p2
ON (p1.id = p2.max_id)
ORDER BY p1.items_id DESC;
Please help !
EDIT:: Explain 2nd query
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 1177 Using temporary; Using filesort
1 PRIMARY p1 eq_ref PRIMARY PRIMARY 4 p2.max_id 1
2 DERIVED logs ALL NULL NULL NULL NULL 7930527 Using where; Using temporary; Using filesort
select *from tablename orderby unique_column desc limit 0,1;
try it will work
here 0->oth record,1->one record

ORDER BY not working inside JOIN SELECT

I have 2 tables: ticket and ticket_message.
I want to select all tickets, that were not answered by our support team. This means last message, left in ticket, will have type client.
I trying code like this:
SELECT `ticket`.*,`message`.*
FROM `ticket`
LEFT JOIN (SELECT * FROM `ticket_message` ORDER BY `timeCreated` DESC) AS `message` ON `message`.`ticketId` = `ticket`.`id`
GROUP BY `ticket`.`id`
HAVING `message`.`type` = 'client'
The thing is, that this code works perfectly on my dev server with MySQL 5.5.42, but messages does not sort in subquery on production server with MySQL 5.7.9
Here is EXPLAIN results:
for 5.5.42:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY ticket ALL NULL NULL NULL NULL 38
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 130
2 DERIVED ticket_message ALL NULL NULL NULL NULL 127 Using filesort
for 5.7.9:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE ticket NULL index PRIMARY,ticket_ibfk_1,ticket_ibfk_2 PRIMARY 4 NULL 38 100.00 NULL
1 SIMPLE ticket_message NULL ref ticket_message_ibfk_1 ticket_message_ibfk_1 5 ticket.id 3 100.00 NULL
There's no guarantee that the ordering inside a subquery will be preserved. And in this case you can just use a join rather than a subquery.
Does this work?
SELECT `ticket`.*,`ticket_message`.* FROM `ticket`
LEFT JOIN `ticket_message` ON `ticket_message`.`ticketId` = `ticket`.`id`
GROUP BY `ticket`.`id` ORDER BY `ticket_message`.`timeCreated` DESC
I'm not sure what you want to achieve, but you may want to put ticket_message.timeCreated into the group by or you may get unexpected results.
Note on using "Group By"
It is easy to get caught out using Group By and there are rules that should be followed
Fields you are selecting should appear in the "Group By" clause. If they are not in the group by clause they should have an aggregate function applied.
https://mariadb.com/kb/en/sql-99/rules-for-grouping-columns/

MYSQL : LEFT JOIN run slowly

I have two tables :
table 1
table 2
then I do left join to both tables:
SELECT DATE(`Inspection_datetime`) AS Date, `Line`,`Model`, `Lot_no`,
COUNT(A.`Serial_number`) AS Qty,B.`name`
FROM `inspection_report` AS A
LEFT JOIN `Employee` AS B
ON A.`NIK` LIKE B.NIK
GROUP BY Date , A.Model ,A.Lot_no,A.Line,B.`name`
ORDER BY Date DESC
This query make my Jquery DataTable plugin run very slow even the data not show. The strange point is at this field B.name, if I mention it in query the data won't appear but if I delete it, the data would appear (I mean not doing LEFT JOIN).
Whether my query not good enough? this my EXPLAIN:
TABLE 1
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE inspection_report ALL NULL NULL NULL NULL 334518
TABLE 2
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Employee ALL NULL NULL NULL NULL 100
QUERY
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE B ALL NULL NULL NULL NULL 100 Using temporary; Using filesort
1 SIMPLE A ALL NULL NULL NULL NULL 334520 Using where; Using join buffer
Any special reason you are joining with a LIKE clause instead of equality?
SELECT DATE(`Inspection_datetime`) AS Date, `Line`,`Model`, `Lot_no`,
COUNT(A.`Serial_number`) AS Qty,B.`name`
FROM `inspection_report` AS A
LEFT JOIN `Employee` AS B
ON A.`NIK` = B.NIK
GROUP BY Date , A.Model ,A.Lot_no,A.Line,B.`name`
ORDER BY Date DESC
should yield better results.
Also, adding an index to A.NIK should help the join operation tremendously.
CREATE INDEX inspection_report_nik ON inspection_report (NIK);

Which query is better

EXPLAIN EXTENDED SELECT id, name
FROM member
INNER JOIN group_assoc ON ( member.id = group_assoc.member_id
AND group_assoc.group_id =2 )
ORDER BY registered DESC
LIMIT 0 , 1
Outputs:
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE group_assoc ref member_id,group_id group_id 4 const 3 100.00 Using temporary; Using filesort
1 SIMPLE member eq_ref PRIMARY PRIMARY 4 source_member.group_assoc.member_id 1 100.00
explain extended SELECT
id, name
FROM member WHERE
id
NOT IN (
SELECT
member_id
FROM group_assoc WHERE group_id = 2
)
ORDER BY registered DESC LIMIT 0,1
Outputs:
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY member ALL NULL NULL NULL NULL 2635 100.00 Using where; Using filesort
2 DEPENDENT SUBQUERY group_assoc index_subquery member_id,group_id member_id 8 func,const 1 100.00 Using index; Using where
The first query I'm not so sure about, it uses a temporary table which seems like a worse idea. But I also see that it uses fewer rows than the 2nd query....
These queries return completely different resultsets: the first one returns members of group 2, the second one returns everybody who is not a member of group 2.
If you meant this:
SELECT id, name
FROM member
LEFT JOIN
group_assoc
ON member.id = group_assoc.member_id
AND group_assoc.group_id = 2
WHERE group_assoc.member_id IS NULL
ORDER BY
registered DESC
LIMIT 0, 1
, then the plans should be identical.
You may find this article interesting:
NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: MySQL
Create an index on member.registered to get rid of both filesort and temporary.
I would say the first is better. The temporary table might not be a good idea, but a subquery isn't much better. And you will give MySQL more options to optimize the query plan with an inner join than you have with a subquery.
The subquery solution is fast as long as there are just a few rows that will be returned.
But... the first and second query don't seem to be the same, should it be that way?