MYSQL : LEFT JOIN run slowly - mysql

I have two tables :
table 1
table 2
then I do left join to both tables:
SELECT DATE(`Inspection_datetime`) AS Date, `Line`,`Model`, `Lot_no`,
COUNT(A.`Serial_number`) AS Qty,B.`name`
FROM `inspection_report` AS A
LEFT JOIN `Employee` AS B
ON A.`NIK` LIKE B.NIK
GROUP BY Date , A.Model ,A.Lot_no,A.Line,B.`name`
ORDER BY Date DESC
This query make my Jquery DataTable plugin run very slow even the data not show. The strange point is at this field B.name, if I mention it in query the data won't appear but if I delete it, the data would appear (I mean not doing LEFT JOIN).
Whether my query not good enough? this my EXPLAIN:
TABLE 1
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE inspection_report ALL NULL NULL NULL NULL 334518
TABLE 2
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Employee ALL NULL NULL NULL NULL 100
QUERY
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE B ALL NULL NULL NULL NULL 100 Using temporary; Using filesort
1 SIMPLE A ALL NULL NULL NULL NULL 334520 Using where; Using join buffer

Any special reason you are joining with a LIKE clause instead of equality?
SELECT DATE(`Inspection_datetime`) AS Date, `Line`,`Model`, `Lot_no`,
COUNT(A.`Serial_number`) AS Qty,B.`name`
FROM `inspection_report` AS A
LEFT JOIN `Employee` AS B
ON A.`NIK` = B.NIK
GROUP BY Date , A.Model ,A.Lot_no,A.Line,B.`name`
ORDER BY Date DESC
should yield better results.
Also, adding an index to A.NIK should help the join operation tremendously.
CREATE INDEX inspection_report_nik ON inspection_report (NIK);

Related

select last record in each group for large database

I want to fetch last record in each group. I have used following query with very small database and it works perfectly -
SELECT * FROM logs
WHERE id IN (
SELECT max(id) FROM logs
WHERE id_search_option = 31
GROUP BY items_id
)
ORDER BY id DESC
But when it comes to actual database having millions of rows (80,00000+ rows), the system gets hanged.
I also tried another query, which gives result in 6.6sec on an average --
SELECT p1.id, p1.itemtype, p1.items_id, p1.date_mod
FROM logs p1
INNER JOIN (
SELECT max(id) as max_id, itemtype, items_id, date_mod
FROM logs
WHERE id_search_option = 31
GROUP BY items_id) p2
ON (p1.id = p2.max_id)
ORDER BY p1.items_id DESC;
Please help !
EDIT:: Explain 2nd query
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 1177 Using temporary; Using filesort
1 PRIMARY p1 eq_ref PRIMARY PRIMARY 4 p2.max_id 1
2 DERIVED logs ALL NULL NULL NULL NULL 7930527 Using where; Using temporary; Using filesort
select *from tablename orderby unique_column desc limit 0,1;
try it will work
here 0->oth record,1->one record

ORDER BY not working inside JOIN SELECT

I have 2 tables: ticket and ticket_message.
I want to select all tickets, that were not answered by our support team. This means last message, left in ticket, will have type client.
I trying code like this:
SELECT `ticket`.*,`message`.*
FROM `ticket`
LEFT JOIN (SELECT * FROM `ticket_message` ORDER BY `timeCreated` DESC) AS `message` ON `message`.`ticketId` = `ticket`.`id`
GROUP BY `ticket`.`id`
HAVING `message`.`type` = 'client'
The thing is, that this code works perfectly on my dev server with MySQL 5.5.42, but messages does not sort in subquery on production server with MySQL 5.7.9
Here is EXPLAIN results:
for 5.5.42:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY ticket ALL NULL NULL NULL NULL 38
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 130
2 DERIVED ticket_message ALL NULL NULL NULL NULL 127 Using filesort
for 5.7.9:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE ticket NULL index PRIMARY,ticket_ibfk_1,ticket_ibfk_2 PRIMARY 4 NULL 38 100.00 NULL
1 SIMPLE ticket_message NULL ref ticket_message_ibfk_1 ticket_message_ibfk_1 5 ticket.id 3 100.00 NULL
There's no guarantee that the ordering inside a subquery will be preserved. And in this case you can just use a join rather than a subquery.
Does this work?
SELECT `ticket`.*,`ticket_message`.* FROM `ticket`
LEFT JOIN `ticket_message` ON `ticket_message`.`ticketId` = `ticket`.`id`
GROUP BY `ticket`.`id` ORDER BY `ticket_message`.`timeCreated` DESC
I'm not sure what you want to achieve, but you may want to put ticket_message.timeCreated into the group by or you may get unexpected results.
Note on using "Group By"
It is easy to get caught out using Group By and there are rules that should be followed
Fields you are selecting should appear in the "Group By" clause. If they are not in the group by clause they should have an aggregate function applied.
https://mariadb.com/kb/en/sql-99/rules-for-grouping-columns/

Optimize joined order by query

I Have the following query:
SELECT `p_products`.`id`, `p_products`.`name`, `p_products`.`date`,
`p_products`.`img`, `p_products`.`safe_name`, `p_products`.`sku`,
`p_products`.`productstatusid`, `op`.`quantity`
FROM `p_products`
INNER JOIN `p_product_p_category`
ON `p_products`.`id` = `p_product_p_category`.`p_product_id`
LEFT JOIN (SELECT `p_product_id`,`order_date`,SUM(`product_quantity`) as quantity
FROM `p_orderedproducts`
WHERE `order_date`>='2013-03-01 16:51:17'
GROUP BY `p_product_id`) AS op
ON `p_products`.`id` = `op`.`p_product_id`
WHERE `p_product_p_category`.`p_category_id` IN ('15','23','32')
AND `p_products`.`active` = '1'
GROUP BY `p_products`.`id`
ORDER BY `date` DESC
Explain says:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY p_product_p_category ref p_product_id,p_category_id,p_product_id_2 p_category_id 4 const 8239 Using temporary; Using filesort
1 PRIMARY p_products eq_ref PRIMARY PRIMARY 4 pdev.p_product_p_category.p_product_id 1 Using where
1 PRIMARY ALL NULL NULL NULL NULL 78
2 DERIVED p_orderedproducts index order_date p_product_id 4 NULL 201 Using where
And I have indexes on a number of columns including p_products.date.
Problem is the speed when there are more then 5000 products in a number of categories. 60000 products take >1 second. Is there any way to speed things up?
This also holds true if I remove the left join in which case the result is:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p_product_p_category index p_product_id,p_category_id,p_product_id_2 p_product_id_2 8 NULL 91167 Using where; Using index; Using temporary; Using filesort
1 SIMPLE p_products eq_ref PRIMARY PRIMARY 4 pdev.p_product_p_category.p_product_id 1 Using where
The intermediatate table p_product_p_category has indexes on both p_product_id and p_category_id aswell as a combined index with both.
Tries Ochi's suggestion and ended up with:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY ALL NULL NULL NULL NULL 62087 Using temporary; Using filesort
1 PRIMARY nr1media_products eq_ref PRIMARY PRIMARY 4 cats.nr1media_product_id 1 Using where
2 DERIVED nr1media_product_nr1media_category range nr1media_category_id nr1media_category_id 4 NULL 62066 Using where
I think I can simplify the question to how can I join my products on the category intermediate table to fetch all unique products for the selected categories, sorted by date.
EDIT:
This gives me all unique products in the categories without using a temp table for ordering or grouping:
SELECT
`p_products`.`id`,
`p_products`.`name`,
`p_products`.`img`,
`p_products`.`safe_name`,
`p_products`.`sku`,
`p_products`.`productstatusid`
FROM
p_products
WHERE
EXISTS (
SELECT
1
FROM
p_product_p_category
WHERE
p_product_p_category.p_product_id = p_products.id
AND p_category_id IN ('15', '23', '32')
)
AND p_products.active = 1
ORDER BY
`date` DESC
Above query is very fast, much faster then the join using group by order by (0.04 VS 0.7 sec), although I don't understand why it can do this query without temp tables.
I think I need to find another solution for the orderedproducts join, it still slows the query down to >1 sec. Might make a cron to update the ranking of the products sold once every night and save that info to the p_products table.
Unless someone has a definitive solution...
You are joining every type of category to products - only then it gets filtered by category id
try to limit your query as soon as possible for e.g. instead of
INNER JOIN `p_product_p_category`
do
INNER JOIN ( SELECT * FROM `p_product_p_category` WHERE `p_category_id` IN ('15','23','32') )
so that you will be working on smaller subset of products right from begining
One possible solution would be to remove the derived table and just do a single Group By:
Select P.id, P.name, P.date
, P.img, P.safe_name, P.sku
, P.productstatusid
, Sum( OP.product_quantity ) As quantity
From p_products As P
Join p_product_p_category As CAT
On p_products.id = CAT.p_product_id
Left Join p_orderedproducts As OP
On OP.p_product_id = P.id
And OP.order_date >= '2013-03-01 16:51:17'
Where CAT.p_category_id In ('15','23','32')
And P.active = '1'
Group By P.id, P.name, P.date
, P.img, P.safe_name, P.sku
, P.productstatusid
Order By P.date Desc

Mysql Optimize Query: Trying to Get Average of Subquery

I have the following query:
SELECT AVG(time) FROM
(SELECT UNIX_TIMESTAMP(max(datelast)) - UNIX_TIMESTAMP(min(datestart)) AS time
FROM table
WHERE id IN
(SELECT DISTINCT id
FROM table
WHERE product_id = 12394 AND datelast > '2011-04-13 00:26:59'
)
GROUP BY id
)
as T
The query gets the greatest datelast value and subtracts it from the greatest datestart value for every ID (which is the length of a user session), and then averages it.
The outer most query is there only to average the resulting times. Is there any way to optimize this query?
Output from EXPLAIN:
id select_type table type possible_keys key key_len ref rows extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 7
2 DERIVED table index NULL id 16 NULL 26 Using where
3 DEPENDENT SUBQUERY table index_subquery id,product_id,datelast id 12 func 2 Using index; Using where
Is the first SELECT really necessary ?
SELECT
AVG(time)
FROM
(
SELECT
UNIX_TIMESTAMP(max(datelast)) - UNIX_TIMESTAMP(min(datestart)) AS time
FROM
table
WHERE
product_id = 12394 AND datelast > '2011-04-13 00:26:59'
GROUP BY
id
)
I can't test now and I think it would work too. Otherwise, your query looks good.
You can optimize the query by adding a (datelast, product_id) key (always put the most restrictive field first, to increase selectivity).

Which query is better

EXPLAIN EXTENDED SELECT id, name
FROM member
INNER JOIN group_assoc ON ( member.id = group_assoc.member_id
AND group_assoc.group_id =2 )
ORDER BY registered DESC
LIMIT 0 , 1
Outputs:
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE group_assoc ref member_id,group_id group_id 4 const 3 100.00 Using temporary; Using filesort
1 SIMPLE member eq_ref PRIMARY PRIMARY 4 source_member.group_assoc.member_id 1 100.00
explain extended SELECT
id, name
FROM member WHERE
id
NOT IN (
SELECT
member_id
FROM group_assoc WHERE group_id = 2
)
ORDER BY registered DESC LIMIT 0,1
Outputs:
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY member ALL NULL NULL NULL NULL 2635 100.00 Using where; Using filesort
2 DEPENDENT SUBQUERY group_assoc index_subquery member_id,group_id member_id 8 func,const 1 100.00 Using index; Using where
The first query I'm not so sure about, it uses a temporary table which seems like a worse idea. But I also see that it uses fewer rows than the 2nd query....
These queries return completely different resultsets: the first one returns members of group 2, the second one returns everybody who is not a member of group 2.
If you meant this:
SELECT id, name
FROM member
LEFT JOIN
group_assoc
ON member.id = group_assoc.member_id
AND group_assoc.group_id = 2
WHERE group_assoc.member_id IS NULL
ORDER BY
registered DESC
LIMIT 0, 1
, then the plans should be identical.
You may find this article interesting:
NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: MySQL
Create an index on member.registered to get rid of both filesort and temporary.
I would say the first is better. The temporary table might not be a good idea, but a subquery isn't much better. And you will give MySQL more options to optimize the query plan with an inner join than you have with a subquery.
The subquery solution is fast as long as there are just a few rows that will be returned.
But... the first and second query don't seem to be the same, should it be that way?