Which query is better - mysql

EXPLAIN EXTENDED SELECT id, name
FROM member
INNER JOIN group_assoc ON ( member.id = group_assoc.member_id
AND group_assoc.group_id =2 )
ORDER BY registered DESC
LIMIT 0 , 1
Outputs:
id select_type table type possible_keys key key_len ref rows filtered Extra
1 SIMPLE group_assoc ref member_id,group_id group_id 4 const 3 100.00 Using temporary; Using filesort
1 SIMPLE member eq_ref PRIMARY PRIMARY 4 source_member.group_assoc.member_id 1 100.00
explain extended SELECT
id, name
FROM member WHERE
id
NOT IN (
SELECT
member_id
FROM group_assoc WHERE group_id = 2
)
ORDER BY registered DESC LIMIT 0,1
Outputs:
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY member ALL NULL NULL NULL NULL 2635 100.00 Using where; Using filesort
2 DEPENDENT SUBQUERY group_assoc index_subquery member_id,group_id member_id 8 func,const 1 100.00 Using index; Using where
The first query I'm not so sure about, it uses a temporary table which seems like a worse idea. But I also see that it uses fewer rows than the 2nd query....

These queries return completely different resultsets: the first one returns members of group 2, the second one returns everybody who is not a member of group 2.
If you meant this:
SELECT id, name
FROM member
LEFT JOIN
group_assoc
ON member.id = group_assoc.member_id
AND group_assoc.group_id = 2
WHERE group_assoc.member_id IS NULL
ORDER BY
registered DESC
LIMIT 0, 1
, then the plans should be identical.
You may find this article interesting:
NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: MySQL
Create an index on member.registered to get rid of both filesort and temporary.

I would say the first is better. The temporary table might not be a good idea, but a subquery isn't much better. And you will give MySQL more options to optimize the query plan with an inner join than you have with a subquery.
The subquery solution is fast as long as there are just a few rows that will be returned.
But... the first and second query don't seem to be the same, should it be that way?

Related

select last record in each group for large database

I want to fetch last record in each group. I have used following query with very small database and it works perfectly -
SELECT * FROM logs
WHERE id IN (
SELECT max(id) FROM logs
WHERE id_search_option = 31
GROUP BY items_id
)
ORDER BY id DESC
But when it comes to actual database having millions of rows (80,00000+ rows), the system gets hanged.
I also tried another query, which gives result in 6.6sec on an average --
SELECT p1.id, p1.itemtype, p1.items_id, p1.date_mod
FROM logs p1
INNER JOIN (
SELECT max(id) as max_id, itemtype, items_id, date_mod
FROM logs
WHERE id_search_option = 31
GROUP BY items_id) p2
ON (p1.id = p2.max_id)
ORDER BY p1.items_id DESC;
Please help !
EDIT:: Explain 2nd query
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 1177 Using temporary; Using filesort
1 PRIMARY p1 eq_ref PRIMARY PRIMARY 4 p2.max_id 1
2 DERIVED logs ALL NULL NULL NULL NULL 7930527 Using where; Using temporary; Using filesort
select *from tablename orderby unique_column desc limit 0,1;
try it will work
here 0->oth record,1->one record

ORDER BY not working inside JOIN SELECT

I have 2 tables: ticket and ticket_message.
I want to select all tickets, that were not answered by our support team. This means last message, left in ticket, will have type client.
I trying code like this:
SELECT `ticket`.*,`message`.*
FROM `ticket`
LEFT JOIN (SELECT * FROM `ticket_message` ORDER BY `timeCreated` DESC) AS `message` ON `message`.`ticketId` = `ticket`.`id`
GROUP BY `ticket`.`id`
HAVING `message`.`type` = 'client'
The thing is, that this code works perfectly on my dev server with MySQL 5.5.42, but messages does not sort in subquery on production server with MySQL 5.7.9
Here is EXPLAIN results:
for 5.5.42:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY ticket ALL NULL NULL NULL NULL 38
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 130
2 DERIVED ticket_message ALL NULL NULL NULL NULL 127 Using filesort
for 5.7.9:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE ticket NULL index PRIMARY,ticket_ibfk_1,ticket_ibfk_2 PRIMARY 4 NULL 38 100.00 NULL
1 SIMPLE ticket_message NULL ref ticket_message_ibfk_1 ticket_message_ibfk_1 5 ticket.id 3 100.00 NULL
There's no guarantee that the ordering inside a subquery will be preserved. And in this case you can just use a join rather than a subquery.
Does this work?
SELECT `ticket`.*,`ticket_message`.* FROM `ticket`
LEFT JOIN `ticket_message` ON `ticket_message`.`ticketId` = `ticket`.`id`
GROUP BY `ticket`.`id` ORDER BY `ticket_message`.`timeCreated` DESC
I'm not sure what you want to achieve, but you may want to put ticket_message.timeCreated into the group by or you may get unexpected results.
Note on using "Group By"
It is easy to get caught out using Group By and there are rules that should be followed
Fields you are selecting should appear in the "Group By" clause. If they are not in the group by clause they should have an aggregate function applied.
https://mariadb.com/kb/en/sql-99/rules-for-grouping-columns/

mysql:choosing the most efficient query from the two

Both of these mysql queries produce exactly the same result but query A is a simple union and it has the where postType clause embedded inside individual queries whereas query B has the same where clause applied to the external select of the virtual table which is a union of individual query results. I am concerned that the virtual table sigma from query B might get too large for no good reason if there are a lot of rows but then I am bit confused because how would the order by work for query A ; would it also not have to make a virtual table or something like that for sorting results. All may depend on how order by works for a union ? If order by for a union is also making a temp table ; would then query A almost equate to query B in resources(it will be much easier for us to implement query B in our system compared to query A)? Please guide/advise in any way possible, thanks
Query A
SELECT `t1`.*, `t2`.*
FROM `t1` INNER JOIN `t2` ON
`t1`.websiteID= `t2`.ownerID
AND `t1`.authorID= `t2`.authorID
AND `t1`.authorID=1559 AND `t1`.postType="simplePost"
UNION
SELECT `t1`.*
FROM `t1` where websiteID=1559 AND postType="simplePost"
ORDER BY postID limit 0,50
Query B
Select * from (
SELECT `t1`.*,`t2`.*
FROM `t1` INNER JOIN `t2` ON
`t1`.websiteID= `t2`.ownerID
AND `t1`.authorID= `t2`.authorID
AND `t1`.authorID=1559
UNION
SELECT `t1`.*
FROM `t1` where websiteID=1559
)
As sigma where postType="simplePost" ORDER BY postID limit 0,50
EXPLAIN FOR QUERY A
id type table type possible_keys keys key_len ref rows Extra
1 PRIMARY t2 ref userID userID 4 const 1
1 PRIMARY t1 ref authorID authorID 4 const 2 Usingwhere
2 UNION t1 ref websiteID websiteID 4 const 9 Usingwhere
NULL UNIONRESULT <union1,2> ALL NULL NULL NULL NULL NULL Usingfilesort
EXPLAIN FOR QUERY B
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 10 Using where; Using filesort
2 DERIVED t2 ref userID userID 4 1
2 DERIVED t1 ref authorID authorID 4 2 Using where
3 UNION t1 ref websiteID websiteID 4 9
NULL UNION RESULT <union2,3> ALL NULL NULL NULL NULL NULL
There is no doubt that version 1 - separate where clauses in each side of the union - will be faster. Let's look at why version - where clause over the union result - is worse:
data volume: there's always going to be more rows in the union result, because there are less conditions on what rows are returned. This means more disk I/O (depending on indexes), more temporary storage to hold the rowset, which means more processing time
repeated scan: the entire result of the union must be scanned again to apply the condition, when it could have been handled during the initial scan. This means double handling the rowset, albeit probably in-memory, still it's extra work.
indexes aren't used for where clauses on a union result. If you have an index over the foreign key fields and postType, it would not be used
If you want maximum performance, use UNION ALL, which passes the rows straight out into the result with no overhead, instead of UNION, which removes duplicates (usually by sorting) and can be expensive and is unnecessary based in your comments
Define these indexes and use version 1 for maximum performance:
create index t1_authorID_postType on t1(authorID, postType);
create index t1_websiteID_postType on t1(websiteID, postType);
perhaps this would work in lieu:
SELECT
`t1`.*
,`t2`.*
FROM `t1`
LEFT JOIN `t2` ON `t1`.websiteID = `t2`.ownerID
AND `t1`.authorID = `t2`.authorID
AND `t1`.authorID = 1559
WHERE ( `t1`.authorID = 1559 OR `t1`.websiteID = 1559 )
AND `t1`.postType = 'simplePost'
ORDER BY postID limit 0 ,50

Optimize joined order by query

I Have the following query:
SELECT `p_products`.`id`, `p_products`.`name`, `p_products`.`date`,
`p_products`.`img`, `p_products`.`safe_name`, `p_products`.`sku`,
`p_products`.`productstatusid`, `op`.`quantity`
FROM `p_products`
INNER JOIN `p_product_p_category`
ON `p_products`.`id` = `p_product_p_category`.`p_product_id`
LEFT JOIN (SELECT `p_product_id`,`order_date`,SUM(`product_quantity`) as quantity
FROM `p_orderedproducts`
WHERE `order_date`>='2013-03-01 16:51:17'
GROUP BY `p_product_id`) AS op
ON `p_products`.`id` = `op`.`p_product_id`
WHERE `p_product_p_category`.`p_category_id` IN ('15','23','32')
AND `p_products`.`active` = '1'
GROUP BY `p_products`.`id`
ORDER BY `date` DESC
Explain says:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY p_product_p_category ref p_product_id,p_category_id,p_product_id_2 p_category_id 4 const 8239 Using temporary; Using filesort
1 PRIMARY p_products eq_ref PRIMARY PRIMARY 4 pdev.p_product_p_category.p_product_id 1 Using where
1 PRIMARY ALL NULL NULL NULL NULL 78
2 DERIVED p_orderedproducts index order_date p_product_id 4 NULL 201 Using where
And I have indexes on a number of columns including p_products.date.
Problem is the speed when there are more then 5000 products in a number of categories. 60000 products take >1 second. Is there any way to speed things up?
This also holds true if I remove the left join in which case the result is:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE p_product_p_category index p_product_id,p_category_id,p_product_id_2 p_product_id_2 8 NULL 91167 Using where; Using index; Using temporary; Using filesort
1 SIMPLE p_products eq_ref PRIMARY PRIMARY 4 pdev.p_product_p_category.p_product_id 1 Using where
The intermediatate table p_product_p_category has indexes on both p_product_id and p_category_id aswell as a combined index with both.
Tries Ochi's suggestion and ended up with:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY ALL NULL NULL NULL NULL 62087 Using temporary; Using filesort
1 PRIMARY nr1media_products eq_ref PRIMARY PRIMARY 4 cats.nr1media_product_id 1 Using where
2 DERIVED nr1media_product_nr1media_category range nr1media_category_id nr1media_category_id 4 NULL 62066 Using where
I think I can simplify the question to how can I join my products on the category intermediate table to fetch all unique products for the selected categories, sorted by date.
EDIT:
This gives me all unique products in the categories without using a temp table for ordering or grouping:
SELECT
`p_products`.`id`,
`p_products`.`name`,
`p_products`.`img`,
`p_products`.`safe_name`,
`p_products`.`sku`,
`p_products`.`productstatusid`
FROM
p_products
WHERE
EXISTS (
SELECT
1
FROM
p_product_p_category
WHERE
p_product_p_category.p_product_id = p_products.id
AND p_category_id IN ('15', '23', '32')
)
AND p_products.active = 1
ORDER BY
`date` DESC
Above query is very fast, much faster then the join using group by order by (0.04 VS 0.7 sec), although I don't understand why it can do this query without temp tables.
I think I need to find another solution for the orderedproducts join, it still slows the query down to >1 sec. Might make a cron to update the ranking of the products sold once every night and save that info to the p_products table.
Unless someone has a definitive solution...
You are joining every type of category to products - only then it gets filtered by category id
try to limit your query as soon as possible for e.g. instead of
INNER JOIN `p_product_p_category`
do
INNER JOIN ( SELECT * FROM `p_product_p_category` WHERE `p_category_id` IN ('15','23','32') )
so that you will be working on smaller subset of products right from begining
One possible solution would be to remove the derived table and just do a single Group By:
Select P.id, P.name, P.date
, P.img, P.safe_name, P.sku
, P.productstatusid
, Sum( OP.product_quantity ) As quantity
From p_products As P
Join p_product_p_category As CAT
On p_products.id = CAT.p_product_id
Left Join p_orderedproducts As OP
On OP.p_product_id = P.id
And OP.order_date >= '2013-03-01 16:51:17'
Where CAT.p_category_id In ('15','23','32')
And P.active = '1'
Group By P.id, P.name, P.date
, P.img, P.safe_name, P.sku
, P.productstatusid
Order By P.date Desc

MYSQL : LEFT JOIN run slowly

I have two tables :
table 1
table 2
then I do left join to both tables:
SELECT DATE(`Inspection_datetime`) AS Date, `Line`,`Model`, `Lot_no`,
COUNT(A.`Serial_number`) AS Qty,B.`name`
FROM `inspection_report` AS A
LEFT JOIN `Employee` AS B
ON A.`NIK` LIKE B.NIK
GROUP BY Date , A.Model ,A.Lot_no,A.Line,B.`name`
ORDER BY Date DESC
This query make my Jquery DataTable plugin run very slow even the data not show. The strange point is at this field B.name, if I mention it in query the data won't appear but if I delete it, the data would appear (I mean not doing LEFT JOIN).
Whether my query not good enough? this my EXPLAIN:
TABLE 1
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE inspection_report ALL NULL NULL NULL NULL 334518
TABLE 2
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Employee ALL NULL NULL NULL NULL 100
QUERY
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE B ALL NULL NULL NULL NULL 100 Using temporary; Using filesort
1 SIMPLE A ALL NULL NULL NULL NULL 334520 Using where; Using join buffer
Any special reason you are joining with a LIKE clause instead of equality?
SELECT DATE(`Inspection_datetime`) AS Date, `Line`,`Model`, `Lot_no`,
COUNT(A.`Serial_number`) AS Qty,B.`name`
FROM `inspection_report` AS A
LEFT JOIN `Employee` AS B
ON A.`NIK` = B.NIK
GROUP BY Date , A.Model ,A.Lot_no,A.Line,B.`name`
ORDER BY Date DESC
should yield better results.
Also, adding an index to A.NIK should help the join operation tremendously.
CREATE INDEX inspection_report_nik ON inspection_report (NIK);