why is this query so slow with MySQL? - mysql

Yesterday I found a slow query running on the server(this query costs more than 1 minute).It looks like this:
select a.* from a
left join b on a.hotel_id=b.hotel_id and a.hotel_type=b.hotel_type
where b.hotel_id is null
There are 40000+ rows in table a and 10000+ rows in table b.An unique key had already been created on columns hotel_id and hotel_type in table b like UNIQUE KEY idx_hotel_id (hotel_id,hotel_type).So I used the explain keyword to check the query plan on this sql and I got a result like the following:
type key rows
1 SIMPLE a ALL NULL NULL NULL NULL 36804
1 SIMPLE b index NULL idx_hotel_id 185 NULL 8353 Using where; Using index; Not exists
According to the reference manual of MySQL, when all parts of an index are used by the join and the index is a PRIMARY KEY or UNIQUE NOT NULL index the join type will be "eq_ref".See the second row of the query plan,the value of column type is "index".But I really had en unique index on hotel_id and hotel_type and both the two columns were used by the join.The join type "ef_ref" is more efficient than the join type "ref" and "ref" is more efficient than "range"."index" is the last join type wo wanna hava except "ALL".This is what I'm confused about and I wanna know why the join type here is "index". I hope I describe my question clear and I'm looking forward to get answers from you guys,thanks!

Where Is Null checks can be slow, so maybe it is that.
select * from a
where not exists ( select 1 from b where a.hotel_id=b.hotel_id and a.hotel_type=b.hotel_type )
Also: how many records are you returning? If you are returning all 36804 records this could slow things down as well.

Thanks all the people above!I found the way to solve my problem myself.The columns hotel_id and hotel_type didn't have the same character set.After I made them both "utf8",my query returned result in about less than 10 millisecond.There is an good article about left join and index in MySQL,I strongly recommend it to you guys.Here is the site:http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/

Related

Seeking optimised proper mysql query for multiple table joins

Hi developer/dba friends I have small issues for fetching details with mysql tables as following :
I have cid lets say cid=xxx
I want output(cid,ruser_id,dtoken,akey,prase) in one record using cid as key input
what mysql query I should perform that can also optimise this fetch with smooth less time on execution?
table structure is as following:
tbl_mk(cid,pro_id) -> pro_id primary key of tbl_pro
tbl_luser(cid,ruser_id) -> ruser_id primary key of tbl_ruser
tbl_ruser(id,dtoken)->id is primary key of this tbl_ruser where its referenced in tbl_luser as ruser_id
tbl_pro(id,akey)-> id is primary key of this tbl_pro which its referenced in tbl_mk as pro_id
tbl_app(akey,prase)
primary id/reference naming convention is i.e like if name of table is
tbl_name then id referenced in other table for tbl_name is name_id. where id is primary key of tbl_name.
I know there are lot of mysql experts here so how to make it working with less efforts, fyi I am basically mobile app developer but there is time m working on some mysql stuffs for web apis needs :)
Thanks and I really appreciate and admire if some one can solve this problem for me.I did one query and getting details but seems its not proper way I need more efficient way thats why m posting here.
Waiting for some best reply with expected answer.
Just join the tables. Assuming :cid is your input:
SELECT l.cid, l.ruser_id, r.dtoken, p.akey, prase
FROM tbl_luser l
JOIN tbl_ruser r ON l.ruser_id = r.id
JOIN tbl_mk m ON l.cid = m.cid
JOIN tbl_pro p ON p.id = m.prod_id
JOIN tbl_app a ON a.akey = p.akey
WHERE l.cid = :cid

query with multiple left joins leading to query lock

I am trying to optimize this query as good as possible,but still i am getting query locks due to this query.Can any one provide some suggestions in improving it.The query fetches the last one day entries from the table.
The QUERY:
SELECT CR.id,
CR.servicecode,
CR.leadtime,
CR.redirecturl,
CRE.custemail,
CRE.custlname,
CRE.custfname,
CRE.duration,
CR.userid,
AA.lpintrotimearr,
AA.lpintrotimedep,
AA.landdatetimearr,
AA.landdatetimedep,
CR.newcustid,
cre.CRE.custmobilephone,
CRE.brandname
FROM response CR
LEFT JOIN agreement AA
ON CR.id = AA.id
LEFT JOIN request CRE
ON CRE.id = CR.id
WHERE CR.id > '20120617145243'
AND CR.approved = 1
AND CR.chlapproved != 0
AND CR.chlapproved IS NOT NULL
AND AA.id IS NOT NULL
AND ( AA.stdsign != 'on'
OR AA.stdsign IS NULL )
AND ( AA.ivaflag = 0
OR AA.ivaflag IS NULL )
AND ( AA.opt IS NULL
OR AA.opt = 0 );
The EXPLAIN:
One way is to index all 3(AA.stdsign,AA.ivaflag and AA.opts) columns but all the three flags (AA.stdsign,AA.ivaflag and AA.opts) can have only 3 different values.Will indexing these reduce query run time?
All the ids are of varchar(60) data type.
There isn't much to be improved on the query itself.
On the other hand, setting an index on AA.stdsign, AA.ivaflag and AA.opts should help a lot.
As your EXPLAIN indicates, no suitable key is found for your AA table and all 534956 rows must be scanned to satisfy the WHERE clause.
[edit]
One last tip: using large column types (such as VARCHAR(60)) for your primary keys is probably sub-optimal.
First reason: every time you need to reference a row (e.g. in a foreign key), you need another VARCHAR(60).
Second reason: comparisons on strings are slower than on integers (hence it may render a JOIN slower than necessary)
You may want to add an INT column to your tables, and use it as primary key.

MySQL: simple schema, joining in a view and sorting on unrelated attribute causes unbearable performance hit

I'm creating a database model for use by a diverse amount of applications and different kinds of database servers (though I'm mostly testing on MySQL and SQLite now). It's a really simple model that basically consists of one central matches table and many attribute tables that have the match_id as their primary key and one other field (the attribute value itself). Said in other words, every match has exactly one of every type of attribute and every attribute is stored in a seperate table. After experiencing some rather bad performance whilst sorting and filtering on these attributes (FROM matches LEFT JOIN attributes_i_want on primary index) I decided to try to improve it. To this end I added an index on every attribute value column. Sorting and filtering performance increased a lot for easy queries.
This simple schema is basically a requirement for the application, so it is able to auto-discover and use attributes. Thus, to create more complex attributes that are actually based on other results, I decided to use VIEWs that turn one or more other tables that don't necessarily match up to the attribute-like schema into an attribute-schema. I call these meta-attributes (they aren't directly editable either). However, to the application this is all transparant, and so it happily joins in the VIEW as well when it wants to. The problem: it kills performance. When the VIEW is joined in without sorting on any attribute, performance is still acceptable, but combining a retrieval of the VIEW with sorting is unacceptably slow (on the order of 1s). Even after reading quite a bit of tutorials on indexing and some questions here on stack overflow, I can't seem to help it.
_Prerequisites for a solution: in one way or another, num_duplicates must exist as a table or view with the columns match_id and num_duplicates to look like an attribute. I can't change the way attributes are discovered and used. So if I want to see num_duplicates appear in the application it'll have to be as some kind of view or materialized table that makes a num_duplicates table._
Relevant parts of the schema
Main table:
CREATE TABLE `matches` (
`match_id` int(11) NOT NULL,
`source_name` text,
`target_name` text,
`transformation` text,
PRIMARY KEY (`match_id`)
) ENGINE=InnoDB;
Example of a normal attribute (indexed):
CREATE TABLE `error` (
`match_id` int(11) NOT NULL,
`error` double DEFAULT NULL,
PRIMARY KEY (`match_id`),
KEY `error_index` (`error`)
) ENGINE=InnoDB;
(all normal attributes, like error, are basically the same)
Meta-attribute / VIEW:
CREATE VIEW num_duplicates
AS SELECT duplicate AS match_id, COUNT(duplicate) AS num_duplicates
FROM duplicate
GROUP BY duplicate
(this is the only meta-attribute I'm using right now)
Simple query with indexing on the attribute value columns (the part improved by indexes)
SELECT matches.match_id, source_name, target_name, transformation FROM matches
INNER JOIN error ON matches.match_id = error.match_id
ORDER BY error.error
(the performance on this query increased a lot because of the index on error)
(the runtime of this query is on the order of 0.0001 sec)
Slightly more complex queries and their runtimes including the meta-attribute (the still bad part)
SELECT
matches.match_id, source_name, target_name, transformation, STATUS , volume, error, COMMENT , num_duplicates
FROM matches
INNER JOIN STATUS ON matches.match_id = status.match_id
INNER JOIN error ON matches.match_id = error.match_id
LEFT JOIN num_duplicates ON matches.match_id = num_duplicates.match_id
INNER JOIN volume ON matches.match_id = volume.match_id
INNER JOIN COMMENT ON matches.match_id = comment.match_id
(runtime: 0.0263sec) <--- still acceptable
SELECT matches.match_id, source_name, target_name, transformation, STATUS , volume, error, COMMENT , num_duplicates
FROM matches
INNER JOIN STATUS ON matches.match_id = status.match_id
INNER JOIN error ON matches.match_id = error.match_id
LEFT JOIN num_duplicates ON matches.match_id = num_duplicates.match_id
INNER JOIN volume ON matches.match_id = volume.match_id
INNER JOIN COMMENT ON matches.match_id = comment.match_id
ORDER BY error.error
LIMIT 20, 20
(runtime: 0.8866 sec) <--- not acceptable (the query speed is exactly the same with the LIMIT as without the LIMIT, note: if I could get the version with the LIMIT to be fast that would already be a big win. I presume it has to scan the entire table and so the limit doesn't matter too much)
EXPLAIN of the last query
Of course I tried to solve it myself before coming here, but I must admit I'm not that good at these things and haven't found a way to remove the offending performance killer yet. I know it's most likely the using filesort but I don't know how to get rid of it.
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY error index PRIMARY,match_id error_index 9 NULL 53909 Using index; Using temporary; Using filesort
1 PRIMARY COMMENT eq_ref PRIMARY PRIMARY 4 tangbig4.error.match_id 1
1 PRIMARY STATUS eq_ref PRIMARY PRIMARY 4 tangbig4.COMMENT.match_id 1 Using where
1 PRIMARY matches eq_ref PRIMARY PRIMARY 4 tangbig4.COMMENT.match_id 1 Using where
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 2
1 PRIMARY volume eq_ref PRIMARY PRIMARY 4 tangbig4.matches.match_id 1 Using where
2 DERIVED duplicate index NULL duplicate_index 5 NULL 49222 Using index
By the way, the query without the sort, which still runs acceptably, is EXPLAIN'ed like this:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY COMMENT ALL PRIMARY NULL NULL NULL 49610
1 PRIMARY error eq_ref PRIMARY,match_id PRIMARY 4 tangbig4.COMMENT.match_id 1
1 PRIMARY matches eq_ref PRIMARY PRIMARY 4 tangbig4.COMMENT.match_id 1
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 2
1 PRIMARY STATUS eq_ref PRIMARY PRIMARY 4 tangbig4.COMMENT.match_id 1
1 PRIMARY volume eq_ref PRIMARY PRIMARY 4 tangbig4.matches.match_id 1 Using where
2 DERIVED duplicate index NULL duplicate_index 5 NULL 49222 Using index
Question
So, my question is if someone who know more about databases/MySQL is able to find me a way that I can use/research to increase the performance of my last query.
I've been thinking quite a lot about materialized views but they are not natively supported in MySQL and since I'm going for as wide a range of SQL servers as possible this might not be idea. I'm hoping maybe a change to the queries or views might help or possible an extra index.
EDIT: Some random thoughts I've been having about the query:
VERY FAST: joining all tables, excluding the VIEW, sorting
ACCEPTABLE: joining all tables, including the VIEW, no sorting
DOG SLOW: joining all tables, including the VIEW, sorting
But: the VIEW has no influence at all on the sorting, none of it's attributes or even the attributes in its constituent tables are used to sort. Why does includingg the sort impact performance that much then? Is there any way I can convince the database to sort first and then just join up the VIEW? Or can I convince it that the VIEW is not important for sorting?
EDIT2: Following the suggestion by #ace for creating a VIEW and then joining at first didn't seem to help:
DROP VIEW IF EXISTS `matches_joined`;
CREATE VIEW `matches_joined` AS (
SELECT matches.match_id, source_name, target_name, transformation, STATUS , volume, error, COMMENT
FROM matches
INNER JOIN STATUS ON matches.match_id = status.match_id
INNER JOIN error ON matches.match_id = error.match_id
INNER JOIN volume ON matches.match_id = volume.match_id
INNER JOIN COMMENT ON matches.match_id = comment.match_id
ORDER BY error.error
);
followed by:
SELECT matches_joined.*, num_duplicates
FROM matches_joined
LEFT JOIN num_duplicates ON matches_joined.match_id = num_duplicates.match_id
However, using LIMIT on the view did make a difference:
DROP VIEW IF EXISTS `matches_joined`;
CREATE VIEW `matches_joined` AS (
SELECT matches.match_id, source_name, target_name, transformation, STATUS , volume, error, COMMENT
FROM matches
INNER JOIN STATUS ON matches.match_id = status.match_id
INNER JOIN error ON matches.match_id = error.match_id
INNER JOIN volume ON matches.match_id = volume.match_id
INNER JOIN COMMENT ON matches.match_id = comment.match_id
ORDER BY error.error
LIMIT 0, 20
);
Afterwards, the query ran at an acceptable speed. This is already a nice result. However, I feel that I'm jumping through hoops to force the database to do what I want and the reduction in time is probably only caused by the fact that it now only has to sort 20 rows. What if I have more rows? Is there any other way to force the database to see that joining in the num_duplicates VIEW doesn't influence the sorting in the least? Could I perhaps change the query that makes the VIEW a bit?
Some things that can be tested if you haven't tried them yet.
Create a view for all joins with sorting.
DROP VIEW IF EXISTS `matches_joined`;
CREATE VIEW `matches_joined` AS (
SELECT matches.match_id, source_name, target_name, transformation, STATUS , volume, error, COMMENT
FROM matches
INNER JOIN STATUS ON matches.match_id = status.match_id
INNER JOIN error ON matches.match_id = error.match_id
INNER JOIN volume ON matches.match_id = volume.match_id
INNER JOIN COMMENT ON matches.match_id = comment.match_id
ORDER BY error.error
);
Then join them with num_duplicates
SELECT matches_joined.*, num_duplicates
FROM matches_joined
LEFT JOIN num_duplicates ON matches_joined.match_id = num_duplicates.match_id
I'm assuming that as pointed out in here, this query will utilize the order by clause in the view matches_joined.
Some information that may help on optimization.
MySQL :: MySQL 5.0 Reference Manual :: 7.3.1.11 ORDER BY Optimization
The problem was more or less solved by the "VIEW" suggestion that #ace made, but several other types of queries still had performance issues (notably large OFFSET's). In the end a large improvement on all queries of this form was had by simply forcing late-row lookup. Note that it is commonly claimed that this is only necessary for MySQL because MySQL always performs early-row lookup and that other databases like PostgreSQL don't suffer from this problem. However, extensive benchmarks of my application have pointed out that PostgreSQL benefits greatly from this approach as well.

sql on mysql about join

the code below provide a result too much Infact i want to list the customer that never buy somethink How can i fix the code below
SELECT
webboard.listweb.id,
webboard.listweb.iditempro,
webboard.listweb.url,
webboard.listweb.useradddate,
webboard.listweb.expiredate,
webboard.prorecord.urlpostonweb
webboard.prorecord.urlpostonweb
FROM
webboard.listweb ,
webboard.prorecord
Where listweb.id Not In
(select webboard.prorecord.idlist From webboard.prorecord )
Using the syntax
FROM
webboard.listweb ,
webboard.prorecord
will perform a cartesian, or cross, join on the tables involved. So for every row in the table listweb all the rows in prorecord are displayed.
You need to use an INNER JOIN to only select the rows in listweb that have related rows in the prorecord table. What are the fields which identify the rows (your Primary Keys) and what is the name of the foreign key field in the prorecord table?
EDIT: Just re-read the question and comments and I see you want the rows in listweb which do not have an entry in prorecord
Your SELECT will then look like:
SELECT
webboard.listweb.id,
webboard.listweb.iditempro,
webboard.listweb.url,
webboard.listweb.useradddate,
webboard.listweb.expiredate,
webboard.prorecord.urlpostonweb
-- webboard.prorecord.urlpostonweb -- You have this field twice
FROM webboard.listweb LEFT JOIN webboard.prorecord
ON webboard.listweb.id = webboard.prorecord.idlist -- I'm guessing at the foreign key here
WHERE webboard.prorecord.idlist IS NULL

How to optimize MySQL Views

I have some querys using views, and these run a lot slower than I would expect them to given all relevant tables are indexed (and not that large anyway).
I hope I can explain this:
My main Query looks like this (grossly simplified)
select [stuff] from orders as ord
left join calc_order_status as ors on (ors.order_id = ord.id)
calc_order_status is a view, defined thusly:
create view calc_order_status as
select ord.id AS order_id,
(sum(itm.items * itm.item_price) + ord.delivery_cost) AS total_total
from orders ord
left join order_items itm on itm.order_id = ord.id
group by ord.id
Orders (ord) contain orders, order_items contain the individual items associated with each order and their prices.
All tables are properly indexed, BUT the thing runs slowly and when I do a EXPLAIN I get
# id select_type table type possible_keys key key_len ref rows Extra
1 1 PRIMARY ord ALL customer_id NULL NULL NULL 1002 Using temporary; Using filesort
2 1 PRIMARY <derived2> ALL NULL NULL NULL NULL 1002
3 1 PRIMARY cus eq_ref PRIMARY PRIMARY 4 db135147_2.ord.customer_id 1 Using where
4 2 DERIVED ord ALL NULL NULL NULL NULL 1002 Using temporary; Using filesort
5 2 DERIVED itm ref order_id order_id 4 db135147_2.ord.id 2
My guess is, "derived2" refers to the view. The individual items (itm) seem to work fine, indexed by order _ id. The problem seems to be Line # 4, which indicates that the system doesn't use a key for the orders table (ord). But in the MAIN query, the order id is already defined:
left join calc_order_status as ors on (ors.order _ id = ord.id)
and ord.id (both in the main query and within the view) refer to the primary key.
I have read somewhere than MySQL simpliy does not optimize views that well and might not utilize keys under some conditions even when available. This seems to be one of those cases.
I would appreciate any suggestions. Is there a way to force MySQL to realize "it's all simpler than you think, just use the primary key and you'll be fine"? Or are views the wrong way to go about this at all?
If it is at all possible to remove those joins remove them. Replacing them with subquerys will speed it up a lot.
you could also try running something like this to see if it has any speed difference at all.
select [stuff] from orders as ord
left join (
create view calc_order_status as
select ord.id AS order_id,
(sum(itm.items * itm.item_price) + ord.delivery_cost) AS total_total
from orders ord
left join order_items itm on itm.order_id = ord.id
group by ord.id
) as ors on (ors.order_id = ord.id)
An index is useful for finding a few rows in a big table, but when you query every row, an index just slows things down. So here MySQL probably expects to be using the whole [order] table, so it better not use an index.
You can try if it would be faster by forcing MySQL to use an index:
from orders as ord force index for join (yourindex)