When I am running a query on MySQL database, it is taking around 3 sec. When we execute the performance testing for 50 concurrent users, then the same query is taking 120 sec.
The query joins multiple tables with an order by clause and a limit condition.
We are using RDS instance (16 GB memory, 4 vCPU).
Can any one suggest how to improve the performance in this case?
Query:
SELECT
person0_.person_id AS person_i1_131_,
person0_.uuid AS uuid2_131_,
person0_.gender AS gender3_131_
CASE
WHEN
EXISTS( SELECT * FROM patient p WHERE p.patient_id = person0_.person_id)
THEN 1
ELSE 0
END AS formula1_,
CASE
WHEN person0_1_.patient_id IS NOT NULL THEN 1
WHEN person0_.person_id IS NOT NULL THEN 0
END AS clazz_
FROM
person person0_
LEFT OUTER JOIN
patient person0_1_ ON person0_.person_id = person0_1_.patient_id
INNER JOIN
person_attribute attributes1_ ON person0_.person_id = attributes1_.person_id
CROSS JOIN
person_attribute_type personattr2_
WHERE
attributes1_.person_attribute_type_id = personattr2_.person_attribute_type_id
AND personattr2_.name = 'PersonImageAttribute'
AND (person0_.person_id IN (SELECT
person3_.person_id
FROM
person person3_
INNER JOIN
person_attribute attributes4_ ON person3_.person_id = attributes4_.person_id
CROSS JOIN
person_attribute_type personattr5_
WHERE
attributes4_.person_attribute_type_id = personattr5_.person_attribute_type_id
AND personattr5_.name = 'LocationAttribute'
AND (attributes4_.value IN ('d31fe20e-6736-42ff-a3ed-b3e622e80842'))))
ORDER BY person0_1_.date_changed , person0_1_.patient_id
LIMIT 25
Plan
There appears to be some redundant query components, and what does not appear to be a proper context of CROSSS-JOIN when you have relation on specific patient and/or attribute info.
Your query getting the "clazz_" is based on a patient_id NOT NULL, but then again a person_id not null. Under what condition, would the person_id coming from the person table EVER be null. That sounds like a KEY ID and would NEVER be null, so why test for that. It seems like that is a duplicate field and in-essence is just the condition of a person actually being a patient vs not.
This query SHOULD get the same results otherwise and suggest the following specific indexes are available including
table index
person ( person_id )
person_attribute ( person_id, person_attribute_type_id )
person_attribute_type ( person_attribute_type_id, name )
patient ( patient_id )
select
p1.person_id AS person_i1_131_,
p1.uuid AS uuid2_131_,
p1.gender AS gender3_131_,
CASE WHEN p2.patient_id IS NULL
then 0 else 1 end formula1_,
-- appears to be a redunant result, just trying to qualify
-- some specific column value for later calculations.
CASE WHEN p2.patient_id IS NULL
THEN 0 else 1 end clazz_
from
-- pre-get only those people based on the P4 attribute in question
-- and attribute type of location. Get small list vs everything else
( SELECT distinct
pa.person_id
FROM
person_attribute pa
JOIN person_attribute_type pat
on pa.person_attribute_type_id = pat.person_attribute_type_id
AND pat.name = 'LocationAttribute'
WHERE
pa.value = 'd31fe20e-6736-42ff-a3ed-b3e622e80842' ) PQ
join person p1
on PQ.person_id = p1.person_id
LEFT JOIN patient p2
ON p1.person_id = p2.patient_id
JOIN person_attribute pa1
ON p1.person_id = pa1.person_id
JOIN person_attribute_type pat1
on pa1.person_attribute_type_id = pat1.person_attribute_type_id
AND pat1.name = 'PersonImageAttribute'
order by
p2.date_changed,
p2.patient_id
LIMIT
25
Finally, your query does an order by the date_changed and patient id which is based on the PATIENT table data having been changed. If that table is a left-join, you may have a bunch of PERSON records that are not patients and thus may not get
the expected records you really intent. So, just some personal review of what is presented in the question.
Speeding up the query is the best hope for handling more connections.
A simplification (but no speed difference), since TRUE=1 and FALSE=0:
CASE WHERE (boolean_expression) THEN 1 ELSE 0 END
-->
(boolean_expression)
Index suggestions:
person: INDEX(patient_id, date_changed)
person_attribute: INDEX(person_attribute_type_id, person_id)
person_attribute: INDEX(person_attribute_type_id, value, person_id)
person_attribute_type: INDEX(person_attribute_type_id, name)
If value is of type TEXT, then that cannot be used in an index.
Assuming that person has PRIMARY KEY(person_id) and patient -- patient_id, I have no extra recommendations for them.
The Entity-Attribute-Value schema pattern, which this seems to be, is hard to optimize when there are a large number of rows. Sorry.
The CROSS JOIN seems to be just an INNER JOIN, but with the condition in the WHERE instead of in ON, where it belongs.
person0_1_.patient_id can be NULL because of the LEFT JOIN, but I don't see how person0_.person_id can be NULL. Please check your logic.
Related
I would like to find a way to improve a query but it seems i've done it all. Let me give you some details.
Below is my query :
SELECT
`u`.`id` AS `id`,
`p`.`lastname` AS `lastname`,
`p`.`firstname` AS `firstname`,
COALESCE(`r`.`value`, 0) AS `rvalue`,
SUM(`rat`.`category` = 'A') AS `count_a`,
SUM(`rat`.`category` = 'B') AS `count_b`,
SUM(`rat`.`category` = 'C') AS `count_c`
FROM
`user` `u`
JOIN `user_customer` `uc` ON (`u`.`id` = `uc`.`user_id`)
JOIN `profile` `p` ON (`p`.`id` = `u`.`profile_id`)
JOIN `ad` FORCE INDEX (fk_ad_customer_idx) ON (`uc`.`customer_id` = `ad`.`customer_id`)
JOIN `ac` ON (`ac`.`id` = `ad`.`ac_id`)
JOIN `a` ON (`a`.`id` = `ac`.`a_id`)
JOIN `rat` ON (`rat`.`code` = `a`.`rat_code`)
LEFT JOIN `r` ON (`r`.`id` = `u`.`r_id`)
GROUP BY `u`.`id`
;
Note : Some table and column names are voluntarily hidden.
Now let me give you some volumetric data :
user => 6534 rows
user_customer => 12 923 rows
profile => 6511 rows
ad => 320 868 rows
ac => 4505 rows
a => 536 rows
rat => 6 rows
r => 3400 rows
And finally, my execution plan :
My query does currently run in around 1.3 to 1.7 seconds which is slow enough to annoy users of my application of course ... Also fyi result set is composed of 165 rows.
Is there a way I can improve this ?
Thanks.
EDIT 1 (answer to Rick James below) :
What are the speed and EXPLAIN when you don't use FORCE INDEX?
Surprisingly it gets faster when i don't use FORCE INDEX. To be honest, i don't really remember why i've done that change. I've probably found better results in terms of performance with it during one of my various tries and didn't remove it since.
When i don't use FORCE INDEX, it uses an other index ad_customer_ac_id_blocked_idx(customer_id, ac_id, blocked) and times are around 1.1 sec.
I don't really get it because fk_ad_customer_idx(customer_id) is the same when we talk about index on customer_id.
Get rid of FORCE INDEX. Even if it helped yesterday; it may hurt tomorrow.
Some of these indexes may be beneficial. (It is hard to predict; so simply add them all.)
a: (rat_code, id)
rat: (code, category)
ac: (a_id, id)
ad: (ac_id, customer_id)
ad: (customer_id, ac_id)
uc: (customer_id, user_id)
uc: (user_id, customer_id)
u: (profile_id, r_id, id)
(This assumes that id is the PRIMARY KEY of each table. Note that none have id first.) Most of the above are "covering".
Another approach that sometimes helps: Gather the SUMs before joining to any unnecessary table. But is seems that p is the only table not involved in getting from u (the target of GROUP BY) to r and rat (used in aggregates). It would look something like:
SELECT ..., firstname, lastname
FROM ( everything as above except for `p` ) AS most
JOIN `profile` `p` ON (`p`.`id` = most.`profile_id`)
GROUP BY most.id
This avoids hauling around firstname and lastname while doing most of the joins and the GROUP BY.
When doing JOINs and GROUP BY, be sure to sanity check the aggregates. Your COUNTs and SUMs may be larger than they should be.
First, you don't need to tick.everyTableAndColumn in your queries, nor result columns, aliases, etc. The tick marks are used primarily when you are in conflict with a reserved work so the parser knows you are referring to a specific column... like having a table with a COLUMN named "JOIN", but JOIN is part of SQL command... see the confusion it would cause. Helps clean readability too.
Next, and this is just personal preference and can help you and others following behind you on data and their relationships. I show the join as indented from where it is coming from. As you can see below, I see the chain on how do I get from the User (u alias) to the rat alias table... You get there only by going 5 levels deep, and I put the first table on the left-side of the join (coming from table) then = the table joining TO right-side of join.
Now, that I can see the relationships, I would suggest the following. Make COVERING indexes on your tables that have the criteria, and id/value where appropriate. This way the query gets as best it needs, the data from the index page vs having to go to the raw data. So here are suggestions for indexes.
table index
user_customer ( user_id, customer_id ) -- dont know what your fk_ad_customer_idx parts are)
ad ( customer_id, ac_id )
ac ( id, a_id )
a (id, rat_code )
rat ( code, category )
Reformatted query for readability and seeing relationships between the tables
SELECT
u.id,
p.lastname,
p.firstname,
COALESCE(r.value, 0) AS rvalue,
SUM(rat.category = 'A') AS count_a,
SUM(rat.category = 'B') AS count_b,
SUM(rat.category = 'C') AS count_c
FROM
user u
JOIN user_customer uc
ON u.id = uc.user_id
JOIN ad FORCE INDEX (fk_ad_customer_idx)
ON uc.customer_id = ad.customer_id
JOIN ac
ON ad.ac_id = ac.id
JOIN a
ON ac.a_id = a.id
JOIN rat
ON a.rat_code = rat.code
JOIN profile p
ON u.profile_id = p.id
LEFT JOIN r
ON u.r_id = r.id
GROUP BY
u.id
I've looked all over, and unfortunately, I can't seem to figure out what I'm doing wrong. I'm developing a personal financial management application that uses a MySQL server. For this problem, I have 4 tables I'm working with.
The TRANSACTIONS table contains columns CATID and BILLID which refer to primary keys in the SECONDARYCATEGORIES and BILLS tables. Both the TRANSACTIONS and BILLS tables have a column PCATID which refers to a primary key in the PRIMARYCATEGORIES table.
I'm building a SQL query that sums an "amount" column in the TRANSACTIONS table and returns the primary key from PCATID and the sum from all records that are associated with that value. If the BILLID is set to -1, it should find the PCATID in SECONDARYCATEGORIES where SECONDARYCATEGORIES.ID = TRANSACTIONS.CATID, otherwise (since -1 indicates this is NOT a bill), it should find the PCATID from the BILL record where BILLS.ID matches TRANSACTIONS.BILLID.
I'm looking for something like this (not valid SQL, obviously):
SELECT
SECONDARYCATEGORIES.PCATID,
SUM(TRANSACTIONS.AMOUNT)
FROM
TRANSACTIONS
IF (BILLID = -1) JOIN SECONDARYCATEGORIES ON SECONDARYCATEGORIES.ID = TRANSACTIONS.CATID
ELSE JOIN SECONDARYCATEGORIES ON SECONDARYCATEGORIES.ID = BILLS.CATID WHERE BILLS.ID = TRANSACTIONS.BILLID
I have tried a myriad of different JOINs, IF statements, etc, and I just can't seem to make this work. I had thought of breaking this up into different SQL queries based on the value of BILLID, and summing the values, but I'd really like to do this all in one SQL query if possible.
I know I'm missing something obvious here; any help is very much appreciated.
Edit: I forgot to describe the BILLS table. It contains a primary category, ID, as well as some descriptive data.
You can use OR in your JOIN, like this:
SELECT S.PCATID,
SUM(T.AMOUNT)
FROM TRANSACTIONS T
LEFT JOIN BILLS ON BILLS.ID = T.BILLID
JOIN SECONDARYCATEGORIES S ON (S.ID = T.CATID AND T.BILLID = -1)
OR (S.ID = BILLS.CATID AND BILLS.ID = T.BILLID)
You can also use COALESCE and CASE in your JOINs.
SELECT ID = COALESCE(s.PCATID,b.PCATID)
,Total = SUM(t.AMOUNT)
FROM TRANSACTIONS t
LEFT JOIN BILLS b ON b.BILLID = CASE WHEN t.BILLID <> -1 THEN t.BILLID END
LEFT JOIN SECONDARYCATEGORIES s ON s.CATID = CASE WHEN t.BILLID = -1 THEN t.CATID END
GROUP BY COALESCE(s.PCATID,b.BILLID)
I use UNION to pick either query. But the second query obviously won't work because it's missing BILLS table.
SELECT SECONDARYCATEGORIES.PCATID
, SUM(TRANSACTIONS.AMOUNT)
FROM TRANSACTIONS
JOIN SECONDARYCATEGORIES ON SECONDARYCATEGORIES.ID = TRANSACTIONS.CATID AND BILLID = -1
UNION
SELECT SECONDARYCATEGORIES.PCATID
, SUM(TRANSACTIONS.AMOUNT)
FROM TRANSACTIONS
JOIN SECONDARYCATEGORIES ON SECONDARYCATEGORIES.ID = BILLS.CATID AND BILLID <> -1
WHERE BILLS.ID = TRANSACTIONS.BILLID
I'm using MySQL and I have a query. There is also a subquery.
SELECT * FROM rg, list, status
WHERE (
(rg.required_status_id IS NULL AND rg.incorrect_status_id IS NULL) ||
(status.season_id = rg.required_status_id AND status.user_id = list.user_id) ||
(rg.incorrect_status_id IS NOT NULL AND
list.user_id NOT IN (SELECT user_id FROM status WHERE user_id = list.user_id AND season_id = rg.incorrect_status_id)
)
)
The problem is the following part of the code:
(rg.incorrect_status_id IS NOT NULL AND
list.user_id NOT IN (SELECT user_id FROM status WHERE user_id = list.user_id AND season_id = rg.incorrect_status_id)
)
How could I check if the table "status" has a row where user_id is same as list.user_id and season_id is same as rg.incorrect_status_id?
Update
Here is my current code, but it does not work at all. I do not know what to do.
SELECT * FROM rg, list, status
LEFT JOIN status AS stat
INNER JOIN rg AS rglist
ON rglist.incorrect_status_id = stat.season_id
ON stat.season_id = rglist.incorrect_status_id
WHERE (
(rg.required_status_id IS NULL AND rg.incorrect_status_id IS NULL) ||
(status.season_id = rg.required_status_id AND status.user_id = list.user_id) ||
(rg.incorrect_status_id IS NOT NULL AND stat.user_id IS NULL)
)
)
Update 2
I modified the names, but the basic idea is same.
FROM sarjojen_rglistat, sarjojen_rglistojen_osakilpailut, kilpailukausien_kilpailut, sarjojen_osakilpailuiden_rgpisteet
, sarjojen_kilpailukaudet, sarjojen_kilpailukausien_kilpailusysteemit
/* , kayttajien_ilmoittautumiset */
/* , sarjojen_kilpailukausien_pelaajastatukset */
LEFT OUTER JOIN sarjojen_kilpailukausien_pelaajastatukset
ON sarjojen_kilpailukausien_pelaajastatukset.sarjan_kilpailukausi_id = sarjojen_rglistat.vaadittu_pelaajastatus_id
LEFT OUTER JOIN kayttajien_ilmoittautumiset
ON kayttajien_ilmoittautumiset.kayttaja_id = sarjojen_kilpailukausien_pelaajastatukset.kayttaja_id
Now this says:
Column not found: 1054 Unknown column 'sarjojen_rglistat.vaadittu_pelaajastatus_id' in 'on clause'
Why is that so?
I have a table called "sarjojen_rglistat" and there is a column "vaadittu_pelaajastatus_id".
1) Simpler queries are easier for the query engine to interpret and produce an efficient plan.
If you pay careful attention to the following part of your query, you may realise something a little "weird" is going. This is a clue the approach is perhaps a little too complicated.
...(
list.user_id NOT IN (
SELECT user_id
FROM status
/* Note the sub-query cannot ever return a user_id different
to the one checked with "NOT IN" above */
WHERE user_id = list.user_id
AND season_id = rg.incorrect_status_id)
)
The query filtering where list.user_id is not in a result set that cannot contain user_id's other than list.user_id. Of course the sub-query could return zero results. So basically it boils down to a simple existence check.
So for a start, you should rather write:
...(
NOT EXISTS (
SELECT *
FROM status
WHERE user_id = list.user_id
AND season_id = rg.incorrect_status_id)
)
2) Be clear about your "what joins the tables together" (this refers back to 1 as well).
Your query selects from 3 tables without specifying any join conditions:
FROM rg, list, status
This would result in a cross join producing a result set that is a permutation combination of all possible row matches. If your WHERE clause were simple, the query engine might be able to implicitly promote certain filter conditions into join conditions, but that's not the case. So even if for example you have a very small number of rows in each table:
status 20
rg 100
list 1000
Your intermediate result set (before WHERE is applied),
would need 1000 * 100 * 20 = 2000000 rows!
It helps tremendously to make it clear with join conditions how the rows of each table are intended to match up. Not only does it make the query easier to read and understand, but it also helps avoid overlooking join conditions which can be the bane of performance considerations.
Note that when specifying join conditions, some rows might not have matches and this is where knowing and understanding the different types of joins is extremely important. Particularly in your case, most of the complexity in your WHERE clause seems to come from trying resolve when rows do/do not match. See this answer for some useful information.
Your FROM/WHERE clause should probably look more like the following. (Difficult to be certain because you haven't stated your table relationships or expected input/output of your query. But it should set you on the right track.)
FROM rg
/* Assumes rg rows form the base of the query, and not to have
some rg rows excluded due to non-matches in list or status. */
LEFT OUTER JOIN status ON
status.season_id = rg.required_status_id
LEFT OUTER JOIN list ON
status.user_id = list.user_id
WHERE rg.incorrect_status_id IS NULL
/* As Barmar commented, it may also be useful to break this
OR condition out as a separate query UNION to the above. */
OR (
rg.incorrect_status_id IS NOT NULL
AND NOT EXISTS (
SELECT *
FROM status
WHERE user_id = list.user_id
AND season_id = rg.incorrect_status_id)
)
Note that this query is very clear about the distinction between how the tables are joined, and what is used to filter the joined result set.
3) Finally and very importantly, even the best queries are of little benefit without the correct indexes!
A good query with bad indexes (or conversely a bad query with good indexes) is going to be inefficient either way. Computers are fast enough that you might not notice on small databases, but you do experiment with candidate indexes to find the best combination for your data and workload.
In the above query you likely need indexes on the following. (Some may already be covered by Primary Key constraints.)
status.season_id
status.user_id
list.user_id
rg.required_status_id
rg.incorrect_status_id
Use a UNION of subqueries that handle the 3 cases that you're combining with OR. You can then use explicit JOIN in each subquery to make it clear how the tables are related to each other (or not related at all when you're doing a full cross-product, as is the case when rg.required_status_id IS NULL AND rg.incorrect_status_id IS NULL).
SELECT rg.*, list.*, status.*
FROM rg
CROSS JOIN list
CROSS JOIN status
WHERE rg.required_status_id IS NULL AND rg.incorrect_status_id IS NULL
UNION ALL
SELECT rg.*, list.*, status.*
FROM rg
JOIN status ON rg.required_status_id = status.season_id
JOIN list ON status.user_id = list.user_id
UNION ALL
SELECT rg.*, list.*, status.*
FROM rg
CROSS JOIN list
LEFT JOIN status ON status.user_id = list.user_id AND status.season_id = rg.required_status_id
WHERE rg.incorrect_status_id IS NOT NULL AND status.season_id IS NULL
I have a query like this . I have compound index for CC.key1,CC.key2.
I am executing this in a big database
Select * from CC where
( (
(select count(*) from Service s
where CC.key1=s.sr2 and CC.key2=s.sr1) > 2
AND
CC.key3='new'
)
OR
(
(select count(*) from Service s
where CC.key1=s.sr2 and CC.key2=s.sr1) <= 2
)
)
limit 10000;
I tried to make it as inner join , but its getting slower . How can i optimize this query ?
The trick here is being able to articulate a query for the problem:
SELECT *
FROM CC t1
INNER JOIN
(
SELECT cc.key1, cc.key2
FROM CC cc
LEFT JOIN Service s
ON cc.key1 = s.sr2 AND
cc.key2 = s.sr1
GROUP BY cc.key1, cc.key2
HAVING COUNT(*) <= 2 OR
SUM(CASE WHEN cc.key = 'new' THEN 1 ELSE 0 END) > 2
) t2
ON t1.key1 = t2.key1 AND
t1.key2 = t2.key2
Explanation:
Your original two subqueries would only add to the count if a given record in CC, with a given key1 and key2 value, matched to a corresponding record in the Service table. The strategy behind my inner query is to use GROUP BY to count the number of times that this happens, and use this instead of your subqueries. The first count condition is your bottom subquery, and the second one is the top.
The inner query finds all key1, key2 pairs in CC corresponding to records which should be retained. And recognize that these two columns are the only criteria in your original query for determining whether a record from CC gets retained. Then, this inner query can be inner joined to CC again to get your final result set.
In terms of performance, even this answer could leave something to be desired, but it should be better than a massive correlated subquery, which is what you had.
Basically get the Columns that must not have a duplicate then join them together. Example:
select *
FROM Table_X A
WHERE exists (SELECT 1
FROM Table_X B
WHERE 1=1
and a.SHOULD_BE_UNIQUE = b.SHOULD_BE_UNIQUE
and a.SHOULD_BE_UNIQUE2 = b.SHOULD_BE_UNIQUE2
/* excluded because these columns are null or can be Duplicated*/
--and a.GENERIC_COLUMN = b.GENERIC_COLUMN
--and a.GENERIC_COLUMN2 = b.GENERIC_COLUMN2
--and a.NULL_COLUMN = b.NULL_COLUMN
--and a.NULL_COLUMN2 = b.NULL_COLUMN2
and b.rowid > a.ROWID);
Where SHOULD_BE_UNIQUE and SHOULD_BE_UNIQUE2 are columns that shouldn't be repeated and have unique columns and the GENERIC_COLUMN and NULL_COLUMNS can be ignored so just leave them out of the query.
Been using this approach when we have issues in Duplicate Records.
With the limited information you've given us, this could be a rewrite using 'simplified' logic:
SEELCT *
FROM CC NATURAL JOIN
( SELECT key1, key2, COUNT(*) AS tally
FROM Service
GROUP
BY key1, key2 ) AS t
WHERE key3 = 'new' OR tally <= 2;
Not sure whether it will perform better but might give you some ideas of what to try next?
In the following query, I show the latest status of the sale (by stage, in this case the number 3). The query is based on a subquery in the status history of the sale:
SELECT v.id_sale,
IFNULL((
SELECT (CASE WHEN IFNULL( vec.description, '' ) = ''
THEN ve.name
ELSE vec.description
END)
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
WHERE veh.id_sale = v.id_sale
AND vec.id_stage = 3
ORDER BY veh.id_record DESC
LIMIT 1
), 'x') sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
WHERE 1 =1
AND v.flag =1
AND v.id_quarters =4
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
the query delay 0.0057seg and show 1011 records.
Because I have to filter the sales by the name of the state as it would have to repeat the subquery in a where clause, I have decided to change the same query using joins. In this case, I'm using the MAX function to obtain the latest status:
SELECT
v.id_sale,
IFNULL(veh3.State3,'x') AS sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
LEFT JOIN (
SELECT veh.id_sale,
(CASE WHEN IFNULL(vec.description,'') = ''
THEN ve.name
ELSE vec.description END) AS State3
FROM t_record veh
INNER JOIN (
SELECT id_sale, MAX(id_record) AS max_rating
FROM(
SELECT veh.id_sale, id_record
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign AND vec.id_stage = 3
) m
GROUP BY id_sale
) x ON x.max_rating = veh.id_record
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
) veh3 ON veh3.id_sale = v.id_sale
WHERE v.flag = 1
AND v.id_quarters = 4
This query shows the same results (1011). But the problem is it takes 0.0753 sec
Reviewing the possibilities I have found the factor that makes the difference in the speed of the query:
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
If I remove this clause, both queries the same time delay... Why it works better? Is there any way to use this clause in the joins? I hope your help.
EDIT
I will show the results of EXPLAIN for each query respectively:
q1:
q2:
Interesting, so that little statement basically determines if there is a match between t_record.id_sale and t_sale.id_sale.
Why is this making your query run faster? Because Where statements applied prior to subSelects in the select statement, so if there is no record to go with the sale, then it doesn't bother processing the subSelect. Which is netting you some time. So that's why it works better.
Is it going to work in your join syntax? I don't really know without having your tables to test against but you can always just apply it to the end and find out. Add the keyword EXPLAIN to the beginning of your query and you will get a plan of execution which will help you optimize things. Probably the best way to get better results in your join syntax is to add some indexes to your tables.
But I ask you, is this even necessary? You have a query returning in <8 hundredths of a second. Unless this query is getting ran thousands of times an hour, this is not really taxing your DB at all and your time is probably better spent making improvements elsewhere in your application.