I am currently running into an issue where, when I use a "LIKE" in my query I get the result in 2 seconds. But when I use the '=' instead, it takes around 1 minute for the result to show up.
The following is my query:
QUERY1
The following query takes 2 seconds:
`select distinct p.Name from Timeset s
join table1 f on (f.id = s.id)
join table2 p on (p.source=f.table_name)
join table3 d on (d.Name = p.Name) WHERE
s.Active = 'Y' AND **p.sourcefrom like '%sometable%'`
QUERY2
The same query replacing the 'like' by '=' takes 1 minute:
select distinct p.Name from Timeset s
join table1 f on (f.id = s.id)
join table2 p on (p.source=f.table_name)
join table3 d on (d.Name = p.Name) WHERE
s.Active = 'Y' AND **p.sourcefrom = 'sometable'
I am really puzzled because I know that 'LIKE' is usually slower (than '=') since mysql need to look for different possibilities. But I am sure why in my case, "=" is slower with such a substantial difference.
thank you kindly for the help in advance,
regards,
When you use = MySQL is probably using a different index compared to when you use LIKE. Check the output from the two execution plans and see what the differnce is. Then you can FORCE the use of the better performing index. Might be worth running ANALYZE TABLE for each of the tables involved.
Related
I have the below query, which I appreciate probably isn't well written, but on my local PC with Xampp and MariaDB it executes in 0.1719 seconds, which is about the speed I would hope for.
However, on my development server with Plesk and MariaDB the same query with the same data takes over 12 seconds. Obviously would be no use.
Probably the query could be modified to make it better, but can somebody explain why the performance difference? The server is a VPS, it has no shortage of resources - it isn't live so usage is almost none at all, yet still 12+ seconds for this query.
The query:
SELECT m.id AS match_id, e.event AS event1
FROM matches m
JOIN competitions co ON co.id = m.competition
JOIN clubs h ON h.id = m.hometeam
JOIN clubs a ON a.id = m.awayteam
LEFT JOIN match_events e ON e.match = m.id
AND e.player = '7138'
WHERE (m.hometeam = '1'
OR m.awayteam = '1'
)
AND m.season = '121'
Are you sure you need AND e.player = '7138' in the ON clause of a LEFT JOIN and not in the WHERE clause?
Better indexing
Recommend these composite, covering, indexes:
m: (season, awayteam, hometeam, competition, id)
e: (player, match, event)
Avoiding OR
OR optimizes poorly. A common trick is to turn it into UNION. Such may work for your query:
SELECT ...
FROM matches JOIN ...
WHERE m.season = 121
AND m.hometeam = 1
UNION ALL
SELECT ...
FROM matches JOIN ...
WHERE m.season = 121
AND m.awayteam = 1
And have these two indexes:
INDEX(season, hometeam) -- will be used by one part of the UNION
INDEX(season, awayteam) -- will be used by the other
I chose UNION ALL because it is faster than UNION DISTINCT. But if you get unwanted dups, change it.
I convert an old software (that use MS-ACCESS MDB) to mySQL.
I have a query that takes long time to run (actualy I break running after 5 minutes of waiting)
How can I write it?
SELECT pa_ID, pa_PRODUCT_ID, pr_ID,pr_NAME,Sum(pa_KILOS) as IN_KILOS,
(select sum(pl_KILOS) from POLHSH where POLHSH.pl_PRODUCT_ID = pa_PRODUCT_ID and POLHSH.pl_PARALABH_ID = pa_ID) as OUT_KILOS From PARALABH, PRODUCTS WHERE pa_company_id=1 GROUP BY pa_ID, pa_PRODUCT_ID,pr_ID, pr_NAME HAVING pa_ID=241 and pr_id=pa_PRODUCT_ID
Thanks in advance
Consider avoiding the correlated subquery which runs a SUM separately for each row and use a join of two aggregate queries each of which runs SUM once by grouping fields. Additionally, use explicit joins, the current SQL standard in joining tables/views.
Please adjust column aliases and names to actuals as assumptions were made below.
SELECT t1.*, t2.OUT_KILOS
FROM
(SELECT pa.pa_ID,
pa.pa_PRODUCT_ID,
pr.pr_ID,
pr.pr_NAME,
SUM(pa.pa_KILOS) AS IN_KILOS
FROM PARALABH pa
INNER JOIN PRODUCTS pr
ON pr.pr_id = pa.pa_PRODUCT_ID
WHERE pa.pa_company_id = 1
GROUP BY pa.pa_ID,
pa.pa_PRODUCT_ID,
pr.pr_ID,
pr.pr_NAME
HAVING pa.pa_ID = 241
) AS t1
INNER JOIN
(SELECT POLHSH.pl_PRODUCT_ID,
POLHSH.pl_PARALABH_ID
SUM(pl_KILOS) As OUT_KILOS
FROM POLHSH
GROUP BY POLHSH.pl_PRODUCT_ID,
POLHSH.pl_PARALABH_ID
) AS t2
ON t2.pl_PRODUCT_ID = t1.pa_PRODUCT_ID
AND t2.pl_PARALABH_ID = t1.pa_ID
I'm always be amused and confused(at same time) whenever I have been to asked prepare and run Join query on Sql Console.
And the cause of most confusion is mainly based upon the fact whether/or not the ordering of join predicate hold any importances in Join results.
Example.
SELECT "zones"."name", "ip_addresses".*
FROM "ip_addresses"
INNER JOIN "zones" ON "zones"."id" = "ip_addresses"."zone_id"
WHERE "ip_addresses"."resporg_accnt_id" = 1
AND "zones"."name" = 'us-central1'
LIMIT 1;
Given the sql query, the Join predicate look like this.
... INNER JOIN "zones" ON "zones"."id" = "ip_addresses"."zone_id" WHERE "ip_addresses"."resporg_accnt_id"
Now, would it make any difference in term of performance of Join as well as the authenticity of the obtained result. If happen to change the predicate to look like this
... INNER JOIN "zones" ON "ip_addresses"."zone_id" = "zones"."id" WHERE "ip_addresses"."resporg_accnt_id"
The predicate order won't make a performance difference in your case, a simple equality condition, but personally I like to place the columns from the table I'm JOINing to on the LHS of each ON condition
SELECT ...
FROM ip_addresses ia
JOIN zones z
ON z.id = ia.zone_id
WHERE ...
The optimiser can use any index available on these columns during the JOIN and I find it easier to visualise this way.
Any additional conditions also tend to be on columns of the table being JOINed to and I find again this reads better when this table is consistently on the LHS
Not quite the same, but I did see a case where performance was affected by the choice of column to isolate
I think the JOIN looked something like
SELECT ...
FROM table_a a
JOIN table_b b
ON a.id = b.id - 1
Changing this to
SELECT ...
FROM table_a a
JOIN table_b b
ON b.id = a.id + 1
allowed the optimiser to use an index on b.id, but presumably at the cost of an index on a.id
I suspect this kind of query might need analysing on a case by case basis
Furthermore, I would probably switch your table order round too and write your original query:
SELECT z.name,
ia.*
FROM zones z
JOIN ip_addresses ia
ON ia.zone_id = z.id
AND ia.resporg_accnt_id = 1
WHERE z.name = 'us-central1'
LIMIT 1
Conceptually, you are saying "Start with the 'us-central1' zone and fetch me all the ip_addresses associated with a resporg_accnt_id of 1"
Check the EXPLAIN plans if you want to verify that there is no difference in your case
I have the following query:
SELECT PKID, QuestionText, Type
FROM Questions
WHERE PKID IN (
SELECT FirstQuestion
FROM Batch
WHERE BatchNumber IN (
SELECT BatchNumber
FROM User
WHERE RandomString = '$key'
)
)
I've heard that sub-queries are inefficient and that joins are preferred. I can't find anything explaining how to convert a 3+ tier sub-query to join notation, however, and can't get my head around it.
Can anyone explain how to do it?
SELECT DISTINCT a.*
FROM Questions a
INNER JOIN Batch b
ON a.PKID = b.FirstQuestion
INNER JOIN User c
ON b.BatchNumber = c.BatchNumber
WHERE c.RandomString = '$key'
The reason why DISTINCT was specified is because there might be rows that matches to multiple rows on the other tables causing duplicate record on the result. But since you are only interested on records on table Questions, a DISTINCT keyword will suffice.
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins
Try :
SELECT q.PKID, q.QuestionText, q.Type
FROM Questions q
INNER JOIN Batch b ON q.PKID = b.FirstQuestion
INNER JOIN User u ON u.BatchNumber = q.BatchNumber
WHERE u.RandomString = '$key'
select
q.pkid,
q.questiontext,
q.type
from user u
join batch b
on u.batchnumber = b.batchnumber
join questions q
on b.firstquestion = q.pkid
where u.randomstring = '$key'
Since your WHERE clause filters on the USER table, start with that in the FROM clause. Next, apply your joins backwards.
In order to do this correctly, you need distinct in the subquery. Otherwise, you might multiply rows in the join version:
SELECT q.PKID, q.QuestionText, q.Type
FROM Questions q join
(select distinct FirstQuestion
from Batch b join user u
on b.batchnumber = u.batchnumber and
u.RandomString = '$key'
) fq
on q.pkid = fq.FirstQuestion
As to whether the in or join version is better . . . that depends. In some cases, particularly if the fields are indexed, the in version might be fine.
In the following query, I show the latest status of the sale (by stage, in this case the number 3). The query is based on a subquery in the status history of the sale:
SELECT v.id_sale,
IFNULL((
SELECT (CASE WHEN IFNULL( vec.description, '' ) = ''
THEN ve.name
ELSE vec.description
END)
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
WHERE veh.id_sale = v.id_sale
AND vec.id_stage = 3
ORDER BY veh.id_record DESC
LIMIT 1
), 'x') sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
WHERE 1 =1
AND v.flag =1
AND v.id_quarters =4
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
the query delay 0.0057seg and show 1011 records.
Because I have to filter the sales by the name of the state as it would have to repeat the subquery in a where clause, I have decided to change the same query using joins. In this case, I'm using the MAX function to obtain the latest status:
SELECT
v.id_sale,
IFNULL(veh3.State3,'x') AS sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
LEFT JOIN (
SELECT veh.id_sale,
(CASE WHEN IFNULL(vec.description,'') = ''
THEN ve.name
ELSE vec.description END) AS State3
FROM t_record veh
INNER JOIN (
SELECT id_sale, MAX(id_record) AS max_rating
FROM(
SELECT veh.id_sale, id_record
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign AND vec.id_stage = 3
) m
GROUP BY id_sale
) x ON x.max_rating = veh.id_record
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
) veh3 ON veh3.id_sale = v.id_sale
WHERE v.flag = 1
AND v.id_quarters = 4
This query shows the same results (1011). But the problem is it takes 0.0753 sec
Reviewing the possibilities I have found the factor that makes the difference in the speed of the query:
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
If I remove this clause, both queries the same time delay... Why it works better? Is there any way to use this clause in the joins? I hope your help.
EDIT
I will show the results of EXPLAIN for each query respectively:
q1:
q2:
Interesting, so that little statement basically determines if there is a match between t_record.id_sale and t_sale.id_sale.
Why is this making your query run faster? Because Where statements applied prior to subSelects in the select statement, so if there is no record to go with the sale, then it doesn't bother processing the subSelect. Which is netting you some time. So that's why it works better.
Is it going to work in your join syntax? I don't really know without having your tables to test against but you can always just apply it to the end and find out. Add the keyword EXPLAIN to the beginning of your query and you will get a plan of execution which will help you optimize things. Probably the best way to get better results in your join syntax is to add some indexes to your tables.
But I ask you, is this even necessary? You have a query returning in <8 hundredths of a second. Unless this query is getting ran thousands of times an hour, this is not really taxing your DB at all and your time is probably better spent making improvements elsewhere in your application.