Improving Existing MySQL LEFT JOIN query - mysql

My query runs a bit slow (slower than I would like). I know the general basics of MySQL but not enough to optimize queries. My query is as follows:
SELECT a.index, a.date, b.description, a.type, a.place, a.value, b.space,a.val2
FROM a LEFT JOIN b ON a.index = b.index
WHERE a.date >= ? and a.date < ?
AND a.type = 100;
I have an index on A's index and type as well as B's index. I know this query could be faster, so I was wondering if anyone might know of some proper optimization.
Thanks

Related

Optimized sql query is slower than not optimized one? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
A fellow programmer showed me a query he created which looked like this:
SELECT a.row, b.row, c.row
FROM
a LEFT JOIN
b ON (a.id = b.id) LEFT JOIN
c ON (c.otherid= b.otherid)
WHERE a.id NOT IN (SELECT DISTINCT b.id bb
INNER JOIN
c cc ON (bb.a_id = cc.a_id)
WHERE (bb.date BETWEEN '2018-08-04 00:00:00' AND '2018-08-06 23:59:59'))
GROUP BY a.id ORDER BY c.otherid DESC;
So I shortened it by removing the second query and applying the WHERE clause directly:
SELECT a.row, b.row, c.row
FROM
a LEFT JOIN
b ON (a.id = b.id) LEFT JOIN
c ON (c.otherid= b.otherid)
WHERE b.date NOT BETWEEN '2018-08-04 00:00:00' AND '2018-08-06 23:59:59'
GROUP BY a.id ORDER BY c.otherid DESC;
Until here, everything seems fine and both queries return the same result set. The problem is that the second query takes three times longer to execute than the first one. How is that possible?
Thanks
The queries are significantly different. (We're assuming that the missing FROM keyword in the subquery in the first version is a result of putting that into the question, and that the original query doesn't have the same syntax errors. Also, the reference to b.id in the SELECT list of the subquery is highly suspicious, we suspect that's really meant to be a reference to bb.id ... but we're just guessing.)
If the two queries are returning the same exact resultset, that's a circumstance in the data. (We could demonstrate data sets where the results of the two queries would be different.)
"Shortening" a query does not necessarily optimize it.
What really matters (in terms of performance) is the execution plan. That is, what operations are being performed, in what order, and with large tables, which indexes are available and being used.
Without table and index definitions, it's not possible to give a definitive diagnosis.
Suggestion: Use MySQL EXPLAIN to view the execution plan of each query.
Assuming that the original query has a WHERE clause of the form:
WHERE a.id NOT IN ( SELECT DISTINCT bb.id
FROM b bb
JOIN c cc
ON bb.a_id = cc.a_id
WHERE bb.date BETWEEN '2018-08-04 00:00:00'
AND '2018-08-06 23:59:59'
AND bb.id IS NOT NULL
)
(assuming that we have a guarantee that a value returned by the subquery will never be NULL...)
That could be re-written as a NOT EXISTS correlated subquery to achieve an equivalent result:
WHERE NOT EXISTS ( SELECT 1
FROM b bb
JOIN c cc
ON cc.a_id = bb.a_id
WHERE bb.date >= '2018-08-04 00:00:00'
AND bb.date < '2018-08-07 00:00:00'
AND bb.id = a.id
)
or it could be re-written as an anti-join
LEFT
JOIN b bb
ON bb.id = a.id
AND bb.date >= '2018-08-04 00:00:00'
AND bb.date < '2018-08-07 00:00:00'
LEFT
JOIN c cc
ON cc.a_id = bb.a_id
WHERE cc.a_id IS NULL
With large sets, appropriate indexes would need to be available for optimal performance.
The re-write presented in the question is not guaranteed to return an equivalent result.

Optimizing a MySQL NOT IN( query

I am trying to optimize this MySQL query. I want to get a count of the number of customers that do not have an appointment prior to the current appointment being looked at. In other words, if they have an appointment (which is what the NOT IN( subquery is checking for), then exclude them.
However, this query is absolutely killing performance. I know that MySQL is not very good with NOT IN( queries, but I am not sure on the best way to go about optimizing this query. It takes anywhere from 15 to 30 seconds to run. I have created indexes on CustNo, AptStatus, and AptNum.
SELECT
COUNT(*) AS NumOfCustomersWithPriorAppointment,
FROM
transaction_log AS tl
LEFT JOIN
appointment AS a
ON
a.AptNum = tl.AptNum
INNER JOIN
customer AS c
ON
c.CustNo = tl.CustNo
WHERE
a.AptStatus IN (2)
AND a.CustNo NOT IN
(
SELECT
a2.CustNo
FROM
appointment a2
WHERE
a2.AptDateTime < a.AptDateTime)
AND a.AptDateTime > BEGIN_QUERY_DATE
AND a.AptDateTime < END_QUERY_DATE
Thank you in advance.
Try the following:
SELECT
COUNT(*) AS NumOfCustomersWithPriorAppointment,
FROM
transaction_log AS tl
INNER JOIN
appointment AS a
ON
a.AptNum = tl.AptNum
LEFT OUTER JOIN appointment AS earlier_a
ON earlier_a.CustNo = a.CustNo
AND earlier_a.AptDateTime < a.AptDateTime
INNER JOIN
customer AS c
ON
c.CustNo = tl.CustNo
WHERE
a.AptStatus IN (2)
AND earlier_a.AptNum IS NULL
AND a.AptDateTime > BEGIN_QUERY_DATE
AND a.AptDateTime < END_QUERY_DATE
This will benefit from a composite index on (CustNo,AptDateTime). Make it unique if that fits your business model (logically it seems like it should, but practically it may not, depending on how you handle conflicts in your application.)
Provide SHOW CREATE TABLE statements for all tables if this does not create a sufficient performance improvement.

How to speed up mysql query for most Joomla 1.5 mod_mostcomment

I have one query that is loading db so much and my hosting provider complains
SELECT count(a.id),
a.*,
CASE
WHEN CHAR_LENGTH(a.alias) THEN CONCAT_WS(":", a.id, a.alias)
ELSE a.id
END AS slug,
CASE
WHEN CHAR_LENGTH(cc.alias) THEN CONCAT_WS(":", cc.id, cc.alias)
ELSE cc.id
END AS catslug
FROM jos_chrono_comments AS com,
jos_content AS a
LEFT JOIN jos_content_frontpage AS f ON f.content_id = a.id
INNER JOIN jos_content AS c ON f.content_id = c.id
INNER JOIN jos_categories AS cc ON cc.id = a.catid
INNER JOIN jos_sections AS s ON s.id = a.sectionid
WHERE (a.state = 1
AND s.id > 0)
AND s.published = 1
AND cc.published = 1
AND a.id = com.pageid
AND DATE_SUB(CURDATE(),INTERVAL 30 DAY) <= c.publish_up
GROUP BY (com.pageid)
ORDER BY 1 DESC LIMIT 0,
10
it's Joomla 1.5 related and Chronocomments module for most commented in 30 days
I have some hints here https://goo.gl/0wF2ex but I am not so good to rewerite that in better way without using temp table
Looking for help to make that query not so heavy for mysql server, maybe eliminating group by or any hint will be usefull
Thanks,
K#m0
There are multiple possible answers, this is focusing on the database with no change to the PHP code.
Explain query
With the MySQL EXPLAIN command, that you can use e.g. from phpMyAdmin, you can have optimization tips from MySQL itself, mostly related to adding indexes if required.
Possible indexes
As a wild guess, I would make sure that all the following fields are indexed:
jos_content.catid
jos_content.sectionid
jos_content.state
jos_content_frontpage.content_id (if you have many items in frontpage)
jos_categories.published (if you have many categories)
jos_sections.published (if you have many sections)
jos_chrono_comments.pageid
Final notes
The ORDER BY 1 DESC part seems useless to me.
The increase of performance is really dependant from the size of your tables, so don't expect miracles. But it's definitely worth a try.

Optimizing this mysql query by rewriting?

I am trying to optimize mysql to decrease my server load.. And here i have a complex query which will be used about 1k times/minute on a quad core server with 8gb ram and my server is going down.
I have tried many ways to rewrite this query :
SELECT *
FROM (
SELECT a.id,
a.url
FROM surf a
LEFT JOIN users b
ON b.id = a.user
LEFT JOIN surfed c
ON c.user = 'asdf' AND c.site = a.id
WHERE a.active = '0'
AND (b.coins >= a.cpc AND a.cpc >= '2')
AND (c.site IS NULL AND a.user !='asdf')
ORDER BY a.cpc DESC, b.premium DESC
LIMIT 100) AS records
ORDER BY RAND()
LIMIT 1
But it didn't work. So can you guys help me to rewrite the above query so that it would not waste any resources ?
Also this query doesn't have any indexes :( . It would be very helpful to guide me creating indexes for this.
The problem is most likely the inner sort.
You should have indexes on surfed(site, user, cpc), surf(active, user, site), and user(id, coins).
You can also perhaps make minor improvements by switching the join to inner joins from outer joins. The where clause is undoing the left outer join anyway, so this won't affect the results.
But I don't think these changes will really help. The problem is the sort of the result set in the inner query. The outer sort by rand() is a minor issue, because you are only doing that on 100 rows.
If you are running this 1,000/minute, you will need to rethink your data structures and develop an approach that has the data you need more readily available.

Mysql query with '=' is much slower than 'LIKE'

I am currently running into an issue where, when I use a "LIKE" in my query I get the result in 2 seconds. But when I use the '=' instead, it takes around 1 minute for the result to show up.
The following is my query:
QUERY1
The following query takes 2 seconds:
`select distinct p.Name from Timeset s
join table1 f on (f.id = s.id)
join table2 p on (p.source=f.table_name)
join table3 d on (d.Name = p.Name) WHERE
s.Active = 'Y' AND **p.sourcefrom like '%sometable%'`
QUERY2
The same query replacing the 'like' by '=' takes 1 minute:
select distinct p.Name from Timeset s
join table1 f on (f.id = s.id)
join table2 p on (p.source=f.table_name)
join table3 d on (d.Name = p.Name) WHERE
s.Active = 'Y' AND **p.sourcefrom = 'sometable'
I am really puzzled because I know that 'LIKE' is usually slower (than '=') since mysql need to look for different possibilities. But I am sure why in my case, "=" is slower with such a substantial difference.
thank you kindly for the help in advance,
regards,
When you use = MySQL is probably using a different index compared to when you use LIKE. Check the output from the two execution plans and see what the differnce is. Then you can FORCE the use of the better performing index. Might be worth running ANALYZE TABLE for each of the tables involved.