What should I index mysql? - mysql

I'm looking to speed this query up. I currently have an index on
users_score.appID
app_names.name
SELECT users_scores.username, users_scores.avatar, users_scores.score
FROM users_scores
RIGHT JOIN app_names ON app_names.id = users_scores.appID
WHERE app_names.name = "testapp1"
ORDER BY users_scores.score DESC
LIMIT 0 , 30

Do you have an index on your primary key? (users_score.id, or whatever you've named it). If not, keys should always be indexed... in fact, they ARE the index. app_names.id should also be primary key/index.
appID is a good index, however I see you searching for apps via name. It's faster if MySQL doesn't have to perform string comparisons on WHERE clauses. It would be much more efficient to search for an AppID. Given the app name is known ('testapp1'), you could do an inner query to determine the ID before searching, like this.
WHERE app_names.id = (SELECT id FROM app_names WHERE app_names.name = "testapp1")

You should never use RIGHT JOIN. Any RIGHT JOIN can and should be written as a LEFT JOIN.
Regardless, your WHERE clause automatically turns the query into an INNER JOIN:
SELECT users_scores.username, users_scores.avatar, users_scores.score
FROM app_names
INNER JOIN users_scores
ON users_scores.appID = app_names.id
WHERE app_names.name = 'testapp1'
ORDER BY users_scores.score DESC
LIMIT 30
Since you're not returning any data from app_names, you can get rid of the JOIN entirely by using a subquery:
SELECT username, avatar, score
FROM users_scores
WHERE appID = (SELECT id FROM app_names WHERE name = 'testapp1' LIMIT 1)
ORDER BY score DESC
LIMIT 30
MySQL executes non-correlated subqueries first, so MySQL can use an index on the app_names table for the search, then is able to utilize an index on users_scores for the search and sort.
For optimum read performance, add a multi-column index on user_scores(appID, score) to satisfy the search and the sort, or with an even larger covering index: user_scores(appID, score, username, avatar).

SELECT users_scores.username, users_scores.avatar, users_scores.score
FROM users_scores
RIGHT JOIN app_names -- Probably forces app_names to be first
ON app_names.id
= users_scores.appID -- users_scores second; Step 1
WHERE app_names.name = "testapp1" -- app_names first; Step 1
ORDER BY users_scores.score DESC
LIMIT 0 , 30
app_names needs INDEX(name)
users_scores needs INDEX(appID)
Even if you remove RIGHT (which might be noise), the Optimizer will pick app_names first because of the WHERE clause mentions only app_names.
All of this, plus more, is found in my blog on Creating an Index from a SELECT.

Related

Mysql how to set best indexes for my query

Please check my query and suggest me indexing value and how can I decide which columns will be in indexes. Query is very slow when where clause exist otherwise query is just fine. Offset value is also slow down query.
SELECT
attachment.attachment_id AS attachmentID,
attachment.data_item_id AS candidateID,
attachment.title AS title,
candidate.first_name AS firstName,
candidate.last_name AS lastName,
candidate.city AS city,
candidate.state AS state
FROM
attachment
LEFT JOIN candidate
ON attachment.data_item_id = candidate.candidate_id
where candidate.is_active = 1
ORDER BY
lastName ASC
LIMIT 92000, 20
Your query is basically:
SELECT . . .
FROM attachment a JOIN
candidate c
ON a.data_item_id = c.candidate_id
WHERE c.is_active = 1
ORDER BY c.last_name ASC
LIMIT 92000, 20;
Note that the WHERE clause turns the LEFT JOIN into an INNER JOIN anyway, so there is no reason to use LEFT JOIN.
I would recommend the following indexes:
candidate(is_active, candidate_id, last_name)
attachment(data_item_id)
You could expand the indexes to include all columns being selected.
Note that offsetting 92,000 rows takes a bit of effort so the query will never be lightning fast.

Turn a Mysql Subquery in a Join

How can I turn this subquery in a JOIN?
I read that subqueries are slower than JOINs.
SELECT
reklamation.id,
reklamation.titel,
(
SELECT reklamation_status.status
FROM reklamation_status
WHERE reklamation_status.id_reklamation = reklamation.id
ORDER BY reklamation_status.id DESC
LIMIT 1
) as status
FROM reklamation
WHERE reklamation.aktiv=1
This should do it:
SELECT r.id, r.titel, MAX(s.id) as status
FROM reklamation r
LEFT JOIN reklamation_status s ON s.id_reklamation = r.id
WHERE r.aktiv = 1
GROUP BY r.id, r.titel
The key point here is to use aggregation to manage the cardinality between reklamation and reklamation_status. In your original code, the inline subquery uses ORDER BY reklamation_status.id DESC LIMIT 1 to return the highest id in reklamation_status that corresponds to the current reklamation. Without aggregation, we would probably get multiple records in the resultset for each reklamation (one for each corresponding reklamation_status).
Another thing is to consider is the type of the JOIN. INNER JOIN would filter out records of reklamations that do not have a reklamation_status: the original query with the inline subquery does not behave like that, so I chose LEFT JOIN. If you can guarantee that every reklamation has at least one child in reklamation_status, you can safely switch back to INNER JOIN (which might perform more efficiently).
PS:
I read that subqueries are slower than JOINs.
This is not a universal truth. It depends on many factors and cannot be told without seeing your exact use case.
Using JOIN query can be rewritten to:
SELECT reklamation.id, reklamation.titel, reklamation_status.status
FROM reklamation
JOIN reklamation_status ON reklamation_status.id_reklamation = reklamation.id
WHERE reklamation.aktiv=1
What you read is incorrect. Subqueries can be slower, faster, or the same as joins. I would write the query as:
SELECT r.id, r.titel,
(SELECT rs.status
FROM reklamation_status rs
WHERE rs.id_reklamation = r.id
ORDER BY rs.id DESC
LIMIT 1
) as status
FROM reklamation r
WHERE r.aktiv = 1;
For performance, you want an index on reklamation_status(id_reklamation, id, status).
By the way, this is a case where the subquery is probably the fastest method for expressing this query. If you attempt a JOIN, then you need some sort of aggregation to get the most recent status. That can be expensive.

How to fix SQL query with Left Join and subquery?

I have SQL query with LEFT JOIN:
SELECT COUNT(stn.stocksId) AS count_stocks
FROM MedicalFacilities AS a
LEFT JOIN stocks stn ON
(stn.stocksIdMF = ( SELECT b.MedicalFacilitiesIdUser
FROM medicalfacilities AS b
WHERE b.MedicalFacilitiesIdUser = a.MedicalFacilitiesIdUser
ORDER BY stn.stocksId DESC LIMIT 1)
AND stn.stocksEndDate >= UNIX_TIMESTAMP() AND stn.stocksStartDate <= UNIX_TIMESTAMP())
These query I want to select one row from table stocks by conditions and with field equal value a.MedicalFacilitiesIdUser.
I get always count_stocks = 0 in result. But I need to get 1
The count(...) aggregate doesn't count null, so its argument matters:
COUNT(stn.stocksId)
Since stn is your right hand table, this will not count anything if the left join misses. You could use:
COUNT(*)
which counts every row, even if all its columns are null. Or a column from the left hand table (a) that is never null:
COUNT(a.ID)
Your subquery in the on looks very strange to me:
on stn.stocksIdMF = ( SELECT b.MedicalFacilitiesIdUser
FROM medicalfacilities AS b
WHERE b.MedicalFacilitiesIdUser = a.MedicalFacilitiesIdUser
ORDER BY stn.stocksId DESC LIMIT 1)
This is comparing MedicalFacilitiesIdUser to stocksIdMF. Admittedly, you have no sample data or data layouts, but the naming of the columns suggests that these are not the same thing. Perhaps you intend:
on stn.stocksIdMF = ( SELECT b.stocksId
-----------------------------^
FROM medicalfacilities AS b
WHERE b.MedicalFacilitiesIdUser = a.MedicalFacilitiesIdUser
ORDER BY b.stocksId DESC
LIMIT 1)
Also, ordering by stn.stocksid wouldn't do anything useful, because that would be coming from outside the subquery.
Your subquery seems redundant and main query is hard to read as much of the join statements could be placed in where clause. Additionally, original query might have a performance issue.
Recall WHERE is an implicit join and JOIN is an explicit join. Query optimizers
make no distinction between the two if they use same expressions but readability and maintainability is another thing to acknowledge.
Consider the revised version (notice I added a GROUP BY):
SELECT COUNT(stn.stocksId) AS count_stocks
FROM MedicalFacilities AS a
LEFT JOIN stocks stn ON stn.stocksIdMF = a.MedicalFacilitiesIdUser
WHERE stn.stocksEndDate >= UNIX_TIMESTAMP()
AND stn.stocksStartDate <= UNIX_TIMESTAMP()
GROUP BY stn.stocksId
ORDER BY stn.stocksId DESC
LIMIT 1

How to make faster queries on my mysql table?

I have the following table
As you can see It has 1868155 rows. I am attempting to make a realtime graph, but It is impossible since almost any query lasts 1 or 2 seconds.
For example, this query
SELECT sensor.nombre, temperatura.temperatura
FROM sensor, temperatura
WHERE sensor.id = temperatura.idsensor
ORDER BY temperatura.fecha DESC, idsensor ASC
LIMIT 4
Is supposed to show this
Ive tried everything, using indexes(perhaps not correctly), using only the fields i need instead of *, etc. but the results are the same!
These are the indexes of the table.
Explain of the query
EDITED
This is the explain of the query after implementing
ALTER TABLE temperatura
ADD INDEX `sensor_temp` (`idsensor`,`fecha`,`temperatura`)
And using inner join syntax for the query
SELECT s.nombre, t.temperatura
FROM sensor s
INNER JOIN temperatura t
ON s.id = t.idsensor
ORDER BY t.fecha DESC, t.idsensor ASC
LIMIT 4
This is my whole sensor table
Try the following:
ALTER TABLE temperatura
ADD INDEX `sensor_temp` (`idsensor`,`fecha`,`temperatura`)
I also recommend using modern join syntax:
SELECT s.nombre, t.temperatura
FROM sensor s
INNER JOIN temperatura t
ON s.id = t.idsensor
ORDER BY t.fecha DESC, t.idsensor ASC
LIMIT 4
Report the EXPLAIN again after making the above changes, if performance is still not good enough.
Attempt #2
After looking closely at what it appears you are trying to do, I believe this next query may be more effective:
SELECT
s.nombre, t.temperatura
FROM temperatura t
LEFT OUTER JOIN temperatura later_t
ON later_t.idsensor = t.idsensor
AND later_t.fecha > t.fecha
INNER JOIN sensor s
ON s.id = t.idsensor
WHERE later_t.idsensor IS NULL
ORDER BY t.idsensor ASC
You can also try:
SELECT
s.nombre, t.temperatura
FROM temperatura t
INNER JOIN (
SELECT
t.idsensor,
MAX(t.fecha) AS fecha
FROM temperatura t
GROUP BY t.idsensor
) max_fecha
ON max_fecha.idsensor = t.idsensor
AND max_fecha.fecha > t.fecha
INNER JOIN sensor s
ON s.id = t.idsensor
ORDER BY t.idsensor ASC
In my experience, if you are trying to find the most recent record, one of the two queries above will work. Which works best depends on various factors, so try them both.
Let me know how those perform, and if they still get you the data you want. Also, any query you run, run at least 3 times, and report all 3 times. That will help get an accurate measure of how fast a given query is, since various external factors can affect the speed of a query.
It is not possible to optimize a mixture of ASC and DESC, as in
ORDER BY t.fecha DESC, t.idsensor ASC
You tried a covering index:
INDEX `sensor_temp` (`idsensor`,`fecha`,`temperatura`)
However, this covering index may be better:
INDEX `sensor_temp` (`fecha`,`idsensor`,`temperatura`)
Then, if you are willing to get the sensors in a different order, use
ORDER BY t.fecha DESC, t.idsensor DESC
This will give you up to 4 sensors for the last fecha:
sensor: PRIMARY KEY(id)
tempuratura: INDEX(fecha, idsensor, tempuratura)
SELECT
( SELECT nombre FROM sensor WHERE id = t.idsensor ) AS nombre,
t.temperatura
FROM
( SELECT MAX(fecha) AS max_fecha FROM tempuratura ) AS z
JOIN temperatura AS t ON t.fecha = z.max_fecha
ORDER BY t.idsensor ASC
LIMIT 4;

Tips for improving this slow mysql query?

I'm using a query which generally executes in under a second, but sometimes takes between 10-40 seconds to finish. I'm actually not totally clear on how the subquery works, I just know that it works, in that it gives me 15 rows for each faverprofileid.
I'm logging slow queries and it's telling me 5823244 rows were examined, which is odd because there aren't anywhere close to that many rows in any of the tables involved (the favorites table has the most at 50,000 rows).
Can anyone offer me some pointers? Is it an issue with the subquery and needing to use filesort?
EDIT: Running explain shows that the users table is not using an index (even though id is the primary key). Under extra it says: Using temporary; Using filesort.
SELECT F.id,F.created,U.username,U.fullname,U.id,I.*
FROM favorites AS F
INNER JOIN users AS U ON F.faver_profile_id = U.id
INNER JOIN items AS I ON F.notice_id = I.id
WHERE faver_profile_id IN (360,379,95,315,278,1)
AND F.removed = 0
AND I.removed = 0
AND F.collection_id is null
AND I.nudity = 0
AND (SELECT COUNT(*) FROM favorites WHERE faver_profile_id = F.faver_profile_id
AND created > F.created AND removed = 0 AND collection_id is null) < 15
ORDER BY F.faver_profile_id, F.created DESC;
The number of rows examined represents is large because many rows have been examined more than once. You are getting this because of an incorrectly optimized query plan which results in table scans when index lookups should have been performed. In this case the number of rows examined is exponential, i.e. of an order of magnitude comparable to the product of the total number of rows in more than one table.
Make sure that you have run ANALYZE TABLE on your three tables.
Read on how to avoid table scans, and identify then create any missing indexes
Rerun ANALYZE and re-explain your queries
the number of examined rows must drop dramatically
if not, post the full explain plan
use query hints to force the use of indices (to see the index names for a table, use SHOW INDEX):
SELECT
F.id,F.created,U.username,U.fullname,U.id,I.*
FROM favorites AS F FORCE INDEX (faver_profile_id_key)
INNER JOIN users AS U FORCE INDEX FOR JOIN (PRIMARY) ON F.faver_profile_id = U.id
INNER JOIN items AS I FORCE INDEX FOR JOIN (PRIMARY) ON F.notice_id = I.id
WHERE faver_profile_id IN (360,379,95,315,278,1)
AND F.removed = 0
AND I.removed = 0
AND F.collection_id is null
AND I.nudity = 0
AND (SELECT COUNT(*) FROM favorites FORCE INDEX (faver_profile_id_key) WHERE faver_profile_id = F.faver_profile_id
AND created > F.created AND removed = 0 AND collection_id is null) < 15
ORDER BY F.faver_profile_id, F.created DESC;
You may also change your query to use GROUP BY faver_profile_id/HAVING count > 15 instead of the nested SELECT COUNT(*) subquery, as suggested by vartec. The performance of both your original and vartec's query should be comparable if both are properly optimized e.g. using hints (your query would use nested index lookups, whereas vartec's query would use a hash-based strategy.)
I think with GROUP BY and HAVING it should be faster.
Is that what you want?
SELECT F.id,F.created,U.username,U.fullname,U.id, I.field1, I.field2, count(*) as CNT
FROM favorites AS F
INNER JOIN users AS U ON F.faver_profile_id = U.id
INNER JOIN items AS I ON F.notice_id = I.id
WHERE faver_profile_id IN (360,379,95,315,278,1)
AND F.removed = 0
AND I.removed = 0
AND F.collection_id is null
AND I.nudity = 0
GROUP BY F.id,F.created,U.username,U.fullname,U.id,I.field1, I.field2
HAVING CNT < 15
ORDER BY F.faver_profile_id, F.created DESC;
Don't know which fields from items you need, so I've put placeholders.
I suggest you use Mysql Explain Query to see how your mysql server handles the query. My bet is your indexes aren't optimal, but explain should do much better than my bet.
You could do a loop on each id and use limit instead of the count(*) subquery:
foreach $id in [123,456,789]:
SELECT
F.id,
F.created,
U.username,
U.fullname,
U.id,
I.*
FROM
favorites AS F INNER JOIN
users AS U ON F.faver_profile_id = U.id INNER JOIN
items AS I ON F.notice_id = I.id
WHERE
F.faver_profile_id = {$id} AND
I.removed = 0 AND
I.nudity = 0 AND
F.removed = 0 AND
F.collection_id is null
ORDER BY
F.faver_profile_id,
F.created DESC
LIMIT
15;
I'll suppose the result of that query is intented to be shown as a paged list. In that case, perhaps you could consider to do a simpler "unjoined query" and do a second query for each row to read only the 15, 20 or 30 elements shown. Was not a JOIN a heavy operation? This would simplify the query and It wouldn't become slower when the joined tables grow.
Tell me if I'm wrong, please.