Slow MySQL query with AS and subquery - mysql

I have a problem with this slow query that runs for 10+ seconds:
SELECT DISTINCT siteid,
storyid,
added,
title,
subscore1,
subscore2,
subscore3,
( 1 * subscore1 + 0.8 * subscore2 + 0.1 * subscore3 ) AS score
FROM articles
WHERE added > '2011-10-23 09:10:19'
AND ( articles.feedid IN (SELECT userfeeds.siteid
FROM userfeeds
WHERE userfeeds.userid = '1234')
OR ( articles.title REGEXP '[[:<:]]keyword1[[:>:]]' = 1
OR articles.title REGEXP '[[:<:]]keyword2[[:>:]]' = 1 ) )
ORDER BY score DESC
LIMIT 0, 25
This outputs a list of stories based on the sites that a user added to his account. The ranking is determined by score, which is made up out of the subscore columns.
The query uses filesort and uses indices on PRIMARY and feedid.
Results of an EXPLAIN:
1 PRIMARY articles
range
PRIMARY,added,storyid
PRIMARY 729263 rows
Using where; Using filesort
2 DEPENDENT SUBQUERY
userfeeds
index_subquery storyid,userid,siteid_storyid
siteid func
1 row
Using where
Any suggestions to improve this query? Thank you.

I would move the calculation logic to the client and only load fields from the database. This makes your query and the calculation itself faster. It's not a good style to do such things in SQL code.
And also is the regex very slow, maybe another searching mode like 'LIKE' is faster.

Looking at your EXPLAIN, it doesn't appear your query is utilizing any index (thus the filesort). This is being caused by the sort on the calculated column (score).
Another barrier is the size of the table (729263 rows). You don't want to create an index that is too wide as it will take much more space and impact performance of your CUD operations. What we want to do is target the columns that are being selected, however, in this situation we can't since it's a calculated column. You can try creating a VIEW or either remove the sort or do it at the application layer.

Related

SQL gets slow on a simple query with ORDER BY

I have problem with MySQL ORDER BY, it slows down query and I really don't know why, my query was a little more complex so I simplified it to a light query with no joins, but it stills works really slow.
Query:
SELECT
W.`oid`
FROM
`z_web_dok` AS W
WHERE
W.`sent_eRacun` = 1 AND W.`status` IN(8, 9) AND W.`Drzava` = 'BiH'
ORDER BY W.`oid` ASC
LIMIT 0, 10
The table has 946,566 rows, with memory taking 500 MB, those fields I selecting are all indexed as follow:
oid - INT PRIMARY KEY AUTOINCREMENT
status - INT INDEXED
sent_eRacun - TINYINT INDEXED
Drzava - VARCHAR(3) INDEXED
I am posting screenshoots of explain query first:
The next is the query executed to database:
And this is speed after I remove ORDER BY.
I have also tried sorting with DATETIME field which is also indexed, but I get same slow query as with ordering with primary key, this started from today, usually it was fast and light always.
What can cause something like this?
The kind of query you use here calls for a composite covering index. This one should handle your query very well.
CREATE INDEX someName ON z_web_dok (Drzava, sent_eRacun, status, oid);
Why does this work? You're looking for equality matches on the first three columns, and sorting on the fourth column. The query planner will use this index to satisfy the entire query. It can random-access the index to find the first row matching your query, then scan through the index in order to get the rows it needs.
Pro tip: Indexes on single columns are generally harmful to performance unless they happen to match the requirements of particular queries in your application, or are used for primary or foreign keys. You generally choose your indexes to match your most active, or your slowest, queries. Edit You asked whether it's better to create specific indexes for each query in your application. The answer is yes.
There may be an even faster way. (Or it may not be any faster.)
The IN(8, 9) gets in the way of easily handling the WHERE..ORDER BY..LIMIT completely efficiently. The possible solution is to treat that as OR, then convert to UNION and do some tricks with the LIMIT, especially if you might also be using OFFSET.
( SELECT ... WHERE .. = 8 AND ... ORDER BY oid LIMIT 10 )
UNION ALL
( SELECT ... WHERE .. = 9 AND ... ORDER BY oid LIMIT 10 )
ORDER BY oid LIMIT 10
This will allow the covering index described by OJones to be fully used in each of the subqueries. Furthermore, each will provide up to 10 rows without any temp table or filesort. Then the outer part will sort up to 20 rows and deliver the 'correct' 10.
For OFFSET, see http://mysql.rjweb.org/doc.php/index_cookbook_mysql#or

How to sort by sum(:field) without using temp filesort

So we want to have the objects who received the most points by user actions (comments, image uploads etc). each action is stored with its points-value and the target.
select sum(points) as points, target_type,target_id from user_actions where target_type="Modification" group by target_id order by points DESC limit 100
Showing rows 0 - 99 (100 total, Query took 200.7865 seconds.)
Table size 4M rows.
Index on target_type, target_id.
If i EXPLAIN the query, it says it is using temporary filesort. that obviously is killing it.
question
do i have any chances of speeding this query up?
You can add another index or changing the existing one if it's not used in another place .
Points column is not indexed, if it will be it should improve your performance significantly :
CREATE INDEX user_actions_indx
ON user_actions (target_type,target_id,points);

How to optimize MIN + ORDER + LIMIT

I try to implement backward pagination in my app. Data comes from a large NoSQL database and if I do pagination in the trivial way, then I see that the further page I jump to, the more time it takes me to get there. To improve performanace I plan to use MySQL table which stores just indices. What I want from MySQL - to find a starting index of the page as fast as possible. This approach on a table with 3 million rows takes almost 3 second to get and index:
SELECT MIN(id) FROM index_77635_ ORDER BY id DESC LIMIT $large_skip_number
As you see, I try to find a row with the least index so to jump to those rows which were added earlier. Probably, there is a better way to implement this task.
EDIT
The correct query which works quite good (=relatively fast, or at least faster than in pure Mongo) turned out to be this one:
SELECT a.id FROM index_77635_ a
INNER JOIN (
SELECT MAX(id) AS id FROM (
SELECT id FROM index_77635_ ORDER BY id DESC LIMIT $skip,$limit
) t
) b ON a.id = b.id
In this case I try to find the starting (that is maximum for backward pagination) index, and then in mongo I query chucnk of data up to this index.

Best way to use indexes on large mysql like query

This mysql query is runned on a large (about 200 000 records, 41 columns) myisam table :
select t1.* from table t1 where 1 and t1.inactive = '0' and (t1.code like '%searchtext%' or t1.name like '%searchtext%' or t1.ext like '%searchtext%' ) order by t1.id desc LIMIT 0, 15
id is the primary index.
I tried adding a multiple column index on all 3 searched (like) columns. works ok but results are served on a auto filled ajax table on a website and the 2 seond return delay is a bit too slow.
I also tried adding seperate indexes on all 3 columns and a fulltext index on all 3 columns without significant improvement.
What would be the best way to optimize this type of query? I would like to achieve under 1 sec performance, is it doable?
The best thing you can do is implement paging. No matter what you do, that IO cost is going to be huge. If you only return one page of records, 10/25/ or whatever that will help a lot.
As for the index, you need to check the plan to see if your index is actually being used. A full text index might help but that depends on how many rows you return and what you pass in. Using parameters such as % really drain performance. You can still use an index if it ends with % but not starts with %. If you put % on both sides of the text you are searching for, indexes can't help too much.
You can create a full-text index that covers the three columns: code, name, and ext. Then perform a full-text query using the MATCH() AGAINST () function:
select t1.*
from table t1
where match(code, name, ext) against ('searchtext')
order by t1.id desc
limit 0, 15
If you omit the ORDER BY clause the rows are sorted by default using the MATCH function result relevance value. For more information read the Full-Text Search Functions documentation.
As #Vulcronos notes, the query optimizer is not able to use the index when the LIKE operator is used with an expression that starts with a wildcard %.

Why does order by primary index make this query slow?

This query is getting the newest videos uploaded by the user's subscriptions, its running very slow so I rewrote it to use joins but It didn't make a difference and after tinkering with it I found out that removing ORDER BY would make it run fast (however it defeats the purpose of the query).
Query:
SELECT vid. *
FROM video AS vid
INNER JOIN subscriptions AS sub ON vid.uploader = sub.subscription_id
WHERE sub.subscriber_id = '1'
AND vid.privacy = 0 AND vid.blocked <> 1 AND vid.converted = 1
ORDER BY vid.id DESC
LIMIT 8
Running explain, it would show "Using temporary; Using filesort" in subscriptions table and its slow (0.0900 seconds).
Without ORDER BY vid.id DESC it doesn't show "Using temporary; Using filesort" so its fast (0.0004 seconds) but I don't understand how the other table can affect it like this.
All the fields are indexed (privacy blocked and converted fields don't affect performance by more than 10%).
I would paste the full explain information but I can't seem to make it fit nice in the layout of this site.
You're limiting the query to 8 results. When you run it without an order by, it can grab the first 8 rows it comes across that pass the condition, and then hand them back. Boom, it's done.
When you use the order by, you're not asking for any 8 records. You're asking for the first 8 records in terms of vid.id. So it has to figure out which those are, and the only way to do that is to look through the entire table and compare vid.id values. That's a lot more work.
Is there actually an index on the column? If so, it may be out of date. You could try rebuilding it.
Fixed it by suggesting that mysql use the primary index with USE_INDEX(PRIMARY)
SELECT vid. *
FROM video AS vid USE INDEX ( PRIMARY )
INNER JOIN subscriptions AS sub ON vid.uploader = sub.subscription_id
WHERE sub.subscriber_id = '1'
AND vid.privacy =0
AND vid.blocked <>1
AND vid.converted =1
ORDER BY vid.id DESC
LIMIT 8