MySQL order by makes query slow solved, but not sure why

MySQL order by makes query slow solved, but not sure why - mysql

I have the following query:
select *
from
`twitter_posts`
where
`main_handle_id` in (
select
`twitter`.`main_handle_id`
from
`users`
inner join `twitter` on `twitter`.`user_id` = `user`.`id`
where
`users` LIKE 'foo'
)
order by created_at
This query runs dramatically slow when the order by is used. The twitter_posts table has indexes on the created_at timestamp and on the main_handle_id column.
I used "explain" to see what the engine was planning on doing and noticed a filesort... which I was not expecting to see as an index exists on the twitter_posts table. After skimming through the various posts covering this on Stackoverflow and blogs, none of the examples worked for me. I had a hunch, that maybe, the sql optimizer got confused somehow with the nested select. So I wrapped the query and ordered the result of that by created_at:
select * (select *
from
`twitter_posts`
where
`main_handle_id` in (
select
`twitter`.`main_handle_id`
from
`users`
inner join `twitter` on `twitter`.`user_id` = `user`.`id`
where
`users` LIKE 'foo'
)
)
order by created_at
Using explain on this one, DOES use the index to sort and returns the response in a fraction of what it takes with filesort. So I've "solved" my problem, but I don't understand why SQL would make a different choice based on the solution. As far as I can see, I'm just wrapping the result and ask everything of that result in a seemingly redundant statement...
Edit: I'm still unaware of why the optimizer started using the index on the created_at when using a redundant subquery, but in some cases it wasn't faster. I've now gone with the solution to add the statement "FORCE INDEX FOR ORDER BY created_at_index_name".

Don't use IN ( SELECT ... ); convert that to a JOIN. Or maybe converting to EXISTS ( SELECT ... ) may optimize better.
Please provide the EXPLAINs. Even if they are both the same. Provide EXPLAIN FORMAT=JSON SELECT ... if your version has such.
If the Optimizer is confused by a nested query, why add another nesting??
Which table is users in?
You mention user.id, yet there is no user table in FROM or JOIN.
What version of MySQL (or MariaDB) are you using? 5.6 and especially 5.7 have significant differences in the optimizer, with respect to MariaDB.
Please provide SHOW CREATE TABLE for each table mentioned in the query.
There are hints in your comments that the queries you presented are not the ones giving you trouble. Please do not ask about a query that does not itself demonstrate the problem.
Oh, is it users.name? Need to see the CREATE TABLEs to see if you have a suitable index.
Since MariaDB was mentioned, I am adding that tag. (This is probably the only way to get someone with specific MariaDB expertise involved.)

Related

Should i rather use a subquery or a combined WHERE?

This specific situation may seem a bit silly, but i just want to know how i should solve it: there is a table (schools) and in this table you find all students with their school-id. The order is completely random, but with a SELECT statement you can sort it.
CREATE TABLE schools (school_id int, name varchar(32), age ...);
Now i want to search for a student by his name (with LIKE '%name%'), but only if he's in a certain school.
I already tried this:
SELECT * FROM `schools` WHERE `school_id` = 33 and `name` LIKE '%max%';
But then i realized, that i could also use subqueries like:
SELECT * FROM (SELECT * FROM `schools` WHERE `school_id` = 33) AS a
WHERE a.name LIKE '%max%';
Which way is more efficient/has a higher performance?

You can use the EXPLAIN keyword to see exactly how each query is executed.
I'd say it's almost a definite that these two will execute identically.

The query optimizer will probably choose the same plan for both queries. If you want to know for sure, look at the execution plan when you execute each query.

The query without the subquery is probably more efficient in MySQL:
SELECT *
FROM `schools`
WHERE `school_id` = 33 and `name` LIKE '%max%';
MySQL has this nasty tendency to materialize subqueries -- that is, to actually run the subquery and save it as a temporary table (it is getting better, though). Most other databases do not do this. So, in other databases, the two should be equivalent.
MySQL is smart enough to use an index, if available, for school_id, even though there are other comparisons. If no indexes are available, it will be doing a full table scan, which will probably dominate the performance.

explain plan meanings in mysql

i use explain plan,but i am confused what is its real meaning.
explain extended
select *
from (select type_id from con_consult_type cct
where cct.consult_id = (select id
from con_consult
where id = 1))
cctt left join con_type ct on cctt.type_id = ct.id;
the results is
i google the derived is temporary table,but what is its sql of the temporary table?is ctt table?
and the step 2,is result of cctt left join con_type ct on cctt.type_id = ct.id?
the FK_CONSULT_TO_CONSULT_TYPE is consult_id refer con_consult id column,
how to use the index in the sql?
get all results of ctt,and then use the index filter?
please help me explain what the explain meanings.

This is a bad query to learn the basics of the explain output, there is simply too much happening with all the sub queries, and joins.
I can give a run down of some of the essentials;
'rows' column: Less is better, it shows how many rows had to be scanned by the database, anything less than a couple of hundred is good, generally indicates how well it is able to find your data from the indexes;
'possible_keys': and 'keys': If 'rows' is big, you may have to tweek your keys to provide the engine with some help finding your data
'type': Type of join
To answer some of your questions;
'sql of the temporary table' - it's the first subquery in your sql
With FK_CONSULT_TO_CONSULT_TYPE you dont have to do anything, the engine has allready picked this up as an index which is what the explain is saying.
Queries are broken into 3 essentials steps; select data, filter, and join. Each row in the explain is a detail into one or more of these operations, it may not necessarily relate to a specific section of your SQL as the engine may have combined various parts into one.

How to remove this filesort?

I have problem, my slow query is still using filesort.
I can not get rid of this extra flag. Could you please help me?
My query looks like this:
select `User`,
`Friend`
from `friends`
WHERE
(`User`='3053741' || `Friend`='3053741')
AND `Status`='1'
ORDER by `Recent` DESC;
Explain says this:
http://blindr.eu/stack/1.jpg
My table structure and indexed (it is in slovak language, but it should be the same):
http://blindr.eu/stack/2.jpg
A little help is needed, how to get rid of this filesort?

USING FILESORT does not mean, that MySQL is actually using a file. The naming is a bit "unlucky". The sorting is done in memory, as long as the data fits into memory. That said, have a look at the variable sort_buffer_size. But keep in mind, that this is a per session variable. Every connection to your server allocates this memory.
The other option is of course, to do the sorting on application level. You have no LIMIT clause anyway.
Oh, and you have too many indexes. Combined indexes can also be used when just the left-most column is used. So some of your indexes are redundant.
If performance is really that much of an issue, try another approach, cause the OR in the WHERE makes it hard, if not impossible, to get rid of using filesort via indexes. Try this query:
SELECT * FROM (
select "this is a user" AS friend_or_user, `User`, Recent
from `friends`
WHERE
`User`='3053741'
AND `Status`='1'
UNION ALL
select "this is a friend",
`Friend`, Recent
from `friends`
WHERE
`Friend`='3053741'
AND `Status`='1'
) here_needs_to_be_an_alias
ORDER by `Recent` DESC

SELECT with JOIN where joined row is NULL

I am trying to select rows from a table which don't have a correspondence in the other table.
For this purpose, I'm currently using LEFT JOIN and WHERE joined_table.any_column IS NULL, but I don't think that's the fastest way.
SELECT * FROM main_table mt LEFT JOIN joined_table jt ON mt.foreign_id=jt.id WHERE jt.id IS NULL
This query works, but as I said, I'm looking for a faster alternative.

Your query is a standard query for this:
SELECT *
FROM main_table mt LEFT JOIN
joined_table jt
ON mt.foreign_id=jt.id
WHERE jt.id IS NULL;
You can try this as well:
SELECT mt.*
FROM main_table mt
WHERE not exists (select 1 from joined_table jt where mt.foreign_id = jt.id);
In some versions of MySQL, it might produce a better execution plan.

In my experience with MSSQL the syntax used (usually) produces the exact same query plan as the WHERE NOT EXISTS() syntax, however this is mysql, so I can't be sure about performance!!
That said, I'm a much bigger fan of using the WHERE NOT EXISTS() syntax for the following reasons :
it's easier to read. If you speak a bit of English anyone can deduce the meaning of the query
it's more foolproof, I've seen people test for NULL on a NULL-able field
it can't have side effects like 'doubled-records' due to the JOIN. If the referenced field is unique there is no problem, but again I've seen situations where people chose 'insufficient keys' causing the main-table to get multiple hits against the joined table... and off course they solved it again using DISTINCT (aarrgg!!! =)
As for performance, make sure to have a (unique) index on the referenced field(s) and if possible put a FK-relationship between both tables. Query-wise I doubt you can squeeze much more out of it.
My 2 cents.

The query that you are running is usually the fastest option, just make sure that you have an index forh both mt.foreign_id and jt.id.
You mentioned that this query is more complex, so it might be possible that the problem is in another part of the query. You should check the execution plan to see what is wrong and fix it.

Improve a Join performance

This query runs slowly:
SELECT UserAccountNumber, balance, username FROM users JOIN balances ON
users.UserAccountNumber=balances.UserAccountNumber WHERE date < “2011-02-02”
What can i do to improve its performance?
I thought about using the user ID for the join instead of the userAccountNumber.
Appart form it, as far as i know, the JOIN and WHERE users.id = balances.idUser perform at the same speed...
So.. what else could i change to improve it?
Thanks.

The query itself looks OK to me, but make sure you have indexes on the UserAccountNumber columns (since they're involved in the join) and date (the column you're searching on). If the database has to do a sequential scan of a lot of records, that'll be slow. Using EXPLAIN SELECT may help you to understand how the database is actually performing the query.

If the tables are huge you might get some improvement using a temporary table rather then letting MySQL's optimiser sort it out for you. It'll be a trade off for sure though.
CREATE TEMPORARY TABLE `tmp_balances`
(
`UserAccountNumber` INT,
`balance` INT,
INDEX (`UserAccountNumber`)
) ENGINE=MEMORY
SELECT `UserAccountNumber`, `balance`
FROM balances
WHERE date < "2011-02-02";
SELECT `tmp_balances`.`UserAccountNumber`,
`tmp_balances`.`balance`,
`users`.`username`
FROM `tmp_balances` INNER JOIN users USING (`UserAccountNumber`);
Other then that (and it's a bit of a long-shot), I'd have to echo what the earlier commentors said about indexing.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008