How can I make this SQL non sargable? - mysql

I've used an online tool to analyse one of my sql querys (The Query took me ages to make).
My query takes a word (in this example the word is 'dog.') and tries to find it in the 'qa' table when it does it joins row data from the login table where the login.pid===qa.u
SELECT login.pid,login.name,
qa.id,qa.end,qa.react,qa.win,qa.stock,qa.num,qa.ratio,qa.u,qa.t,qa.k,qa.swipes,qa.d
FROM login,qa WHERE login.pid=qa.u AND (qa.k LIKE '%dog.%' OR qa.k='.dog.')
ORDER BY qa.d DESC LIMIT 0,15
I understand what the tool is telling me:
Argument with leading wildcard
An argument has a leading wildcard character, such as "%foo". The predicate with
this argument is not sargable and cannot use an index if one exists.
but I don't know how to use an index inside the '()' without damaging or changing the results... could someone please explain how I could use an index in the middle of a query's conditions?
I take it that if this was non-sargable then the result would be faster?

First, learn to use modern join syntax:
SELECT login.pid, login.name,
qa.id, qa.end, qa.react, qa.win, qa.stock, qa.num, qa.ratio, qa.u, qa.t,qa.k, qa.swipes, qa.d
FROM login join
qa
on login.pid = qa.u
WHERE (qa.k LIKE '%dog.%' OR qa.k = '.dog.')
ORDER BY qa.d DESC
LIMIT 0,15;
Basically "sargable" means that you can use an index on a particular expression (it is not an English word, it is an acronym). The expression on qa.k cannot use an index.
This may not make a difference, depending on the query plan for the query. For instance, if the engine decides to scan the login table and then lookup values in qa, the index wouldn't help. It helps going the other way, though.
The bad news is that you cannot make this expression sargable in MySQL. The good news is that you can use a full text index to do what you want and possibly more. You can read about them here. One small note is that the default settings ignore short words, up to three letters. So you need to change the default setting if you actually want to search for "dog".
By the way, the following expression can use an index on qa.k:
WHERE (qa.k LIKE 'dog.%' OR qa.k = '.dog.')
(I'm not sure if MySQL actually would use the index, because it sometimes gets confused by or.)

Related

Should I avoid ORDER BY in queries for large tables?

In our application, we have a page that displays user a set of data, a part of it actually. It also allows user to order it by a custom field. So in the end it all comes down to query like this:
SELECT name, info, description FROM mytable
WHERE active = 1 -- Some filtering by indexed column
ORDER BY name LIMIT 0,50; -- Just a part of it
And this worked just fine, as long as the size of table is relatively small (used only locally in our department). But now we have to scale this application. And let's assume, the table has about a million of records (we expect that to happen soon). What will happen with ordering? Do I understand correctly, that in order to do this query, MySQL will have to sort a million records each time and give a part of it? This seems like a very resource-heavy operation.
My idea is simply to turn off that feature and don't let users select their custom ordering (maybe just filtering), so that the order would be a natural one (by id in descending order, I believe the indexing can handle that).
Or is there a way to make this query work much faster with ordering?
UPDATE:
Here is what I read from the official MySQL developer page.
In some cases, MySQL cannot use indexes to resolve the ORDER BY,
although it still uses indexes to find the rows that match the WHERE
clause. These cases include the following:
....
The key used to
fetch the rows is not the same as the one used in the ORDER BY:
SELECT * FROM t1 WHERE key2=constant ORDER BY key1;
So yes, it does seem like mysql will have a problem with such a query? So, what do I do - don't use an order part at all?
The 'problem' here seems to be that you have 2 requirements (in the example)
active = 1
order by name LIMIT 0, 50
The former you can easily solve by adding an index on the active field
The latter you can improve by adding an index on name
Since you do both in the same query, you'll need to combine this into an index that lets you resolve the active value quickly and then from there on fetches the first 50 names.
As such, I'd guess that something like this will help you out:
CREATE INDEX idx_test ON myTable (active, name)
(in theory, as always, try before you buy!)
Keep in mind though that there is no such a thing as a free lunch; you'll need to consider that adding an index also comes with downsides:
the index will make your INSERT/UPDATE/DELETE statements (slightly) slower, usually the effect is negligible but only testing will show
the index will require extra space in de database, think of it as an additional (hidden) special table sitting next to your actual data. The index will only hold the fields required + the PK of the originating table, which usually is a lot less data then the entire table, but for 'millions of rows' it can add up.
if your query selects one or more fields that are not part of the index, then the system will have to fetch the matching PK fields from the index first and then go look for the other fields in the actual table by means of the PK. This probably is still (a lot) faster than when not having the index, but keep this in mind when doing something like SELECT * FROM ... : do you really need all the fields?
In the example you use active and name but from the text I get that these might be 'dynamic' in which case you'd have to foresee all kinds of combinations. From a practical point this might not be feasible as each index will come with the downsides of above and each time you add an index you'll add supra to that list again (cumulative).
PS: I use PK for simplicity but in MSSQL it's actually the fields of the clustered index, which USUALLY is the same thing. I'm guessing MySQL works similarly.
Explain your query, and check, whether it goes for filesort,
If Order By doesnt get any index or if MYSQL optimizer prefers to avoid the existing index(es) for sorting, it goes with filesort.
Now, If you're getting filesort, then you should preferably either avoid ORDER BY or you should create appropriate index(es).
if the data is small enough, it does operations in Memory else it goes on the disk.
so you may try and change the variable < sort_buffer_size > as well.
there are always tradeoffs, one way to improve the preformance of order query is to set the buffersize and then the run the order by query which improvises the performance of the query
set sort_buffer_size=100000;
<>
If this size is further increased then the performance will start decreasing

What drive the natural result order for an unordered MySQL request

How does mysql return lines when there is no ORDER BY in the request?
What drives the natural order?
There can obviously be many different queries but let's say a simple
select column from table where date < NOW()
There is no natural predictable order when you don't specify one.
Be very careful with this. For all SQL there is no defined implied order. Never count on this. Even if you see a specific behavior at a point in time, that could change in a future release or even with the adding of an index. If you are expecting an order and counting on it, the specify it explicitly.
Problem is that "natural order" of results is often affected completely or partly by the access plan the DB engine uses. For instance, if you do a group by FieldA there is a good chance (not a guarantee) that the results will come back in FieldA sequence. If you do a very simple select chances are the results will be in the sequence they are stored in the database, which may or may not be the order of the IDs or the primary key. IF you don't specify the order it is giving the DB engine the option to do whatever is most convenient for it at the time based on how it got the results. So really does become unpredictable and open to change.
Wish I could explain better, but trying to convey the real randomness of the process form an observer viewpoint.
If the query is using an index, it will prefer the ordering of that index. Group by forces an ordering. This is why combining group by and order can have a performance penalty.
In your case, if you have an index on date, it will probably order by that, hard to say how it handles tie breaks though. For more information, as usual explain the query.
Of course there's a caveat to ordering on the index used as well. If the index is on an autoincremented field and the data was added with prespecified ids, you may find it prefers the order the data was added in.

Does the order of index creation matter

Assuming I have an index on two columns on foo table indexed on (x,y)
If I search it as select * from foo where x=1 and y=2 or select * from foo where y=2 and x=1. Does it really matter on mysql.
Short answer - no, it doesn't matter. MySQL will try to pick the best index to use regardless of whether the column appears first or second in your WHERE clause.
You can prove this by running an EXPLAIN statement on each one to get more information about how MySQL will execute the query - it should show that the same index is used in both cases.
If you're talking about the order the columns appear in the index - (x,y) vs (y,x), it also doesn't matter in this case since you're selecting using both columns. If you sometimes select on just one of the columns though, that column should appear first in the index so MySQL can use the partial index to help optimize the query when only one value is provided.
Thats is so useful on Postgre on query optmizing to some querys, but MySQL just ignore index order (ASC,DESC) I dont know which version gonna suport this.
'I thought' this is a workbench bug, but the wockbench team anwser me:
Our manual, http://dev.mysql.com/doc/refman/5.5/en/create-index.html,
says:
"An index_col_name specification can end with ASC or DESC. These
keywords are permitted for future extensions for specifying ascending
or descending index value storage. Currently, they are parsed but
ignored; index values are always stored in ascending order."
So, probably this is the reason for what you see in Workbench: it
allows to add that DESC option, but server itself ignores it.
Earlier comments can be viewed at http://bugs.mysql.com/65893

Order of condition execution in MySQL

Suppose I have a MySQL query with two conditions:
SELECT * FROM `table` WHERE `field_1` = 1 AND `field_2` LIKE '%term%';
The first condition is obviously going to be a lot cheaper than the second, so I'd like to be sure that it runs first, limiting the pool of rows which will be compared with the LIKE clause. Do MySQL query conditions run in the order they're listed or, if not, is there a way to specify order?
The optimiser will evaluate the WHERE conditions in the order it sees fit.
SQL is declarative: you tell the optimiser what you want, not how to do it.
In a procedural/imperative language (.net, Java, php etc) then you say how and would choose which condition is evaluated first.
Note: "left to right" does apply in certain expressions like (a+b)*c as you'd expect
MySQL has an internal query optimizer that takes care of such things in most cases. So, typically, you don't need to worry about it.
But, of course, the query optimizer is not foolproof. So...
Sorry to do this to you, but you'll want to get familiar with EXPLAIN if you suspect that a query may be running less efficiently than it should.
http://dev.mysql.com/doc/refman/5.0/en/explain.html
If you have doubts about MySQL usage of index, you can suggest what index should be used.
http://dev.mysql.com/doc/refman/5.1/en/index-hints.html

mysql where condition

I'm interested in where condition; if I write:
Select * from table_name
where insert_date > '2010-01-03'
and text like '%friend%';
is it different from:
Select * from table_name
where text like '%friend%'
and insert_date > '2010-01-03';
I mean if the table is very big, has a lot of rows and if mysql takes records compliant with condition " where insert_date > '2010-01-03' " first and then searches in these records for a word "friend" it can be much faster than from first search for "friend" rows and than look into the date field.
Is it important to write where condition smartly, or mysql analyze the condition and rewrites where condition in the best way?
thanks
No, the two where clauses should be equivalent. The optimizer should pick the same index whichever you use.
The order of columns in an index does matter though.
If you think the optimizer is using the wrong index, you could give it a hint. More often than not though, there's a good reason for using the index it has chosen to use, so unless you know exactly what you are doing, giving the optimizer hints will often make things worse not better.
I don't know about MySQL in particular, but typically this kind of optimization is left to the database engine, as which order is faster depends on indexes, cardinality of data, and quantity of data among other things.
I think it's true, that both of where clause ar similar in database abstraction
By definition, a logical conjunction (the AND operator) is commutative. This means that WHERE A AND B is equal to WHERE B AND A.
It makes no difference in which order you write your conditions.
However, what makes a difference is what indexes you have in place on your table. The query analyzer takes these into account. It is also smart enough to find the part of the condition that is easiest to check and apply that one first.