Optimizing MySQL LIKE '%string%' queries in innoDB - mysql

Having this table:
CREATE TABLE `example` (
`id` int(11) unsigned NOT NULL auto_increment,
`keywords` varchar(200) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
We would like to optimize the following query:
SELECT id FROM example WHERE keywords LIKE '%whatever%'
The table is InnoDB, (so no FULLTEXT for now) which would be the best index to use in order to optimize such query?
We've tried a simple :
ALTER TABLE `example` ADD INDEX `idxSearch` (`keywords`);
But an explain query shows that need to scan the whole table
if our queries where LIKE 'whatever%' instead, this index performs well, but otherwise has no value.
Is there anyway to optimize this for innoDB ?
Thanks!

Indexes are built from the start of the string towards the end. When you use LIKE 'whatever%' type clause, MySQL can use those start-based indexes to look for whatever very quickly.
But switching to LIKE '%whatever%' removes that anchor at the start of the string. Now the start-based indexes can't be used, because your search term is no longer anchored at the start of the string - it's "floating" somewhere in the middle and the entire field has to be search. Any LIKE '%... query can never use indexes.
That's why you use fulltext indexes if all you're doing are 'floating' searches, because they're designed for that type of usage.
Of major note: InnoDB now supports fulltext indexes as of version 5.6.4. So unless you can't upgrade to at least 5.6.4, there's nothing holding you back from using InnoDB *AND fulltext searches.

I would like to comment that surprisingly, creating an index also helped speed up queries for like '%abc%' queries in my case.
Running MySQL 5.5.50 on Ubuntu (leaving everything on default), I have created a table with a lot of columns and inserted 100,000 dummy entries. In one column, I inserted completely random strings with 32 characters (i.e. they are all unique).
I ran some queries and then added an index on this column.
A simple
select id, searchcolumn from table_x where searchcolumn like '%ABC%'
returns a result in ~2 seconds without the index and in 0.05 seconds with the index.
This does not fit the explanations above (and in many other posts). What could be the reason for that?
EDIT
I have checked the EXPLAIN output. The output says rows is 100,000, but Extra info is "Using where; Using index". So somehow, the DBMS has to search all rows, but still is able to utilise the index?

Related

MySQL: Slow SELECT because of Index / FKEY?

Dear StackOverflow Members
It's my first post, so please be nice :-)
I have a strange SQL behavior which i can't explain and don't find any resources which explains it.
I have built a web honeypot which record all access and attacks and display it on a statistic page.
However since the data increased, the generation of the statistic page is getting slower and slower.
I narrowed it down to a some select statements which takes a quite a long time.
The "issue" seems to be an index on a specific column.
*For sure the real issue is my lack of knowledge :-)
Database: mysql
DB schema
Event Table (removed unrelated columes):
Event table size: 30MB
Event table records: 335k
CREATE TABLE `event` (
`EventID` int(11) NOT NULL,
`EventTime` datetime NOT NULL DEFAULT current_timestamp(),
`WEBURL` varchar(50) COLLATE utf8_bin DEFAULT NULL,
`IP` varchar(15) COLLATE utf8_bin NOT NULL,
`AttackID` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
ALTER TABLE `event`
ADD PRIMARY KEY (`EventID`),
ADD KEY `AttackID` (`AttackID`);
ALTER TABLE `event`
ADD CONSTRAINT `event_ibfk_1` FOREIGN KEY (`AttackID`) REFERENCES `attack` (`AttackID`);
Attack Table
attack table size: 32KB
attack Table records: 11
CREATE TABLE attack (
`AttackID` int(4) NOT NULL,
`AttackName` varchar(30) COLLATE utf8_bin NOT NULL,
`AttackDescription` varchar(70) COLLATE utf8_bin NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
ALTER TABLE `attack`
ADD PRIMARY KEY (`AttackID`),
SLOW Query:
SELECT Count(EventID), IP
-> FROM event
-> WHERE AttackID >0
-> GROUP BY IP
-> ORDER BY Count(EventID) DESC
-> LIMIT 5;
RESULT: 5 rows in set (1.220 sec)
(This seems quite long for me, for a simple query)
QuerySlow
Now the Strange thing:
If I remove the foreign key relationship the performance of the query is the same.
But if I remove the the index on event.AttackID same select statement is much faster:
(ALTER TABLE `event` DROP INDEX `AttackID`;)
The result of the SQL SELECT query:
5 rows in set (0.242 sec)
QueryFast
From my understanding indexes on columns which are used in "WHERE" should improve the performance.
Why does removing the index have such an impact on the query?
What can I do to keep the relations between the table and have a faster
SELECT execution?
Cheers
Why does removing the index improve performance?
The query optimizer has multiple ways to resolve a query. For instance, two methods for filtering data are:
Look up the rows that match the where clause in the index and then fetch related data from the data pages.
Scan the index.
This doesn't get into the use of indexes for joins or aggregations or alternative algorithms.
Which is better? Under some circumstances, the first method is horribly slower than the second. This occurs when the data for the table does not fit into memory. Under such circumstances, the index can read a record from page 124 and then from 1068 and then from 124 again and -- well, all sorts of random intertwined reading of pages. Reading data pages in order is usually faster. And when the data doesn't fit into memory, thrashing occurs, which means that a page in memory is aged (overwritten) -- and then needed again.
I'm not saying that is occurring in your case. I am simply saying that what optimizers do is not always obvious. The optimizer has to make judgements based on the nature of the data -- and those judgements are not right 100% of the time. They are usually correct. But there are borderline cases. Sometimes, the issue is out-of-date statistics. Sometimes the issue is that what looks best to the optimizer is not best in practice.
Let me emphasize that optimizers usually do a very good job, and a better job than a person would do. Even if they occasionally come up with suboptimal plans, they are still quite useful.
Get rid of your redundant UNIQUE KEYs. A primary key is a unique key.
Use COUNT(*) rather than COUNT(IP) in your query. They mean the same thing because you declared IP to be NOT NULL.
Your query can be much faster if you stop saying WHERE AttackId>0. Because that column is a FK to the PK of your other table, those values should be nonzero anyway. But to get that speedup you'll need an index on event(IP) something like this.
CREATE INDEX IpDex ON event (IP)
But you're still summarizing a large table, and that will always take time.
It looks like you want to display some kind of leaderboard. You could add a top_ips table, and use an EVENT to populate it, using your query, every few minutes. Then you could display it to your users without incurring the cost of the query every time. This of course would display slightly stale data; only you know whether that's acceptable in your app.
Pro Tip. Read https://use-the-index-luke.com by Marcus Winand.
Essentially every part of your query, except for the FKey, conspires to make the query slow.
Your query is equivalent to
SELECT Count(*), IP
FROM event
WHERE AttackID >0
GROUP BY IP
ORDER BY Count(*) DESC
LIMIT 5;
Please use COUNT(*) unless you need to avoid NULL.
If AttackID is rarely >0, the optimal index is probably
ADD INDEX(AttackID, -- for filtering
IP) -- for covering
Else, the optimal index is probably
ADD INDEX(IP, -- to avoid sorting
AttackID) -- for covering
You could simply add both indexes and let the Optimizer decide. Meanwhile, get rid of these, if they exist:
DROP INDEX(AttackID)
DROP INDEX(IP)
because any uses of them are handled by the new indexes.
Furthermore, leaving the 1-column indexes around can confuse the Optimizer into using them instead of the covering index. (This seems to be a design flaw in at least some versions of MySQL/MariaDB.)
"Covering" means that the query can be performed entirely in the index's BTree. EXPLAIN will indicate it with "Using index". A "covering" index speeds up a query by 2x -- but there is a very wide variation on this prediction. ("Using index condition" is something different.)
More on index creation: http://mysql.rjweb.org/doc.php/index_cookbook_mysql

MySQL Index sometimes not being used

I have a table with 150k rows of data, and I have column with a UNIQUE INDEX, It has a type of VARCHAR(10) and stores 10 digit account numbers.
Now whenever I query, like a simple one:
SELECT * FROM table WHERE account_number LIKE '0103%'
It results 30,000+ ROWS, and when I run a EXPLAIN on my query It shows no INDEX is used.
But when I do:
SELECT * FROM table WHERE account_number LIKE '0104%'
It results 4,000+ ROWS, with the INDEX used.
Anyone can explain this?
I'm using MySQL 5.7 Percona XtraDB.
30k+/150k > 20% and I guess it is faster to do table scan. From 8.2.1.19 Avoiding Full Table Scans:
The output from EXPLAIN shows ALL in the type column when MySQL uses a full table scan to resolve a query. This usually happens under the following conditions:
You are using a key with low cardinality (many rows match the key value) through another column. In this case, MySQL assumes that by using the key it probably will do many key lookups and that a table scan would be faster.
If you don't need all values try to use:
SELECT account_number FROM table WHERE account_number LIKE '0103%'
instead of SELECT *. Then your index will become covering index and optimizer should always use it (as long as WHERE condition is SARGable).
The most database uses B tree for indexing. In this case the database optimizer don't use the index because its faster to scan without index. Like #lad2025 explained.
Your database column is unique and i think your cardinality of your index is high. But since your query using the like filter the database optimizer decides for you to choose not to use the index.
You can use try force index to see the result. Your using varchar with unique index. I would choose another data type or change your index type. If your table only contains numbers change it to numbers. This will help to optimize you query a lot.
In some cases when you have to use like you can use full text index.
If you need help with optimizing your query and table. Provide us more info and which info you want to fetch from your table.
lad2025 is correct. The database is attempting to make an intelligent optimization.
Benchmark with:
SELECT * FROM table FORCE INDEX(table_index) WHERE account_number LIKE '0103%'
and see who is smarter :-) You can always try your hand at questioning the optimizer. That's what index hints are for...
https://dev.mysql.com/doc/refman/5.7/en/index-hints.html

Cannot achieve a cover index with this table (2 equalities and one selection)?

CREATE TABLE `discount_base` (
`id` varchar(12) COLLATE utf8_unicode_ci NOT NULL,
`amount` decimal(13,4) NOT NULL,
`description` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`family` varchar(4) COLLATE utf8_unicode_ci NOT NULL,
`customer_id` varchar(8) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `IDX_CUSTOMER` (`customer_id`),
KEY `IDX_FAMILY_CUSTOMER_AMOUNT` (`family`,`customer_id`,`amount`),
CONSTRAINT `FK_CUSTOMER` FOREIGN KEY (`customer_id`)
REFERENCES `customer` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
I've added a cover index IDX_FAMILY_CUSTOMER_AMOUNT on family, customer_id and amount because most of the time I use the following query:
SELECT amount FROM discount_base WHERE family = :family AND customer_id = :customer_id
However using EXPLAIN and a bounce of records (~ 250000) it says:
'1', 'SIMPLE', 'discount_base', 'ref', 'IDX_CUSTOMER,IDX_FAMILY_CUSTOMER_AMOUNT', 'IDX_FAMILY_CUSTOMER_AMOUNT', '40', 'const,const', '1', 'Using where; Using index'
Why I'm getting using where; using index instead of just using index?
EDIT: Fiddle with a small amount of data (Using where; Using index):
EXPLAIN SELECT amount
FROM discount_base
WHERE family = '0603' and customer_id = '20000275';
Another fiddle where id is family + customer_id (const):
EXPLAIN SELECT amount
FROM discount_base
WHERE `id` = '060320000275';
Interesting problem. It would seem "obvious" that the IDX_FAMILY_CUSTOMER_AMOUNT index would be used for this query:
SELECT amount
FROM discount_base
WHERE family = :family AND customer_id = :customer_id;
"Obvious" to us people, but clearly not to the optimizer. What is happening?
This aspect of index usage is poorly documented. I (intelligently) speculate that when doing comparisons on strings using case-insensitive collations (and perhaps others), then the = operation is really more like an in. Something sort of like this, conceptually:
WHERE family in (lower(:family, upper(:family), . . .) and . . .
This is conceptual. But it means that an index scan is required for the = rather than an index lookup. Minor change typographically. Very important semantically. It prevents the use of the second key. Yup, that is an unfortunately consequence of inequalities, even when they look like =.
So, the optimizer compares the two possible indexes, and it decides that customer_id is more selective than family, and chooses the former.
Alas, both of your keys are case-insensitive strings. My suggestion would be to replace at least one of them with an auto-incrementing integer id. In fact, my suggestion is that basically all tables have an auto-incrementing integer id, which is then used for all foreign key references.
Another solution would be to use a trigger to create a single column CustomerFamily with the values concatenated together. Then this index:
KEY IDX_CUSTOMERFAMILY_AMOUNT (CustomerFamily, amount)
should do what you want. It is also possible that a case-sensitive encoding would also solve the problem.
Are family and customer_id strings? I guess you could be passing customer_id maybe as a integer which could be causing a type conversion to take place and so the index not being used for that particular column.
Ensure you pass customer_id as string or consider changing your table to store cusomer_id as INT.
If you are using alphanumeric Ids then this don't apply.
I'm pretty sure Using index is the important part, and it means "using a covering index".
Two things to further check:
EXPLAIN FORMT=JSON SELECT ...
may give further clues.
FLUSH STATUS;
SELECT ...;
SHOW SESSION STATUS LIKE 'Handler%';
will show you how many rows were read/written/etc in various ways. If some number says about 250000 (in your case), it indicates a table scan. If all the numbers a small (approximately the number of rows returned by the query), then you can be assured that it did do what that query efficiently.
The numbers there do not distinguish between read to an index versus data. But they ignore caching. Timings (for two identical runs) can differ significantly due to caching; Handler% values won't change.
The answer to your question relies on what the engine is actually using your index for.
In given query, you ask the engine to:
Lookup for values (WHERE/JOIN)
Retrieve information (SELECT) based on this lookup result
For the first part, as soon as you filter the results (lookup), there's an entry in Extra indicating USING WHERE, so this is the reason you see it in your explain plan.
For the second part, the engine does not need to go anywhere out of one given index because it is a covering index. The explain plan notifies it by showing USING INDEX. This USING INDEX hint, combined with USING WHERE, means your index is also used in the lookup portion of the query, as explained in mysql documentation:
https://dev.mysql.com/doc/refman/5.0/en/explain-output.html
Using index
The column information is retrieved from the table using only
information in the index tree without having to do an additional seek
to read the actual row. This strategy can be used when the query uses
only columns that are part of a single index.
If the Extra column also says Using where, it means the index is being
used to perform lookups of key values. Without Using where, the
optimizer may be reading the index to avoid reading data rows but not
using it for lookups. For example, if the index is a covering index
for the query, the optimizer may scan it without using it for lookups.
Check this fiddle:
http://sqlfiddle.com/#!9/8cdf2/10
I removed the where clause and the query now displays USING INDEX only. This is because no lookup is necessary in your table.
The MySQL documentation on EXPLAIN has this to say:
Using index
The column information is retrieved from the table using
only information in the index tree without having to do an additional
seek to read the actual row. This strategy can be used when the query
uses only columns that are part of a single index.
If the Extra column
also says Using where, it means the index is being used to perform
lookups of key values. Without Using where, the optimizer may be
reading the index to avoid reading data rows but not using it for
lookups. For example, if the index is a covering index for the query,
the optimizer may scan it without using it for lookups.
My best guess, based on the information you have provided, is that the optimizer first uses your IDX_CUSTOMER index and then performs a key lookup to retrieve non-key data (amount and family) from the actual data page based on the key (customer_id).
This is most likely caused by cardinality (eg. uniqueness) of the columns in your indexes. You should check the cardinality of
the columns used in your where clause and put the one with the highest cardinality first on your index. Guessing from the column
names and your current results, customer_id has the highest cardinality.
So change this:
KEY `IDX_FAMILY_CUSTOMER_AMOUNT` (`family`,`customer_id`,`amount`)
to this:
KEY `IDX_FAMILY_CUSTOMER_AMOUNT` (`customer_id`,`family`,`amount`)
After making the change, you should run ANALYZE TABLE to update table statistics. This will update table statistics, which can
affect the choices the optimizer makes regarding your indexes.
This sounds fine. According to the MySQL documentation:
If the Extra column also says Using where, it means the index is being
used to perform lookups of key values. Without Using where, the
optimizer may be reading the index to avoid reading data rows but not
using it for lookups. For example, if the index is a covering index
for the query, the optimizer may scan it without using it for lookups.
That means, Using index alone would read entire index to retrieve the results, but not using the index structure to find specific values. You probably could get this with SELECT family, customer_id, amount FROM discount_base. Using where; using index means the optimizer exploits the index to find and retrieve rows matching query parameters (family, customer_id).
This indeed could be a problem.
Note, that there might be millions of string matching by a single index key using the utf8_unicode_ci collation. For example, all these letters are matched by the same index key:
A, a, À, Á, Â, Ã, Ä, Å, à, á, â, ã, ä, å, Ā, ā, Ă, ă, Ą, ą, Ǎ, ǎ, Ǟ, ǟ, Ǡ, ǡ, Ǻ, ǻ, Ȁ, ȁ, Ȃ, ȃ, Ȧ, ȧ, Ḁ, ḁ, Ạ, ạ, Ả, ả, Ấ, ấ, Ầ, ầ, Ẩ, ẩ, Ẫ, ẫ, Ậ, ậ, Ắ, ắ, Ằ, ằ, Ẳ, ẳ, Ẵ, ẵ, Ặ, ặ.
And there are substantial grounds for believing that then processing an query using CHAR/VARCHAR index, MySQL in addition to regular index lookup performs full linear scan of all the values matched by index to make sure it is indeed matched by the original query parameter. This may be really needed when the index collation and the WHERE collation do not match, but I don't know why it do so all the time, even when this is clearly not needed (in your case for example).
See this question for an evidence and additional details: Why performance of MySQL queries are so bad when using a CHAR/VARCHAR index?
I would only recommend this solution:
Remove amount from index like:
KEY `IDX_FAMILY_CUSTOMER_AMOUNT` (`family`,`customer_id`)
When you do request force USE INDEX:
USE INDEX (`IDX_FAMILY_CUSTOMER_AMOUNT`)
That trick allow to avoid Using where. Hope performance will also be on acceptable level:
http://sqlfiddle.com/#!9/86f46/2
SELECT amount FROM discount_base
USE INDEX (`IDX_FAMILY_CUSTOMER_AMOUNT`)
WHERE family = '1' AND customer_id = '1'
Based on the fiddle provided, it appears that only numeric values are being used for family and customer id. If this assumption is correct, changing these columns to numeric and using just a single key on customer and family appears to have resolved the issue.
Please check this fiddle

MYSQL indexing issue

I am having some difficulties finding an answer to this question...
For simplicity lets create use this situation.
I create a table like this..
CREATE TABLE `test` (
`MerchID` int(10) DEFAULT NULL,
KEY `MerchID` (`MerchID`)
) ENGINE=InnoDB AUTO_INCREMENT=32769 DEFAULT CHARSET=utf8;
I will insert some data into the column of this table...
INSERT INTO test
SELECT 1
UNION
SELECT 2
UNION
SELECT null
Now I examine the query using MYSQL's explain feature...
EXPLAIN
SELECT * FROM test
WHERE merchid IS NOT NULL
Resting in ID=1
,select_type=SIMPLE
,table=test
,type=index
,possible_keys=MerchID
,key=MerchID
,key_len=5
,ref=NULL
,rows=3
,Extra= Using where
;Using index
In production in my real procedure something like this takes a long time with this index. If I re declare the table with the index line reading "KEY MerchID (MerchID) USING BTREE' I get much better results. The explain feature seems to return the same results too. I have read some basics about the BTREE, HASH and RTREE storage types for indexes/keys. When no storage type is specified I was unded the assumption that BTREE would be assumed. However I am kinda stumped why when modifying my index to use this storage type my procedure seems to fly. Any ideas?
I am using MYSQL 5.1 and coding in MYSQL Workbench. The part of procedure that appears to be help up is like the one I illustrated above where the column of a joined table is tested for NULL.
I think you are on the wrong path. For InnoDB storage the only available index method is the BTREE so if you are safe to omit the BTREE keyword from you table create script.Supported index types here along with other useful information.
The performance issue is coming from a different place.
Whenever testing performance, be sure to always use the SQL_NO_CACHE directive, otherwise, with query caching, the second time you run a query, your results may be returned a lot faster simply due to caching.
With a covering index (all of the selected and filtered columns are in the index), the query is rather efficient. Using index in the EXPLAIN result shows that it's being used as a covering index.
However, if the index were not a covering index, MySQL would have to perform a seek for each row returned by the index in order to grab the actual table data. While this would still be fast for a small result set, with a result set of 1 million rows, that would be 1 million seeks. If the number of NULL rows were a high percentage, MySQL would abandon the index altogether to avoid the seeks.
Ensure that your real "production" index is a covering index as well.

How to optimize database this query in large database?

Query
SELECT id FROM `user_tmp`
WHERE `code` = '9s5xs1sy'
AND `go` NOT REGEXP 'http://www.xxxx.example.com/aflam/|http://xx.example.com|http://www.xxxxx..example.com/aflam/|http://www.xxxxxx.example.com/v/|http://www.xxxxxx.example.com/vb/'
AND check='done'
AND `dataip` <1319992460
ORDER BY id DESC
LIMIT 50
MySQL returns:
Showing rows 0 - 29 ( 50 total, Query took 21.3102 sec) [id: 2622270 - 2602288]
Query took 21.3102 sec
if i remove
AND dataip <1319992460
MySQL returns
Showing rows 0 - 29 ( 50 total, Query took 0.0859 sec) [id: 3637556 - 3627005]
Query took 0.0859 sec
and if no data, MySQL returns
MySQL returned an empty result set (i.e. zero rows). ( Query took 21.7332 sec )
Query took 21.7332 sec
Explain plan:
SQL query: Explain SELECT * FROM `user_tmp` WHERE `code` = '93mhco3s5y' AND `too` NOT REGEXP 'http://www.10neen.com/aflam/|http://3ltool.com|http://www.10neen.com/aflam/|http://www.10neen.com/v/|http://www.m1-w3d.com/vb/' and checkopen='2010' and `dataip` <1319992460 ORDER BY id DESC LIMIT 50;
Rows: 1
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE user_tmp index NULL PRIMARY 4 NULL 50 Using where
Example of the database used
CREATE TABLE IF NOT EXISTS user_tmp ( id int(9) NOT NULL
AUTO_INCREMENT, ip text NOT NULL, dataip bigint(20) NOT NULL,
ref text NOT NULL, click int(20) NOT NULL, code text NOT
NULL, too text NOT NULL, name text NOT NULL, checkopen
text NOT NULL, contry text NOT NULL, vOperation text NOT NULL,
vBrowser text NOT NULL, iconOperation text NOT NULL,
iconBrowser text NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=4653425 ;
--
-- Dumping data for table user_tmp
INSERT INTO `user_tmp` (`id`, `ip`, `dataip`, `ref`, `click`, `code`, `too`, `name`, `checkopen`, `contry`, `vOperation`, `vBrowser`, `iconOperation`, `iconBrowser`) VALUES
(1, '54.125.78.84', 1319506641, 'http://xxxx.example.com/vb/showthread.php%D8%AA%D8%AD%D9%85%D9%8A%D9%84-%D8%A7%D8%BA%D9%86%D9%8A%D8%A9-%D8%A7%D9%84%D8%A8%D9%88%D9%85-giovanni-marradi-lovers-rendezvous-3cd-1999-a-155712.html', 0, '4mxxxxx5', 'http://www.xxx.example.com/aflam/', 'xxxxe', '2010', 'US', 'Linux', 'Chrome 12.0.742 ', 'linux.png', 'chrome.png');
I want the correct way to do the query and optimize database
You don't have any indexes besides the primary key. You need to make index on fields that you use in your WHERE statement. If you need to index only 1 field or a combination of several fields depends on the other SELECTs you will be running against that table.
Keep in mind that REGEXP cannot use indexes at all, LIKE can use index only when it does not begin with wildcard (so LIKE 'a%' can use index, but LIKE '%a' cannot), bigger than / smaller than (<>) usually don't use indexes also.
So you are left with the code and check fields. I suppose many rows will have the same value for check, so I would begin the index with code field. Multi-field indexes can be used only in the order in which they are defined...
Imagine index created for fields code, check. This index can be used in your query (where the WHERE clause contains both fields), also in the query with only code field, but not in query with only check field.
Is it important to ORDER BY id? If not, leave it out, it will prevent the sort pass and your query will finish faster.
I will assume you are using mysql <= 5.1
The answers above fall into two basic categories:
1. You are using the wrong column type
2. You need indexes
I will deal with each as both are relevant for performance which is ultimately what I take your questions to be about:
Column Types
The difference between bigint/int or int/char for the dataip question is basically not relevant to your issue. The fundamental issue has more to do with index strategy. However when considering performance holistically, the fact that you are using MyISAM as your engine for this table leads me to ask if you really need "text" column types. If you have short (less than 255 say) character columns, then making them fixed length columns will most likely increase performance. Keep in mind that if any one column is of variable length (varchar, text, etc) then this is not worth changing any of them.
Vertical Partitioning
The fact to keep in mind here is that even though you are only requesting the id column from the standpoint of disk IO and memory you are getting the entire row back. Since so many of the rows are text, this could mean a massive amount of data. Any of these rows that are not used for lookups of users or are not often accessed could be moved into another table where the foreign key has a unique key placed on it keeping the relationship 1:1.
Index Strategy
Most likely the problem is simply indexing as is noted above. The reason that your current situation is caused by adding the "AND dataip <1319992460" condition is that it forces a full table scan.
As stated above placing all the columns in the where clause in a single, composite index will help. The order of the columns in the index will no matter so long as all of them appear in the where clause.
However, the order could matter a great deal for other queries. A quick example would be an index made of (colA, colB). A query with "where colA = 'foo'" will use this index. But a query with "where colB = 'bar'" will not because colB is not the left most column in the index definition. So, if you have other queries that use these columns in some combination it is worth minimizing the number of indexes created on the table. This is b/c every index increases the cost of a write and uses disk space. Writes are expensive b/c of necessary disk activity. Don't make them more expensive.
You need to add index like this:
ALTER TABLE `user_tmp` ADD INDEX(`dataip`);
And if your column 'dataip' contains only unique values you can add unique key like this:
ALTER TABLE `user_tmp` ADD UNIQUE(`dataip`);
Keep in mind, that adding index can take long time on a big table, so don't do it on production server with out testing.
You need to create index on fields in the same order that that are using in where clause. Otherwise index is not be used. Index fields of your where clause.
does dataip really need to be a bigint? According to mysql The signed range is -9223372036854775808 to 9223372036854775807 ( it is a 64bit number ).
You need to choose the right column type for the job, and add the right type of index too. Else these queries will take forever.