I am dealing with MySQL tables that are essentially results of raytracing simulations on a simulated office room with a single venetian blind. I usually need to retrieve the simulation's result for a unique combination of time and blind's settings. So I end up doing a lot of
SELECT result FROM results WHERE timestamp='2005-05-05 12:30:25' \
AND opening=40 AND slatangle=60
This looks suspiciously optimizable, since this query should never ever return more than one row. Does it make sense to define an index on the three columns that uniquely identify each row? Or are there other techniques I can use?
The answer is most definately a yes. If you define a unique index on timestamp, opening and slatangle MySQL should be able to find your row with very few disc seeks.
You might experiment with creating an index on timestamp, opening, slateangle and result. MySQL may be able to fetch your data from the index without touching the datafile at all.
The MySQL Manual has a section about optimzing queries.
I would suggest adding
LIMIT 1;
to the end of the query.
William
I wouldn't suggest adding 3 indexes. An index using all three columns may be better and even setting the primary key unique on that combination would be best - only if you're sure that it unique.
Yes, create an index of multiple columns helps. Also you should test the performance of different column order, ie O(c1, c2, c3) != O(c2, c1, c3)
Have a look
http://joekuan.wordpress.com/2009/01/23/mysql-optimize-your-query-to-be-more-scalable-part-12/
http://joekuan.wordpress.com/2009/01/23/mysql-optimize-your-query-to-be-more-scalable-part-22/
Related
Consider we have A,B,C,D,E,F,G,H columns in my table and if I make composite indexing on column ABCDE because these column are coming in where clause and then I want composite indexing on ABCEF then I create new composite indexing on ABCEF in same table but in a different query, we want indexing on column ABCGH for that i make another composite indexing in same table,
So my question is can we make too many composite indexes as per our requirements because sometimes our column in where clause gets change and we have to increase its performance so tell me is there any other way to optimise the query or we can make multiple composite indexes as per the requirements.
I tried multiple composite indexing and it did not give me any problem till now but still i want to know from all of you that will it give me any problem in future or is it ok to use only composite indexes and not single index.
Your answers will help me alot waiting for your replies.
Thanks in advance.
You can have as many as you want. However, each additional index has a cost when updating, inserting or deleting. The trick is to
find common segments and make indexes for those.
Or create them as required when queries are too slow.
As an example, if you are "needing" indexes for ABCDE, ABDEF, and ABGIH then create an index on just AB
InnoDB supports up to 64 indexes per table (cf. https://dev.mysql.com/doc/refman/8.0/en/innodb-limits.html).
If you try to create a composite index for every permutation of N columns, you would need N-factorial indexes. So for 8 columns, you would need 40,320 indexes. Clearly this is more than InnoDB supports.
You probably don't need that many indexes. In practice, I've rarely seen more than 6 indexes in a given table. All queries that are needed are optimized by one of those.
I understand you said sometimes you change the terms in your query's WHERE clause, so it might need a composite index with different columns.
You can rely on indexes that have a subset of all the columns that would be optimal. That won't be 100% optimized, but it will be better than no index.
You can't predict the optimal set of indexes for a given query until you write that query.
There is a limit of 64 secondary indexes (at least in InnoDB).
Order the indexes so that columns being tested with = come first. (The order of those columns in the INDEX does not matter.)
The leftmost columns in an index are the most important.
There is little or now use in including more than one column that will be searched by a range.
Study your likely queries, and find the most common combinations of 2 or 3 columns; build indexes starting with those.
In your two examples (ABCDEFGH and ABCEF), ABC would work for both (assuming at least A and B are tested with =). If you do throw on more columns, that one INDEX can still be used for both cases.
Maybe you would what to declare both ABCDEFGH and BCEFA; This handles your ABCDEF case, plus cases that have B, but not A. (Remember 'leftmost'.)
Use the SlowLog to find the slowest queries and make better indexes for them.
More on indexing: Index Cookbook
Each index requires space on the disk to be stored, and time to be updated every time you update(/insert/delete) an indexed column value.
So as long as you don't run out of storage or write operations are too slow, because you have to update too many indexes, you are not limited to create as many specific indexes as you want.
This depends on your use case and should be measured with production like data.
A common solution would be to create one index specific for your most important query e.g. in your case ABCDE.
Other queries can still use the as many columns from left to right until there is a first difference. e.g. a query searching for ABCEF could still use ABC on the previous mentioned index.
To also utilise column E you could add a where condition to D to your query in a way you know it matches all values e.g. D < 100 if you know there are only values 1-99.
Reading this I now understand when to use indexes and when not to use them. But i have a question; would using an index on a column with a limited number of possible values help speedup queries (SELECT-ing) ? Consider the following:
Table "companies": id, district_id, name
Table "districts": id, name
The number of districts would never pass 5 entries. Should i use an index on companies.district_id then or not? I read somewhere (can't find the link :( ) that it wont help since the values are not that many and it would actually slow down the query in many cases.
PS: both tables are MyISAM
Almost never is an INDEX on a low-cardinality column used by the optimizer.
On the other hand, a "compound index" may be useful. For example, does INDEX(district_id, name) have any use?
Having INDEX(district_id) will slow down INSERTs because the index is added to whenever a row is inserted. It will not slow down SELECTs, other than the minor amount of time for the Optimizer to notice the index and reject it.
(My statements apply to both MyISAM and InnoDB.)
More discussion of this answer:
MySQL: Building the best INDEX for a given SELECT: Flags and Low Cardinality
In my Java application I have found a small performance issue, which is caused by such simple query:
SELECT DISTINCT a
FROM table
WHERE checked = 0
LIMIT 10000
I have index on the checked column.
In the beginning, the query is very fast (i.e. where almost all rows have checked = 0). But as I mark more and more rows as checked, the query becomes greatly inefficient (up to several minutes).
How can I improve the performance of this query ? Should I add a complex index
a, checked
or rather
checked, a?
My table has a lot of millions of rows, that is why I do not want to test it manually and hope to have lucky guess.
I would add an index on checked, a. This means that the value you're returning has already been found in the index and there's no need to re-access the table to find it. Secondly if you're doing lot's of individual updates of the table there's a good chance both the table and the index have become fragmented on the disc. Rebuilding (compacting) a table and index can significantly increase performance.
You can also use the query rewritten as (just in case the optimizer does not understand that it's equivalent):
SELECT a
FROM table
WHERE checked = 0
GROUP BY a
LIMIT 10000
Add a compound index on the DISTINCT column (a in this case). MySQL is able to use this index for the DISTINCT.
MySQL may also take profit of a compound index on (a, checked) (the order matters, the DISTINCT column has to be at the start of the index). Try both and compare the results with your data and your queries.
(After adding this index you should see Using index for group-by in the EXPLAIN output.)
See GROUP BY optimization on the manual. (A DISTINCT is very similar to a GROUP BY.)
The most efficient way to process GROUP BY is when an index is used to directly retrieve the grouping columns. With this access method, MySQL uses the property of some index types that the keys are ordered (for example, BTREE). This property enables use of lookup groups in an index without having to consider all keys in the index that satisfy all WHERE conditions.>
My table has a lot of millions of rows <...> where almost all rows have
checked=0
In this case it seems that the best index would be a simple (a).
UPDATE:
It was not clear how many rows get checked. From your comment bellow the question:
At the beginning 0 is in 100% rows, but at the end of the day it will
become 0%
This changes everything. So #Ben has the correct answer.
I have found a completely different solution which would do the trick. I will simple create a new table with all possible unique "a" values. This will allow me to avoid DISTINCT
You don't state it, but are you updating the index regularly? As changes occur to the underlying data, the index becomes less and less accurate and processing gets worse and worse. If you have an index on checked, and checked is being updated over time, you need to make sure your index is updated accordingly on a regular basis.
I've a table with 7 columns, I've on primary on first column, another index (foreign key).
My app does:
SELECT `comment_vote`.`ip`, `comment_vote`.`comment_id`, COUNT(*) AS `nb` FROM `comment_vote`
SELECT `comment_vote`.`type` FROM `comment_vote` WHERE (comment_id = 123) AND (ip = "127.0.0.1")
Is it worth to add an index on ip column? it is often used in my select query.
By the way is there anything I can do to quick up those queries? Sometimes it tooks a long time and lock the table preventing other queries to run.
If you are searching by ip quite often then yes you can create an index. However your insert/updates might take a bit longer due to this. Not sure how your data is structured but if the data collection is by ip then may be you can consider partitioning it by ip.
A good rule of thumb: If a column appears in the WHERE clause, there should be an index for it. If a query is slow, there's a good chance an index could help, particularly one that contains all fields in the WHERE clause.
In MySQL, you can use the EXPLAIN keyword to see an approximate query plan for your query, including indexes used. This should help you find out where your queries spend their time.
Yes, do create an index on ip if you're using it in other queries.
This one uses column id and ip, so I'd create an index on the combination. An index on ip alone won't help that query.
YES! Almost always add an INDEX or two or three! (multi-column indexes?) to every column.
If it is in not a WHERE clause today, you can bet it will be tomorrow.
Most data is WORM (written once read many times) so making the read most effective is where you will get the most value. And, as many have pointed out, the argument about having to maintain the index during a write is just plain silly.
I've been reading about indexes in MySQL recently, and some of the principles are quite straightforward but one concept is still bugging me: basically, if in a hypothetical table with, let's say, 10 columns, we have two single-column indexes (for column01 and column02 respectively), plus a primary key column (some other column), then are they going to be used in a simple SELECT query like this one or not:
SELECT * FROM table WHERE column01 = 'aaa' AND column02 = 'bbb'
Looking at it, my first instinct is telling me that the first index is going to retrieve a set of rows (or primary keys in InnoDB, if I got the idea right) that satisfy the first condition, and the second index will get another set. And the final result set will be just the intersection of these two. In the books that I've been going through I cannot find anything about this particular scenario. Of course, for this particular query one index on both columns seems like the best option, but I am struggling with understanding the real process behind this whole thing if I try to use two indexes that I described above.
Its only going to use a single index. You need to create a composite index of multiple columns if you want it to be able to index off of each column you are testing. You may want to read the manual to find out how MySQL uses each type of index, and how to order your composite indexes correctly to get the best utilization of it.
It's actually the most common question
about indexing at all: is it better to
have one index with all columns or one
individual index for every column?
http://use-the-index-luke.com/sql/where-clause/searching-for-ranges/index-combine-performance