MYSQL Multiple Composite Index with ranges - mysql

I have some doubts about MYSQL multiple composite indexes with ranges.
For example, if i have the following index:
Multiple column index in columns (A, B, C)
And the following query
WHERE A=2 AND B>5 AND C=3
Question:
- The index will use the columns (A,B,C) or only (A, B)
And what about this one:
WHERE A=2 AND B IN (1,2,3) AND C=4
Thanks!

Question: - The index will use the columns (A,B,C) or only (A, B)
Most likely (A, B). Indexes are positional, so they can't access the next column if the previous one is not using equality. But depending on table stats a full table scan may be faster.
WHERE A=2 AND B IN (1,2,3) AND C=4
Maybe:
only (A), or
(A, B) three times,
or none of them by doing a full table scan.
Statistics will decidedly make a big difference. In essence, is A very selective or not? If not, then a full table scan may be faster than using the index.

Related

Does it make sense to create a separate index against a column, that is also a part of the composite primary key?

I am using MySQL as my RDBMS.
But I think it must be applicable to other relational DBs.
I have a table Z, where I have 5 columns: a, b, c, d, e.
Columns a, b, c comprise a composite primary key.
Now, when it comes down to querying in the WHERE clause there will be times when I will be fetching data based on the values of columns a, b, c. But only one column out of 3 will be set.
Do I need to create 3 indices against these columns?
Follow-up question: what if I need to query my table knowing values for 2 columns out of 3? Will the creation of an additional 3 indices help to speed up my queries? (a, b) (a, c) (b, c)
Please advise.
...will be fetching data based on the values of columns a, b, c. But only one column out of 3 will be set.
If that's the case you'll need three indexes:
If a is set your primary key index (a, b, c) will suffice. You don't need to create an extra index for this case.
If b is set you'll need the index (b) for this query to be fast.
If c is set you'll need the index (c) for this query to be fast.
The index (a, b, c) is not useful when a is null. Remember, null is not a value.
Short answer: yes.
INDEX (a, b, c)
-- creates 1 index of unique combinations of a&b&c, not unlike CONCAT(a, b,c)
INDEX (a),
INDEX (b),
INDEX (c)
-- creates 3 indexes of unique values for all a, b, c separately
INDEX (a, b),
INDEX (c)
-- creates 2 indexes:
-- 1st for a&b unique values
-- 2nd for c unique values
Follow up: with WHERE a = '...' AND b = '...', searching thru INDEX(a, b) will be faster than searching thru INDEX(a), INDEX(b). However, if a or b values are (at least mostly) unique, performance increase will not be significant.
When debugging index performance, always start witch checking your indexes' cardinality and later your queries' index usage with EXPLAIN SELECT.

MySQL how are indexes used in this example?

This is probably in the MySQL documentation, but I have not been able to find it. So I know that if I'm selecting a record from a database, the fastest results are when the fields I'm selecting and the fields in the WHERE clause are parts of an index. Say that I have a statement like this:
SELECT a FROM t1 WHERE b=X AND c=Y
What key or combination of keys would give me the fastest result?
Option 1: one key that's (a, b, c).
Option 2: one key that's (b, c) because those are in the where statement.
Option 3: one key that's (b, c, a) because b and c are in the where statement, and a is the value that ultimately needs to be looked up. (Seems logical to me, but I have no idea if this makes any MySQL sense...)
Options 4: two keys, one that's (b, c) and one that is just (a).
Sorry, I'm a really MySQL newbie...
In your case a composite index on (b,c) should do the job. You do not need an index on a since it is not in your WHERE clause. Its presence in the SELECT list doesn't affect how the rest of the query has to be indexed.
You could also use (b,c,a) in that order since MySQL will use column combinations in composite indexes starting from left to right. That isn't necessary for this use case but could future-proof your code if you ever did need to query all three columns Indexing (a,b,c) would not work in this query for that reason.
WHERE b='X' AND c='Y' AND z='Z'
From the MySQL docs on index usage
If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to find rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3).
As always, when in doubt, check the query's execution plan after creating your index to verify that it can be used as intended.
EXPLAIN SELECT a FROM t1 WHERE b='X' AND c='Y'

Multiple Column Index vs Multiple Indexes/Index Merge

Let's assume we have a table with 4 columns: A, B, C, and D
Let's assume we have a few queries that will join or perform a clause against these columns:
Q1: Where A = ?
Q2: Where A = ?, B = ?
Q3: Where A = ?, B = ?, C = ?
Since we know we will use these columns in three different contexts, is it best to create three different indexes? Or three different multiple indexes?
Index Merge:
Idx1: Create index A_idx ON table (A)
Idx2: Create index B_idx ON table (B)
Idx3: Create index C_idx ON table (C)
Multiple Index
Idx1: Create index A_idx ON table(A)
Idx2: Create index AB_idx ON table(A,B)
Idx3: Create index ABC_idx ON table(A,B,C)
This is a simplified case. Let's assume we have 10-15 columns, that will be joined or where'd in different ways and combinations. Is it best to create multiple column indexes for these combinations they will receive? Or just find the smallest set of multiple columns that are most frequently used, build a multiple column index on those, and then create individual indexes for the rest?
Composite index on (A,B,C) will cover the 3 queries, so you don't need index on (A) and ON (A,B). It's also faster than index_merge.
The only reason to have more than one index is if some queries won't be covered by the index (they include B and C, but not A for example)
Also keep in mind that one of the most important characteristics of the column, to decide if it should be included in the index, is not if it's used in a query, but it's cardinality. If the query on this column won't exclude a lot of the rows, you should not include it in the index.
Let's say you have A,B,C
For a given value of A you have 20% of the rows. From those rows, for a given value of B you have 1% of the rows. Lets say those conditions (A,B) filter 1000 rows from the table. After applying C, you receive 850 rows. Index on C is not effective and (A,B) is the best index for this query

Efficiency of multicolumn indexes in MySQL

If I have a MyISAM table with a 3-column index, something like
create table t (
a int,
b int,
c int,
index abc (a, b, c)
) engine=MyISAM;
the question is, can the following query fully utilize the index:
select * from t where a=1 and c=2;
in other words, considering that an index is a b-tree, can MySQL skip the column in the middle and still do a quick search on first and last columns?
EXPLAIN does seem to be showing that the index will be used, however, the Extra says: Using where; Using index and I have no idea what this really means.
The answer is "no".
The MySQL documentation is quite clear on how indexes are used:
If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to find rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3). (http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html.)
What happens is that the index gets used for "a=1". All records that match are loaded, to see if "c=2" is true. The filter ends up using a combination of indexes and explicit record filtering.
By the way, if you want to handle all combinations of two columns, you need several indexes:
(a, b, c)
(b, a, c)
(c, b, a)
Even if you are using an index for all parts of a WHERE clause, you
may see Using where if the column can be NULL.
As per MySQL documentation, the above statement clarifies that the column in your table could be null and hence it says using where as well though it has covering index for fields in where clause.
http://dev.mysql.com/doc/refman/5.1/en/explain-output.html#explain-extra-information

Does a MySQL index on columns (A,B,C) optimize for queries than just select/order by (A,B) only?

Given a table with columns A,B,D,E,F.
I have two queries:
one that orders by A then B.
one that orders by A, then B, then C
I want to add an index on (A,B,C) to speed up the second query.
I'm thinking this will also speed up the first query. Is that correct? Or should I add a second index on (A,B)?
Or am I oversimplifying the problem of performance-tuning here?
Just put an index on all three of them. You don't need a second index.