I have question if I have tables with 3 columns (firstname, lastname , address ) as string/varchar(255)
and I have composite my_idx with 2 columns
CREATE INDEX my_idx ON my_table (firstname,lastname)
if I use sql , will it use my defined index ?
select * from my_table where address="zzz" and firstname="xxxx" and lastname="yyyy"
or should I use index columns as first left most condition
select * from my_table where firstname="xxxx" and lastname="yyyy" and address="zzz"
Thank you
First of all: if you prepend your Query with the keyword "EXPLAIN" it will print out all the indices it may use and which one MySQL choose.
From my understanding, yes it will use the index. The order of the fields in the Query is not relevant.
What matters is the order in the Index, but only if you are not providing all fields in the Query (or applying a function to the value or using e.g. the like operator for the rest of a string). If for example you only queried for lastname, the index can not be used. If you only queried for firstname, the index will be used. If you queried for firstname and address, the index will be used and so on...
Related
I have googled a lot and couldn't find a clear answer to my question
assume we have this query
SELECT * WHERE user_id = x ORDER BY date_created
If we have a single column index on user_id and another one on date_created, does the optimizer use both indexes? or just user_id index?
This is your query:
SELECT *
FROM mytable
WHERE user_id = 123
ORDER BY date_created
If you have two distinct indexes, then MySQL might use the index on user_id to apply the where predicate (if it believes that it will speed up the query, depending on the cardinality of your data, and other factor). It will not use the index on date_created, because it has no way to relate the intermediate resultset that satisfy the where predicate to that index.
For this query, you want a compound index on (user_id, date_created). The database uses the first key in the index to filter the dataset: in the index B-tree, matching rows are already sorted by date, so the order by operation becoms a no-op.
I notice that you are using select *; this is not a good practice in general, and not good for performance. If there are other columns in the table than the user and date, this forces to database to look up at the table to bring the corresponding rows after filtering and ordering through the index, which can be more expensive than not using the index at all. If you just need a few columns, then enumerate them:
SELECT date_created, first_name, last_name
FROM mytable
WHERE user_id = 123
ORDER BY date_created
And have an index on (user_id, date_created, first_name, last_name). That's a covering index: the database can execute the whole query using on the index, without looking up the table itself.
I have a table in MySQL with two columns
id int(11) unsigned NOT NULL AUTO_INCREMENT,
B varchar(191) CHARACTER SET utf8mb4 DEFAULT NULL,
The id being the PK.
I need to do a lookup in a query using either one of these. id in (:idList) or B in (:bList)
Would this query perform better if, there is a composite index with these two columns in them?
No, it will not.
Indexes can be used to look up values from the leftmost columns in an index:
MySQL can use multiple-column indexes for queries that test all the columns in the index, or queries that test just the first column, the first two columns, the first three columns, and so on. If you specify the columns in the right order in the index definition, a single composite index can speed up several kinds of queries on the same table.
So, if you have a composite index on id, B fields (in this order), then the index can be used to look up values based on their id, or a combination of id and B values. But cannot be used to look up values based on B only. However, in case of an or condition that's what you need to do: look up values based on B only.
If both fields in the or condition are leftmost fields in an index, then MySQL attempts to do an index merge optimisation, so you may actually be better off having separate indexes for these two fields.
Note: if you use innodb table engine, then there is no point in adding the primary key to any multi column index because innodb silently adds the PK to every index.
For OR I dont think so.
Optimizer will try to find a match in the first side, if fail will try the second side. So Individual index for each search will be better.
For AND a composite index will help.
MySQL index TIPS
Of course you can always add the index and compare the explain plan.
MySQL Explain Plan
The trick for optimizing OR is to use UNION. (At least, it works well in some cases.)
( SELECT ... FROM ... WHERE id IN (...) )
UNION DISTINCT
( SELECT ... FROM ... WHERE B IN (...) )
Notes:
Need separate indexes on id and B.
No benefit from any composite index (unless it is also "covering").
Change DISTINCT to ALL if you know that there won't be any rows found by both the id and B tests. (This avoids a de-dup pass.)
If you need ORDER BY, add it after the SQL above.
If you need LIMIT, it gets messier. (This is probably not relevant for IN, but it often is with ORDER BY.)
If the rows are 'wide' and the resultset has very few rows, it may be further beneficial to do
Something like this:
SELECT t...
FROM t
JOIN (
( SELECT id FROM t WHERE id IN (...) )
UNION DISTINCT
( SELECT id FROM t WHERE B IN (...) )
) AS u USING(id);
Notes:
This needs PRIMARY KEY(id) and INDEX(B, id). (Actually there is no diff, as Michael pointed out.)
The UNION is cheaper here because of collecting only id, not the bulky columns.
The SELECTs in the UNION are faster because you should be able to provide "covering" indexes.
ORDER BY would go at the very end.
If i have a table in Mysql which has a multi column index on the fields (phone_number, name)
While querying this table if i just group by on phone_number will this index be used?
And if i performing any operation involving just phone_number will the index be used?
And if i want to group by phone_number, name then will this index be used?
In some cases, MySQL is able to perform GROUP BY using index access.
the preconditions for using indexes for GROUP BY are that all GROUP
BY columns reference attributes from the same index, and that the
index stores its keys in order.
this mean that in your case is possibile use index for both your question
You can find more here http://dev.mysql.com/doc/refman/5.7/en/group-by-optimization.html
Suppose the table T has three columns,
id int not null auto_increment,
my_id int,
name varchar(200).
and the query is "select * from T where my_id in (var_1, var_2, ..., var_n) and name = 'name_var'".
Is there any performance difference between below two indices?
Index1: (my_id, name)
Index2: (name, my_id).
•Index1: (my_id, name)
•Index2: (name, my_id).
Yes, above two would slightly differ when it comes to query performance.
Always, the leftmost fields are the most important in determining the efficiency and selectivity of an index.
index should be built on the column(s) which are frequently used in the WHERE, ORDER BY, and GROUP BY clauses.
Hope this helps!
In a composite index, the column to be searched should appear first. So, if you are searching for a set of id values, you'll want id to show up first in the index.
But if id is the primary key and you're using a SELECT * clause to retrieve the whole row, it doesn't make sense to add another index. The way tables are organized, all the data of the row appears clustered with each id value. So just use the index on the primary key.
tl;dr: neither (id,name) nor (name,id) will help this query.
In general, it is best to start the INDEX with the WHERE clauses with col = const. One range can come last. IN is sort of like =, sort like a range. Hence, this is best:
INDEX(name, id)
Think of it this way. The index is an ordered list. With this index, it will start at the name=... and then have to scan or leapfrog through all the ids with that name.
I suspect the PRIMARY KEY is not (id). If it were, why would you be checking the name?
I have a MySQL table of the form
CREATE TABLE `myTable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`timestamp` datetime NOT NULL,
`fieldA` int(11) NOT NULL,
`fieldB` int(11) NOT NULL,
....
)
The table will have around 500,000,000 rows, with the remaining fields being floats.
The queries I will be using will be of the form:
SELECT * FROM myTable
WHERE fieldA= AND fieldB= AND timestamp>'' and timestamp<=''
ORDER BY timestamp;
At the moment I have two indices: a primary key on id, and a unique key on timestamp,fieldA,fieldB (hashed). At the moment, a select query like the above takes around 6 minutes on a reasonably powerful desktop PC.
What would the optimal index to apply? Does the ordering of the 3 fields in the key matter, and should I be using a binary tree instead of hashed? Is there a conflict between my primary key and the second index? Or do I have the best performance I can expect for such a large db without more serious hardware?
Thanks!
For that particular query adding an index to fieldA and fieldB probably would be optimal. Order of the columns in the index do matter.
Index Order
In order for Mysql to even consider using a particular index on the query the first column must be in the query, so for example:
alter table mytable add index a_b_index(a, b);
select * from mytable where a = 1 and b = 2;
The above query should use the index a_b_index. Now take this next example:
alter table mytable add index a_b_index(a, b);
select * from mytable where b = 2;
This will not use the index because the index starts with a, but a is never used in the query so mysql will not use it.
Comparison
Mysql will only use an index if you use equality comparison. So < and > won't use an index for that column, same with between
LIKE
Mysql does use indexes on the LIKE statement, but only when the % is at the end of the statement like this:
select * from mytable where cola like 'hello%';
Whereas these will not use a index:
select * from mytable where cola like '%hello';
select * from mytable where cola like '%hello%';
Hashed indexes are not used for ranges. They are used for equality comparisons only. Therefore, a hashed index cannot be used for the range portion of your query.
Since you have a range in your query, you should use a standard b-tree index. Ensure that fielda and fieldb are the first columns in the index, then timestamp. MySQL cannot utilize the index for searches beyond the first range.
Consider a multi-column index on (fielda, fieldb, timestamp).
The index should also be able to satisfy the ORDER BY.
To improve the query further, select only those three columns or consider a larger "covering" index.