I have a table with 500k rows. I have specific table which takes really long time to run every query.
One of the queries is:
SELECT *
FROM player_data
WHERE `user_id` = '61120'
AND `opzak` = 'ja'
ORDER BY opzak_nummer ASC
the opzak_nummer column is a tinyint with a number.
EXPLAIN:
Is there any way to improve this query performance and the general of this query/table?
The table name is player_data and includes about 25 columns, most of them are integers with values of stats.
The index is id AUTO_INCREMENT.
You need to run that query, it will alter table and add index. You can read more details here http://dev.mysql.com/doc/refman/5.7/en/drop-index.html
ALTER TABLE pokemon_speler ADD INDEX index_name (user_id, opzak);
The optimal index for that query is either of these:
INDEX(user_id, opzak, opzak_nummer)
INDEX(opzak, user_id, opzak_nummer)
The first two columns do the filtering; the last avoids a tmp table and sort by consuming the ORDER BY.
Is any combination of columns 'unique' (other than id)? If so, we might be able to make it run even faster.
Related
Step 1:
I am creating a simple table.
CREATE TABLE `indexs`.`table_one` (
`id` INT NOT NULL AUTO_INCREMENT,
`name` VARCHAR(45) NULL,
PRIMARY KEY (`id`));
Step 2:
I make two inserts into this table.
insert into table_one (name) values ("B");
insert into table_one (name) values ("A");
Step 3:
I make a select, I get a table, the records in which are ordered by id.
SELECT * FROM table_one;
This is the expected result, because in mysql the primary key is a clustered index, therefore the data will be physically ordered by it.
Now the part I don't understand.
Step 4:
I am creating an index on the name column.
CREATE INDEX index_name ON table_one(name)
I repeat step 3 again, but I get a different result. The lines are now ordered according to the name column.
Why is this happening? why the order of the rows in the table changes in accordance with the new index on the name column, because as far as I understand, in mysql, the primary key is the only clustered index, and all indexes created additionally are secondary.
I make a select, I get a table, the records in which are ordered by id. [...] This is the expected result, because in mysql the primary key is a clustered index, therefore the data will be physically ordered by it.
There is some misunderstanding of a concept here.
Table rows have no inherent ordering: they represent unordered set of rows. While the clustered index enforces a physical ordering of data in storage, it does not guarantee the order in which rows are returned by a select query.
If you want the results of the query to be ordered, then use an order by clause. Without such clause, the ordering or the rows is undefined: the database is free to return results in whichever order it likes, and results are not guaranteed to be consistent over consecutive executions of the same query.
select * from table_one order by id;
select * from table_one order by name;
(GMB explains most)
Why is this happening? why the order of the rows in the table changes in accordance with the new index on the name column
Use EXPLAIN SELECT ... -- it might give a clue of what I am about to suggest.
You added INDEX(name). In InnoDB, the PRIMARY KEY column(s) are tacked onto the end of each secondary index. So it is effectively a BTree ordered by (name,id) and containing only those columns.
Now, the Optimizer is free to fetch the data from the index, since it has everything you asked for (id and name). (This index is called "covering".)
Since you did not specify an ORDER BY, the result set ordering is valid (see GMB's discussion).
Moral of the story: If you want an ordering, specify ORDER BY. (The Optimizer is smart enough to "do no extra work" if it can see how to provide the data without doing a sort.
Further experiment: Add another column to the table but don't change the indexes. Now you will find SELECT * FROM t is ordered differently than SELECT id, name FROM t. I think I have given you enough clues to predict this difference, if not, ask.
There is my SQL query. Table system_mailer is for logging sent e-mails. When i want to search some data, query is 10 seconds long. It is possible on any way to optimize this query?
SELECT `ID`
FROM `system_mailer`
WHERE `group` = 'selling_center'
AND `group_parameter_1` = '1'
AND `group_parameter_2` = '2138'
Timins is around couple of seconds, how could it be optimised?
You might find the following index on 4 columns would help performance:
CREATE INDEX idx ON system_mailer (`group`, group_parameter_1, group_parameter_2, ID);
MySQL should be able to use this index on your current query. By the way, if you are using InnoDB, and ID is the primary key, then you might be able to drop it from the explicit index definition, and just use this:
CREATE INDEX idx ON system_mailer (`group`, group_parameter_1, group_parameter_2);
Please avoid naming your columns and tables with reserved MySQL keywords like group. Because you made this design decision, you will now be forced to forever escape that column name with backticks (ugly).
just be sure you have a composite index on table system_mailer for the columns
(`group`, `group_parameter_1`, `group_parameter_2`)
and you can use redudancy adding the id to index for avoid data table access in query
(`group`, `group_parameter_1`, `group_parameter_2`, ID)
I would like to know if it is necessary to create an index for all fields within a table if one of your queries will use SELECT *.
To explain, if we had a table that 10M records and we did a SELECT * query on it would the query run faster if we have created an index for all fields within the table or does MySQL handle SELECT * in a different way to SELECT first_field, a_field, last_field.
To my understanding, if I had a query that did SELECT first_field, a_field FROM table then it would bring performance benefits if we created an index on first_field, a_field but if we use SELECT * is there even a benefit from creating an index for all fields?
Performing a SELECT * FROM mytable query would have to read all the data from the table. This could, theoretically, be done from an index if you have an index on all the columns, but it would be just faster for the database to read the table itself.
If you have a where clause, having an index on (some of) the columns you have conditions on may dramatically improve the query's performance. It's a gross simplification, but what basically happens is the following:
The appropriate rows are filtered according to the where clause. It's much faster to search for these rows in an index (which is, essentially, a sorted tree) than a table (which is an unordered set of rows).
For the columns that where in the index used in the previous step the values are returned.
For the columns that aren't, the table is accessed (according to a pointer kept in the index).
indexing a mysql table for a column improves performance when there is a need to search or edit a row/record based on that column of that table.
for example, if there is an 'id' column and if it is a primary key; And in that case if you want to search a record using where clause on that 'id' column then you don't need to create index for the 'id' column because primary key column will act as an indexed column.
In another case, if there is an 'pid' column in the table and if it is not a primary key; Then in order to search based on 'pid' column then to improve performance it is better to create an index for the 'pid' column. That will make query fast to search the expected record.
I have a table with a large number of records ( > 300,000). The most relevant fields in the table are:
CREATE_DATE
MOD_DATE
Those are updated every time a record is added or updated.
I now need to query this table to find the date of the record that was modified last. I'm currently using
SELECT mod_date FROM table ORDER BY mod_date DESC LIMIT 1;
But I'm wondering if this is the most efficient way to get the answer.
I've tried adding a where clause to limit the date to the last month, but it looks like that's actually slower (and I need the most recent date, which could be older than the last month).
I've also tried the suggestion I read elsewhere to use:
SELECT UPDATE_TIME
FROM information_schema.tables
WHERE TABLE_SCHEMA = 'db'
AND TABLE_NAME = 'table';
But since I might be working on a dump of the original that query might result into NULL. And it looks like this is actually slower than the original query.
I can't resort to last_insert_id() because I'm not updating or inserting.
I just want to make sure I have the most efficient query possible.
The most efficient way for this query would be to use an index for the column MOD_DATE.
From How MySQL Uses Indexes
8.3.1 How MySQL Uses Indexes
Indexes are used to find rows with specific column values quickly.
Without an index, MySQL must begin with the first row and then read
through the entire table to find the relevant rows. The larger the
table, the more this costs. If the table has an index for the columns
in question, MySQL can quickly determine the position to seek to in
the middle of the data file without having to look at all the data. If
a table has 1,000 rows, this is at least 100 times faster than reading
sequentially.
You can use
SHOW CREATE TABLE UPDATE_TIME;
to get the CREATE statement and see, if an index on MOD_DATE is defined.
To add an Index you can use
CREATE INDEX
CREATE [UNIQUE|FULLTEXT|SPATIAL] INDEX index_name
[index_type]
ON tbl_name (index_col_name,...)
[index_option]
[algorithm_option | lock_option] ...
see http://dev.mysql.com/doc/refman/5.6/en/create-index.html
Make sure that both of those fields are indexed.
Then I would just run -
select max(mod_date) from table
or create_date, whichever one.
Make sure to create 2 indexes, one on each date field, not a compound index on both.
As for a discussion of the difference between this and using limit, see MIN/MAX vs ORDER BY and LIMIT
Use EXPLAIN:
http://dev.mysql.com/doc/refman/5.0/en/explain.html
This tells You how mysql executes statement, thanks to that You can figure out most efficient way, cause it depends on Your db structure and there is no one universal solution.
I have a query that looks like the following:
select count(*) from `foo` where expires_at < now()”
since expires_at is indexed, the query hits the index no problem. however the following query:
select count(*) from `foo` where expires_at < now() and some_id != 5
the index never gets hit.
both expires_at and some_id are indexed.
is my index not properly created?
This query:
SELECT COUNT(*)
FROM foo
WHERE expires_at < NOW()
can be satisfied by the index only, without referring to the table itself. You may see it from the using index in the plan.
This query:
SELECT COUNT(*)
FROM foo
WHERE expires_at < NOW()
AND some_id <> 5
needs to look into the table to find the value of some_id.
Since the table lookup is quite an expensive thing, it is more efficient to use the table scan and filter the records.
If you had a composite index on expires_at, some_id, the query would probably use the index both for ranging on expires_at and filtering on some_id.
SQL Server even offers a feature known as included fields for this. This command
CREATE INDEX ix_foo_expires__someid ON foo (expires_at) INCLUDE (some_id)
would create an index on expires_at which would additionally store some_id in the leaf entires (without overhead of sorting).
MySQL, unfortunately, does not support it.
Probably what's happening is that for the first query, the index can be used to count the rows satisfying the WHERE clause. In other words, the query would result in a table scan, but happily all the columns involved in the WHERE condition are in an index, so the index is scanned instead.
In the second query though, there's no single index that contains all the columns in the WHERE clause. So MySQL resorts to a full table scan. In the case of the first query, it was using your index, but not to find the rows to check - in the special case of a COUNT() query, it could use the index to count rows. It was doing the equivalent of a table scan, but on the index instead of the table.
1) It seems you have two single-column indices. You can try to create a multi-column index.
For a detailed explanation why this is different than multiple single column indices, see the following:
http://www.mysqlfaqs.net/mysql-faqs/Indexes/When-does-multi-column-index-come-into-use-in-MySQL
2) Do you have a B-tree index on the expires_at column? Since you are doing a range query (<), that might give better performance.
http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
Best of luck!