I have id, member_id, topic_id fields. Sometimes I use id, sometimes member_id and sometimes topic_id in WHERE clauses. Can I add Indexes to all of them? Will it make it slower? I am new to MYSQL optimization stuff, so thank you.
Unused indexes won't make a SELECT slower, but each index you add will slow down INSERTs and UPDATEs.
The maximum number of indexes a MyISAM table can have is 64
In general, you would want a separate index on each field if you will be filtering your queries only on single fields, such as in the following case:
SELECT * FROM your_table WHERE id = ?;
SELECT * FROM your_table WHERE member_id = ?;
SELECT * FROM your_table WHERE topic_id = ?;
If the id field is the primary key, then that is probably already using a clustered index. Therefore it looks like you may want to try creating two separate non-clustered indexes on member_id and topic_id:
CREATE INDEX ix_your_table_member_id ON your_table (member_id);
CREATE INDEX ix_your_table_topic_id ON your_table (topic_id);
You may also be interested in researching the topic of covering indexes.
Related
I have a many-to-many relationship database in MySQL
And this Query:
SELECT main_id FROM posts_tag
WHERE post_id IN ('134','140','187')
GROUP BY main_id
HAVING COUNT(DISTINCT post_id) = 3
There are ~5,300,000 rows into this table and that query seems to be slow like 5 seconds (and slower if I add more ids into search)
I want to ask if there is any way to make it faster?
EXPLAIN shows this:
By the way, I want to add more conditions like NOT IN and possible JOIN new tables which has same structure but different data. Not so much like this but first I want to know if there is any way to make that simple query faster?
Any advice would be helpful, even another method, or structure etc.
PS: Hardware is Intel Core i9 3.6Ghz, 64GB RAM, 480GB SSD. So I think the server specs is not a problem.
Use a "composite" and "covering" index:
INDEX(post_id, main_id)
And get rid of INDEX(post_id) since it will then be redundant.
"Covering" helps speed up a query.
Assuming this is a normal "many-to-many" table, then:
CREATE TABLE post_main (
post_id -- similar to `id` in table `posts`
main_id -- similar to `id` in table `main`
PRIMARY KEY(post_id, main_id),
INDEX(main_id, post_id)
) ENGINE=InnoDB;
There is no need for AUTO_INCREMENT anywhere in a many-to-many table.
(You could add FK constraints, but I say 'why bother'.)
More discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
And NOT IN
This gets a bit tricky. I think this is one way; there may be others.
SELECT main_id
FROM post_main
WHERE post_id IN (244,229,193,93,61)
GROUP BY main_id AS x
HAVING COUNT(*) = 5
AND NOT EXISTS ( SELECT 1
FROM post_main
WHERE main_id = x.main_id
AND post_id IN (92,10,234) );
Alexfsk, your Query on the second line has the IN variables surrounded by single quotes. When your column name is defined as INT or mediumint (or any kind of int) datatype, adding the single quotes around the data causes datatype conversion delays on every row considered and delays completion of your query.
I have googled a lot and couldn't find a clear answer to my question
assume we have this query
SELECT * WHERE user_id = x ORDER BY date_created
If we have a single column index on user_id and another one on date_created, does the optimizer use both indexes? or just user_id index?
This is your query:
SELECT *
FROM mytable
WHERE user_id = 123
ORDER BY date_created
If you have two distinct indexes, then MySQL might use the index on user_id to apply the where predicate (if it believes that it will speed up the query, depending on the cardinality of your data, and other factor). It will not use the index on date_created, because it has no way to relate the intermediate resultset that satisfy the where predicate to that index.
For this query, you want a compound index on (user_id, date_created). The database uses the first key in the index to filter the dataset: in the index B-tree, matching rows are already sorted by date, so the order by operation becoms a no-op.
I notice that you are using select *; this is not a good practice in general, and not good for performance. If there are other columns in the table than the user and date, this forces to database to look up at the table to bring the corresponding rows after filtering and ordering through the index, which can be more expensive than not using the index at all. If you just need a few columns, then enumerate them:
SELECT date_created, first_name, last_name
FROM mytable
WHERE user_id = 123
ORDER BY date_created
And have an index on (user_id, date_created, first_name, last_name). That's a covering index: the database can execute the whole query using on the index, without looking up the table itself.
I'm using partitioning RANGE BY(person_id) (10 users per sub-table) and I have these PRIMARY keys:
id,
person_id. id is a UNIQUE and auto-increment index. These indexes are holding articles that were written by person_id. If I want to retrieve all articles that were written by, lets say, person_id = 748172, I can run this query: SELECT * FROM articles WHERE person_id = 748172. But what I want to achieve, is that to be able to get older articles by running this query: SELECT * FROM articles WHERE person_id = 748172 AND id < 472785478 (or older...). Should I use composite index ALTER TABLE articles ADD INDEX '...' (person_id, id) for this case? The design of this table will be used to have up to 1 billion rows. Performance is very important here.
You should be going with NON-Clustered index when have large number of rows and performance is critical.
I have a table in MySQL with two columns
id int(11) unsigned NOT NULL AUTO_INCREMENT,
B varchar(191) CHARACTER SET utf8mb4 DEFAULT NULL,
The id being the PK.
I need to do a lookup in a query using either one of these. id in (:idList) or B in (:bList)
Would this query perform better if, there is a composite index with these two columns in them?
No, it will not.
Indexes can be used to look up values from the leftmost columns in an index:
MySQL can use multiple-column indexes for queries that test all the columns in the index, or queries that test just the first column, the first two columns, the first three columns, and so on. If you specify the columns in the right order in the index definition, a single composite index can speed up several kinds of queries on the same table.
So, if you have a composite index on id, B fields (in this order), then the index can be used to look up values based on their id, or a combination of id and B values. But cannot be used to look up values based on B only. However, in case of an or condition that's what you need to do: look up values based on B only.
If both fields in the or condition are leftmost fields in an index, then MySQL attempts to do an index merge optimisation, so you may actually be better off having separate indexes for these two fields.
Note: if you use innodb table engine, then there is no point in adding the primary key to any multi column index because innodb silently adds the PK to every index.
For OR I dont think so.
Optimizer will try to find a match in the first side, if fail will try the second side. So Individual index for each search will be better.
For AND a composite index will help.
MySQL index TIPS
Of course you can always add the index and compare the explain plan.
MySQL Explain Plan
The trick for optimizing OR is to use UNION. (At least, it works well in some cases.)
( SELECT ... FROM ... WHERE id IN (...) )
UNION DISTINCT
( SELECT ... FROM ... WHERE B IN (...) )
Notes:
Need separate indexes on id and B.
No benefit from any composite index (unless it is also "covering").
Change DISTINCT to ALL if you know that there won't be any rows found by both the id and B tests. (This avoids a de-dup pass.)
If you need ORDER BY, add it after the SQL above.
If you need LIMIT, it gets messier. (This is probably not relevant for IN, but it often is with ORDER BY.)
If the rows are 'wide' and the resultset has very few rows, it may be further beneficial to do
Something like this:
SELECT t...
FROM t
JOIN (
( SELECT id FROM t WHERE id IN (...) )
UNION DISTINCT
( SELECT id FROM t WHERE B IN (...) )
) AS u USING(id);
Notes:
This needs PRIMARY KEY(id) and INDEX(B, id). (Actually there is no diff, as Michael pointed out.)
The UNION is cheaper here because of collecting only id, not the bulky columns.
The SELECTs in the UNION are faster because you should be able to provide "covering" indexes.
ORDER BY would go at the very end.
I have a table with 500k rows. I have specific table which takes really long time to run every query.
One of the queries is:
SELECT *
FROM player_data
WHERE `user_id` = '61120'
AND `opzak` = 'ja'
ORDER BY opzak_nummer ASC
the opzak_nummer column is a tinyint with a number.
EXPLAIN:
Is there any way to improve this query performance and the general of this query/table?
The table name is player_data and includes about 25 columns, most of them are integers with values of stats.
The index is id AUTO_INCREMENT.
You need to run that query, it will alter table and add index. You can read more details here http://dev.mysql.com/doc/refman/5.7/en/drop-index.html
ALTER TABLE pokemon_speler ADD INDEX index_name (user_id, opzak);
The optimal index for that query is either of these:
INDEX(user_id, opzak, opzak_nummer)
INDEX(opzak, user_id, opzak_nummer)
The first two columns do the filtering; the last avoids a tmp table and sort by consuming the ORDER BY.
Is any combination of columns 'unique' (other than id)? If so, we might be able to make it run even faster.