Note: I searched around to see if this question has been asked before. All the existing questions I've been able to find are asking about composite index ordering, or the ordering of columns for queries on an existing table.
Say I have the following table:
CREATE TABLE `foobar` (
`foo_id` int(11),
`bar_id` int(11),
KEY `foo_id` (`foo_id`),
KEY `bar_id` (`bar_id`)
);
It has two unrelated indexes on it. If I swap the definition of the two indexes, it might look like this:
CREATE TABLE `foobar` (
`foo_id` int(11),
`bar_id` int(11),
KEY `bar_id` (`bar_id`),
KEY `foo_id` (`foo_id`)
);
If I run SHOW CREATE TABLE foobar on each of these I can see that there is a difference between the ordering of the KEYs for each table. My question is, does the ordering in this specific case matter? I know it would matter if foo_id and bar_id were used together in a composite index, but here they are not.
If it does indeed matter, is there a way to arbitrarily rearrange the keys once the table has been created? (Something akin to ALTER TABLE foobar ADD INDEX foo_id (foo_id) AFTER bar_id, which I'm pretty sure is invalid as written.)
There is no visual representation of the keys, and it would not add any overhead to arrange them in any way. The key-setting function simply adds qualities to existing columns, which can be rearranged.
The only exception I could see to this would be if you went through some IDE (DB Forge, HeidiSQL, SequelPro, etc), which arranged key values to the top of some list they generated. This, however, is on the side of the system which is interpreting it and has nothing to do with the database performance.
No.
All keys are 'equal'. All are considered when deciding which to use. The key that has the least 'cost' is used. ('Cost' is a complicated formula involving the effort to perform the query one way versus another.)
If you are using InnoDB, it is really a good idea to have a PRIMARY KEY.
If your table is a "many-to-many" mapping table, here is further advice:
http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
A related topic: A "composite" index is one that has multiple columns. (Eg, INDEX(a,b).) The order of the columns does make a difference.
You can use modify in altertable comanad
ALTER TABLE table_name MODIFY foo_id int AFTER bar_id
Related
I realized, that when I am creating foreign keys in table, indexes are adding automatically.
In my table:
CREATE TABLE `SupplierOrderGoods` (
`shopOrder_id` INT(11) NOT NULL,
`supplierGood_id` INT(11) NOT NULL,
`count` INT(11) NOT NULL,
PRIMARY KEY (`shopOrder_id`, `supplierGood_id`),
CONSTRAINT `FK_SupplierOrderGoods_ShopOrders` FOREIGN KEY (`shopOrder_id`) REFERENCES `shoporders` (`id`),
CONSTRAINT `FK_SupplierOrderGoods_SupplierGoods` FOREIGN KEY (`supplierGood_id`) REFERENCES `suppliergoods` (`id`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB;
Index
INDEX `FK_SupplierOrderGoods_SupplierGoods` (`supplierGood_id`)
have been created automatically.
It is okay, that index have been created as I found in another post. I was looking what indexes are used for and found, that they are used for optimizing search in tables.
Now, I know, that I have to use indexes to optimize work with database.
Also, I found, that indexes can be complex (not on one field, but on some fields). In that case, I want to ask should I use complex index:
INDEX `FK_ShopOrders_SupplierGoods` (`shopOrder_id`, `supplierGood_id`),
or two simple indexes?:
INDEX `FK_SupplierOrderGoods_SupplierGoods` (`supplierGood_id`),
INDEX `FK_SupplierOrderGoods_ShopOrders` (`shopOrder_id`),
I'm still earning about indexes myself but I believe it's going to depend on what kind of data you will be querying the DB for.
For example, if you have a report for a certain record that will be ran a lot you'll want an index on it. If the report pulls just one column then make a one column index, if it's comprised of two, like a first name and a last name record, you'll probably want one for both.
You do not want to put an index on everything though as that can have performance issues as both the record and the index need to be updated. As such, tables that have a high amount of inserts or updating done on them you'll want to think about whether an index hurts or helps.
Lot of information to cover with indexes.
I am really beginner when it's about databases and I am facing a problem.
I have a table with many rows. Each row got a primary key, called MY_KEY. MY_KEY is CHAR(20).
I have another table. Each row, in one of the fields will have many MY_KEY separated by space and stored as TEXT, but never the same MY_KEY twice.
I am not sure i explained this well, but how can I design those two tables to be more performance efficient?
My program will take the TEXT and add it to a vector and then binary search it. This will be slow if there are 1000 20 characters MY_KEY.
Don't store delimited values in the database! Normalize your data by introducing many-to-many table.
Can you tell me the way to improve this design, please?
Your schema might look something like
CREATE TABLE table1
(
table1_key CHAR(20) NOT NULL PRIMARY KEY,
-- other columns
);
CREATE TABLE table2
(
table2_key CHAR(20) NOT NULL PRIMARY KEY,
-- other columns
);
CREATE TABLE table2_table1
(
table2_key CHAR(20),
table1_key CHAR(20),
PRIMARY KEY (table2_key, table1_key),
FOREIGN KEY(table2_key) REFERENCES table2 (table2_key),
FOREIGN KEY(table1_key) REFERENCES table1 (table1_key)
);
Here is SQLFiddle demo
Check out some readings on database normalization. The basic idea is that you don't want to have any column that stores more than one piece of data. While this isn't an absolute rule, it's a good rule of thumb, and will probably be more performant than what you're describing.
Instead of one row with a bunch of associated keys, consider having a bunch of rows with the pairs of associated keys. This is a superior way to represent a many to many relation in a relational database. You can do a join to retrieve the data.
I think the following is a standard coding "pattern" when we need to change the primary key of a table to serve better our queries via the index:
ALTER TABLE employees
ADD UNIQUE INDEX tmp (employee_id, subsidiary_id);
ALTER TABLE employees
DROP PRIMARY KEY;
ALTER TABLE employees
ADD PRIMARY KEY (subsidiary_id, employee_id);
My understanding is that the tmp index is created before dropping the primary key in order to facilitate queries using the current primary key and not lose performance.
But I don't understand this.
When we execute an ALTER TABLE (I am referring to the ALTER TABLE to drop the primary key) the table will be locked until the operation finishes right?
So the queries will not be able to run anyway. So why create tmp in the first place?
I haven't seen this pattern. But, I would expect it for a slightly different reason. The purpose is less "facilitating" queries than ensuring that the pair of keys remains unique throughout the entire set of transactions.
In other words, between dropping the primary key and creating the new key, there is a short window of opportunity for someone to insert a duplicate pair of keys. Then the second operation will fail. By creating the unique index first, you prevent this from happening.
If you know there are no other users/queries using the system when you are modifying the table (say you are in single user mode), then there is no need to create the additional index.
I have a table with thousands of records. I do a lot of selects like this to find if a person exists.
SELECT * from person WHERE personid='U244A902'
Because the person ID is not pure numerical, I didn't use it as the primary key and went with auto-increment. But now I'm rethinking my strategy, because I think SELECTS are getting slower as the table fills up. I'm thinking the reason behind this slowness is because personid is not the primary key.
So my question, if I were to go through the trouble of restructuring the table and use the personid as the primary key instead, without an auto-increment, would that significantly speed up the selects? I'm talking about a table that has 200,000 records now and will fill up to about 5 million when done.
The slowness is due indirectly to the fact that the personid is not a primary key, in that it isn't indexed because it wasn't defined as a key. The quickest fix is to simply index it:
CREATE UNIQUE INDEX `idx_personid` ON `person` (`personid`);
However, if it is a unique value, it should be the table's primary key. There is no real need for a separate auto_increment key.
ALTER TABLE person DROP the_auto_increment_column;
ALTER TABLE person ADD PRIMARY KEY personid;
Note however, that if you were also using the_auto_increment_column as a FOREIGN KEY in other tables and dropped it in favor of personid, you would need to modify all your other tables to use personid instead. The difficulty of doing so may not be completely worth the gain for you.
You can to create an index to personid.
CREATE INDEX id_index ON person(personidid)
ALTER TABLE `person ` ADD INDEX `index1` (`personid`);
try to index your coloumns on which you are using where clause or selecting the coloumns
What would be an appropriate way to do this, since mySQL obviously doesnt enjoy this.
To leave either partitioning or the foreign keys out from the database design would not seem like a good idea to me. I'll guess that there is a workaround for this?
Update 03/24:
http://opendba.blogspot.com/2008/10/mysql-partitioned-tables-with-trigger.html
How to handle foreign key while partitioning
Thanks!
It depends on the extent to which the size of rows in the partitioned table is the reason for partitions being necessary.
If the row size is small and the reason for partitioning is the sheer number of rows, then I'm not sure what you should do.
If the row size is quite big, then have you considered the following:
Let P be the partitioned table and F be the table referenced in the would-be foreign key. Create a new table X:
CREATE TABLE `X` (
`P_id` INT UNSIGNED NOT NULL,
-- I'm assuming an INT is adequate, but perhaps
-- you will actually require a BIGINT
`F_id` INT UNSIGNED NOT NULL,
PRIMARY KEY (`P_id`, `F_id`),
CONSTRAINT `Constr_X_P_fk`
FOREIGN KEY `P_fk` (`P_id`) REFERENCES `P`.`id`
ON DELETE CASCADE ON UPDATE RESTRICT,
CONSTRAINT `Constr_X_F_fk`
FOREIGN KEY `F_fk` (`F_id`) REFERENCES `F`.`id`
ON DELETE RESTRICT ON UPDATE RESTRICT
) ENGINE=INNODB CHARACTER SET ascii COLLATE ascii_general_ci
and crucially, create a stored procedure for adding rows to table P. Your stored procedure should make certain (use transactions) that whenever a row is added to table P, a corresponding row is added to table X. You must not allow rows to be added to P in the "normal" way! You can only guarantee that referential integrity will be maintained if you keep to using your stored procedure for adding rows. You can freely delete from P in the normal way, though.
The idea here is that your table X has sufficiently small rows that you should hopefully not need to partition it, even though it has many many rows. The index on the table will nevertheless take up quite a large chunk of memory, I guess.
Should you need to query P on the foreign key, you will of course query X instead, as that is where the foreign key actually is.
I would strongly suggest sharding using Date as the key for archiving data to archive tables. If you need to report off multiple archive tables, you can use Views, or build the logic into your application.
However, with a properly structured DB, you should be able to handle tens of millions of rows in a table before partitioning, or sharding is really needed.