Implicit column index - mysql

Given the following MySQL table (InnoDB type):
CREATE TABLE `table` (
`id` INT NOT NULL,
`foo_id` INT NOT NULL,
`bar_id` INT NOT NULL,
`name` VARCHAR NOT NULL,
PRIMARY KEY (`id`),
INDEX `on_foo_id` (`foo_id`),
INDEX `on_bar_id` (`bar_id`),
UNIQUE `on_foo_bar_id` (`btl_foo_id`, `btl_bar_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Note: even if I'm not using MySQL FOREIGN KEY constraints (I'm a Rails developer who handle this on the applicative level), the columns foo_id & bar_id are foreign keys.
Due to the presence of:
bar_id's index
(foo_id, bar_id)'s index
...I'm wondering if the index on foo_id is really relevant. Maybe MySQL may already index foo_id even without explicitly declare this column as an index.
In other words, is it possible to remove this line:
INDEX `on_foo_id` (`foo_id`),
without altering the performances?
Thank you for the light.

Related

MYSQL How to safely remove UNIQUE KEY? What to have in mind?

I have a table in database.
CREATE TABLE `comment_sheets` (
`id` mediumint(9) NOT NULL AUTO_INCREMENT,
`doc_id` mediumint(9) NOT NULL,
`level` varchar(10) DEFAULT NULL,
`author` varchar(30) DEFAULT NULL,
`status` varchar(10) DEFAULT NULL,
`creation_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `cs` (`doc_id`,`level`,`author`)
) ENGINE=InnoDB AUTO_INCREMENT=3961075 DEFAULT CHARSET=utf8 ;
My UNIQUE KEY cs (doc_id,level,author) is a problem now. I want to remove it, becouse i need duplicate values.
My question is. What should i have in my mind or what shoud I be worry about, when i want delete unique key?
Thanks.
To drop unique key
ALTER TABLE table_name
DROP INDEX index_name;
To drop primary key
ALTER TABLE table_name
DROP INDEX `PRIMARY`;
You need to alter table:
alter table comment_sheets drop INDEX `cs`
It really depends on how the key is used apart from enforcing uniqueness of data.
Check if the key is used in any foreign key relationships. If yes, you need to drop the foreign key before you can drop the unique one. (Well, mysql won't let you drop the index if the foreign key exits anyway)
Check what queries may make use of the key and how its removal would affect their performance. You may have to add a non-unique key on these fields back.
I this particular case dropping the index is relatively simple task because it is not the primary key, and it is not a fulltext index. The only thing that may take long is the removal of the index data if your table is big (judging from the auto increment value, it is not a small table)

MySql - Better Schema for large table IP pairings?

I'm trying to manage some internet logs. I'm essentially capturing what IPs are reaching out to what other IPs and making reports on it.
Problem is there's a ton of chatter and I'm not sure if I can make my schema any better.
my table schema:
CREATE TABLE `IpChatter` (
`Id` bigint(20) NOT NULL AUTO_INCREMENT,
`SourceIp` bigint(20) NULL,
`DestinationIp` bigint(20) NULL,
`SourcePort` int(11) NULL,
`DestinationPort` int(11) NULL,
`FKToSomeTableWithExtraMetaDataId` bigint(20) NOT NULL,
CONSTRAINT `PK_IpChatter` PRIMARY KEY (`Id` ASC)
) ENGINE=InnoDB;
CREATE INDEX `IX_IpChatter_FKToSomeTableWithExtraMetaDataId` ON `IpChatter` (`FKToSomeTableWithExtraMetaDataId`) using HASH;
CREATE INDEX `IX_IpChatter_Main_Query_SourceIp` ON `IpChatter` (`SourceIp`);
CREATE INDEX `IX_IpChatter_Main_Query_DestinationIp` ON `IpChatter` (`DestinationIp`);
CREATE INDEX `IX_IpChatter_Main_Query_SourcePort` ON `IpChatter` (`SourcePort`);
CREATE INDEX `IX_IpChatter_Main_Query_DestinationPort` ON `IpChatter` (`DestinationPort`);
ALTER TABLE `IpChatter` ADD CONSTRAINT `FK_IpChatter_FKToSomeTableWithExtraMetaData`
FOREIGN KEY (`FKToSomeTableWithExtraMetaDataId`) REFERENCES `FKToSomeTableWithExtraMetaData` (`Id`)
ON DELETE CASCADE;
Right now I've got 2mill rows of data and pulls back data I need in about 4sec. However this is from using relatively light testing data. I'd imagine the size of the data being 30X larger in the final product. So that 4 sec will surely mean 2mins in the final product. Is there a better way I could normalize this data or have I hit a bottle neck and there isn't much I can do? Also, Are the indexes I picked ok?
Never mind, I figured it out. I guess I just needed to type out the problem to help me think up a solution.
So after looking at my data I've noticed a lot of pairings are repeated but under a different FKToSomeTableWithExtraMetaDataId value.
So tells me I can normalize the data by creating a table with distinct pairings of SourceIp,DestinationIp,SourcePort,DestinationPort`. Then create a lookup table to join up that table with the ToSomeTableWithExtraMetaData table.
This reduces my raw IP data by 1700%! This will give a tremendous increase in performance when searching for a range of IPs and now it has to go though far less rows. Plus with the lookup table I have greater flexibility on how I can query.
CREATE TABLE `IpChatter` (
`Id` bigint(20) NOT NULL AUTO_INCREMENT,
`SourceIp` bigint(20) NULL,
`DestinationIp` bigint(20) NULL,
`SourcePort` int(11) NULL,
`DestinationPort` int(11) NULL,
`FKToSomeLookupTableId` bigint(20) NOT NULL,
CONSTRAINT `PK_IpChatter` PRIMARY KEY (`Id` ASC)
) ENGINE=InnoDB;
CREATE INDEX `IX_IpChatter_FKToSomeLookupTableId` ON `IpChatter` (`FKToSomeLookupTableId`) using HASH;
CREATE INDEX `IX_IpChatter_Main_Query_SourceIp` ON `IpChatter` (`SourceIp`);
CREATE INDEX `IX_IpChatter_Main_Query_DestinationIp` ON `IpChatter` (`DestinationIp`);
CREATE INDEX `IX_IpChatter_Main_Query_SourcePort` ON `IpChatter` (`SourcePort`);
CREATE INDEX `IX_IpChatter_Main_Query_DestinationPort` ON `IpChatter` (`DestinationPort`);
ALTER TABLE `IpChatter` ADD CONSTRAINT `FK_IpChatter_FKToSomeLookupTable`
FOREIGN KEY (`FKToSomeLookupTableId`) REFERENCES `FKToSomeLookupTable` (`Id`)
ON DELETE CASCADE;
CREATE TABLE `FKToSomeLookupTable` (
`FKToSomeTableWithExtraMetaDataId` bigint(20) NOT NULL,
`IpChatterId` bigint(20) NOT NULL,
CONSTRAINT `PK_FKToSomeLookupTable` PRIMARY KEY (`Id` ASC)
) ENGINE=InnoDB;
CREATE INDEX `IX_IpChatter_FKToSomeTableWithExtraMetaDataId` ON `FKToSomeLookupTable` (`FKToSomeTableWithExtraMetaDataId`) using HASH;
CREATE INDEX `IX_IpChatter_IpChatterId` ON `FKToSomeLookupTable` (`IpChatterId`) using HASH;
ALTER TABLE `FKToSomeLookupTable` ADD CONSTRAINT `FK_FKToSomeLookupTable_FKToSomeTableWithExtraMetaData`
FOREIGN KEY (`FKToSomeTableWithExtraMetaDataId`) REFERENCES `FKToSomeTableWithExtraMetaData` (`Id`)
ON DELETE CASCADE;
ALTER TABLE `FKToSomeLookupTable` ADD CONSTRAINT `FK_FKToSomeLookupTable_IpChatter`
FOREIGN KEY (`IpChatterId`) REFERENCES `IpChatter` (`Id`)
ON DELETE CASCADE;
Shrink the table size. Smaller is one way to help (some) with the speed.
IPv4 can be packed into INT UNSIGNED, which is 4 bytes versus your current 8-byte BIGINT. IPv6, on the other hand, needs BINARY(16); what you have will not work.
Port number, I think, will fit in a 2-byte SMALLINT UNSIGNED.
Are you expecting your tables to be bigger than 4 billion rows? If not, use INT UNSIGNED instead of BIGINT for ids.
Get rid of FOREIGN KEYs, they slow down things; meanwhile, the constraints have never triggered an error, have they? Do you really use the overhead of CASCADE?
Don't index every column. Look at your queries and index the columns or combinations of columns that would benefit SELECTs, UPDATEs, and DELETEs.
Please show the queries; without them, we cannot judge the performance.

Mysql too slow on simple query between two tables

Good morning,
I've two tables, ANALISI with 1462632 records and PAZIENTE with 1408146 records, this simple count using one of the index of PAZIENTE require about 30 seconds to give back about 65000 records
SELECT COUNT(analisi0_.ID_ANALISI) AS col_0_0_
FROM Analisi analisi0_
INNER JOIN Paziente paziente1_ ON analisi0_.ID_PAZIENTE = paziente1_.ID_PAZIENTE
WHERE (paziente1_.nome LIKE 'MARIA%')
I've also tried adding an index on analisi0_.ID_PAZIENTE but with no good results.
Is there a way to enhance performance?
This is the corrisponding explain that seems ok to me
CREATE TABLE ANALISI
(
ID_ANALISI INT UNSIGNED NOT NULL AUTO_INCREMENT,
ID_PAZIENTE INT UNSIGNED NOT NULL,
ID_SESSIONE INT UNSIGNED NOT NULL,
TRACCIATO TINYINT UNSIGNED NOT NULL,
CAMPIONE VARCHAR(30),
ID_PATOLOGICO TINYINT UNSIGNED,
REPARTO VARCHAR(40),
TOTALE_PROTEINE FLOAT,
RAPP_AG FLOAT,
ID_ANALISI_LINK INT UNSIGNED,
ID_ANALISI_IFE INT UNSIGNED,
ID_ANALISI_DATI INT UNSIGNED,
ID_ANALISI_NOTA INT UNSIGNED,
DATA_MODIFICA DATETIME,
ID_UTENTE_MODIFICA SMALLINT UNSIGNED,
DATA_VALIDAZIONE DATETIME,
ID_TIPO_VALIDAZIONE TINYINT UNSIGNED NOT NULL,
ID_UTENTE_VALIDAZIONE SMALLINT UNSIGNED,
DATA_CANCELLAZIONE DATETIME,
ID_UTENTE_CANCELLAZIONE SMALLINT UNSIGNED,
PRIMARY KEY (ID_ANALISI),
INDEX IDX_CAMPIONE (CAMPIONE),
INDEX IDX_REPARTO (REPARTO),
CONSTRAINT FK_ANALISI_PAZIENTE FOREIGN KEY (ID_PAZIENTE) REFERENCES PAZIENTE(ID_PAZIENTE),
CONSTRAINT FK_ANALISI_SESSIONE FOREIGN KEY (ID_SESSIONE) REFERENCES SESSIONE(ID_SESSIONE),
CONSTRAINT FK_ANALISI_PATOLOGICO FOREIGN KEY (ID_PATOLOGICO) REFERENCES PATOLOGICO(ID_PATOLOGICO),
CONSTRAINT FK_ANALISI_TIPO_VALIDAZIONE FOREIGN KEY (ID_TIPO_VALIDAZIONE) REFERENCES TIPO_VALIDAZIONE(ID_TIPO_VALIDAZIONE),
CONSTRAINT FK_ANALISI_UTENTE_MODIFICA FOREIGN KEY (ID_UTENTE_MODIFICA) REFERENCES UTENTE(ID_UTENTE),
CONSTRAINT FK_ANALISI_UTENTE_VALIDAZIONE FOREIGN KEY (ID_UTENTE_VALIDAZIONE) REFERENCES UTENTE(ID_UTENTE),
CONSTRAINT FK_ANALISI_UTENTE_CANCELLAZIONE FOREIGN KEY (ID_UTENTE_CANCELLAZIONE) REFERENCES UTENTE(ID_UTENTE),
CONSTRAINT FK_ANALISI_ANALISI_LINK FOREIGN KEY (ID_ANALISI_LINK) REFERENCES ANALISI(ID_ANALISI),
CONSTRAINT FK_ANALISI_ANALISI_IFE FOREIGN KEY (ID_ANALISI_IFE) REFERENCES ANALISI_IFE(ID_ANALISI_IFE),
CONSTRAINT FK_ANALISI_ANALISI_NOTA FOREIGN KEY (ID_ANALISI_NOTA) REFERENCES ANALISI_NOTA(ID_ANALISI_NOTA),
CONSTRAINT FK_ANALISI_ANALISI_DATI FOREIGN KEY (ID_ANALISI_DATI) REFERENCES ANALISI_DATI(ID_ANALISI_DATI)
)
ENGINE=InnoDB;
CREATE TABLE PAZIENTE
(
ID_PAZIENTE INT UNSIGNED NOT NULL AUTO_INCREMENT,
ID_PAZIENTE_LAB VARCHAR(20),
COGNOME VARCHAR(30),
NOME VARCHAR(30),
DATA_NASCITA DATE,
ID_SESSO TINYINT UNSIGNED NOT NULL,
RECAPITO VARCHAR(50),
CODICE_FISCALE VARCHAR(30),
ID_SPECIE TINYINT UNSIGNED NOT NULL,
PRIMARY KEY (ID_PAZIENTE),
INDEX IDX_DATA_NASCITA (DATA_NASCITA),
INDEX IDX_COGNOME (COGNOME),
INDEX IDX_NOME (NOME),
INDEX IDX_SESSO (ID_SESSO),
CONSTRAINT FK_PAZIENTE_SPECIE FOREIGN KEY (ID_SPECIE) REFERENCES SPECIE(ID_SPECIE),
CONSTRAINT FK_PAZIENTE_SESSO FOREIGN KEY (ID_SESSO) REFERENCES SESSO(ID_SESSO)
)
ENGINE=InnoDB;
In InnoDB every index contains the primary key implicitly.
The explain plan shows that index IDX_NOME is used on table Paziente. The DBMS looks up the name in the index and finds ID_PAZIENTE in there, which is the key we need to access the other table. So there is nothing to add. (In another DBMS we would have added a composite index on (NOME, ID_PAZIENTE) for this to happen.)
Then there is table Analisi to consider. We find a record via FK_ANALISI_PAZIENTE which contains the ID_PAZIENTE which is used to find the match, and implicitly the primary key ID_ANALISI which could be used to access the table, but this is not even necessary, beacuse we have all information we need from the index. There is nothing left that we need to find in the table. (Again, in another DBMS we would have added a composite index on (ID_PAZIENTE, ID_ANALISI) to have a covering index.)
So what happens is merely: read one index in order to read the other index in order to count. Perfect. There is nothing to add.
We could replace COUNT(analisi0_.ID_ANALISI) with COUNT(*) as the former only says "count records where ID_ANALISI is not null", which is always the case as ID_ANALISI is the table's primary key. So it's simpler to use the latter and say "count records". However, I don't expect this to speed up the query significantly if at all.
So from a query point of view, there is nothing to speed this up. Here are further things that come to mind:
Partitioned tables? No, I would see no benefit in this. It could be faster were the query executed in parallel threads then, but as far as I know, there is no parallel execution on multiple partitions in MySQL. (I may be wrong though.)
Defragmenting the tables? No, the tables themselves are not even accessed in the query.
That leaves us with: Buy better hardware. (Sorry not to have any better advice for you.)

Should the normal Key include the primary key?

This question is related to
Do MySQL tables need an ID?
There is a meaningless auto_incremental ID acting as PRIMARY KEY for a table, then when I create other KEYs, should I include this ID in the KEYs?
For example, in this table:
CREATE TABLE `location` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`country` varchar(50),
`states` varchar(50),
`city` varchar(50),
`county` varchar(50),
`zip` int(5),
PRIMARY KEY(ID),
KEY zip1 (zip),
KEY zip2 (zip, ID)
} ENGINE=InnoDB ;
Because I need to search the table using zip code a lot, so I need a KEY start from zip code. I should use either KEY zip1 or KEY zip2. Which one of these two KEYs is better?
For InnoDB, the primary key is always included in secondary indexes;
All indexes other than the clustered index are known as secondary indexes. In InnoDB, each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index. InnoDB uses this primary key value to search for the row in the clustered index.
In other words, ID is already included in zip1, and does not have to be mentioned as it is in in zip2.

What does the line " KEY `idx_pid` (`person_id`), " mean?

I am new to mysql and am working on an online server (MYSQL version 5.1.69) and i have the following table
CREATE TABLE `person_info` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`person_id` int(11) NOT NULL,
`info_type_id` int(11) NOT NULL,
`info` text NOT NULL,
`note` text,
PRIMARY KEY (`id`),
KEY `idx_pid` (`person_id`),
KEY `person_info_info_type_id_exists` (`info_type_id`)
)
Can someone explain to me what " KEY idx_pid (person_id)," does?
KEY, in MySQL, is an alias for INDEX; you can see this in the pseudo grammar in the CREATE TABLE documentation:
[INDEX|KEY] [index_name] (index_col_name,...)
It represents the definition of an index on a table, and nothing more. Here,
KEY `idx_pid` (`person_id`),
…creates an index named "idx_pid" on the column "person_id". This could have also been written as,
INDEX `idx_pid` (`person_id`),
However, MySQL's SHOW CREATE TABLE command (and other commands) will prefer KEY. It is an unfortunate choice for a keyword here, as it has nothing to do with a “key¹” in the relational databases sense of the word.
¹A key, in relational database theory, is a set of columns that uniquely identify a row.
It means you're creating an index named "idx_pid" on the person_info.person_id column.
This adds an index named idx_pid on the person_id column which speeds up queries using the persond_id as condition.
You can read up on MySQL indexes here.