I am trying to optimize the schemas for tables by removing redundant keys. Both the percona toolkit and common_schema tell me that the following key is redundant:
mysql> SELECT redundant_index_name, sql_drop_index FROM redundant_keys;
+----------------------+-------------------------------------------------------------------------------+
| redundant_index_name | sql_drop_index |
+----------------------+-------------------------------------------------------------------------------+
| deviceName | ALTER TABLE `reporting`.`tbCardData` DROP INDEX `deviceName` |
+----------------------+-------------------------------------------------------------------------------+
1 rows in set (0.18 sec)
mysql> show create table `reporting`.`tbCardData`;
CREATE TABLE `tbCardData` (
`pkCardDataId` bigint(12) unsigned NOT NULL AUTO_INCREMENT,
`deviceName` varchar(64) DEFAULT NULL,
`shelfId` smallint(3) unsigned DEFAULT NULL,
`cardId` smallint(3) unsigned DEFAULT NULL,
`cardName` varchar(64) DEFAULT NULL,
`cardType` smallint(3) unsigned DEFAULT NULL,
`cardSubType` smallint(3) unsigned DEFAULT NULL,
`cardSpareGroupId` smallint(3) unsigned DEFAULT NULL,
`cardSerialNum` varchar(64) DEFAULT NULL,
`cardCarrierSerialNum` varchar(64) DEFAULT NULL,
`dom` tinyint(2) unsigned NOT NULL DEFAULT '0',
`updateTime` int(11) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`pkCardDataId`),
UNIQUE KEY `devchascarddom` (`deviceName`,`shelfId`,`cardId`,`dom`),
KEY `deviceName` (`deviceName`),
KEY `dom` (`dom`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
I understand that deviceName Key and the unique key devchascarddom share the leftmost attribute, deviceName, but it would seem to me that the Unique Key occurs once, whereas there are several deviceNames in the list. I guess what i am saying is, dropping the Key deviceName doesn't seem to make sense to me here, but i am no mysql guru -- should i drop it or is this just the way these tools are reporting back to me that i'll have to discard?
MySQL can use the first part of the compound index devchascarddom in the same way it can use deviceName. These tools are telling you the truth. The deviceName index will be smaller, and if you can get rid of devchascarddom instead, that would be better. You'll have to look at the EXPLAIN output for your queries to see if that's possible.
It's saying if it needed an index on devicename, it would use the compound unique key devchascarddom to get it, presumably because that index would itself be indexed by devicename.
Mind you I've no idea whether that is true, but it would make more sense than replicating each member.
e.g.
Device1
Shelf1
Card1
dom1
dom2
etc.
Now if you had an index on Shelf it wouldn't say it was redundant.
Related
I have an innoDB table named "transaction" with ~1.5 million rows. I would like to partition this table (probably on column "gas_station_id" since it is used a lot in join queries) but I've read in MySQL 5.7 Reference Manual that
All columns used in the table's partitioning expression must be part of every unique key that the table may have, including any primary key.
I have two questions:
The column "gas_station_id" is not part of unique key or primary key. How could I partition this table then?
even if I could partition this table, I am not sure which partitioning type would be better in this case? (I was thinking about LIST partitioning (we have about 40 different(distinct) gas stations) but I am not sure since there will be only one value in each list partition like the following :
ALTER TABLE transaction
PARTITION BY LIST(gas_station_id)
( PARTITION p1 VALUES IN (9001),
PARTITION p2 VALUES IN (9002),.....)
I tried partitioning by KEY, but I receive the following error (I think because id is not part of all unique keys..):
#1053 - a UNIQUE INDEX must include all columns in the table's partitioning function
This is the structure of the "transaction" table:
EDIT
and this is what SHOW CREATE TABLE shows:
CREATE TABLE `transaction` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`terminal_transaction_id` int(11) NOT NULL,
`fuel_terminal_id` int(11) NOT NULL,
`fuel_terminal_serial` int(11) NOT NULL,
`xboard_id` int(11) NOT NULL,
`gas_station_id` int(11) NOT NULL,
`operator_id` varchar(16) NOT NULL,
`shift_id` int(11) NOT NULL,
`xboard_total_counter` int(11) NOT NULL,
`fuel_type` tinyint(2) NOT NULL,
`start_fuel_time` int(11) NOT NULL,
`end_fuel_time` int(11) DEFAULT NULL,
`preset_amount` int(11) NOT NULL,
`actual_amount` int(11) DEFAULT NULL,
`fuel_cost` int(11) DEFAULT NULL,
`payment_cost` int(11) DEFAULT NULL,
`purchase_type` int(11) NOT NULL,
`payment_ref_id` text,
`unit_fuel_price` int(11) NOT NULL,
`fuel_status_id` int(11) DEFAULT NULL,
`fuel_mode_id` int(11) NOT NULL,
`payment_result` int(11) NOT NULL,
`card_pan` varchar(20) DEFAULT NULL,
`state` int(11) DEFAULT NULL,
`totalizer` int(11) NOT NULL DEFAULT '0',
`shift_start_time` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `terminal_transaction_id` (`terminal_transaction_id`,`fuel_terminal_id`,`start_fuel_time`) USING BTREE,
KEY `start_fuel_time_idx` (`start_fuel_time`),
KEY `fuel_terminal_idx` (`fuel_terminal_id`),
KEY `xboard_idx` (`xboard_id`),
KEY `gas_station_id` (`gas_station_id`) USING BTREE,
KEY `purchase_type` (`purchase_type`) USING BTREE,
KEY `shift_start_time` (`shift_start_time`) USING BTREE,
KEY `fuel_type` (`fuel_type`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1665335 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT
Short answer: Don't use PARTITION. Let's see the query to help speed it up.
Long answer:
1.5M rows is only marginally big enough to consider partitioning.
PARTITION BY LIST is probably useless for performance.
You have not given enough info to give you answers other that vague hints. Please provide at least SHOW CREATE TABLE and the slow SELECT.
It is possible to add the partition key onto the end of the PRIMARY or UNIQUE key; you will lose the uniqueness test.
Don't index a low-cardinality column; it won't be used.
More on PARTITION
I have a table with million of rows and the frequency of growth will probably increase in future, so far about 4.3 million rows are added in a month, causing the database to slow down. I have already applied indexing but it's not really optimizing the speed. Is applying Partitioning to such data favorable?
Also how can I apply partitioning on a table with million of rows? I know it will look something like this
ALTER TABLE gpsloggs
PARTITION BY KEY(DeviceCode)
PARTITIONS 10;
The problem is I was Partitioning on DeviceCode which is not a primary key so partitioning isn't permissible.
DROP TABLE IF EXISTS `gpslogss`;
CREATE TABLE `gpslogss` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`DeviceCode` varchar(255) DEFAULT NULL,
`Latitude` varchar(255) DEFAULT NULL,
`Longitude` varchar(255) DEFAULT NULL,
`Speed` double DEFAULT NULL,
`rowStamp` datetime DEFAULT NULL,
`Date` varchar(255) DEFAULT NULL,
`Time` varchar(255) DEFAULT NULL,
`AlarmCode` int(11) DEFAULT NULL,
PRIMARY KEY `Id` (`Id`) USING BTREE,
KEY `DeviceCode` (`DeviceCode`) USING BTREE
);
So I altered the table and made the table in a new database with 0 records this way and it worked fine
DROP TABLE IF EXISTS `gpslogss`;
CREATE TABLE `gpslogss` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`DeviceCode` varchar(255) DEFAULT NULL,
`Latitude` varchar(255) DEFAULT NULL,
`Longitude` varchar(255) DEFAULT NULL,
`Speed` double DEFAULT NULL,
`rowStamp` datetime DEFAULT NULL,
`Date` varchar(255) DEFAULT NULL,
`Time` varchar(255) DEFAULT NULL,
`AlarmCode` int(11) DEFAULT NULL,
KEY `Id` (`Id`) USING BTREE,
KEY `DeviceCode` (`DeviceCode`) USING BTREE
);
PARTITION BY KEY(DeviceCode)
PARTITIONS 10;
How should I render the code so that I can apply partitioning to the table with million of rows? How should I drop keys and alter the table to apply partitioning without damaging data?
Short answer: Don't.
Long answer: PARTITION BY KEY does not provide any performance benefit (that I know of). And why else use PARTITION?
Other notes:
You should use InnoDB for virtually all tables.
InnoDB tables should have an explicit PRIMARY KEY.
There is a DATETIME datatype; don't use VARCHAR for date or time, and don't split them.
latitude and longitude are numeric; don't use VARCHAR. FLOAT is a likely candidate (precise enough to differentiate vehicles, but not people).
Your real question is about speed. Let's see the slow SELECTs and work backward from them. Adding PARTITIONing is rarely a solution to performance.
I am trying to convert a table from MyISAM into InnoDB, this is the definition and I am getting error #1075 - Incorrect table definition; there can be only one auto column and it must be defined as a key
The table has an AutoIncrement value and the field is indexed and it works with MyISAM. I am new to InnoDB so it might be a dumb question
CREATE TABLE `cart_item` (
`cart_id` int(10) unsigned NOT NULL DEFAULT '0',
`id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`design_number` int(10) unsigned NOT NULL,
`logo_position_id` smallint(5) unsigned NOT NULL,
`subst_style_id` varchar(10) DEFAULT NULL,
`style_id` varchar(10) NOT NULL DEFAULT '',
`subst_color_id` smallint(5) unsigned DEFAULT NULL,
`color_id` smallint(5) unsigned NOT NULL,
`size_id` smallint(5) unsigned NOT NULL,
`qty` mediumint(8) unsigned NOT NULL,
`active` enum('y','n') NOT NULL DEFAULT 'y',
`date_last_modified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`last_modified_by_id` mediumint(5) unsigned NOT NULL,
`date_last_locked` datetime DEFAULT NULL,
`last_locked_by_id` smallint(5) unsigned NOT NULL,
`date_added` datetime NOT NULL,
`subsite_logo_group_id` int(11) NOT NULL,
`bundle` varchar(32) NOT NULL,
`color_stop_1` varchar(4) DEFAULT NULL,
PRIMARY KEY (`cart_id`,`id`),
KEY `color_id` (`color_id`),
KEY `style_id` (`style_id`),
KEY `size_id` (`size_id`),
KEY `design_number` (`design_number`),
KEY `subsite_logo_group_id` (`subsite_logo_group_id`),
KEY `date_added` (`date_added`),
KEY `bundle` (`bundle`)
) ENGINE=InnoDB
What you were doing on the MyISAM table, cannot be done with InnoDB. See my answer on a (similar) problem: creating primary key based on date
MySQL docs, in the Using AUTO_INCREMENT section, explain it:
For MyISAM tables you can specify AUTO_INCREMENT on a secondary column in a multiple-column index. In this case, the generated value for the AUTO_INCREMENT column is calculated as MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is useful when you want to put data into ordered groups.
You may get similar behaviour in InnoDB but not with AUTO_INCREMENT. You'll have to use either some fancy trigger or a stored procedure for your Inserts that will take care of the (per cart_id) auto-increment.
You have a composite PRIMARY KEY defined on (cart_id, id), but the AUTO_INCREMENT requires an index on id alone. You can add a KEY for it (not a primary key, but just a plain index):
KEY `idx_id` (`id`)
I question the use of the composite PK on (cart_id, id) though, since id is alone a unique value by definition. Perhaps you should make id the PK, and create a separate index across the combination.
PRIMARY KEY (`id`),
KEY (`cart_id`, `id`)
It doesn't even need to be specified as UNIQUE because the AUTO_INCREMENT can't be repeated anyway. There is no way to violate uniqueness on the combination (cart_id, id).
AUTO_INCREMENT columns should be define as key, as what the error implies.
`id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
and set UNIQUE on the two column instead of primary key
UNIQUE (`cart_id`,`id`),
SQLFiddle Demo
I have following mysql query:
CREATE TABLE `sampledata`.`ORDERFACT` (
`ORDERNUMBER` int(11) DEFAULT NULL,
`PRODUCTCODE` varchar(50) DEFAULT NULL,
`QUANTITYORDERED` int(11) DEFAULT NULL,
`PRICEEACH` decimal(17,5) DEFAULT NULL,
`ORDERLINENUMBER` int(11) DEFAULT NULL,
`TOTALPRICE` decimal(17,0) DEFAULT NULL,
`ORDERDATE` datetime DEFAULT NULL,
`REQUIREDDATE` datetime DEFAULT NULL,
`SHIPPEDDATE` datetime DEFAULT NULL,
`STATUS` varchar(15) DEFAULT NULL,
`COMMENTS` longtext,
`CUSTOMERNUMBER` int(11) DEFAULT NULL,
`TIME_ID` varchar(10) DEFAULT NULL,
`QTR_ID` bigint(20) DEFAULT NULL,
`MONTH_ID` bigint(20) DEFAULT NULL,
`YEAR_ID` bigint(20) DEFAULT NULL,
KEY `idx_ORDERFACT_lookup` (`ORDERNUMBER`,`PRODUCTCODE`,`QUANTITYORDERED`,`PRICEEACH`,`ORDERLINENUMBER`,`TOTALPRICE`,`ORDERDATE`,`REQUIREDDATE`,`SHIPPEDDATE`,`STATUS`,`CUSTOMERNUMBER`,`TIME_ID`,`QTR_ID`,`MONTH_ID`,`YEAR_ID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
Can someone explain me what will do the last three lines:
(KEY `idx_ORDERFACT_lookup` ...)
(KEY idx_ORDERFACT_lookup ...)
will create a composite primary key on all the columns in the bracket.
If you want to learn more read MySQL :: MySQL 5.0 Reference Manual :: 7.5.2 Multiple-Column Indexes
Addendum:
Actually if you are designing this table I might have few suggestions for you:
It is more cross-platform compatible to use small letters in columns names for some systems/dbrms are not case sensitive
PRICEEACH... do you really need to store 5 decimal places for price? And then TOTALPRICE has 0 decimal positions...
Those monthid, quarterid itd look sort of weird, especially being BIGINT but probably you have a requirement for that.
Just a few thoughts.
(KEY idx_ORDERFACT_lookup ...) will add an index to the table.
If you really want to dig deep then visit:
http://dev.mysql.com/doc/refman/5.0/en/create-table.html
It is 12.1.10 topic. Focus on third grey box. You will get something like this:
{INDEX|KEY} [index_name] [index_type] (index_col_name,...) [index_type]
Everything is square brackets is optional.
Hope it helps!!!
The following query is using temporary and filesort. I'd like to avoid that if possible.
SELECT lib_name, description, count(seq_id), floor(avg(size))
FROM libraries l JOIN sequence s ON (l.lib_id=s.lib_id)
WHERE s.is_contig=0 and foreign_seqs=0 GROUP BY lib_name;
The EXPLAIN says:
id,select_type,table,type,possible_keys,key,key_len,ref,rows,Extra
1,SIMPLE,s,ref,libseq,contigs,contigs,4,const,28447,Using temporary; Using filesort
1,SIMPLE,l,eq_ref,PRIMARY,PRIMARY,4,s.lib_id,1,Using where
The tables look like this:
libraries
CREATE TABLE `libraries` (
`lib_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`lib_name` varchar(30) NOT NULL,
`method_id` int(10) unsigned DEFAULT NULL,
`lib_efficiency` decimal(4,2) unsigned DEFAULT NULL,
`insert_avg` decimal(5,2) DEFAULT NULL,
`insert_high` decimal(5,2) DEFAULT NULL,
`insert_low` decimal(5,2) DEFAULT NULL,
`amtvector` decimal(4,2) unsigned DEFAULT NULL,
`description` text,
`foreign_seqs` tinyint(1) NOT NULL DEFAULT '0' COMMENT '1 means the sequences in this library are not ours',
PRIMARY KEY (`lib_id`),
UNIQUE KEY `lib_name` (`lib_name`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=latin1;
sequence
CREATE TABLE `sequence` (
`seq_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`seq_name` varchar(40) NOT NULL DEFAULT '',
`lib_id` int(10) unsigned DEFAULT NULL,
`size` int(10) unsigned DEFAULT NULL,
`add_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`sequencing_date` date DEFAULT '0000-00-00',
`comment` text DEFAULT NULL,
`is_contig` int(10) unsigned NOT NULL DEFAULT '0',
`fasta_seq` longtext,
`primer` varchar(15) DEFAULT NULL,
`gc_count` int(10) DEFAULT NULL,
PRIMARY KEY (`seq_id`),
UNIQUE KEY `seq_name` (`seq_name`),
UNIQUE KEY `libseq` (`lib_id`,`seq_id`),
KEY `primer` (`primer`),
KEY `sgitnoc` (`seq_name`,`is_contig`),
KEY `contigs` (`is_contig`,`seq_name`) USING BTREE,
CONSTRAINT `FK_sequence_1` FOREIGN KEY (`lib_id`) REFERENCES `libraries` (`lib_id`)
) ENGINE=InnoDB AUTO_INCREMENT=61508 DEFAULT CHARSET=latin1 ROW_FORMAT=DYNAMIC;
Are there any changes I can do to make the query go faster? If not, when (for a web application) is it worth putting the results of a query like the above into a MEMORY table?
First strategy: make it faster for mySQL to locate the records you want summarized.
You've already got an index on sequence.is_contig. You might try indexing on libraries.foreign_seqs. I don't know if that will help, but it's worth a try.
Second strategy: see if you can get your sort to run in memory, rather than in a file. Try making the sort_buffer_size parameter bigger. This will consume RAM on your server, but that's what RAM is for.
Third strategy: IF your application needs to do this query a lot but updates the underlying data only a little, take your own suggestion and create a summary table. Perhaps use an EVENT to remake the summary table., and run it once every few minutes. If you're going to follow that strategy, start by creating a view with this table in it and have your app retrieve information from the view. Then get the summary table stuff working, drop the view, and give the summary table the same name as the view. That way your data model work and your application design work can proceed independently of each other.
Final suggestion: If this is truly slowly-changing summary data, switch to myISAM. It's a little faster for this kind of data wrangling.