MySQL Partition by Range on YEAR(date) wrong results - mysql

I have a table defined as below:
CREATE TABLE `adverts_stats_clicks` (
`advert_stats_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`advertisment_id` int(10) unsigned NOT NULL DEFAULT '0',
`domain_id` int(10) unsigned NOT NULL DEFAULT '0',
`catdom_id` int(10) unsigned NOT NULL DEFAULT '0',
`added_on` date NOT NULL DEFAULT '0000-00-00',
`ipfrom` varchar(64) NOT NULL DEFAULT '',
`click_on` varchar(10) NOT NULL DEFAULT '',
`click_from` varchar(5) NOT NULL DEFAULT '',
`click_from_path` char(3) NOT NULL DEFAULT '',
`country_id` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`advert_stats_id`,`added_on`),
KEY `domain_id` (`domain_id`),
KEY `click_on` (`click_on`),
KEY `advertisment_id` (`advertisment_id`),
KEY `country_id` (`country_id`),
KEY `added_on` (`added_on`)
) ENGINE=MyISAM AUTO_INCREMENT=48176808 DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (YEAR(added_on))
(PARTITION ac0 VALUES LESS THAN (2005) ENGINE = MyISAM,
PARTITION ac1 VALUES LESS THAN (2006) ENGINE = MyISAM,
PARTITION ac2 VALUES LESS THAN (2007) ENGINE = MyISAM,
PARTITION ac3 VALUES LESS THAN (2009) ENGINE = MyISAM,
PARTITION ac4 VALUES LESS THAN (2010) ENGINE = MyISAM,
PARTITION ac5 VALUES LESS THAN (2011) ENGINE = MyISAM,
PARTITION ac6 VALUES LESS THAN (2012) ENGINE = MyISAM,
PARTITION ac7 VALUES LESS THAN (2013) ENGINE = MyISAM,
PARTITION ac8 VALUES LESS THAN (2014) ENGINE = MyISAM,
PARTITION ac9 VALUES LESS THAN (2015) ENGINE = MyISAM,
PARTITION ac99 VALUES LESS THAN MAXVALUE ENGINE = MyISAM) */
Then if I run the following query:
SELECT advert_stats_id
FROM adverts_stats_clicks
WHERE domain_id = 618
AND click_on = 'www'
AND click_from IN('sel','top','s','e','adv')
AND advertisment_id = 6122
AND added_on BETWEEN '2012-10-01' AND '2014-12-31';
I get for example 152 results. But if I change the date range as fllows:
SELECT advert_stats_id
FROM adverts_stats_clicks
WHERE domain_id = 618
AND click_on = 'www'
AND click_from IN('sel','top','s','e','adv')
AND advertisment_id = 6122
AND added_on BETWEEN '2013-10-01' AND '2014-12-31';
I get 643 results. Which is nonsense as the previous date range is bigger than the second.
If I run explain partitions I get the following:
id select_type table partitions type possible_keys key key_len ref rows Extra
1 SIMPLE adverts_stats_clicks ac7,ac8,ac9 index_merge domain_id,click_on,advertisment_id,added_on advertisment_id,domain_id 4,4 NULL 114 Using intersect(advertisment_id,domain_id); Using where
So in the first query the correct partitions (ac7,ac8,ac9) were selected. This proves that pruning is working. For the second query also the correct partitions were selected (ac8, ac9) however the results retrieved by both cannot be trusted.
What is happening ??? Why is like this?
Some extra info from sever variables in case it might me important to figure out.
Variable_name Value
character_set_client utf8mb4
character_set_connection utf8mb4
character_set_database latin1
character_set_filesystem binary
character_set_results utf8mb4
character_set_server latin1
character_set_system utf8
character_sets_dir /usr/share/mysql/charsets/
Please help and ask additional questions if you need more info in order to help.
Thank you!

when you are using BETWEEN '2013-10-01' AND '2014-12-31';
your checking a value in between two strings...
'2013-10-01' will take as a string...
so if we want to check between the dates..string should be converted to dates..
use str_to_date() function

Related

MySQL partition contains more records than expected

I have partitioned a MySQL table containing 53 rows. Now when I query number of records in all partitions, the records are almost 3 times the expected. Even phpMyAdmin thinks there are 156 records.
Have I done somthing wrong in my table design and partitioning?
Below picture shows count of records in partitions:
phpMyAdmin:
Finally, this is my table:
CREATE TABLE cl_inbox (
id int(11) NOT NULL AUTO_INCREMENT,
user int(11) NOT NULL,
contact int(11) DEFAULT NULL,
sdate timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
body text NOT NULL,
userstatus tinyint(4) NOT NULL DEFAULT 1 COMMENT '0: new, 1:read, 2: deleted',
contactstatus tinyint(4) NOT NULL DEFAULT 0,
class tinyint(4) NOT NULL DEFAULT 0,
attachtype tinyint(4) NOT NULL DEFAULT 0,
attachsrc varchar(255) DEFAULT NULL,
PRIMARY KEY (id, user),
INDEX i_class (class),
INDEX i_contact_user (contact, user),
INDEX i_contactstatus (contactstatus),
INDEX i_user_contact (user, contact),
INDEX i_userstatus (userstatus)
)
ENGINE = INNODB
AUTO_INCREMENT = 69
AVG_ROW_LENGTH = 19972
CHARACTER SET utf8
COLLATE utf8_general_ci
ROW_FORMAT = DYNAMIC
PARTITION BY KEY (`user`)
(
PARTITION partition1 ENGINE = INNODB,
PARTITION partition2 ENGINE = INNODB,
PARTITION partition3 ENGINE = INNODB,
.....
PARTITION partition128 ENGINE = INNODB
);
Those numbers are approximations, just as with SHOW TABLE STATUS and EXPLAIN.
Meanwhile, you will probably find that PARTITION BY KEY provides no performance improvement. If you find otherwise, I would be very interested to hear about it.

Querying record in a single partition very slow

I have a large table (over 2 billion records) which is partitioned. Each partition contains roughly 500 million records. I have recently moved from physical hardware to AWS, i used a mysqldump to backup and restore the MySQL data. I have also recently created a new partition (p108). Querying data from old partitions (created on the old server) are running as normal, very quick, returning data in seconds. However querying records in the newly created partition (p108) is very slow - minutes.
show create table results
CREATE TABLE `termusage`
(
`id` BIGINT(20) NOT NULL auto_increment,
`terminal` BIGINT(20) DEFAULT NULL,
`date` DATETIME DEFAULT NULL,
`dest` VARCHAR(255) DEFAULT NULL,
`feattrans` BIGINT(20) DEFAULT NULL,
`cost_type` TINYINT(4) DEFAULT NULL,
`cost` DECIMAL(16, 6) DEFAULT NULL,
`gprsup` BIGINT(20) DEFAULT NULL,
`gprsdown` BIGINT(20) DEFAULT NULL,
`duration` TIME DEFAULT NULL,
`file` BIGINT(20) DEFAULT NULL,
`custcost` DECIMAL(16, 6) DEFAULT '0.000000',
`invoice` BIGINT(20) NOT NULL DEFAULT '99999999',
`carriertrans` BIGINT(20) DEFAULT NULL,
`session_start` DATETIME DEFAULT NULL,
`session_end` DATETIME DEFAULT NULL,
`mt_mo` VARCHAR(4) DEFAULT NULL,
`grps_rounded` BIGINT(20) DEFAULT NULL,
`gprs_rounded` BIGINT(20) DEFAULT NULL,
`country` VARCHAR(25) DEFAULT NULL,
`network` VARCHAR(25) DEFAULT NULL,
`ctn` VARCHAR(20) DEFAULT NULL,
`pricetrans` BIGINT(20) DEFAULT NULL,
PRIMARY KEY (`id`, `invoice`),
KEY `idx_terminal` (`invoice`, `terminal`),
KEY `idx_feattrans` (`invoice`, `feattrans`),
KEY `idx_file` (`invoice`, `file`),
KEY `termusage_carriertrans_idx` (`carriertrans`),
KEY `idx_ctn` (`invoice`, `ctn`),
KEY `idx_pricetrans` (`invoice`, `pricetrans`)
)
engine=innodb
auto_increment=17449438880
DEFAULT charset=latin1
/*!50500 PARTITION BY RANGE COLUMNS(invoice)
(PARTITION p103 VALUES LESS THAN (621574) ENGINE = InnoDB,
PARTITION p104 VALUES LESS THAN (628214) ENGINE = InnoDB,
PARTITION p106 VALUES LESS THAN (634897) ENGINE = InnoDB,
PARTITION p107 VALUES LESS THAN (649249) ENGINE = InnoDB,
PARTITION p108 VALUES LESS THAN (662763) ENGINE = InnoDB,
PARTITION plast VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */
I created the partition p108 using the following query
ALTER TABLE termusage reorganize partition plast
INTO ( partition p108 VALUES less than (662763),
partition plast VALUES less than maxvalue )
I can see the file termusage#p#p108.ibd and looks to be "normal" and the data is there as i can get results from the query.
information_schema.PARTITIONS shows the following for the table - which indicates there is some kind of issue
Name Pos Rows Avg Data Length Method
p103 1 412249206 124 51124371456 RANGE COLUMNS
p104 2 453164890 133 60594061312 RANGE COLUMNS
p106 3 542767414 135 73562849280 RANGE COLUMNS
p107 4 587042147 129 76288098304 RANGE COLUMNS
p108 5 0 0 16384 RANGE COLUMNS
plast 6 0 0 16384 RANGE COLUMNS
How can i fix the partition ?
Updated
Explain for good query
# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
1, SIMPLE, t, p107, ref, idx_terminal,idx_feattrans,idx_file,idx_ctn,idx_pricetrans, idx_terminal, 17, const,const, 603, 100.00, Using index condition; Using temporary; Using filesort
Explain for poor query
# id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
1, SIMPLE, t, p108, ALL, idx_terminal,idx_feattrans,idx_file,idx_ctn,idx_pricetrans, , , , 1, 100.00, Using where; Using temporary; Using filesort
For future readers, the issue was resolved by running ALTER TABLE ... ANALYZE PARTITION p108.
The table and index statistics that guide the optimizer to choose the best way to read the table were out of date. It's common to use ANALYZE to make sure these statistics are updated after a significant data load or delete.

LOAD DATA FROM S3 - Fastest way for big data and indexes

I have a table that consists of holiday deals, so just to give you an idea, each row will contain the following bits of data:
Departure airport
Arrival airport
Start date
Duration
Hotel destination
Resort
Hotel name
Hotel rating
A few tiny integer columns for 1s and 0s.
Price
Date time the row was updated
Now, all of these deals get packaged up from 3 tables, they are flights, accommodation and transfers, the packaging up is to find the cheapest deal per variation such as, per departure airport, duration, board basis etc.
The table I am importing into will consist of around about 50 million rows, the import is extremely slow.
I have removed the indexes, that made a massive difference but now when I re-add the indexes back to the table after all data is in there it takes forever to complete.
I would like to know is there a way of bulk loading data quickly or is there a quicker way of adding indexes back to the table after data has been added?
Create Table
```
CREATE TABLE `iv_deals` (
`aid` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'Deal Autonumber PK',
`startdate` DATE NULL DEFAULT NULL COMMENT 'Holiday Start Date',
`startdatet` TINYINT(2) NOT NULL DEFAULT '0',
`depairport` CHAR(3) NULL DEFAULT NULL COMMENT 'Departure Airport IATA Code',
`arrairport` CHAR(3) NULL DEFAULT NULL COMMENT 'Arrival Airport IATA Code',
`destination` VARCHAR(30) NULL DEFAULT NULL COMMENT 'Holiday Destination',
`resort` VARCHAR(30) NULL DEFAULT NULL COMMENT 'Holiday Resort',
`hotel` VARCHAR(50) NULL DEFAULT NULL COMMENT 'Holiday Property Name',
`iv_PropertyID` INT(11) UNSIGNED NOT NULL DEFAULT '0' COMMENT 'Holiday Property ID',
`rating` VARCHAR(2) NULL DEFAULT NULL COMMENT 'Holiday Property Star Rating',
`board` VARCHAR(10) NULL DEFAULT NULL COMMENT 'Holiday Meal Option',
`duration` TINYINT(2) UNSIGNED NULL DEFAULT '0' COMMENT 'Holiday Duration',
`2for1` TINYINT(1) UNSIGNED NULL DEFAULT '0' COMMENT 'Is 2nd Week FREE Offer, 0 = False, 1 = True',
`3for2` TINYINT(1) UNSIGNED NULL DEFAULT '0' COMMENT 'Is 3rd Week FREE Offer, 0 = False, 1 = True',
`3and4` TINYINT(1) UNSIGNED NULL DEFAULT '0' COMMENT 'Is 3rd and 4th Week FREE Offer, 0 = False, 1 = True',
`4for3` TINYINT(1) UNSIGNED NULL DEFAULT '0' COMMENT 'Is 4th Week FREE Offer, 0 = False, 1 = True',
`freebb` VARCHAR(2) NULL DEFAULT NULL COMMENT 'Free Week Meal Option',
`adults` TINYINT(1) UNSIGNED NULL DEFAULT '0' COMMENT 'Number of Adults',
`children` TINYINT(1) UNSIGNED NULL DEFAULT '0' COMMENT 'Number of Children',
`infants` TINYINT(1) UNSIGNED NULL DEFAULT '0' COMMENT 'Number of Infants',
`price` SMALLINT(4) UNSIGNED NULL DEFAULT '9999' COMMENT 'Price',
`carrier` VARCHAR(40) NULL DEFAULT NULL COMMENT 'Flight Carrier IATA Code',
`DateUpdated` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`aid`, `startdatet`),
UNIQUE INDEX `Unique` (`startdate`, `depairport`, `arrairport`, `iv_PropertyID`, `board`, `duration`, `adults`, `children`, `startdatet`),
INDEX `ik_Price` (`price`),
INDEX `ik_Destination` (`destination`),
INDEX `ik_Resort` (`resort`),
INDEX `ik_DepAirport` (`depairport`),
INDEX `ik_Startdate` (`startdate`),
INDEX `ik_Board` (`board`),
INDEX `ik_FILTER_ALL` (`price`, `depairport`, `destination`, `resort`, `board`, `startdate`),
INDEX `iv_PropertyID` (`iv_PropertyID`),
INDEX `ik_Duration` (`duration`),
INDEX `rating` (`rating`),
INDEX `adults` (`adults`),
INDEX `DirectFromPrice` (`iv_PropertyID`, `depairport`, `arrairport`, `board`, `duration`, `adults`, `children`, `startdate`),
INDEX `DirectFromPrice_wo_depairport` (`iv_PropertyID`, `arrairport`, `board`, `duration`, `adults`, `children`),
INDEX `DirectFromPrice_w_pid_dep` (`iv_PropertyID`, `depairport`, `adults`, `children`, `price`),
INDEX `DirectFromPrice_w_pid_night` (`iv_PropertyID`, `duration`, `adults`, `children`),
INDEX `DirectFromPrice_Dur_Board` (`iv_PropertyID`, `duration`, `board`, `adults`, `children`),
INDEX `join_index` (`destination`, `startdate`, `duration`)
)
COLLATE='utf8_general_ci'
AUTO_INCREMENT=1258378560
/*!50100 PARTITION BY LIST (startdatet)
(PARTITION part0 VALUES IN (1) ENGINE = InnoDB,
PARTITION part1 VALUES IN (2) ENGINE = InnoDB,
PARTITION part2 VALUES IN (3) ENGINE = InnoDB,
PARTITION part3 VALUES IN (4) ENGINE = InnoDB,
PARTITION part4 VALUES IN (5) ENGINE = InnoDB,
PARTITION part5 VALUES IN (6) ENGINE = InnoDB,
PARTITION part6 VALUES IN (7) ENGINE = InnoDB,
PARTITION part7 VALUES IN (8) ENGINE = InnoDB,
PARTITION part8 VALUES IN (9) ENGINE = InnoDB,
PARTITION part9 VALUES IN (10) ENGINE = InnoDB,
PARTITION part10 VALUES IN (11) ENGINE = InnoDB,
PARTITION part11 VALUES IN (12) ENGINE = InnoDB,
PARTITION part12 VALUES IN (0) ENGINE = InnoDB) */;
```
If there are 50M rows, but AUTO_INCREMENT=1258378560, let point out another problem that is looming. (It may be related to the slow load.)
`aid` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT
allows only 4 billion; you are already at 1.2 billion. Do a little math to estimate when you will run out of ids. The brute force solution is to change to BIGINT, but let's analyze why the ids are being 'burned'. There are several ways that INSERT/REPLACE/etc can throw away ids. Please describe how the import is working. REPLACE is perhaps the worst -- it burns ids and is effectively DELETE + INSERT. Other techniques are faster.
(I will now ramble in many directions...)
The partitioning by month (which I assume you are doing with (startdatet) probably does not add any performance. What has your experience been? (I usually argue against using PARTITION except for the few use cases where there are benefits. I see no benefit in your case.)
19 indexes means 19 BTrees that must be updated. The 2 uniques ones must be checked before the INSERT is finished; the 17 others can be delayed, but not forever. (The details are discussed under "change buffer".)
How much RAM? What is the setting of innodb_buffer_pool_size? It should be about 70% of RAM. The Change buffer is a portion of that.
I see at least 4 indexes that can be dropped, since other indexes handle their need. In general, if you have INDEX(a, b), you don't also need INDEX(a). (Shrinking from 19 indexes to 15 will help some.)
Flags and other things of low cardinality are virtually useless by them selves as indexes. The Optimizer will decide that it is cheaper to scan the table than to bounce between the index's BTree and the data BTree. I'm thinking of INDEX(rating).
Any SELECT that does not have startdatet in the WHERE is likely to be slower than without partitioning. This is because the query must check all 13 partitions. Even with AND startdatet = 4, performance won't be any better than if there had been an index that included startdatet.
Let me discuss any index starting with a column (perhaps price, rating, startdate) that is queried as a "range" (eg, WHERE price BETWEEN ...). The processing cannot use any columns after that column. I suspect ik_FILTER_ALL will scan a big chunk of the index, since it only filtered on price. Rearrange the columns. Based on the name, I am guessing this is a "covering" index. That is, a common query references only those 6 columns? Note: SELECT * ... references more than just those 6, so the index is not "covering". (Show us the query; I can discuss it more.)
The 5 "DirectFromPrice" indexes are probably each 'perfect' for some query. But they are awfully long (lots of columns). I would guess that 2 shorter lists would come close to handling the 5 cases "well enough". (Remember, decreasing the number of indexes will help in the goal of insert time.)
What version of MySQL/MariaDB are you using?
The main action item at this point: Show us the import. (I will discuss sorting the input after seeing the method being used.)

Indexing: Invalid default value for '[column]'

I am working on a 10 year old web-app (!!!)
& currently running mysql locally, version 5.7.
This is the table I am currently working on:
CREATE TABLE `processes_history` (
`p_id` bigint(20) UNSIGNED NOT NULL DEFAULT '0',
`exec_id` bigint(20) UNSIGNED NOT NULL DEFAULT '0',
`feature` varchar(100) NOT NULL DEFAULT '',
`macro` tinyint(1) UNSIGNED NOT NULL DEFAULT '0',
`ts` date NOT NULL DEFAULT '0000-00-00',
`seen` int(10) UNSIGNED NOT NULL DEFAULT '1',
`seen_time` bigint(20) UNSIGNED NOT NULL DEFAULT '0',
`focus` int(10) UNSIGNED NOT NULL DEFAULT '0',
`focus_time` bigint(20) UNSIGNED NOT NULL DEFAULT '0',
`mouse` int(10) UNSIGNED NOT NULL DEFAULT '0',
`keyboard` int(10) UNSIGNED NOT NULL DEFAULT '0',
`interactive` int(10) UNSIGNED NOT NULL DEFAULT '0',
`interactive_time` bigint(20) UNSIGNED NOT NULL DEFAULT '0',
`last_seen` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=MyISAM DEFAULT CHARSET=utf8
PARTITION BY RANGE (TO_DAYS(`ts`))
(
PARTITION p0 VALUES LESS THAN (736695) ENGINE=MyISAM,
PARTITION p201701 VALUES LESS THAN (736726) ENGINE=MyISAM,
PARTITION p201702 VALUES LESS THAN (736754) ENGINE=MyISAM,
PARTITION p201703 VALUES LESS THAN (736785) ENGINE=MyISAM,
PARTITION p201704 VALUES LESS THAN (736815) ENGINE=MyISAM,
PARTITION p201705 VALUES LESS THAN (736846) ENGINE=MyISAM,
PARTITION p201706 VALUES LESS THAN (736876) ENGINE=MyISAM,
PARTITION p201707 VALUES LESS THAN (736907) ENGINE=MyISAM,
PARTITION p201708 VALUES LESS THAN (736938) ENGINE=MyISAM,
PARTITION p201709 VALUES LESS THAN (736968) ENGINE=MyISAM,
PARTITION p201710 VALUES LESS THAN (736999) ENGINE=MyISAM,
PARTITION p201711 VALUES LESS THAN (737029) ENGINE=MyISAM,
PARTITION p201712 VALUES LESS THAN (737060) ENGINE=MyISAM,
PARTITION p201801 VALUES LESS THAN (737091) ENGINE=MyISAM,
PARTITION pmax VALUES LESS THAN MAXVALUE ENGINE=MyISAM
);
--
-- Indexes for dumped tables
--
--
-- Indexes for table `processes_history`
--
ALTER TABLE `processes_history`
ADD PRIMARY KEY (`p_id`,`exec_id`,`feature`,`ts`),
ADD KEY `ts` (`ts`),
ADD KEY `exec_ts` (`exec_id`,`ts`),
ADD KEY `last_seen` (`last_seen`);
I keep getting an error when adding an index to p_id, exec_id, ts:
ALTER TABLE `dbname`.`processes_history` ADD INDEX `p_id,exec_id,ts` (`p_id`, `exec_id`, `ts`);
Error SQL query:
ALTER TABLE dbname.processes_history ADD INDEX p_id,exec_id,ts
(p_id, exec_id, ts) MySQL said: Documentation
1067 - Invalid default value for 'ts'
Following this post: https://dba.stackexchange.com/questions/192186/on-create-index-invalid-default-value
From what I understood, using 0000-00-00 as a default value breaks the 'date' type, and that's why it's not working.
But I just couldn't understand what is the solution for this situation. Using TIMESTAMP type instead?
Is there a way to solve this problem without breaking the structure (for now at least) unit I finish the whole web-app? Many things are dependent on that table and i reeeealy don't want to do something risky to index it the way I want.
Changing SQL_mode solved the problem:
I compared MySQL 5.7 and 5.6, and it seems like 5.7 has the default restrictions: ONLY_FULL_GROUP_BY, STRICT_TRANS_TABLES, NO_ZERO_IN_DATE, NO_ZERO_DATE, ERROR_FOR_DIVISION_BY_ZERO, NO_AUTO_CREATE_USER, and NO_ENGINE_SUBSTITUTION set (https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_no_zero_in_date).
And on 5.6: just NO_ENGINE_SUBSTITUTION
(mysql 5.6 manual https://dev.mysql.com/doc/refman/5.6/en/sql-mode.html#sql-mode-setting)
To get my local mysql (5.7) to work in sync with the production version (5.6) I just had to set mysql_mode to NO_ENGINE_SUBSTITUTION
SET GLOBAL sql_mode = 'NO_ENGINE_SUBSTITUTION';
SET SESSION sql_mode = 'NO_ENGINE_SUBSTITUTION';
You can review the post on db-stackexchange with more info.
You need to change to InnoDB -- The next release (8.0) has removed MyISAM.
Somewhere around 5.6, the handling of default for TIMESTAMP changed. Think about what values you have and what values you need.
PARTITIONing rarely improves performance; what query drove you to use it?
Temporarily making the column in question nullable also solves this issue.

Convert to a Partitioned Table

I have the following table structure with live data in it:
CREATE TABLE IF NOT EXISTS `userstatistics` (
`user_id` int(10) unsigned NOT NULL,
`number_logons` int(7) unsigned NOT NULL DEFAULT '0',
`number_profileminiviews` int(7) unsigned NOT NULL DEFAULT '0',
`number_profilefullviews` int(7) unsigned NOT NULL DEFAULT '0',
`number_mailsreceived` int(7) unsigned NOT NULL DEFAULT '0',
`number_interestreceived` int(7) unsigned NOT NULL DEFAULT '0',
`number_favouratesreceived` int(7) unsigned NOT NULL DEFAULT '0',
`number_friendshiprequestreceived` int(7) unsigned NOT NULL DEFAULT '0',
`number_imchatrequestreceived` int(7) unsigned NOT NULL DEFAULT '0',
`yearweek` int(6) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`user_id`,`yearweek`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
I want to convert this to a partitioned table with the following structure:
CREATE TABLE IF NOT EXISTS `userstatistics` (
`user_id` int(10) unsigned NOT NULL,
`number_logons` int(7) unsigned NOT NULL DEFAULT '0',
`number_profileminiviews` int(7) unsigned NOT NULL DEFAULT '0',
`number_profilefullviews` int(7) unsigned NOT NULL DEFAULT '0',
`number_mailsreceived` int(7) unsigned NOT NULL DEFAULT '0',
`number_interestreceived` int(7) unsigned NOT NULL DEFAULT '0',
`number_favouratesreceived` int(7) unsigned NOT NULL DEFAULT '0',
`number_friendshiprequestreceived` int(7) unsigned NOT NULL DEFAULT '0',
`number_imchatrequestreceived` int(7) unsigned NOT NULL DEFAULT '0',
`yearweek` int(6) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`user_id`,`yearweek`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (yearweek)
(PARTITION userstats_201108 VALUES LESS THAN (201108) ENGINE = InnoDB,
PARTITION userstats_201109 VALUES LESS THAN (201109) ENGINE = InnoDB,
PARTITION userstats_201110 VALUES LESS THAN (201110) ENGINE = InnoDB,
PARTITION userstats_201111 VALUES LESS THAN (201111) ENGINE = InnoDB,
PARTITION userstats_201112 VALUES LESS THAN (201112) ENGINE = InnoDB,
PARTITION userstats_201113 VALUES LESS THAN (201113) ENGINE = InnoDB,
PARTITION userstats_201114 VALUES LESS THAN (201114) ENGINE = InnoDB,
PARTITION userstats_201115 VALUES LESS THAN (201115) ENGINE = InnoDB,
PARTITION userstats_201116 VALUES LESS THAN (201116) ENGINE = InnoDB,
PARTITION userstats_201117 VALUES LESS THAN (201117) ENGINE = InnoDB,
PARTITION userstats_201118 VALUES LESS THAN (201118) ENGINE = InnoDB,
PARTITION userstats_201119 VALUES LESS THAN (201119) ENGINE = InnoDB,
PARTITION userstats_201120 VALUES LESS THAN (201120) ENGINE = InnoDB,
PARTITION userstats_201121 VALUES LESS THAN (201121) ENGINE = InnoDB,
PARTITION userstats_max VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;
How can I do this conversion?
Simply changing the first line of the second SQL statement to
ALTER TABLE 'userstatistics' (
Would this do it?
Going from MySQL 5.0 to 5.1.
First, you need to be running MySQL 5.1 or later. MySQL 5.0 does not support partitioning.
Second, please be aware of the difference between single-quotes (which delimit strings and dates) and back-ticks (which delimit table and column identifiers in MySQL). Use the correct type where appropriate. I mention this, because your example uses the wrong type of quotes:
ALTER TABLE 'userstatistics' (
That should be:
ALTER TABLE `userstatistics` (
Finally, yes, you can restructure a table into partitions with ALTER TABLE. Here's an exact copy & paste from a statement I tested on MySQL 5.1.57:
ALTER TABLE userstatistics PARTITION BY RANGE (yearweek) (
PARTITION userstats_201108 VALUES LESS THAN (201108) ENGINE = InnoDB,
PARTITION userstats_201109 VALUES LESS THAN (201109) ENGINE = InnoDB,
PARTITION userstats_201110 VALUES LESS THAN (201110) ENGINE = InnoDB,
PARTITION userstats_201111 VALUES LESS THAN (201111) ENGINE = InnoDB,
PARTITION userstats_201112 VALUES LESS THAN (201112) ENGINE = InnoDB,
PARTITION userstats_201113 VALUES LESS THAN (201113) ENGINE = InnoDB,
PARTITION userstats_201114 VALUES LESS THAN (201114) ENGINE = InnoDB,
PARTITION userstats_201115 VALUES LESS THAN (201115) ENGINE = InnoDB,
PARTITION userstats_201116 VALUES LESS THAN (201116) ENGINE = InnoDB,
PARTITION userstats_201117 VALUES LESS THAN (201117) ENGINE = InnoDB,
PARTITION userstats_201118 VALUES LESS THAN (201118) ENGINE = InnoDB,
PARTITION userstats_201119 VALUES LESS THAN (201119) ENGINE = InnoDB,
PARTITION userstats_201120 VALUES LESS THAN (201120) ENGINE = InnoDB,
PARTITION userstats_201121 VALUES LESS THAN (201121) ENGINE = InnoDB,
PARTITION userstats_max VALUES LESS THAN MAXVALUE ENGINE = InnoDB);
Note that this causes a table restructure, so if you already have a lot of data in this table, it will take a while to run. Exactly how long depends on how much data you have, and your hardware speed, and other factors. Be aware that while the table is being restructured, it is locked and unavailable for reading and writing by other queries.
Look this
http://dev.mysql.com/doc/refman/5.1/en/alter-table.html about the alter table.
Then in particular the alter table.
ADD/DROP/COALESCE/REORGANIZE partition sql provides almost all the functions to manage your partitions.
note that hash can be only used to integer.
ALTER TABLE ... ADD PARTITION creates no temporary table except when used with NDB tables. ADD or DROP operations for RANGE or LIST partitions are immediate operations or nearly so. ADD or COALESCE operations for HASH or KEY partitions copy data between changed partitions; unless LINEAR HASH or LINEAR KEY was used, this is much the same as creating a new table (although the operation is done partition by partition). REORGANIZE operations copy only changed partitions and do not touch unchanged ones.