this is my table
CREATE TABLE `fa_nls_og` (
`Incr_Dollar_YAG_pct_Chg` double(50,4) DEFAULT NULL,
`Incr_U_YAG_pct_Chg` double(50,4) DEFAULT NULL,
`Incr_U_YAG_Chg` double(50,4) DEFAULT NULL,
`Incr_EQ_YAG_pct_Chg` double(50,4) DEFAULT NULL,
`Incr_EQ_YAG_Chg` double(50,4) DEFAULT NULL,
`Baseline_EQ_YAG_pct_Chg` double(50,4) DEFAULT NULL,
`Baseline_EQ_YAG_Chg` double(50,4) DEFAULT NULL,
`Baseline_Units_YAG_pct_Chg` double(50,4) DEFAULT NULL,
`Baseline_Units_YAG_Chg` double(50,4) DEFAULT NULL,
`Units_YAG_Period` double(50,4) DEFAULT NULL,
`Units_YAG_pct_Chg` double(50,4) DEFAULT NULL,
`Units_YAG_Chg` double(50,4) DEFAULT NULL,
`PERIOD_YEAR` int(50) DEFAULT NULL,
`CAT_NO` int(10) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (PERIOD_YEAR)
SUBPARTITION BY KEY (CAT_NO) SUBPARTITIONS 12
(PARTITION pytd VALUES LESS THAN (2) ENGINE = InnoDB,
PARTITION p VALUES LESS THAN (200000) ENGINE = InnoDB,
PARTITION p0 VALUES LESS THAN (201401) ENGINE = InnoDB,
PARTITION p2 VALUES LESS THAN (201402) ENGINE = InnoDB,
PARTITION p4 VALUES LESS THAN (201403) ENGINE = InnoDB,
PARTITION p6 VALUES LESS THAN (201404) ENGINE = InnoDB,
PARTITION p8 VALUES LESS THAN (201405) ENGINE = InnoDB,
PARTITION p10 VALUES LESS THAN (201406) ENGINE = InnoDB,
PARTITION p12 VALUES LESS THAN (201407) ENGINE = InnoDB,
PARTITION p14 VALUES LESS THAN (201408) ENGINE = InnoDB,
PARTITION p16 VALUES LESS THAN (201409) ENGINE = InnoDB,
PARTITION p18 VALUES LESS THAN (201410) ENGINE = InnoDB,
PARTITION p20 VALUES LESS THAN (201411) ENGINE = InnoDB,
PARTITION p22 VALUES LESS THAN (201412) ENGINE = InnoDB,
PARTITION p24 VALUES LESS THAN (201501) ENGINE = InnoDB,
PARTITION p26 VALUES LESS THAN (201502) ENGINE = InnoDB,
PARTITION p28 VALUES LESS THAN (201503) ENGINE = InnoDB,
PARTITION p30 VALUES LESS THAN (201504) ENGINE = InnoDB,
PARTITION p32 VALUES LESS THAN (201505) ENGINE = InnoDB,
PARTITION p34 VALUES LESS THAN (201506) ENGINE = InnoDB,
PARTITION p36 VALUES LESS THAN (201507) ENGINE = InnoDB,
PARTITION p38 VALUES LESS THAN (201508) ENGINE = InnoDB,
PARTITION p40 VALUES LESS THAN (201509) ENGINE = InnoDB,
PARTITION p42 VALUES LESS THAN (201510) ENGINE = InnoDB,
PARTITION p44 VALUES LESS THAN (201511) ENGINE = InnoDB,
PARTITION p46 VALUES LESS THAN (201512) ENGINE = InnoDB,
PARTITION p48 VALUES LESS THAN (201601) ENGINE = InnoDB,
PARTITION p50 VALUES LESS THAN (201602) ENGINE = InnoDB,
PARTITION p52 VALUES LESS THAN (201603) ENGINE = InnoDB,
PARTITION p54 VALUES LESS THAN (201604) ENGINE = InnoDB,
PARTITION p56 VALUES LESS THAN (201605) ENGINE = InnoDB,
PARTITION p58 VALUES LESS THAN (201606) ENGINE = InnoDB,
PARTITION p60 VALUES LESS THAN (201607) ENGINE = InnoDB,
PARTITION p62 VALUES LESS THAN (201608) ENGINE = InnoDB,
PARTITION p64 VALUES LESS THAN (201609) ENGINE = InnoDB,
PARTITION p66 VALUES LESS THAN (201610) ENGINE = InnoDB,
PARTITION p68 VALUES LESS THAN (201611) ENGINE = InnoDB,
PARTITION p70 VALUES LESS THAN (201612) ENGINE = InnoDB,
PARTITION p72 VALUES LESS THAN (201701) ENGINE = InnoDB,
PARTITION p74 VALUES LESS THAN (201702) ENGINE = InnoDB,
PARTITION p76 VALUES LESS THAN (201703) ENGINE = InnoDB,
PARTITION p78 VALUES LESS THAN (201704) ENGINE = InnoDB,
PARTITION p80 VALUES LESS THAN (201705) ENGINE = InnoDB,
PARTITION p82 VALUES LESS THAN (201706) ENGINE = InnoDB,
PARTITION p84 VALUES LESS THAN (201707) ENGINE = InnoDB,
PARTITION p86 VALUES LESS THAN (201708) ENGINE = InnoDB,
PARTITION p88 VALUES LESS THAN (201709) ENGINE = InnoDB,
PARTITION p90 VALUES LESS THAN (201710) ENGINE = InnoDB,
PARTITION p92 VALUES LESS THAN (201711) ENGINE = InnoDB,
PARTITION p94 VALUES LESS THAN (201712) ENGINE = InnoDB,
PARTITION p96 VALUES LESS THAN (201801) ENGINE = InnoDB,
PARTITION p98 VALUES LESS THAN (201802) ENGINE = InnoDB,
PARTITION p100 VALUES LESS THAN (201803) ENGINE = InnoDB,
PARTITION p102 VALUES LESS THAN (201804) ENGINE = InnoDB,
PARTITION p104 VALUES LESS THAN (201805) ENGINE = InnoDB,
PARTITION p106 VALUES LESS THAN (201806) ENGINE = InnoDB,
PARTITION p108 VALUES LESS THAN (201807) ENGINE = InnoDB,
PARTITION p110 VALUES LESS THAN (201808) ENGINE = InnoDB,
PARTITION p112 VALUES LESS THAN (201809) ENGINE = InnoDB,
PARTITION p114 VALUES LESS THAN (201810) ENGINE = InnoDB,
PARTITION p116 VALUES LESS THAN (201811) ENGINE = InnoDB,
PARTITION p118 VALUES LESS THAN (201812) ENGINE = InnoDB,
PARTITION p144 VALUES LESS THAN (201901) ENGINE = InnoDB,
PARTITION p146 VALUES LESS THAN (201902) ENGINE = InnoDB,
PARTITION p148 VALUES LESS THAN (201903) ENGINE = InnoDB,
PARTITION p150 VALUES LESS THAN (201904) ENGINE = InnoDB,
PARTITION p152 VALUES LESS THAN (201905) ENGINE = InnoDB,
PARTITION p154 VALUES LESS THAN (201906) ENGINE = InnoDB,
PARTITION p156 VALUES LESS THAN (201907) ENGINE = InnoDB,
PARTITION p158 VALUES LESS THAN (201908) ENGINE = InnoDB,
PARTITION p160 VALUES LESS THAN (201909) ENGINE = InnoDB,
PARTITION p162 VALUES LESS THAN (201910) ENGINE = InnoDB,
PARTITION p164 VALUES LESS THAN (201911) ENGINE = InnoDB,
PARTITION p166 VALUES LESS THAN (201912) ENGINE = InnoDB,
PARTITION p120 VALUES LESS THAN (202001) ENGINE = InnoDB,
PARTITION p122 VALUES LESS THAN (202002) ENGINE = InnoDB,
PARTITION p124 VALUES LESS THAN (202003) ENGINE = InnoDB,
PARTITION p126 VALUES LESS THAN (202004) ENGINE = InnoDB,
PARTITION p128 VALUES LESS THAN (202005) ENGINE = InnoDB,
PARTITION p130 VALUES LESS THAN (202006) ENGINE = InnoDB,
PARTITION p132 VALUES LESS THAN (202007) ENGINE = InnoDB,
PARTITION p134 VALUES LESS THAN (202008) ENGINE = InnoDB,
PARTITION p136 VALUES LESS THAN (202009) ENGINE = InnoDB,
PARTITION p138 VALUES LESS THAN (202010) ENGINE = InnoDB,
PARTITION p140 VALUES LESS THAN (202011) ENGINE = InnoDB,
PARTITION p142 VALUES LESS THAN (202012) ENGINE = InnoDB
) */
updates on one of the column where period year and catno
first update it takes 5 sec ,2nd time updating it takes 30 min , can anyone help it?
Too many partitions. After about 50, performance degrades. (You have about 800!)
Don't pre-build partitions; it slows down operations.
SUBPARTITIONs have no performance benefit.
Don't use DOUBLE(m,n) it leads to extra rounding. Either use plain DOUBLE (with 16 significant digits) or DECIMAL(m,n) with reasonable values for m and n. DOUBLE (with or without (m,n)) takes 8 bytes; DECIMAL(50,4) takes about 25 bytes!
int(50) -- The (50) means nothing. INT always takes 4 bytes. Since it seems to be a YEAR, use that datatype (only 2 bytes).
Have a PRIMARY KEY
If that is your main query, have INDEX(period_year, catno).
After all of that, get rid of all partitioning -- it is not providing anything useful (based on what you have said so far). The INDEX will give you the speed you are missing. My other tips help in various other ways, some of them helping directly or indirectly (eg, small = faster) with speed.
I have a table in my database with 10+M rows.
Actually I never worked with so many records and it cames across me that I need a little help on indexing / partitioning the table.
The table looks like this:
CREATE TABLE `stock` (
`ID` bigint(20) NOT NULL AUTO_INCREMENT,
`data` date NOT NULL,
`cod_pdv` varchar(200) DEFAULT NULL,
`cod_art` varchar(200) DEFAULT NULL,
`xstock` int(11) DEFAULT NULL,
`sellout` int(11) DEFAULT NULL,
`backorder` int(11) DEFAULT NULL,
`id_insegna` int(11) DEFAULT NULL,
PRIMARY KEY (`ID`,`data`),
KEY `index_stock` (`cod_art`,`cod_pdv`,`data`),
KEY `index_data` (`data`),
KEY `index_trac_stock` (`cod_art`,`id_insegna`)
) ENGINE=InnoDB AUTO_INCREMENT=10120378 DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (YEAR(data))
SUBPARTITION BY HASH (MONTH(data))
(PARTITION part0 VALUES LESS THAN (2015)
(SUBPARTITION subpart0 ENGINE = InnoDB,
SUBPARTITION subpart1 ENGINE = InnoDB,
SUBPARTITION subpart2 ENGINE = InnoDB,
SUBPARTITION subpart3 ENGINE = InnoDB,
SUBPARTITION subpart4 ENGINE = InnoDB,
SUBPARTITION subpart5 ENGINE = InnoDB,
SUBPARTITION subpart6 ENGINE = InnoDB,
SUBPARTITION subpart7 ENGINE = InnoDB,
SUBPARTITION subpart8 ENGINE = InnoDB,
SUBPARTITION subpart9 ENGINE = InnoDB,
SUBPARTITION subpart10 ENGINE = InnoDB,
SUBPARTITION subpart11 ENGINE = InnoDB,
SUBPARTITION subpart12 ENGINE = InnoDB),
PARTITION part1 VALUES LESS THAN (2016)
(SUBPARTITION subpart13 ENGINE = InnoDB,
SUBPARTITION subpart14 ENGINE = InnoDB,
SUBPARTITION subpart15 ENGINE = InnoDB,
SUBPARTITION subpart16 ENGINE = InnoDB,
SUBPARTITION subpart17 ENGINE = InnoDB,
SUBPARTITION subpart18 ENGINE = InnoDB,
SUBPARTITION subpart19 ENGINE = InnoDB,
SUBPARTITION subpart20 ENGINE = InnoDB,
SUBPARTITION subpart21 ENGINE = InnoDB,
SUBPARTITION subpart22 ENGINE = InnoDB,
SUBPARTITION subpart23 ENGINE = InnoDB,
SUBPARTITION subpart24 ENGINE = InnoDB,
SUBPARTITION subpart25 ENGINE = InnoDB),
PARTITION part2 VALUES LESS THAN (2017)
(SUBPARTITION subpart26 ENGINE = InnoDB,
SUBPARTITION subpart27 ENGINE = InnoDB,
SUBPARTITION subpart28 ENGINE = InnoDB,
SUBPARTITION subpart29 ENGINE = InnoDB,
SUBPARTITION subpart30 ENGINE = InnoDB,
SUBPARTITION subpart31 ENGINE = InnoDB,
SUBPARTITION subpart32 ENGINE = InnoDB,
SUBPARTITION subpart33 ENGINE = InnoDB,
SUBPARTITION subpart34 ENGINE = InnoDB,
SUBPARTITION subpart35 ENGINE = InnoDB,
SUBPARTITION subpart36 ENGINE = InnoDB,
SUBPARTITION subpart37 ENGINE = InnoDB,
SUBPARTITION subpart38 ENGINE = InnoDB));
most of the queries run on data, cod_art, cod_pdv and id_insegna.
An example of query could be:
explain
SELECT s.data, s.cod_art, s.giacenza, s.sellout, s.backorder
FROM stock AS s
WHERE s.cod_art IN ("103666","103672","20509","39730","5000016",
"7004009","7004010","7004055","7004064","7004065","7004105",
"7004133","7004161","7004163","7004178","7004213","7005932",
"7023139","7023142","7031974","7049009","7074201","7074204",
"7082052","7082058","7082062","7082067","7082072","7082077",
"7084113","7084127","7088599","7091092","7091094","7094124",
"7095505","7103663","7103678","7103681","7103684","7103687",
"7103690","7103691","7103748","7103766","7103814","7103832",
"7103834","7103835","7103840","7103860","7103902","7103903",
"7103905","7103906","7103907","7104915","7104916","7104936",
"7104957","7105357","7106936","7106937","7106938","7106943",
"7106945","7106946","7106950","7108714","7108716","7108719",
"7108770","7108771","7108778","7108779","7113920","7113921",
"7113925","7113936","7114837","7115099","7115711","7115712",
"7115713","7115714","7115715","7115716","7115717","7115719",
"7115720","7115722","7118620","7118660","7118663","7118664",
"7118665","7118666","7118667","7121650","7121826","7122100",
"7122101","7122102","7122104","7122105","7122106","7122108",
"7122112","7122113","7122115","7122119","7122120","7122123",
"7122124","7122125","7122130","7122141","7122154","7122157",
"7122158","7122159","7122162","7122224","7122238","7122239",
"7122242","7122245","7122246","7122249","7122251","7122252",
"7122256","7122257","7122262","7122266","7122272","7122273", "7122274","7122275","7122276","7122282","7122295","7122296",
"7122297","7122298","7122304","7122308","7122309","7122310", "7122311","7122312","7122314","7122315","7122318","7122319",
"7122326","7122327","7122370","7122371","7122372","7122374", "7122375","7122376","7122377","7122381","7122382","7122386",
"7122388","7122422","7122423","7122425","7122426","7122432", "7122434","7122435","7122436","7122459","7122460","7122469",
"7122470","7122474","7122475","7122479","7122480","7122483","7122484","7122486","7122489","7122496","7122498","7122504","7122505","7122680","7122682","7123119","7123141","7123151","7123152","7123368","7123900","7123945","7123949","7123950","7124025","7134016","7170052","7170055","7170058","7170062","7170067","7170072","7170077","7275297","7275298","7275299","7287682","7292630","7292631","7292644","7292645","7292659","7411238","7411627","7411628","7411629","7411630","7411631","7411632","7411633","7411678","7411679","7411933","7411944","7411955","7411961","7411995","7411996","7411997","7411998","7411999","7412007","7412008","7412009","7414118","7420171","7430455","7430722","7430724","7430728","7442570","7443160","7443178","7443179","7444126","7444220","7444225","7444733","7446020","7446060","7446080","7448499","7449594","7449645","7456561","7456607","7474229","7478302","7480817","7480834","7480836","7480865","7480868","7480869","7481360","7483186","7483199","7484430","7484431","7495863","7496040","7540619","7544450","7544452","7544459","7544460","7544490","7544491","7544842","7544843","7544854","7544855","7544856","7544930","7544931","7544934","7544935","7556801","7620180","7629100","7630592","7630598","7634033","7634035","7634694","7639626","7639628","7639629","7639658","7639661","7639664","7639669","7639670","7639694","7639709","7639894","7639895","7641894","7641895","7641896","7641898","7641933","7642087","7642089","7642098","7642099","7642141","7644215","7644219","7645021","7645022","7645023","7645027","7645040","7645047","7645058","7645059","7645073","7645074","7645076","7646558","7646741","7646742","7646743","7646745","7646746","7646764","7648910","7648912","7648913","7648925","7649244","7653052","7653056","7653325","7653750","7654141","7654500","7654505","7654507","7654514","7654518","7654542","7654554","7654556","7654560","7654565","7654566","7654567","7654568","7654569","7654587","7654588","7654734","7654736","7654745","7654750","7654770","7654777","7654779","7678400","7678419","7678420","7678421","7678426","7678470","7742625","7743642","7745754","7745762","7746585","7762401","7762409","7762417","7762423","7778939","7786921","7786999","7788416","7788473","7788512","7793723","7793731","7793779","7793780","7793783","H17103829","7108269","7108299","7135533","7135534","7135535","7135502","7135504","7135503","7135505","7135511","7135514","7135512","7135515","7135510","7135513","7123600","7123601","7123602","7123791","7123792","7123793","7123794","7123795","7123797","7123798","7123799","7654737","7250055","7286503","7250172","7250176","7250183","7250184","7250188","7208206","7286520","7451195","7017336","7017335","7495647","7495645","7495646","7451400","7451300","7451302","7451289","7451288","7451290","7451297","7451180","7451184","7444746","7444744","7444745","7451179","7451173","7451339","7101223","7101096","7101226","7101222","7101097","7101095","7101197","7101203","7101207","7101212","7113946","7113972"
)
AND s.id_insegna = '3'
AND s.data >= DATE_SUB(CURDATE(), INTERVAL 26 WEEK)
EXPLAIN:
id select_type table typ possible_keys key key_len ref rows Extra
1 SIMPLE s ALL index_stock,index_data,index_trac_stock NULL NULL NULL 4917092 Using where
However as you can see it doesn't use any index. That could be because I don't have a combined index of with data, cod_art and id_insegna but I've read that add many indexes it may not help.
Probably I've made some mistake on creating the table? For example primary indexing the data? I'm a bit lost. I need a real huge help :v
thanks in advance to everyone.
Plan A -- No PARTITION, just good INDEX
SUBPARTITION BY HASH is useless for performance.
In fact, no form of PARTITIONing will speed up the SELECT you proposed. And most forms will slow down the query.
The optimal index for that select should
Start with any = constant parts of the WHERE (id_insegna)
Continue with one 'range' field (code_art or data)
So, I would first recommend either of these. Or both, and let the optimizer pick between them:
INDEX(d_insegna, code_art)
INDEX(d_insegna, data)
Plan B -- better PARTITION, plus good INDEX
Now, I will partially retract a previous statement and say that, since you have two ranges in the WHERE, we can try to take advantage of PARTITION BY RANGE to help with one of them.
So, I think this would be best:
INDEX(d_insegna, code_art)
PARTITION BY RANGE (TO_DAYS(data))
( start VALUES LESS THAN (0),
PARTITION p201501 VALUES LESS THAN (TO_DAYS('2015-02-01')),
PARTITION p201502 VALUES LESS THAN (TO_DAYS('2015-03-01')),
PARTITION p201503 VALUES LESS THAN (TO_DAYS('2015-04-01')),
PARTITION p201504 VALUES LESS THAN (TO_DAYS('2015-05-01')),
PARTITION p201505 VALUES LESS THAN (TO_DAYS('2015-06-01')),
PARTITION p201506 VALUES LESS THAN (TO_DAYS('2015-07-01')),
PARTITION future VALUES LESS THAN MAXVALUE
)
With that, and the 26 WEEK range, it will hit about 7 partitions. Today that is all the partitions, but in the future, it will continue to be only about 7.
See my partition blog for more details.
Do not have lots of empty partitions for the future; it is a performance hit.
Other comments
I do not know the distribution of code_art, nor whether the IN is typical, so I hessitate to even mention the option of PARTITION BY RANGE(code_art) and use the other INDEX. Oops, it is not possible, since code_art is a VARCHAR and BY RANGE does not work with that datatype.
Caveat: If you have other SELECTs, they need to be considered simultaneously with this one -- Optimization, especially when PARTITIONs are involved, cannot be done one query at a time.
Why use VARCHAR for what looks like numbers?
For further discussion, please include EXPLAIN PARTITIONS SELECT ...\G.
I have a very large table on a mysql 5.6.10 instance (roughly 480 million rows).
The storage engine is InnoDB. (Table and DB Default).
The table was partitioned by hash of merchantId (bigint: a kind of client identifier) which helped when queries related to a single merchant. Due to significant performance degradation when queries spanned multiple merchants, I decided to repartition the table by Range on ACTION_DATE (the DATE that an activity occurred). Thinking I was being clever, I decided to add a few (5) new fields for future use (unused_varchar1 varchar(200), etc.), since the table is so large, adding new fields essentially requires a rebuild anyway, so why not...
I created the new table structure as _new, dumped the existing file to a secondary server using mysql dump. I then used an awk script to finesse the name and a few other details to fit the new table (change tableName to tableName_new), and started the load.
The existing table was approximately 430 GB. The text file similarly was about 403 GB. I was surprised therefore that the new table ended up taking about 840 GB!! (Based on the linux fize size of the .ibd files)
So, I have 2 basic questions, which really amount to why and what now...
I imagine that the new table is larger because the dump file was in the order of the previous partition (merchantId) while the load was inserting into the new partitioning (Activity date) creating a semi-random insertion order. The randomness led mysql to leave plenty of space (roughly 50%) in the pages for future insertions. (I'm a little fuzzy on the terminology here, having spent much more time in my career with Sql Server DBs than MySql Dbs...) I'm not able to find any internal statistics in mysql for space free per page. The INFORMATION_SCHEMA.TABLES DATA_FREE stat is an unconvincing 68MB.
If it helps these are the relevant stats from I_S.TABLES:
TABLE_TYPE: BASE TABLE
Engine: InnoDB
VERSION: 10
ROW_FORMAT: Compact
TABLE_ROWS: 488,094,271
AVG_ROW_LENGTH: 1,564
DATA_LENGTH: 763,509,358,592 (711 GB)
INDEX_LENGTH: 100,065,574,912 (93.19 GB)
DATA_FREE: 68,157,440 (0.06 GB)
I realize that that doesn't add up to 840 GB, but as I said, that was the size of the .ibd files which seems to be slightly different than the I_S.TABLES stats. Either way, it is significantly more than the text dump file.
I digress...
My question is whether my theory about whether the repartioning explains the roughly doubled size. Or is there another explanation? I think the extra columns (2 Bigint, 2 Varchar(200), 1 Date) are not the culprit since they are all null. My napkin calculation was that the additional columns would add < 9 GB. Likewise, one additional index on UID should be a relatively small addition.
The follow up question is what can I do now if I want to try to compact the table. (Server now only has about 385 GB free...)
If I repeated the procedure, dump to file, reload, this time in the current partition order, would I end up with a table more like the size of my original table ~430 GB?
Following are relevant parts of DDL.
OLD TABLE:
CREATE TABLE table_name (
`AUTO_SEQ` bigint(20) NOT NULL,
`MERCHANT_ID` bigint(20) NOT NULL,
`AFFILIATE_ID` bigint(20) DEFAULT NULL,
`PROGRAM_ID` bigint(20) NOT NULL,
`ACTION_DATE` date DEFAULT NULL,
`UID` varchar(128) DEFAULT NULL,
... additional columns ...
PRIMARY KEY (`AUTO_SEQ`,`MERCHANT_ID`,`PROGRAM_ID`),
KEY `oc_rpt_mpad_idx` (`MERCHANT_ID`,`PROGRAM_ID`,`ACTION_DATE`,`AFFILIATE_ID`),
KEY `oc_rpt_mapd` (`MERCHANT_ID`,`ACTION_DATE`),
KEY `oc_rpt_apda_idx` (`AFFILIATE_ID`,`PROGRAM_ID`,`ACTION_DATE`,`MERCHANT_ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY HASH (merchant_id)
PARTITIONS 16 */
NEW TABLE:
CREATE TABLE `tableName_new` (
`AUTO_SEQ` bigint(20) NOT NULL,
`MERCHANT_ID` bigint(20) NOT NULL,
`AFFILIATE_ID` bigint(20) DEFAULT NULL,
`PROGRAM_ID` bigint(20) NOT NULL,
`ACTION_DATE` date NOT NULL DEFAULT '0000-00-00',
`UID` varchar(128) DEFAULT NULL,
... additional columns...
# NEW COLUMNS (ALL NULL)
`UNUSED_BIGINT1` bigint(20) DEFAULT NULL,
`UNUSED_BIGINT2` bigint(20) DEFAULT NULL,
`UNUSED_VARCHAR1` varchar(200) DEFAULT NULL,
`UNUSED_VARCHAR2` varchar(200) DEFAULT NULL,
`UNUSED_DATE1` date DEFAULT NULL,
PRIMARY KEY (`AUTO_SEQ`,`ACTION_DATE`),
KEY `oc_rpt_mpad_idx` (`MERCHANT_ID`,`PROGRAM_ID`,`ACTION_DATE`,`AFFILIATE_ID`),
KEY `oc_rpt_mapd` (`ACTION_DATE`),
KEY `oc_rpt_apda_idx` (`AFFILIATE_ID`,`PROGRAM_ID`,`ACTION_DATE`,`MERCHANT_ID`),
KEY `oc_uid` (`UID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50500 PARTITION BY RANGE COLUMNS(ACTION_DATE)
(PARTITION p01 VALUES LESS THAN ('2012-01-01') ENGINE = InnoDB,
PARTITION p02 VALUES LESS THAN ('2012-04-01') ENGINE = InnoDB,
PARTITION p03 VALUES LESS THAN ('2012-07-01') ENGINE = InnoDB,
PARTITION p04 VALUES LESS THAN ('2012-10-01') ENGINE = InnoDB,
PARTITION p05 VALUES LESS THAN ('2013-01-01') ENGINE = InnoDB,
PARTITION p06 VALUES LESS THAN ('2013-04-01') ENGINE = InnoDB,
PARTITION p07 VALUES LESS THAN ('2013-07-01') ENGINE = InnoDB,
PARTITION p08 VALUES LESS THAN ('2013-10-01') ENGINE = InnoDB,
PARTITION p09 VALUES LESS THAN ('2014-01-01') ENGINE = InnoDB,
PARTITION p10 VALUES LESS THAN ('2014-04-01') ENGINE = InnoDB,
PARTITION p11 VALUES LESS THAN ('2014-07-01') ENGINE = InnoDB,
PARTITION p12 VALUES LESS THAN ('2014-10-01') ENGINE = InnoDB,
PARTITION p13 VALUES LESS THAN ('2015-01-01') ENGINE = InnoDB,
PARTITION p14 VALUES LESS THAN ('2015-04-01') ENGINE = InnoDB,
PARTITION p15 VALUES LESS THAN ('2015-07-01') ENGINE = InnoDB,
PARTITION p16 VALUES LESS THAN ('2015-10-01') ENGINE = InnoDB,
PARTITION p17 VALUES LESS THAN ('2016-01-01') ENGINE = InnoDB,
PARTITION p18 VALUES LESS THAN ('2016-04-01') ENGINE = InnoDB,
PARTITION p19 VALUES LESS THAN ('2016-07-01') ENGINE = InnoDB,
PARTITION p20 VALUES LESS THAN ('2016-10-01') ENGINE = InnoDB,
PARTITION p21 VALUES LESS THAN ('2017-01-01') ENGINE = InnoDB,
PARTITION p22 VALUES LESS THAN ('2017-04-01') ENGINE = InnoDB,
PARTITION p23 VALUES LESS THAN ('2017-07-01') ENGINE = InnoDB,
PARTITION p24 VALUES LESS THAN ('2017-10-01') ENGINE = InnoDB,
PARTITION p25 VALUES LESS THAN ('2018-01-01') ENGINE = InnoDB,
PARTITION p26 VALUES LESS THAN ('2018-04-01') ENGINE = InnoDB,
PARTITION p27 VALUES LESS THAN ('2018-07-01') ENGINE = InnoDB,
PARTITION p28 VALUES LESS THAN ('2018-10-01') ENGINE = InnoDB,
PARTITION p29 VALUES LESS THAN ('2019-01-01') ENGINE = InnoDB,
PARTITION p30 VALUES LESS THAN (MAXVALUE) ENGINE = InnoDB) */
adding new fields essentially requires a rebuild anyway, so why not
I predict you will regret it.
The existing table was approximately 430 GB.
According to size of .ibd? Or SHOW TABLE STATUS? Or the dump size, which would be bogus (see below).
it is significantly more than the text dump file
The lengths in TABLE STATUS include several flavors of overhead (BTree, free space, extra extents, etc), plus the indexes (which are not in the dump file).
Also, think about a BIGINT that contains 1234. The .ibd will 8 bytes plus some overhead; the dump will have 5 ('1234', plus a comma). That leads to my next point...
Are there really more than 4 billion merchants? merchant_id is BIGINT (8 bytes); INT UNSIGNED is only 4 bytes and allows 0..4 billion.
What's in uid? If it is some sort of UUID, it seems awfully long.
Do you happen to have the "stats from I_S.TABLES" from the old table?
So far, I have not addressed "whether the repartioning explains the roughly doubled size".
extra columns (2 Bigint, 2 Varchar(200), 1 Date)
That's about 29 bytes per row (15GB of Data_length), perhaps less since they are NULL.
You seem to be using the default ROW_FORMAT. I suspect this did not change in the conversion.
It is usually unwise to start an index with the "partition key" (merchant_id or action_date). This is because you are already "pruning" on that key; you are better off starting the index with something else. (Caveat: There are exceptions.)
Check the CHARACTER SET and datatype of the "additional columns". If something changed, that could be significant.
would I end up with a table more like the size of my original table ~430 GB?
Alas, until we figure out why it grew, I can't answer that question.
I'm more interested in whether random insertion vs. the partition (ACTION_DATE) would lead to wasted space / half empty pages.
I recommend you try the following experiment. Do not use optimize partition; see http://bugs.mysql.com/bug.php?id=42822 . Instead do this to defragment one partition (such as p02):
ALTER TABLE table_name REBUILD PARTITION p02;
You could do this SELECT before and after in order to see the change(s) to the PARTITIONs:
SELECT *
FROM information_schema.PARTITIONS
WHERE TABLE_SCHEMA = 'dbname' -- change as needed
AND TABLE_NAME = 'table_name' -- change as needed
ORDER BY PARTITION_ORDINAL_POSITION,
SUBPARTITION_ORDINAL_POSITION;
It's a generic query to get the table-status-like info for the partitions of one table.
If the REBUILD cuts the partition by about 50%, then we have the answer.
Generally, randomly inserting into a BTree should leave you with about 69% (not 50%) of the "full" size. Hence, I'm not 'expecting' this to be the solution/answer.