Partiton mysql table by column timestamp - mysql

I am trying to partition my table MySQL innoDB. Right now there are approximately 2 million rows in the location table (and growing always) rows of history data. I must perodicly delete the dataset old by year
I use MySQL 5.7.22 Community Server.
CREATE TABLE `geo_data` (
`ID` bigint(20) NOT NULL AUTO_INCREMENT,
`ID_DISP` bigint(20) DEFAULT NULL,
`SYS_TIMESTAMP` datetime DEFAULT NULL,
`DATA_TIMESTAMP` bigint(20) DEFAULT NULL,
`X` double DEFAULT NULL,
`Y` double DEFAULT NULL,
`SPEED` bigint(20) DEFAULT NULL,
`HEADING` bigint(20) DEFAULT NULL,
`ID_DATA_TYPE` bigint(20) DEFAULT NULL,
`PROCESSED` bigint(20) DEFAULT NULL,
`ALTITUDE` bigint(20) DEFAULT NULL,
`ID_UNIT` bigint(20) DEFAULT NULL,
`ID_DRIVER` bigint(20) DEFAULT NULL,
UNIQUE KEY `part_id` (`ID`,`DATA_TIMESTAMP`,`ID_DISP`),
KEY `Index_idDisp_dataTS_type` (`ID_DISP`,`DATA_TIMESTAMP`,`ID_DATA_TYPE`),
KEY `Index_idDisp_dataTS` (`ID_DISP`,`DATA_TIMESTAMP`),
KEY `Index_TS` (`DATA_TIMESTAMP`),
KEY `idx_sysTS_idDisp` (`ID_DISP`,`SYS_TIMESTAMP`),
KEY `idx_clab_geo_data_ID_UNIT_DATA_TIMESTAMP_ID_DATA_TYPE` (`ID_UNIT`,`DATA_TIMESTAMP`,`ID_DATA_TYPE`),
KEY `idx_idUnit_dataTS` (`ID_UNIT`,`DATA_TIMESTAMP`),
KEY `idx_clab_geo_data_ID_DRIVER_DATA_TIMESTAMP_ID_DATA_TYPE` (`ID_DRIVER`,`DATA_TIMESTAMP`,`ID_DATA_TYPE`)
) ENGINE=InnoDB AUTO_INCREMENT=584390 DEFAULT CHARSET=latin1;
I have to partition by DATA_TIMESTAMP (format timestamp date gps).
ALTER TABLE geo_data
PARTITION BY RANGE (year(from_unixtime(data_timestamp)))
(
PARTITION p2018 VALUES LESS THAN ('2018'),
PARTITION p2019 VALUES LESS THAN ('2019'),
PARTITION pmax VALUES LESS THAN MAXVALUE
);
Error Code: 1697. VALUES value for partition 'p2018' must have type INT
How can I do?
I would like to add later a subpartion range by ID_DISP. How can I do?
Thanks in advance!

Since data_timestamp was actually a BIGINT, you are not permitted to use date functions. It seemed there were two errors, and this probably fixes them:
ALTER TABLE geo_data
PARTITION BY RANGE (data_timestamp)
(
PARTITION p2018 VALUES LESS THAN (UNIX_TIMESTAMP('2018-01-01') * 1000),
PARTITION p2019 VALUES LESS THAN (UNIX_TIMESTAMP('2019-01-01') * 1000),
PARTITION pmax VALUES LESS THAN MAXVALUE
);
I am assuming your data_timestamp is really milliseconds, a la Java? If not, the decide what to do with the * 1000.
SUBPARTITIONs are useless; don't bother with them. If you really want to partition by Month or Quarter, then simply do it at the PARTITION level.
Recommendation: Don't have more than about 50 partitions.
How many "drivers" do you have? I suspect you do not have trillions. So, don't blindly use BIGINT for ids. Each takes 8 bytes. SMALLINT UNSIGNED, for example, would take only 2 bytes and allow 64K drivers (etc).
If X and Y are latitude and longitude, it might be clearer to name them such. Here are what datatype to use instead of the 8-byte DOUBLE, depending on the resolution you have (and need). 4-byte FLOATs are probably good enough for vehicles.
The table has several redundant indexes; toss them. Also, note that when you have INDEX(a,b,c), it is redundant to also have INDEX(a,b).
See also my discussion on Partitioning, especially related to time-series, such as yours.
Hmmm... I wonder if the 63 bits of precision for SPEED will let you record them when they go at the speed of light?
Another point: Don't create p2019 until just before the start of 2019. You have pmax in case you goof and fail to add that partition in time. And the REORGANIZE PARTITION mentioned in my discussion covers how to recover from such a goof.

Update:
Seems you cannot use from_unixtime in a PARTITION BY RANGE query because hash partitions must be based on an integer expression. More info see this answer
Its expecting an INT not a STRING (as per the error message), so try :
ALTER TABLE geo_data
PARTITION BY RANGE (year(from_unixtime(data_timestamp)))
(
PARTITION p2018 VALUES LESS THAN (2018),
PARTITION p2019 VALUES LESS THAN (2019),
PARTITION pmax VALUES LESS THAN MAXVALUE
);
Here I have specified the year in the partition values as an int ie 2018 / 2019, and not strings as in '2018' / '2019'

Related

How to partition a table by year and then subpartition by month in mysql 8

I have a table that contains a month and a year column.
I have a query which usually looks something like WHERE month=1 AND year=2022
Given how large this table is i would like to make it more efficient using partitions and sub partitions.
table 1
Querying the data i need took around 2 minutes and 30 seconds.
CREATE TABLE `table_1` (
`id` int NOT NULL AUTO_INCREMENT,
`entity_id` varchar(36) NOT NULL,
`entity_type` varchar(36) NOT NULL,
`score` decimal(4,3) NOT NULL,
`month` int NOT NULL DEFAULT '0',
`year` int NOT NULL DEFAULT '0',
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`deleted_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_month_year` (`month`,`year`, `entity_type`)
)
Partitioning by "month"
Querying the data i need took around 21 seconds (big improvement).
CREATE TABLE `table_1` (
`id` int NOT NULL AUTO_INCREMENT,
`entity_id` varchar(36) NOT NULL,
`entity_type` varchar(36) NOT NULL,
`score` decimal(4,3) NOT NULL,
`month` int NOT NULL DEFAULT '0',
`year` int NOT NULL DEFAULT '0',
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`deleted_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`,`month`),
KEY `idx_month_year` (`month`,`year`, `entity_type`)
) ENGINE=InnoDB AUTO_INCREMENT=21000001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
/*!50100 PARTITION BY LIST (`month`)
(PARTITION p0 VALUES IN (0) ENGINE = InnoDB,
PARTITION p1 VALUES IN (1) ENGINE = InnoDB,
PARTITION p2 VALUES IN (2) ENGINE = InnoDB,
PARTITION p3 VALUES IN (3) ENGINE = InnoDB,
PARTITION p4 VALUES IN (4) ENGINE = InnoDB,
PARTITION p5 VALUES IN (5) ENGINE = InnoDB,
PARTITION p6 VALUES IN (6) ENGINE = InnoDB,
PARTITION p7 VALUES IN (7) ENGINE = InnoDB,
PARTITION p8 VALUES IN (8) ENGINE = InnoDB,
PARTITION p9 VALUES IN (9) ENGINE = InnoDB,
PARTITION p10 VALUES IN (10) ENGINE = InnoDB,
PARTITION p11 VALUES IN (11) ENGINE = InnoDB,
PARTITION p12 VALUES IN (12) ENGINE = InnoDB) */
I would like to see if i can improve the performance even further by partitioning by year and then subpartitioning by month. How can i do that?
I'm not sure the following question Partition by year and sub-partition by month mysql is relevant with no marked answers and that question looks to be particular to mysql 5* and php. Im asking about mysql 8, are there no changes since then regarding partioning/subpartioning/list columns/range columns etc? which could help me.
Broader query im making
SELECT
table_1.entity_id AS entity_id,
table_1.entity_type,
table_1.score
FROM table_1
WHERE table_1.month = 12 AND table_1.year = 2022
AND table_1.score > 0
AND table_1.entity_type IN ('type1', 'type2', 'type3', 'type4') # only ever 4 types usually all 4 are present in the query
To answer your question directly, below is example syntax that accomplishes the subpartitioning. Notice the PRIMARY KEY must include all columns used for partitioning or subpartitioning. Read the manual on subpartitioning for more information: https://dev.mysql.com/doc/refman/8.0/en/partitioning-subpartitions.html
Schema (MySQL v8.0)
CREATE TABLE `table_1` (
`id` int NOT NULL AUTO_INCREMENT,
`entity_id` varchar(36) NOT NULL,
`entity_type` varchar(36) NOT NULL,
`score` decimal(4,3) NOT NULL,
`month` int NOT NULL DEFAULT '0',
`year` int NOT NULL DEFAULT '0',
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`deleted_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`,`month`, `year`),
KEY `idx_month_year` (`month`,`year`, `score`, `entity_type`)
) ENGINE=InnoDB AUTO_INCREMENT=21000001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
PARTITION BY LIST (`month`)
SUBPARTITION BY HASH(`year`)
SUBPARTITIONS 10 (
PARTITION p0 VALUES IN (0) ENGINE = InnoDB,
PARTITION p1 VALUES IN (1) ENGINE = InnoDB,
PARTITION p2 VALUES IN (2) ENGINE = InnoDB,
PARTITION p3 VALUES IN (3) ENGINE = InnoDB,
PARTITION p4 VALUES IN (4) ENGINE = InnoDB,
PARTITION p5 VALUES IN (5) ENGINE = InnoDB,
PARTITION p6 VALUES IN (6) ENGINE = InnoDB,
PARTITION p7 VALUES IN (7) ENGINE = InnoDB,
PARTITION p8 VALUES IN (8) ENGINE = InnoDB,
PARTITION p9 VALUES IN (9) ENGINE = InnoDB,
PARTITION p10 VALUES IN (10) ENGINE = InnoDB,
PARTITION p11 VALUES IN (11) ENGINE = InnoDB,
PARTITION p12 VALUES IN (12) ENGINE = InnoDB
);
Using EXPLAIN on your query reveals that the query references only one subpartition.
Query #1
EXPLAIN
SELECT
table_1.entity_id AS entity_id,
table_1.entity_type,
table_1.score
FROM table_1
WHERE table_1.month = 12
AND table_1.year = 2022
AND table_1.score > 0
AND table_1.entity_type IN ('type1', 'type2', 'type3', 'type4');
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
SIMPLE
table_1
p12_p12sp2
range
idx_month_year
idx_month_year
11
1
100
Using index condition
The partitions field of the EXPLAIN shows that it accesses only partition p12_p12sp2. The year the query references, 2022, modulus the number of subpartitions, 10, will read from the subpartition 2.
In addition to the partitioning by month and year, it is also helpful to use an index. In this case, I added score to the index so it would filter out rows where score <= 0. The note in the EXPLAIN "Using index condition" shows that it is delegating further filtering on entity_type to the storage engine. Though in your example, you said there are only four values for entity type, and all four are selected, so that condition won't filter out any rows anyway.
View on DB Fiddle
Re your questions in comments below:
a little bit confused on SUBPARTITIONS 10 , why 10
It's just an example. You can choose a different number of subpartitions. Whatever you feel is required to reduce the search as much as you want.
To be honest, I've never encountered a situation that required subpartitioning at all, if the search is also optimized with indexes. So I have no guidance on what is an appropriate number of subpartitions.
It's your responsibility to test performance until you are satisfied.
also bit confusd on the partition name p12_p12sp2 how do i know it selected the partition with year 2022 from looking at that?
The query has a condition year = 2022.
There are 10 subpartitions in my example.
Hash partitioning just uses the integer value to be partitioned, modulus the number of partitions.
2022 modulus 10 is 2. Hence the partition ending in ...sp2 is the one used.
I also came across this anothermysqldba.blogspot.com/2014/12/… do you know how yours differs from what it shown here ( bare in mind that blog is from 2014)
They chose to name the subpartitions. There's no need to do that.
would there be any performance difference in having a single date e.g (2022-12-21) instead of sepreate columns month and year.
That depends on the query, and I'll leave it to you to test. Any predictions I make won't be accurate with your data on your server.
i can also see that you partition by month and subpartition by year, as oppose to partition by year and subpartition by month. can you explain the reasoning?
Subpartitioning works only if the outer partitions are LIST or RANGE partitions, and the subpartitions are HASH or KEY partitions. This is in the manual page I linked to.
There are a finite number of months (12). This makes it easy to partition by LIST as you did. You won't ever need more partitions. If you had partitioned by YEAR as the outer partition, you would have needed to specify year values in the list, and this is a growing set, so you would periodically have to alter the table to extend the list or range to account for new years.
Whereas when partitioning by HASH for the subpartitioning, the new year values are mapped into the finite set of subpartitions, so it's okay that it's not a finite list. You won't have to alter table to repartition (unless you want to change the number of subpartitions).
Splitting a date into columns is usually counterproductive. It is much easier to split during SELECT.
PARTITIONing is usually useless for performance of any SELECT.
When partitioning (or unpartitioning), the indexes usually need changing.
For that query, I recommend a combined date column,
WHERE date >= '2022-01-01'
AND date < '2022-01-01' + INTERVAL 1 MONTH
and some INDEX starting with date.
(You probably have other queries; let's see some of them; they may need a different index.)
Covering index -- This is an index that contains all the columns found anywhere in the SELECT. It is may be better (faster) than having only the columns needed for WHERE or WHERE + GROUP BY + ORDER BY. It depends on a lot of variables.
Order of columns in an index (or PK): The leftmost column(s) have priority. That is the order of the index rows on disk. PK(id, date) is useful if looking up by id (in the WHERE), but not if you are just searching by date.
Sargable -- sargable -- Hiding a column in a function disables the use of an index. That is MONTH(date) cannot use INDEX(date).
Blogs -- Index Cookbook and Partition
Test plan
I recommend you time all your queries against a variety of Create Tables.
For the WHERE clause:
The order of ANDs does not matter.
When using IN, a single value os equivalent to = and optimizes better. Multiple values may optimize more poorly. As Bill hints at, when the IN list contains all the options, you should eliminate the clause since the Optimizer is not smart enough. So, be sure to test with 1 and/or many items, so as to be realistic to your app.
For the table
Try Partition BY year + Subpartition by month.
Try Partition by a column that is the combination of year and month.
Try without partitioning.
For indexes
Order of the columns (in a composite index) does matter, so try different orderings.
When partitioning, be sure to tack onto the end of the PK the partition key(s).
A partitioned table needs different indexes than a non-partitioned table. That is, what works well for one may work poorly for the other.
Simply use something like this pattern to test various layouts:
CREATE TABLE (( a new layout with or without partitioning and with indexes ))
INSERT INTO test_table SELECT ... FROM real_table;
Change the "..." to adapt to any extra/missing columns in test_table
SELECT ...
Run various 'real' queries
Run each query twice (caching sometimes messes with the timing)
Report the results -- If you provide sufficient info (CREATE TABLE and SELECT), I may have suggestions on further speeding up the test (whether it is partitioned or not).

Efficient Method To Partition MySQL Table By Year and Month

I am exploring ways of partitioning a MySQL table by year and month. Can you please analyze my table creation below and see if this method of partitioning would end up putting data by month and year in these sub partitions? I'm using MySQL 5.5 and I can't use
SELECT * FROM points_log PARTITION (p0_p0sp0);
to validate if the partitioning is working. If there is a way to validate this in MySQL 5.5 please comment. I appreciate your feedback and criticisms on this table partitioning.
Here is my table creation:
CREATE TABLE `points_log` (
`id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`nick` char(25) NOT NULL,
`amount` decimal(7,4) NOT NULL,
`stream_online` tinyint(1) NOT NULL,
`modification_type` tinyint(3) unsigned NOT NULL,
`dt` datetime NOT NULL,
PRIMARY KEY (`id`,`dt`,`nick`),
KEY `nick_idx` (`nick`),
KEY `amount_idx` (`amount`),
KEY `modification_type_idx` (`modification_type`),
KEY `dt_idx` (`dt`),
KEY `stream_online_idx` (`stream_online`)
) ENGINE=InnoDB AUTO_INCREMENT=13 DEFAULT CHARSET=latin1
PARTITION BY RANGE( YEAR(dt) )
SUBPARTITION BY HASH( MONTH(dt) )
SUBPARTITIONS 12 (
PARTITION p0 VALUES LESS THAN (2014),
PARTITION p1 VALUES LESS THAN (2015),
PARTITION p2 VALUES LESS THAN (2016),
PARTITION p3 VALUES LESS THAN (2017),
PARTITION p4 VALUES LESS THAN (2018),
PARTITION p5 VALUES LESS THAN (2019),
PARTITION p6 VALUES LESS THAN (2020),
PARTITION p7 VALUES LESS THAN MAXVALUE
);
SUBPARTITIONs are probably useless. (That is, I have yet to find any advantage to their use. That especially applies to performance.)
Don't split the date; keep it as a single field.
Use BY RANGE(TO_DAYS(dt)) VALUES LESS THAN (TO_DAYS('2015-02-01'))
BY HASH is probably totally useless for performance.
WHERE dt BETWEEN .. AND .. cannot do partition pruning in the structure you have.
Do not use more than about 50 partitions (for performance reasons).
Do not create more than one 'future' partition; build them as needed. (This is a minor performance improvement.)
Do not use CHAR for variable length fields. Use VARCHAR.

Partition MySQL table with primary key and concatonated unique index

I have a table storing weekly viewing statistic for around 40K businesses, the tables passed 2.2M records and is starting to slow things down, I'm looking at partitioning it to speed things up but I'm not sure how best to do it.
My ORM requires an id field as a primary key, but that field has no relevance to the data, I've been using a unique index on fields for year, week number and business ID.
As I need the primary key to be involved in the partition map, I'm not sure how best to organise this (I've never used partitioning before).
Currently I have...
CREATE TABLE `weekly_views` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`business_id` int(11) NOT NULL,
`year` smallint(4) UNSIGNED NOT NULL,
`week` tinyint(2) UNSIGNED NOT NULL,
`hits` int(5) NOT NULL,
`created` timestamp NOT NULL ON UPDATE CURRENT_TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`updated` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
UNIQUE `search` USING BTREE (business_id, `year`, `week`),
UNIQUE `id` USING BTREE (id, `week`)
) ENGINE=`InnoDB` AUTO_INCREMENT=2287009 DEFAULT CHARACTER SET latin1 COLLATE latin1_swedish_ci ROW_FORMAT=COMPACT CHECKSUM=0 DELAY_KEY_WRITE=0 PARTITION BY LIST(week) PARTITIONS 52 (PARTITION p1 VALUES IN (1) ENGINE = InnoDB,
PARTITION p2 VALUES IN (2) ENGINE = InnoDB,
PARTITION p3 VALUES IN (3) ENGINE = InnoDB,
PARTITION p4 VALUES IN (4) ENGINE = InnoDB,
(5 ... 51)
PARTITION p52 VALUES IN (52) ENGINE = InnoDB);
One partition per week seemed the only logical way to break them up. Am I right that when I search for a record for the current week/business using 'business_id = xx and week = xx and year = xx' it's going to know which partition to use without searching them all? But, when I get the result and save it via the ORM, it's going to use the id field and not know which partition to use?
I guess I could use a custom query to insert or update (I haven't originally done this as the ORM doesn't support it).
Am I going the right way about this, or is there a better way to partition a table like this?
Thanks for your help!
As long as the query has week column in WHERE clause, MySQL will look in correct partition. However, weeks repeat each year and you'll end up with data from different years in the same partition.
Also you need 53 not 52 partitions, as you'll need to deal with leap years.

How to create a table in MySQL 5.5 that has fulltext indexing and partitioning?

I update MySQL versition from 5.0 to 5.5. and I am new for studying mysql partition. firstly, I type:
SHOW VARIABLES LIKE '%partition%'
Variable_name Value
have_partitioning YES
Make sure that the new version support partition. I tried to partition my table by every 10 minutes, then INSERT, UPDATE, QUERY huge data into this table for a test.
First, I need create a new table, I type my code:
CREATE TABLE test (
`id` int unsigned NOT NULL auto_increment,
`words` varchar(100) collate utf8_unicode_ci NOT NULL,
`date` varchar(10) collate utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `index` (`words`)
)
ENGINE=MyISAM
DEFAULT CHARSET=utf8
COLLATE=utf8_unicode_ci
AUTO_INCREMENT=0
PARTITION BY RANGE (MINUTE(`date`))
(
PARTITION p0 VALUES LESS THAN (1322644000),
PARTITION p1 VALUES LESS THAN (1322644600) ,
PARTITION p2 VALUES LESS THAN (1322641200) ,
PARTITION p3 VALUES LESS THAN (1322641800) ,
PARTITION p4 VALUES LESS THAN MAXVALUE
);
It return alert: #1564 - This partition function is not allowed, so what is this problem? thanks.
UPDATE
Modify date into int NOT NULL, and PARTITION BY RANGE MINUTE(date) into PARTITION BY RANGE COLUMNS(date)
CREATE TABLE test (
`id` int unsigned NOT NULL auto_increment,
`words` varchar(100) collate utf8_unicode_ci NOT NULL,
`date` int NOT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `index` (`words`)
)
ENGINE=MyISAM
DEFAULT CHARSET=utf8
COLLATE=utf8_unicode_ci
AUTO_INCREMENT=0
PARTITION BY RANGE COLUMNS(`date`)
(
PARTITION p0 VALUES LESS THAN (1322644000),
PARTITION p1 VALUES LESS THAN (1322644600) ,
PARTITION p2 VALUES LESS THAN (1322641200) ,
PARTITION p3 VALUES LESS THAN (1322641800) ,
PARTITION p4 VALUES LESS THAN MAXVALUE
);
Then caused new error: #1214 - The used table type doesn't support FULLTEXT indexes
I am so sorry, mysql not support fulltext and partition at the same time.
See partitioning limitations
FULLTEXT indexes. Partitioned tables do not support FULLTEXT indexes or searches. This includes partitioned tables employing the MyISAM storage engine.
One issue might be
select MINUTE('2008-10-10 56:56:98') returns null, the reason is Minute function returns minute from time or datetime value, where as in your case date is varchar
MINUTE function returns in either date/datetime expression. Again, A partitioning key must be either an integer column or an expression that resolves to an
integer but inyour case it's VARCHAR

how to partition a table by datetime column?

I want to partition a mysql table by datetime column. One day a partition.The create table scripts is like this:
CREATE TABLE raw_log_2011_4 (
id bigint(20) NOT NULL AUTO_INCREMENT,
logid char(16) NOT NULL,
tid char(16) NOT NULL,
reporterip char(46) DEFAULT NULL,
ftime datetime DEFAULT NULL,
KEY id (id)
) ENGINE=InnoDB AUTO_INCREMENT=286802795 DEFAULT CHARSET=utf8
PARTITION BY hash (day(ftime)) partitions 31;
But when I select data of some day.It could not locate the partition.The select statement is like this:
explain partitions select * from raw_log_2011_4 where day(ftime) = 30;
when i use another statement,it could locate the partition,but I coluld not select data of some day.
explain partitions select * from raw_log_2011_4 where ftime = '2011-03-30';
Is there anyone tell me How I could select data of some day and make use of partition.Thanks!
Partitions by HASH is a very bad idea with datetime columns, because it cannot use partition pruning. From the MySQL docs:
Pruning can be used only on integer columns of tables partitioned by
HASH or KEY. For example, this query on table t4 cannot use pruning
because dob is a DATE column:
SELECT * FROM t4 WHERE dob >= '2001-04-14' AND dob <= '2005-10-15';
However, if the table stores year values in an INT column, then a
query having WHERE year_col >= 2001 AND year_col <= 2005 can be
pruned.
So you can store the value of TO_DAYS(DATE()) in an extra INTEGER column to use pruning.
Another option is to use RANGE partitioning:
CREATE TABLE raw_log_2011_4 (
id bigint(20) NOT NULL AUTO_INCREMENT,
logid char(16) NOT NULL,
tid char(16) NOT NULL,
reporterip char(46) DEFAULT NULL,
ftime datetime DEFAULT NULL,
KEY id (id)
) ENGINE=InnoDB AUTO_INCREMENT=286802795 DEFAULT CHARSET=utf8
PARTITION BY RANGE( TO_DAYS(ftime) ) (
PARTITION p20110401 VALUES LESS THAN (TO_DAYS('2011-04-02')),
PARTITION p20110402 VALUES LESS THAN (TO_DAYS('2011-04-03')),
PARTITION p20110403 VALUES LESS THAN (TO_DAYS('2011-04-04')),
PARTITION p20110404 VALUES LESS THAN (TO_DAYS('2011-04-05')),
...
PARTITION p20110426 VALUES LESS THAN (TO_DAYS('2011-04-27')),
PARTITION p20110427 VALUES LESS THAN (TO_DAYS('2011-04-28')),
PARTITION p20110428 VALUES LESS THAN (TO_DAYS('2011-04-29')),
PARTITION p20110429 VALUES LESS THAN (TO_DAYS('2011-04-30')),
PARTITION future VALUES LESS THAN MAXVALUE
);
Now the following query will only use partition p20110403:
SELECT * FROM raw_log_2011_4 WHERE ftime = '2011-04-03';
Hi You are doing the wrong partition in definition of the table the table definition would like this:
CREATE TABLE raw_log_2011_4 (
id bigint(20) NOT NULL AUTO_INCREMENT,
logid char(16) NOT NULL,
tid char(16) NOT NULL,
reporterip char(46) DEFAULT NULL,
ftime datetime DEFAULT NULL,
KEY id (id)
) ENGINE=InnoDB AUTO_INCREMENT=286802795 DEFAULT CHARSET=utf8
PARTITION BY hash (TO_DAYS(ftime)) partitions 31;
And your select command would be:
explain partitions
select * from raw_log_2011_4 where TO_DAYS(ftime) = '2011-03-30';
The above command would select all the date required, as if you use the TO_DAYS command as
mysql> SELECT TO_DAYS(950501);
-> 728779
mysql> SELECT TO_DAYS('2007-10-07');
-> 733321
Why to use the TO_DAYS AS The MySQL optimizer will recognize two date-based functions for partition pruning purposes:
1.TO_DAYS()
2.YEAR()
and this would solve your problem..
I just recently read a MySQL blog post relating to this, at http://dev.mysql.com/tech-resources/articles/mysql_55_partitioning.html.
Versions earlier than 5.1 required special gymnastics in order to do partitioning based on dates. The link above discusses it and shows examples.
Versions 5.5 and later allowed you to do direct partitioning using non-numeric values such as dates and strings.
Don't use CHAR, use VARCHAR. That will save a lot of space, hence decrease I/O, hence speed up queries. (Exception: If the column is really fixed length, then use CHAR. And it will probably be CHARACTER SET ascii.)
reporterip: (46) is unnecessarily big for an IP address, even IPv6. See My blog for further discussion, including how to shrink it to 16 bytes.
PARTITION BY RANGE(TO_DAYS(...)) as #Steyx suggested, but don't have more than about 50 partitions. The more partitions you have, the slower queries get, in spite of the "pruning". HASH partitioning is essentially useless.
More discussion of partitioning, especially the type you are looking at. That includes code for a sliding set of partitions over time.