How to partition a mysql table after the table is already created - mysql

I am trying to partition my table so I can narrow down the record so accessing data won't take as long as it is taking now.
this table that I want to partition has 2 key fields
(1) 'tigger_on' which is a datetime field and I use this a lot as look up key.
(2) 'status' which has 3 values 1=active,2=completed,0=purged.
I am not sure what is the best way to partition this table so that it will be easier to access for the select statement?
First I have one question when I do a partition does this create a new table so I will have to alter my queries? or is it something like an index where it narrow down the search so the look up data will be less?
Second How can I alter my existing table to add this partition? should I partition base on date range or my status or can I do it by both?
I never done partition before so I am clueless on how its done.
Note this table has 5 million records and I have added index. So I am looking for solution beyond indexing at this point.

Related

Partitioning table on YEAR and create view in MYSQL

I have 2 problems with a partitioned table in mysql.
My table has three columns
id_row INT NOT NULL AUTO_INCREMENT
name_element VARCHAR(45) NULL
date_element DATETIME NOT NULL
I modify the table to apply partioning by range on YEAR(date_element) as follows
ALTER TABLE `orderslist`
PARTITION BY RANGE(YEAR(date_element))
PARTITIONS 5(
PARTITION part_2013 VALUES LESS THAN (2014),
PARTITION part_2014 VALUES LESS THAN (2015),
PARTITION part_2015 VALUES LESS THAN (2016),
PARTITION part_2016 VALUES LESS THAN (2017),
PARTITION part_2017 VALUES LESS THAN (MAXVALUE));
but when I use
EXPLAIN PARTITIONS SELECT * FROM ordersList WHERE YEAR(date_element) > '2015';
the query uses all the partitions and not only part_2015, part_2016 and part_2017.
Instead if I use
EXPLAIN PARTITIONS SELECT * FROM ordersList WHERE date_element > '2015-10-10 10:00:00';
it works.
So my questions are:
How can I make the first query work?
Is there a way to create a materialized view from this table without losing the partitions?
Thank you
In your first example: EXPLAIN PARTITIONS SELECT * FROM ordersList WHERE YEAR(date_element) > '2015'; there's no way for the engine to identify beforehand in which partition your data is.
It must evaluate YEAR(date_element) in every row to find out the year. It's a classic example of filtering by a function's result. DBMS in general can't use indexes to find data this way, since the function's result is unknown and must be evaluated for every table, so your search turns into a full scan.
I understand your point here, since you used the same function the define partitioning and to find data, but for some reason this optimization is not there. In other words: the engine doesn't notice both functions are the same.
In the second statement, you're directly comparing a column to an arbitrary value, this is what the engine prefers, and indexes come into play.
MySQL's PARTITIONing is quite finicky. Whereas YEAR() is recognized, it is probably the only expression that is recognized, not > it plays dumb.
Why are you partitioning on YEAR? it may not be useful.
If your queries are like what you described. then an appropriate index on a non-partitioned table is likely to run just as fast.
Please provide the important queries and SHOW CREATE TABLE (with or without partitioning) so we can analyze what makes the most sense.
Also, what is PARTITIONS 5??

MySQL - Access Partitions Through Views

I've created a view on partitioned table. When I pass the partitioned column to the SELECT statement of view, the optimizer is not going to that particular partition when checked through EXPLAIN statement.
Is there any way to make the view access a single partition of its table?
[Edit] : Here is how I created the view on two partitioned tables
CREATE TABLE Partition1 (ID INT,NAME VARCHAR(100),DOB DATE)
PARTITION BY LIST (YEAR(DOB))
(
PARTITION P_2000 VALUES IN (2000),
PARTITION P_2001 VALUES IN (2001)
);
CREATE TABLE NOPART (ID INT,DOB DATE)
PARTITION BY LIST (YEAR(DOB))
(
PARTITION P_2000 VALUES IN (2000),
PARTITION P_2001 VALUES IN (2001)
);
CREATE OR REPLACE VIEW P_VIEW
AS
SELECT ID,DOB
FROM PARTITION1
UNION
SELECT ID,DOB
FROM NOPART;
EXPLAIN
SELECT * FROM P_VIEW
WHERE DOB = '2001-01-01';
When I run the "Explain" it shows optimizer is going to both partitions "p_2000" and "p_2001".
There are many deficiencies in the implementation of VIEWs. You may have hit one.
There are many uses of PARTITIONing that do not provide any performance. BY RANGE is probably the only variant that helps performance for some use cases. A table with less than a million rows is not worth partitioning.
Without seeing your CREATE TABLE, CREATE VIEW, and SELECT, we can only give you vague answers like I have.
(Responding to added code) Unless there is more to it than that, PARTITIONing in that way provide no benefit over having an index on DOB.
Furthermore, The VIEW + PARTITION approach (without an index) must scan the entire 2001 partition looking for the few rows for '2001-01-01'. Instead the simple index approach can find them immediately -- 365 times as fast. (OK, not really that much faster, but still.)

Partitioning a MySQL table based on a column value.

I want to partition a table in MySQL while preserving the table's structure.
I have a column, 'Year', based on which I want to split up the table into different tables for each year respectively. The new tables will have names like 'table_2012', 'table_2013' and so on. The resultant tables need to have all the fields exactly as in the source table.
I have tried the following two pieces of SQL script with no success:
1.
CREATE TABLE all_data_table
( column1 int default NULL,
column2 varchar(30) default NULL,
column3 date default NULL
) ENGINE=InnoDB
PARTITION BY RANGE ((year))
(
PARTITION p0 VALUES LESS THAN (2010),
PARTITION p1 VALUES LESS THAN (2011) , PARTITION p2 VALUES LESS THAN (2012) ,
PARTITION p3 VALUES LESS THAN (2013), PARTITION p4 VALUES LESS THAN MAXVALUE
);
2.
ALTER TABLE all_data_table PARTITION BY RANGE COLUMNS (`year`) (
PARTITION p0 VALUES LESS THAN (2011),
PARTITION p1 VALUES LESS THAN (2012),
PARTITION p2 VALUES LESS THAN (2013),
PARTITION p3 VALUES LESS THAN (MAXVALUE)
);
Any assistance would be appreciated!
This is old, but seeing as it comes up highly ranked in partitioning searches, I figured I'd give some additional details for people who might hit this page. What you are talking about in having a table_2012 and table_2013 is not "MySQL Partitioning" but "Manual Partitioning".
Partitioning means that you have one "logical table" with a single table name, which--behind the scenes--is divided among multiple files. When you have millions to billions of rows, over years, but typically you are only searching a single month, partitioning by Year/Month can have a great performance benefit because MySQL only has to search against the file that contains the Year/Month that you are searching for...so long as you include the partition key in your WHERE.
When you create multiple tables like table_2012 and table_2013, you are MANUALLY partitioning the tables, which you don't do with the MySQL PARTITION configuration. To manually partition the tables, during 2012, you put all data into the 2012 table. When you hit 2013, you start putting all the data into the 2013 table. You have to make sure to create the table before you hit 2013 or it won't have any place to go. Then, when you query across the years (e.g. from Nov 2012 - Jan 2013), you have to do a UNION between table_2012 and table_2013.
SELECT * FROM table_2012 WHERE #...
UNION
SELECT * FROM table_2013 WHERE #...
With partitioning, this manual work is not necessary. You do the initial setup of the partitions, then you treat is as a single table. No unions required, no checking the date before you insert, etc. This makes life much easier. MySQL handles figuring out what tables it needs to query. However, you MUST make sure to query against the Year column or it will have to scan ALL files. E.g. SELECT * FROM all_data_table WHERE Month=12 will scan all partitions for Month=12. To ensure you are only scanning the partition files that you need to scan, you want to make sure to include the partition column in every query that you can.
Possible negatives to partitioning...if you have billions of rows and you do an ALTER TABLE on the table to--say--add a column...it's going to have to update every row taking a VERY long time. At the company I currently work for, the boss doesn't think it's worth the time it takes to update the billion rows historically when we are adding a new column for going forward...so this is one of the reasons we do manual partitioning instead of letting MySQL do it.
DISCLAIMER: I am not an expert at partitioning...so if I'm wrong in any of this, please let me know and I'll fix the incorrect parts.
From what I see you want to create many tables from one big table.
I think you should try to create views instead.
Since from what I look around about partitioning, it actually partitions the physical storage of that table and then store them separately. But if you see from the top perspective you will see them as a single table.

Partition strategy for MySQL 5.5 (InnoDB)

Trying to implement a partition strategy for a MySQL 5.5 (InnoDB) table and I am not sure my understanding is right or if I need to change the syntax in creating the partition.
Table "Apple" has 10 mill rows...Columns "A" to "H"
PK is columns "A", "B" and "C"
Column "A" is a char column and can identify groups of 2 million rows.
I thought column "A" would be a nice candidate to try and implement a partition around since
I select and delete by this column and could really just truncate the partition when the data is no longer needed.
I issued this command:
ALTER TABLE Apple
PARTITION BY KEY (A);
After looking at the partition info using this command:
SELECT PARTITION_NAME, TABLE_ROWS FROM
INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME = 'Apple';
I see all the data is on partition p0
I am wrong in thinking that MySQL was going to break out the partitions in groups of 2 million automagically?
Did I need to specify the number of partitions in the Alter command?
I was hoping this would create groups of 2 million rows in a partition and then create a new partition as new data comes in with a unique value for column "A".
Sorry if this was too wordy.
Thanks - JeffSpicoli
Yes, you need to specify the number of partitions (I assume the default was to create 1 partition). Partition by KEY uses internal hashing function http://dev.mysql.com/doc/refman/5.1/en/partitioning-key.html , so the partition is not selected based on the value of column, but on hash computed from it. Hashing functions return the same result for same input, so yes, all rows having the same value will be in the same partition.
But maybe you want to partition by RANGE if you want to be able to DROP PARTITION (because if partitioned by KEY, you only know that the rows are spaced evenly in the partitions, but you many different values end up in the same partition).

How to partition a MyISAM table by day in MySQL

I want to keep the last 45 days of log data in a MySQL table for statistical reporting purposes. Each day could be 20-30 million rows. I'm planning on creating a flat file and using load data infile to get the data in there each day. Ideally I'd like to have each day on it's own partition without having to write a script to create a partition every day.
Is there a way in MySQL to just say each day gets it's own partition automatically?
thanks
I would strongly suggest using Redis or Cassandra rather than MySQL to store high traffic data such as logs. Then you could stream it all day long rather than doing daily imports.
You can read more on those two (and more) in this comparison of "NoSQL" databases.
If you insist on MySQL, I think the easiest would just be to create a new table per day, like logs_2011_01_13 and then load it all in there. It makes dropping older dates very easy and you could also easily move different tables on different servers.
er.., number them in Mod 45 with a composite key and cycle through them...
Seriously 1 table per day was a valid suggestion, and since it is static data I would create packed MyISAM, depending upon my host's ability to sort.
Building queries to union some or all of them would be only moderately challenging.
1 table per day, and partition those to improve load performance.
Yes, you can partition MySQL tables by date:
CREATE TABLE ExampleTable (
id INT AUTO_INCREMENT,
d DATE,
PRIMARY KEY (id, d)
) PARTITION BY RANGE COLUMNS(d) (
PARTITION p1 VALUES LESS THAN ('2014-01-01'),
PARTITION p2 VALUES LESS THAN ('2014-01-02'),
PARTITION pN VALUES LESS THAN (MAXVALUE)
);
Later, when you get close to overflowing into partition pN, you can split it:
ALTER TABLE ExampleTable REORGANIZE PARTITION pN INTO (
PARTITION p3 VALUES LESS THAN ('2014-01-03'),
PARTITION pN VALUES LESS THAN (MAXVALUE)
);
This doesn't automatically partition by date, but you can reorganize when you need to. Best to reorganize before you fill the last partition, so the operation will be quick.
I have stumbled on this question while looking for something else and wanted to point out the MERGE storage engine (http://dev.mysql.com/doc/refman/5.7/en/merge-storage-engine.html).
The MERGE storage is more or less a simple pointer to multiple tables, and can be redone in seconds. For cycling logs, it can be very powerfull! Here's what I'd do:
Create one table per day, use LOAD DATA as OP mentionned to fill it up. Once it is done, drop the MERGE table and recreate it including that new table while ommiting the oldest one. Once done, I could delete/archive the old table. This would allow me to rapidly query a specific day, or all as both the orignal tables and the MERGE are valid.
CREATE TABLE logs_day_46 LIKE logs_day_45 ENGINE=MyISAM;
DROP TABLE IF EXISTS logs;
CREATE TABLE logs LIKE logs_day_46 ENGINE=MERGE UNION=(logs_day_2,[...],logs_day_46);
DROP TABLE logs_day_1;
Note that a MERGE table is not the same as a PARTIONNED one and offer some advantages and inconvenients. But do remember that if you are trying to aggregate from all tables it will be slower than if all data was in only one table (same is true for partitions, as they are basically different tables under the hood). If you are going to query mostly on specific days, you will need to choose the table yourself, but if partitions are done on the day values, MySQL will automatically grab the correct table(s) which might come out faster and easier to write.