Partition strategy for MySQL 5.5 (InnoDB) - mysql

Trying to implement a partition strategy for a MySQL 5.5 (InnoDB) table and I am not sure my understanding is right or if I need to change the syntax in creating the partition.
Table "Apple" has 10 mill rows...Columns "A" to "H"
PK is columns "A", "B" and "C"
Column "A" is a char column and can identify groups of 2 million rows.
I thought column "A" would be a nice candidate to try and implement a partition around since
I select and delete by this column and could really just truncate the partition when the data is no longer needed.
I issued this command:
ALTER TABLE Apple
PARTITION BY KEY (A);
After looking at the partition info using this command:
SELECT PARTITION_NAME, TABLE_ROWS FROM
INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME = 'Apple';
I see all the data is on partition p0
I am wrong in thinking that MySQL was going to break out the partitions in groups of 2 million automagically?
Did I need to specify the number of partitions in the Alter command?
I was hoping this would create groups of 2 million rows in a partition and then create a new partition as new data comes in with a unique value for column "A".
Sorry if this was too wordy.
Thanks - JeffSpicoli

Yes, you need to specify the number of partitions (I assume the default was to create 1 partition). Partition by KEY uses internal hashing function http://dev.mysql.com/doc/refman/5.1/en/partitioning-key.html , so the partition is not selected based on the value of column, but on hash computed from it. Hashing functions return the same result for same input, so yes, all rows having the same value will be in the same partition.
But maybe you want to partition by RANGE if you want to be able to DROP PARTITION (because if partitioned by KEY, you only know that the rows are spaced evenly in the partitions, but you many different values end up in the same partition).

Related

How does hash(row id + year) partition work?

I'm new to this with partitions. Didn't knew it existed but came aware when I tried to make our new 'url_hash' column unique in a table in our database. And got the error message:
A UNIQUE INDEX must include all columns in the table's partitioning function
This is a database created by another person that I don't know and who are not involved in the project anymore.
I have tried to read mysql documentation and read on forums about Partition. What it is and how it works. Understand the purpose, to "divide" a table in to several "parts" so it becomes faster to retrieve relevant data. A common example is to partition in to years intervals. But most examples shows an manual method. Where you decide for example less than three specific years. For example:
PARTITION BY RANGE ( YEAR(separated) ) (
PARTITION p0 VALUES LESS THAN (1991),
PARTITION p1 VALUES LESS THAN (1996),
PARTITION p2 VALUES LESS THAN (2001),
PARTITION p3 VALUES LESS THAN MAXVALUE
);
But in our table, the partitions are created this way:
PARTITION BY HASH ( `feeditemsID` + YEAR(`feeddate`))
PARTITIONS 3;
What does that mean? How does our partition work?
feeditemsID is the unique ID for every row in our table.
When you use hash partitioning, the partition that contains each record is determined by calculating a hash code from the expression feaditemsID + YEAR(feeddate), and then finding the modulus of this code by the number of partitions. So if the hash code for a row is 123, it calculates 123 % 3, which is 0, so the record goes into partition 0.
This is explained inthe MySQL documentation.
As stated there,
Note
If a table to be partitioned has a UNIQUE key, then any columns supplied as arguments to the HASH user function or to the KEY's column_list must be part of that key.
In your case, the table's primary key needs to be:
PRIMARY KEY (feeditemsID, feeddate)
Assuming feeditemsID is already unique (presumably it's an auto-increment column), adding feeddate to the primary is redundant as far as keeping the data unique is concerned, but it's needed to satisfy the partitioning requirement. Putting feeditemsID first in the composite key will allow it to be used by itself to optimize table lookup.
This requirement is probably because each partition has its own index. When inserting/updating a row and checking for uniqueness, it only checks the index of the partition where that row will be stored. So when it finds the partition using the hash function, it needs to be sure that this partition will uniquely contain the indexed columns.
For more information see
Partitioning Keys, Primary Keys, and Unique Keys

How to partition a mysql table after the table is already created

I am trying to partition my table so I can narrow down the record so accessing data won't take as long as it is taking now.
this table that I want to partition has 2 key fields
(1) 'tigger_on' which is a datetime field and I use this a lot as look up key.
(2) 'status' which has 3 values 1=active,2=completed,0=purged.
I am not sure what is the best way to partition this table so that it will be easier to access for the select statement?
First I have one question when I do a partition does this create a new table so I will have to alter my queries? or is it something like an index where it narrow down the search so the look up data will be less?
Second How can I alter my existing table to add this partition? should I partition base on date range or my status or can I do it by both?
I never done partition before so I am clueless on how its done.
Note this table has 5 million records and I have added index. So I am looking for solution beyond indexing at this point.

Partitioning a MySQL table based on a column value.

I want to partition a table in MySQL while preserving the table's structure.
I have a column, 'Year', based on which I want to split up the table into different tables for each year respectively. The new tables will have names like 'table_2012', 'table_2013' and so on. The resultant tables need to have all the fields exactly as in the source table.
I have tried the following two pieces of SQL script with no success:
1.
CREATE TABLE all_data_table
( column1 int default NULL,
column2 varchar(30) default NULL,
column3 date default NULL
) ENGINE=InnoDB
PARTITION BY RANGE ((year))
(
PARTITION p0 VALUES LESS THAN (2010),
PARTITION p1 VALUES LESS THAN (2011) , PARTITION p2 VALUES LESS THAN (2012) ,
PARTITION p3 VALUES LESS THAN (2013), PARTITION p4 VALUES LESS THAN MAXVALUE
);
2.
ALTER TABLE all_data_table PARTITION BY RANGE COLUMNS (`year`) (
PARTITION p0 VALUES LESS THAN (2011),
PARTITION p1 VALUES LESS THAN (2012),
PARTITION p2 VALUES LESS THAN (2013),
PARTITION p3 VALUES LESS THAN (MAXVALUE)
);
Any assistance would be appreciated!
This is old, but seeing as it comes up highly ranked in partitioning searches, I figured I'd give some additional details for people who might hit this page. What you are talking about in having a table_2012 and table_2013 is not "MySQL Partitioning" but "Manual Partitioning".
Partitioning means that you have one "logical table" with a single table name, which--behind the scenes--is divided among multiple files. When you have millions to billions of rows, over years, but typically you are only searching a single month, partitioning by Year/Month can have a great performance benefit because MySQL only has to search against the file that contains the Year/Month that you are searching for...so long as you include the partition key in your WHERE.
When you create multiple tables like table_2012 and table_2013, you are MANUALLY partitioning the tables, which you don't do with the MySQL PARTITION configuration. To manually partition the tables, during 2012, you put all data into the 2012 table. When you hit 2013, you start putting all the data into the 2013 table. You have to make sure to create the table before you hit 2013 or it won't have any place to go. Then, when you query across the years (e.g. from Nov 2012 - Jan 2013), you have to do a UNION between table_2012 and table_2013.
SELECT * FROM table_2012 WHERE #...
UNION
SELECT * FROM table_2013 WHERE #...
With partitioning, this manual work is not necessary. You do the initial setup of the partitions, then you treat is as a single table. No unions required, no checking the date before you insert, etc. This makes life much easier. MySQL handles figuring out what tables it needs to query. However, you MUST make sure to query against the Year column or it will have to scan ALL files. E.g. SELECT * FROM all_data_table WHERE Month=12 will scan all partitions for Month=12. To ensure you are only scanning the partition files that you need to scan, you want to make sure to include the partition column in every query that you can.
Possible negatives to partitioning...if you have billions of rows and you do an ALTER TABLE on the table to--say--add a column...it's going to have to update every row taking a VERY long time. At the company I currently work for, the boss doesn't think it's worth the time it takes to update the billion rows historically when we are adding a new column for going forward...so this is one of the reasons we do manual partitioning instead of letting MySQL do it.
DISCLAIMER: I am not an expert at partitioning...so if I'm wrong in any of this, please let me know and I'll fix the incorrect parts.
From what I see you want to create many tables from one big table.
I think you should try to create views instead.
Since from what I look around about partitioning, it actually partitions the physical storage of that table and then store them separately. But if you see from the top perspective you will see them as a single table.

Error #1526 when partitioning table on mysql

Sorry, I don't know English, but I need help :(
I'm using partitioning by LIST COLUMNS by ALTER TABLE statement
My table :
table member_list:
id int,
name varchar(255),
company varchar(255),
cell_phone varchar(20)
It's haven't key
I have more than 900.000 records in the current. After inserting, I tried partitioning table by LIST COLUMNS :
alter table member_list
partition by list columns(company)(
partition p1 values in ('Lavasoft','Cakewalk','Lycos'),
partition p2 values in ('Adobe','Vivoo','Apple Systems','Sibelius'),
partition p3 values in ('Finale','Borland','Macromedia','FPT'),
partition p4 values in ('Chami','Yahoo','Google','Altavista')
)
After runned :
#1526 - Table has no partition for value from column_list
MySQL returned me this error, I can not find support from Oracle page. I hope you will help me. Thanks
#1526 - Table has no partition for value from column_list
The error message is telling you that there is a value in your data in one of the columns you have chosen for partitioning that is not accounted for in your defined partitions.
In this case, there is something in the "company" field that cannot be placed into any of the partitions. For instance, on some record, company="Blackberry." MySQL cannot put this row into any of your partitions.
LIST partitioning allow only Integer values. If you want to use columns with varchar partitioning use HASH or KEY PARTITIONS. Besides partition can only be used on columns that have primary or unique attribute.

Table partitioning using 2 columns

Is it possible to partition a table using 2 columns instead of only 1 for the partition function?
Consider a table with 3 columns
ID (int, primary key,
Date (datetime),
Num (int)
I want to partition this table by 2 columns: Date and Num.
This is what I do to partition a table using 1 column (date):
create PARTITION FUNCTION PFN_MonthRange (datetime)
AS
RANGE left FOR VALUES ('2009-11-30 23:59:59:997',
'2009-12-31 23:59:59:997',
'2010-01-31 23:59:59:997',
'2010-28-02 23:59:59:997',
'2010-03-31 23:59:59:997')
go
Bad News: The partition function has to be defined on a single column.
Good News: That single column could be a persisted computed column that is a combination of the two columns you're trying to partition by.
I found this was an easier solution
select ROW_NUMBER() over (partition by CHECKSUM(value,ID) order by SortOrder) as Row From your_table
Natively, no you can not partition by two columns in SQL Server.
There are a few things you could do, have a lookup table that you use to extract which arbitary integer (partition) each value is within, but you only have 1000 partitions maximum, so they are going to start occupying the same space. The computed column approach suffers this same problem, you have a 1k partition limit, chances are you will blow it.
I would probably just stick to a date partition, and range right on the 1st of the month, instead of ranging left on the last part of the month.
What do you intend to gain from the second partition value?