MYSQL Key partitioning - Uneven distribution of data - partitioning

We are facing issues with uneven distribution of data across key partitions. The partition key is a UUID column. When we create 36 partitions, only 9 partitions are populated and when we create 100 partitions, only 25 partitions get populated. What could be the cause of that ? Is there some issue with MYSQL key partition while working with UUIDs?
Also, is there a way to customize the MYSQL key partitioning implementation so that we could use something like mod(crc32(),36)
Thanks for your help.
Prikshat

Related

how to alter a mysql innodb partition to use another key?

I have a table with 5 hash(key_1) partitions. I want to change that, so it instead has 5 hash(key_2) partitions, but without losing data.
How do I do this? I have searched but its hard to find confirmation that I dont lose data by deleting partitions.
Deleting, truncating, or dropping partitions will definitely lose data. You can change partitioning this with ALTER TABLE, for example ALTER TABLE t PARTITION BY HASH (key_2) PARTITIONS 5. This won't lose data, but (at least with InnoDB), the table will be locked for writes and rebuilt with the new partitioning.

MySQL ( InnoDB): Guid as Primary Key for a Distributed Database

I come from the MSSQL world and have no expert knowledge in MySQL.
Having a GUID as primary key in these two different RDBMs systems is possible. In MSSQL i better do some things in order to not run into a performance nightmare as the row count increases (many million rows).
I create the primary key as a non clustered index to prevent that the database pages change if i insert a new row. If i don't do that the system would insert the row between some existing rows and in order to do that the hard drive needs to find the right position of the page on the disc. I create a second column of a numeric type and this time as a clustered index. This guarantees that new rows will get appended on insert.
Question
But how i do this in MySQL? If my information is right, i cannot force mysql to a non clustered primary key. Is this necessary or does MySQL stores the data in a manner that will not result in a performance disaster later?
Update: But why?
The reason i want to do this is because i want to be able to realize a distributed database.
I ended up using a Sequential GUIDs as described on
CodeProject: GUIDs as fast primary keys under multiple databases.
Great performance!

reorder a column with phpMyAdmin using InnoDB storage engine does not work

Today I tried to reorder a column of a table using phpMyAdmin (as I have done many times before).
Although the result was displayed as successful no reordering effectively happened.
It appears the problem is caused by using InnoDB as storage engine which is the default value from MySQL 5.5 onward.
When I changed back to myIsam the problem was solved. It clarified why it was working on some tables.
Is this a solvable mySQL problem? Or is this regular expected behavior for InnoDB ?
In the latter case phpMyAdmin should perhaps be adapted to not offer the functionality while using InnoDB.
MySQL: 5.5.29
phpMyAdmin: 4.0.4
If by ...reordering column... you meant
ALTER TABLE ... ORDER BY ...
then for InnoDB table that has a PRIMARY or UNIQUE KEY it doesn't work. It's by design:
ALTER TABLE
ORDER BY does not make sense for InnoDB tables that contain a
user-defined clustered index (PRIMARY KEY or NOT NULL UNIQUE index).
InnoDB always orders table rows according to such an index if one is
present.
On the other hand if you don't have PRIMARY or UNIQUE KEY in your table, which is highly unlikely, then MySQL will allow you to change the order.
Here is SQLFiddle demo that demonstrates that behavior.

How to partition and subpartition MySQL by key?

I want to add partition to my innoDB table. I have tried to search the syntax for this, but have not found specifics.
Is this syntax wrong? :
ALTER TABLE Product PARTITION BY HASH(catetoryID1) PARTITIONS 6
SUBPARTITION BY KEY(catetoryID2) SUBPARTITIONS 10;
Does SUBPARTITIONS 10 mean each main partition has 10 subpartitions, or does it mean all main partitions have 10 subpartitions divided among them?
It's strange you didn't find the syntax. The MySQL online documentation has quite detailed syntax listed for most common operations.
Look here for overall syntax of the alter table to work with partitions:
http://dev.mysql.com/doc/refman/5.5/en/create-table.html
The syntax for partition management would remain same even when used with the alter table statement, with a few nuances that are listed on the alter table syntax pages in the MySQL docs.
To answer your first question, the problem is not your syntax but rather that you are trying sub-partition a table partitioned first by Hash partitioning - this is not allowed, at least in MySQL 5.5. Only Range or List partitions can be sub-partitioned.
Look here for a complete list of partitioning types:
http://dev.mysql.com/doc/refman/5.5/en/partitioning-types.html
As for the second question, assuming what you were trying would work, you'd be creating 6 partitions hashed by catetoryID1, and then within these you'd have 10 sub-partitions hashed by catetoryID2. So you'd have in all
6 x 10 = 60 partitions
Rules of thumb:
SUBPARTITION is useless. It provides no speed, and nothing else.
Due to various inefficiencies, don't have more than about 50 partitions.
PARTITION BY RANGE is the only useful one.
Often an INDEX can provide better performance than PARTITION; let's see your SELECT.
My blog on partitioning: http://mysql.rjweb.org/doc.php/partitionmaint

InnoDB Performance on primary index added/altering

So I have a huge update where I have to insert around 40gb data into an innodb table. Its taking quite a while, so Im wondering which method would be the fastest (and more importantly why, as I could just do a split test).
Method 1)
a) Insert all rows
b) create ALTER TABLE su_tmp_matches ADD PRIMARY KEY ( id )
Method 2)
a) ALTER TABLE su_tmp_matches ADD PRIMARY KEY ( id )
b) Insert all rows
Currently we are using method 1, but the step b) seems to take a shitload of time. So Im wondering if there is any implication of the size here (40gb - 5 million rows).
---- so I decided to test this as well ---
Pretty quick brand new mysql server - loads and loads of ram, and fast ram, fast discs as well, and pretty tuned up (we have more than 5000 requests per second on one pieces):
1,6 mio rows / 6gb data:
81 seconds to "delete" a primary index
550 seconds to "add" a primary index (after data is added)
120 seconds to create a copy of the table with the primary index create BEFORE data insert
80 seconds to create a copy of the table without the primary index (which then is 550 seconds to create afterwards)
Seems pretty absurd - question is, if indexes are the same thing.
From the documentation :
InnoDB does not have a special optimization for separate index
creation the way the MyISAM storage engine does. Therefore, it does
not pay to export and import the table and create indexes afterward.
The fastest way to alter a table to InnoDB is to do the inserts
directly to an InnoDB table.
It seems to me that adding the constraint of unicity before the insert could only help the engine if your column having a primary key is an autoincremented integer. But I really doubt there would be a notable difference.
A useful recommendation :
During the conversion of big tables, increase the size of the InnoDB
buffer pool to reduce disk I/O, to a maximum of 80% of physical
memory. You can also increase the sizes of the InnoDB log files.
EDIT : as by experience MySQL doesn't always perform as expected from the documentation performance-wise, I think any benchmark you do on this would be interesting, even if not a definite answer per se.