I have a large table with about 40 partition.
Each partition belongs to different area data.
I found that some partition are crashed and i also want to work on other partitions at the same time keeping crashed partitions as it is.
So can i query on other partition, using PARTITION in SELECT statement, when some partitions are crashed?
I would appreciate if somebody helps me. Thanks in advance
You can, in some sense, restrict select statements to certain partitions. There's no parameter that allow to select a partition (wouldn't make sense since partition access is controlled by the partitioning limits) but you can write your query so that it only retrieves data from specific partitions. For instance if you have partitioned by date you can use a WHERE clause that only addresses specific dates, working so only with specific partitions.
Related
I am facing a performance issue in mysql due to large index size on my table. Index size has grown to 6GB and my instance is running on 32GB memory. Majority of rows is not required in that table after a few hours and can be removed selectively. But removing them is a time consuming solution and doesn't reduce index size.
Please suggest some solution to manage this index.
You can optimize your table to rebuild index and get back space if not getting even after deletion-
optimize table table_name;
But as your table is bulky so it will lock during optimze table and also you are facing issue how can remove old data even you don't need few hours old data. So you can do as per below-
Step1: during night hours or when there is less traffic on your db, first rename your main table and create a new table with same name. Now insert few hours data from old table to new table.
By this you can remove unwanted data and also new table will be optimzed.
Step2: In future to avoid this issue, you can create a stored procedure. Which will will execute in night hours only 1 time per day and either delete till previous day (as per your requirement) data from this table or will move data to any historical table.
Step3: As now your table always keep only sigle day data then you can execute optimize table statement to rebuild and claim space back on this table easily.
Note: delete statement will not rebuild index and will not free space on server. For this you need to do optimize your table. It can be by various ways like by alter statement or by optimize statement etc.
If you can remove all the rows older than X hours, then PARTITIONing is the way to go. PARTITION BY RANGE on the hour and use DROP PARTITION to remove an old hour and REORGANIZE PARTITION to create a new hour. You should have X+2 partitions. More details.
If the deletes are more complex, please provide more details; perhaps we can come up with another solution that deals with the question about index size. Please include SHOW CREATE TABLE.
Even if you cannot use partitions for purging, it may be useful to have partitions for OPTIMIZE. Do not use OPTIMIZE PARTITION; it optimizes the entire table. Instead, use REORGANIZE PARTITION if you see you need to shrink the index.
How big is the table?
How big is innodb_buffer_pool_size?
(6GB index does not seem that bad, especially since you have 32GB of RAM.)
I've partitioned tables in my MySQL 5.1.41 which hold very huge amount of data. Recently, I've deleted a lot of data which caused fragmentation of around 500 GB yet there is a lot of data in the partitions.
To reclaim that space to the OS, I had to de-fragment the partitions. I referred to MySQL documentation, https://dev.mysql.com/doc/refman/5.1/en/partitioning-maintenance.html which confused me with the following statements,
Rebuilding partitions : Rebuilds the partition; this has the same effect as dropping all records stored in the partition, then
reinserting them. This can be useful for purposes of defragmentation.
Optimizing partitions : If you have deleted a large number of rows from a partition or if you have made many changes to a partitioned
table with variable-length rows (that is, having VARCHAR, BLOB, or
TEXT columns), you can use ALTER TABLE ... OPTIMIZE PARTITION to
reclaim any unused space and to defragment the partition data file.
I tried both and observed sometimes "rebuild" happens faster and sometimes "optimize". Each partition I run these commands on, has records from millions to sometimes billions. I'm aware of what MySQL does for above each statement.
Do they need to be applied based on number of rows in the partition? If so, on how many rows I can use "optimize" and on how many I should use "rebuild"?
Also, which is better to use?
MyISAM or InnoDB? (The answer will be different.)
For MyISAM, REBUILD/REORGANIZE/OPTIMIZE will take about the same effort per partition.
For InnoDB, OPTIMIZE PARTITION rebuilds all partitions. So, don't use this if you want to do the partitions one at a time. REORGANIZE PARTITION of the partition into an identical partition definition should act only on the one partition. I recommend that.
It is generally not worth using partitioning unless you have a least a million rows. Also BY RANGE is the only form that has any performance benefits that I have found.
Perhaps the main use of partitioning is with a time-series where you want to delete "old" data. PARTITION BY RANGE with weekly or monthly partitions lets you very efficiently DROP PARTITION rather than DELETE. More in my blog.
(My answer applies to all versions through 5.7, not just your antique 5.1.)
I have a database partitioned by range on to_days(created_at).
The partitions are monthly (p1 - p50) with a pmax catchall on the end. In the below example, I'm expecting only partition p45 to be hit.
when I do an explain partitions select * from units where created_at > "2013-01-01 00:00:00" and NOW()
I get p1,p45 listed under the partitions column
This happens in both 5.1 and 5.5
Why is the optimizer including the first partition for an inequality check?
You asked this a long time ago, but I also ran into this issue and found a workaround here:
http://datacharmer.blogspot.com/2010/05/two-quick-performance-tips-with-mysql.html
... basically you should create a first partition that contains values less than (0), which will always be empty. The MySQL query optimizer will still include this first partition, but at the least it shouldn't be doing any resource-intensive scanning.
UPDATE: Here's a short summary of the URL linked in my original answer:
The official MySQL bugtracker acknowledges this behavior as a feature:
Bug Description:
Regardless of the range in the BETWEEN clause a table partitioned by RANGE using TO_DAYS function always includes the first partition in the table when pruning.
Response:
This is not a bug, since TO_DAYS() returns NULL for invalid dates, it needs to scan the first partition as well (since that holds all NULL values) for ranges.
...
A performance workaround is to create a specific partition to hold all NULL values (like '... LESS THAN (0)'), which also would catch all bad dates.
I want to add partition to my innoDB table. I have tried to search the syntax for this, but have not found specifics.
Is this syntax wrong? :
ALTER TABLE Product PARTITION BY HASH(catetoryID1) PARTITIONS 6
SUBPARTITION BY KEY(catetoryID2) SUBPARTITIONS 10;
Does SUBPARTITIONS 10 mean each main partition has 10 subpartitions, or does it mean all main partitions have 10 subpartitions divided among them?
It's strange you didn't find the syntax. The MySQL online documentation has quite detailed syntax listed for most common operations.
Look here for overall syntax of the alter table to work with partitions:
http://dev.mysql.com/doc/refman/5.5/en/create-table.html
The syntax for partition management would remain same even when used with the alter table statement, with a few nuances that are listed on the alter table syntax pages in the MySQL docs.
To answer your first question, the problem is not your syntax but rather that you are trying sub-partition a table partitioned first by Hash partitioning - this is not allowed, at least in MySQL 5.5. Only Range or List partitions can be sub-partitioned.
Look here for a complete list of partitioning types:
http://dev.mysql.com/doc/refman/5.5/en/partitioning-types.html
As for the second question, assuming what you were trying would work, you'd be creating 6 partitions hashed by catetoryID1, and then within these you'd have 10 sub-partitions hashed by catetoryID2. So you'd have in all
6 x 10 = 60 partitions
Rules of thumb:
SUBPARTITION is useless. It provides no speed, and nothing else.
Due to various inefficiencies, don't have more than about 50 partitions.
PARTITION BY RANGE is the only useful one.
Often an INDEX can provide better performance than PARTITION; let's see your SELECT.
My blog on partitioning: http://mysql.rjweb.org/doc.php/partitionmaint
One day I suspect I'll have to learn hadoop and transfer all this data to a non-structured database, but I'm surprised to find the performance degrade so significantly in such a short period of time.
I have a mysql table with just under 6 million rows.
I am doing a very simple query on this table, and believe I have all the correct indexes in place.
the query is
SELECT date, time FROM events WHERE venid='47975' AND date>='2009-07-11' ORDER BY date
the explain returns
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE updateshows range date_idx date_idx 7 NULL 648997 Using where
so i am using the correct index as far as I can tell, but this query is taking 11 seconds to run.
The database is MyISAM, and phpMyAdmin says the table is 1.0GiB.
Any ideas here?
Edited:
The date_idx is indexes both the date and venid columns. Should those be two seperate indexes?
What you want to make sure is that the query will use ONLY the index, so make sure that the index covers all the fields you are selecting. Also, since it is a range query involved, You need to have the venid first in the index, since it is queried as a constant. I would therefore create and index like so:
ALTER TABLE events ADD INDEX indexNameHere (venid, date, time);
With this index, all the information that is needed to complete the query is in the index. This means that, hopefully, the storage engine is able to fetch the information without actually seeking inside the table itself. However, MyISAM might not be able to do this, since it doesn't store the data in the leaves of the indexes, so you might not get the speed increase you desire. If that's the case, try to create a copy of the table, and use the InnoDB engine on the copy. Repeat the same steps there and see if you get a significant speed increase. InnoDB does store the field values in the index leaves, and allow covering indexes.
Now, hopefully you'll see the following when you explain the query:
mysql> EXPLAIN SELECT date, time FROM events WHERE venid='47975' AND date>='2009-07-11' ORDER BY date;
id select_type table type possible_keys key [..] Extra
1 SIMPLE events range date_idx, indexNameHere indexNameHere Using index, Using where
Try adding a key that spans venid and date (or the other way around, or both...)
I would imagine that a 6M row table should be able to be optimised with quite normal techniques.
I assume that you have a dedicated database server, and it has a sensible amount of ram (say 8G minimum).
You will want to ensure you've tuned mysql to use your ram efficiently. If you're running a 32-bit OS, don't. If you are using MyISAM, tune your key buffer to use a signficiant proportion, but not too much, of your ram.
In any case you want to run repeated performance testing on production-grade hardware.
Try putting an index on the venid column.