MySQL - Hash Partition not working - mysql

I am using MySQL 5.6 Server. I had created a table with HASH partitiong but some how I am unable to use specific partitions in my query.
Table Structure
CREATE TABLE `testtable` (
`id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`purchased` DATE DEFAULT NULL,
KEY `ìd` (`id`),
KEY `Purchased` (`purchased`)
) ENGINE=INNODB
/*!50100 PARTITION BY RANGE ( YEAR(purchased))
SUBPARTITION BY HASH ( dayofyear(purchased))
SUBPARTITIONS 366
(PARTITION p0 VALUES LESS THAN (2015) ENGINE = InnoDB,
PARTITION p1 VALUES LESS THAN (2016) ENGINE = InnoDB) */
My Query
EXPLAIN PARTITIONS
SELECT *
FROM testtable
WHERE purchased BETWEEN '2014-12-29' AND '2014-12-31';
Check SQL FIDDLE Page
My EXPLAIN plan tells me that server is using all partitions instead of specific partitions.
How can I write a query so that server scans specific partitions?
And also want to know what is the problem with my current query and why it is not working?
Thanks in advance...

True. HASH partitioning is essentially useless.
Other things to note...
Having more than about 50 partitions leads to certain inefficiencies.
If you will be purging "old" rows, then consider BY RANGE and have a month in each partition. Then do the purging via DROP PARTITION. More details, including sample code: http://mysql.rjweb.org/doc.php/partitionmaint

Related

How to partition a table by year and then subpartition by month in mysql 8

I have a table that contains a month and a year column.
I have a query which usually looks something like WHERE month=1 AND year=2022
Given how large this table is i would like to make it more efficient using partitions and sub partitions.
table 1
Querying the data i need took around 2 minutes and 30 seconds.
CREATE TABLE `table_1` (
`id` int NOT NULL AUTO_INCREMENT,
`entity_id` varchar(36) NOT NULL,
`entity_type` varchar(36) NOT NULL,
`score` decimal(4,3) NOT NULL,
`month` int NOT NULL DEFAULT '0',
`year` int NOT NULL DEFAULT '0',
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`deleted_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_month_year` (`month`,`year`, `entity_type`)
)
Partitioning by "month"
Querying the data i need took around 21 seconds (big improvement).
CREATE TABLE `table_1` (
`id` int NOT NULL AUTO_INCREMENT,
`entity_id` varchar(36) NOT NULL,
`entity_type` varchar(36) NOT NULL,
`score` decimal(4,3) NOT NULL,
`month` int NOT NULL DEFAULT '0',
`year` int NOT NULL DEFAULT '0',
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`deleted_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`,`month`),
KEY `idx_month_year` (`month`,`year`, `entity_type`)
) ENGINE=InnoDB AUTO_INCREMENT=21000001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
/*!50100 PARTITION BY LIST (`month`)
(PARTITION p0 VALUES IN (0) ENGINE = InnoDB,
PARTITION p1 VALUES IN (1) ENGINE = InnoDB,
PARTITION p2 VALUES IN (2) ENGINE = InnoDB,
PARTITION p3 VALUES IN (3) ENGINE = InnoDB,
PARTITION p4 VALUES IN (4) ENGINE = InnoDB,
PARTITION p5 VALUES IN (5) ENGINE = InnoDB,
PARTITION p6 VALUES IN (6) ENGINE = InnoDB,
PARTITION p7 VALUES IN (7) ENGINE = InnoDB,
PARTITION p8 VALUES IN (8) ENGINE = InnoDB,
PARTITION p9 VALUES IN (9) ENGINE = InnoDB,
PARTITION p10 VALUES IN (10) ENGINE = InnoDB,
PARTITION p11 VALUES IN (11) ENGINE = InnoDB,
PARTITION p12 VALUES IN (12) ENGINE = InnoDB) */
I would like to see if i can improve the performance even further by partitioning by year and then subpartitioning by month. How can i do that?
I'm not sure the following question Partition by year and sub-partition by month mysql is relevant with no marked answers and that question looks to be particular to mysql 5* and php. Im asking about mysql 8, are there no changes since then regarding partioning/subpartioning/list columns/range columns etc? which could help me.
Broader query im making
SELECT
table_1.entity_id AS entity_id,
table_1.entity_type,
table_1.score
FROM table_1
WHERE table_1.month = 12 AND table_1.year = 2022
AND table_1.score > 0
AND table_1.entity_type IN ('type1', 'type2', 'type3', 'type4') # only ever 4 types usually all 4 are present in the query
To answer your question directly, below is example syntax that accomplishes the subpartitioning. Notice the PRIMARY KEY must include all columns used for partitioning or subpartitioning. Read the manual on subpartitioning for more information: https://dev.mysql.com/doc/refman/8.0/en/partitioning-subpartitions.html
Schema (MySQL v8.0)
CREATE TABLE `table_1` (
`id` int NOT NULL AUTO_INCREMENT,
`entity_id` varchar(36) NOT NULL,
`entity_type` varchar(36) NOT NULL,
`score` decimal(4,3) NOT NULL,
`month` int NOT NULL DEFAULT '0',
`year` int NOT NULL DEFAULT '0',
`created_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`deleted_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`,`month`, `year`),
KEY `idx_month_year` (`month`,`year`, `score`, `entity_type`)
) ENGINE=InnoDB AUTO_INCREMENT=21000001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
PARTITION BY LIST (`month`)
SUBPARTITION BY HASH(`year`)
SUBPARTITIONS 10 (
PARTITION p0 VALUES IN (0) ENGINE = InnoDB,
PARTITION p1 VALUES IN (1) ENGINE = InnoDB,
PARTITION p2 VALUES IN (2) ENGINE = InnoDB,
PARTITION p3 VALUES IN (3) ENGINE = InnoDB,
PARTITION p4 VALUES IN (4) ENGINE = InnoDB,
PARTITION p5 VALUES IN (5) ENGINE = InnoDB,
PARTITION p6 VALUES IN (6) ENGINE = InnoDB,
PARTITION p7 VALUES IN (7) ENGINE = InnoDB,
PARTITION p8 VALUES IN (8) ENGINE = InnoDB,
PARTITION p9 VALUES IN (9) ENGINE = InnoDB,
PARTITION p10 VALUES IN (10) ENGINE = InnoDB,
PARTITION p11 VALUES IN (11) ENGINE = InnoDB,
PARTITION p12 VALUES IN (12) ENGINE = InnoDB
);
Using EXPLAIN on your query reveals that the query references only one subpartition.
Query #1
EXPLAIN
SELECT
table_1.entity_id AS entity_id,
table_1.entity_type,
table_1.score
FROM table_1
WHERE table_1.month = 12
AND table_1.year = 2022
AND table_1.score > 0
AND table_1.entity_type IN ('type1', 'type2', 'type3', 'type4');
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
SIMPLE
table_1
p12_p12sp2
range
idx_month_year
idx_month_year
11
1
100
Using index condition
The partitions field of the EXPLAIN shows that it accesses only partition p12_p12sp2. The year the query references, 2022, modulus the number of subpartitions, 10, will read from the subpartition 2.
In addition to the partitioning by month and year, it is also helpful to use an index. In this case, I added score to the index so it would filter out rows where score <= 0. The note in the EXPLAIN "Using index condition" shows that it is delegating further filtering on entity_type to the storage engine. Though in your example, you said there are only four values for entity type, and all four are selected, so that condition won't filter out any rows anyway.
View on DB Fiddle
Re your questions in comments below:
a little bit confused on SUBPARTITIONS 10 , why 10
It's just an example. You can choose a different number of subpartitions. Whatever you feel is required to reduce the search as much as you want.
To be honest, I've never encountered a situation that required subpartitioning at all, if the search is also optimized with indexes. So I have no guidance on what is an appropriate number of subpartitions.
It's your responsibility to test performance until you are satisfied.
also bit confusd on the partition name p12_p12sp2 how do i know it selected the partition with year 2022 from looking at that?
The query has a condition year = 2022.
There are 10 subpartitions in my example.
Hash partitioning just uses the integer value to be partitioned, modulus the number of partitions.
2022 modulus 10 is 2. Hence the partition ending in ...sp2 is the one used.
I also came across this anothermysqldba.blogspot.com/2014/12/… do you know how yours differs from what it shown here ( bare in mind that blog is from 2014)
They chose to name the subpartitions. There's no need to do that.
would there be any performance difference in having a single date e.g (2022-12-21) instead of sepreate columns month and year.
That depends on the query, and I'll leave it to you to test. Any predictions I make won't be accurate with your data on your server.
i can also see that you partition by month and subpartition by year, as oppose to partition by year and subpartition by month. can you explain the reasoning?
Subpartitioning works only if the outer partitions are LIST or RANGE partitions, and the subpartitions are HASH or KEY partitions. This is in the manual page I linked to.
There are a finite number of months (12). This makes it easy to partition by LIST as you did. You won't ever need more partitions. If you had partitioned by YEAR as the outer partition, you would have needed to specify year values in the list, and this is a growing set, so you would periodically have to alter the table to extend the list or range to account for new years.
Whereas when partitioning by HASH for the subpartitioning, the new year values are mapped into the finite set of subpartitions, so it's okay that it's not a finite list. You won't have to alter table to repartition (unless you want to change the number of subpartitions).
Splitting a date into columns is usually counterproductive. It is much easier to split during SELECT.
PARTITIONing is usually useless for performance of any SELECT.
When partitioning (or unpartitioning), the indexes usually need changing.
For that query, I recommend a combined date column,
WHERE date >= '2022-01-01'
AND date < '2022-01-01' + INTERVAL 1 MONTH
and some INDEX starting with date.
(You probably have other queries; let's see some of them; they may need a different index.)
Covering index -- This is an index that contains all the columns found anywhere in the SELECT. It is may be better (faster) than having only the columns needed for WHERE or WHERE + GROUP BY + ORDER BY. It depends on a lot of variables.
Order of columns in an index (or PK): The leftmost column(s) have priority. That is the order of the index rows on disk. PK(id, date) is useful if looking up by id (in the WHERE), but not if you are just searching by date.
Sargable -- sargable -- Hiding a column in a function disables the use of an index. That is MONTH(date) cannot use INDEX(date).
Blogs -- Index Cookbook and Partition
Test plan
I recommend you time all your queries against a variety of Create Tables.
For the WHERE clause:
The order of ANDs does not matter.
When using IN, a single value os equivalent to = and optimizes better. Multiple values may optimize more poorly. As Bill hints at, when the IN list contains all the options, you should eliminate the clause since the Optimizer is not smart enough. So, be sure to test with 1 and/or many items, so as to be realistic to your app.
For the table
Try Partition BY year + Subpartition by month.
Try Partition by a column that is the combination of year and month.
Try without partitioning.
For indexes
Order of the columns (in a composite index) does matter, so try different orderings.
When partitioning, be sure to tack onto the end of the PK the partition key(s).
A partitioned table needs different indexes than a non-partitioned table. That is, what works well for one may work poorly for the other.
Simply use something like this pattern to test various layouts:
CREATE TABLE (( a new layout with or without partitioning and with indexes ))
INSERT INTO test_table SELECT ... FROM real_table;
Change the "..." to adapt to any extra/missing columns in test_table
SELECT ...
Run various 'real' queries
Run each query twice (caching sometimes messes with the timing)
Report the results -- If you provide sufficient info (CREATE TABLE and SELECT), I may have suggestions on further speeding up the test (whether it is partitioned or not).

Estimating How Long It Takes To Partition A Large Table

I'm trying to figure out how long it will take to partition a large table. I'm about 2 weeks into partitioning this table and don't have a good feeling for how much longer it will take. Is there any way to calculate how long this query might take?
The following is the query in question.
ALTER TABLE pIndexData REORGANIZE PARTITION pMAX INTO (
PARTITION p2022 VALUES LESS THAN (UNIX_TIMESTAMP('2023-01-01 00:00:00 UTC')),
PARTITION pMAX VALUES LESS THAN (MAXVALUE)
)
For context, the pIndexData table has about 6 billion records and the pMAX partition has roughly 2 billion records. This is an Amazon Aurora instance and the server is running MySQL 5.7.12. The DB Engine is InnoDB. The following is the table syntax.
CREATE TABLE `pIndexData` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`DateTime-UNIX` bigint(20) NOT NULL DEFAULT '0',
`pkl_PPLT_00-PIndex` int(11) NOT NULL DEFAULT '0',
`DataValue` decimal(14,4) NOT NULL DEFAULT '0.0000',
PRIMARY KEY (`pkl_PPLT_00-PIndex`,`DateTime-UNIX`),
KEY `id` (`id`),
KEY `DateTime` (`DateTime-UNIX`) USING BTREE,
KEY `pIndex` (`pkl_PPLT_00-PIndex`) USING BTREE,
KEY `DataIndex` (`DataValue`),
KEY `pIndex-Data` (`pkl_PPLT_00-PIndex`,`DataValue`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (`DateTime-UNIX`)
(PARTITION p2016 VALUES LESS THAN (1483246800) ENGINE = InnoDB,
PARTITION p2017 VALUES LESS THAN (1514782800) ENGINE = InnoDB,
PARTITION p2018 VALUES LESS THAN (1546318800) ENGINE = InnoDB,
PARTITION p2019 VALUES LESS THAN (1577854800) ENGINE = InnoDB,
PARTITION p2020 VALUES LESS THAN (1609477200) ENGINE = InnoDB,
PARTITION p2021 VALUES LESS THAN (1641013200) ENGINE = InnoDB,
PARTITION pMAX VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
In researching this question, I found using Performance Schema could provide the answer to my question. However, Performance Schema in not enabled on this server and enabling it requires a reboot. Rebooting is not an option because doing so could corrupt the database while this query is processing.
As a means of gaining some sense for how long this will take I recreated the pIndexData table in a separate Aurora instance. I then imported a sample set of data (about 3 million records). The sample set had DateTime values spread out over 2021, 2022 and 2023, with the lions share of data in 2022. I then ran the same REORGANIZE PARTITION query and clocked the time it took to complete. The partition query took 2 minutes, 29 seconds. If the partition query to records was linear, I estimate the query on the original table should take roughly 18 hours. It seems there is no linear calculation. Even with a large margin of error, this is way off. Clearly, there are factors (perhaps many) I'm missing.
I'm not sure what else to try other than run the sample data test again but with an even larger data sample. Before I do, I'm hoping someone might have some insight how to best calculate how long this might take to finish.
Adding (or removing) partitioning will necessarily copy all the data over and rebuild all the tables. So, if your table is large enough to warrant partitioning (over 1M rows), it will take a noticeable amount of time.
In the case of REORGANIZE one (or a few) partitions (eg, PMAX) "INTO ...", the metric is how many rows in the PMAX.
What you should have done is to create the LESS THAN 2022 late in 2021 when PMAX was empty.
Recommend you reorganize PMAX into 2022 and 2023 and PMAX now. Again, the time is proportional to the size of PMAX. Then be sure to create 2024 in Dec 2023, when PMAX is still empty.
What is the advantage of partitioning by Year? Will you be purging old data eventually? (That may be the only advantage.)
As for your test -- was there nothing in the other partitions when you measured 2m29s? That test would be about correct. There may be a small burden in adding the 2021 index rows.
A side note: The following is unnecessary since there are 2 other indexes handling it:
KEY `pIndex` (`pkl_PPLT_00-PIndex`) USING BTREE,
However, I don't know if dropping it would be "instant".

MySQL partitioning and temporary tables

A large table (~10.5M rows) has been causing issues lately. I previously modified my application to use temporary tables for faster selects, but was still having issues due to UPDATE statements. Today I implemented partitions so that the writes happen more quickly, but now my temporary tables error. Its purpose is to group events, placing the first event ID of a set in the EVENT_ID column. Example: writing 4 events beginning at 1000 would result in events 1000, 1001, 1002, 1003, all with an EVENT_ID of 1000. I have tried to do away with the UPDATE statements, but that would require too much refactoring, so it is not an option. Here is the table definition:
CREATE TABLE `all_events` (
`ID` bigint NOT NULL AUTO_INCREMENT,
`EVENT_ID` bigint unsigned DEFAULT NULL,
`LAST_UPDATE` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`EMPLOYEE_ID` int unsigned NOT NULL,
`QUANTITY` float unsigned NOT NULL,
`OPERATORS` float unsigned NOT NULL DEFAULT '0',
`SECSEARNED` decimal(10,2) unsigned NOT NULL DEFAULT '0.00' COMMENT 'for all parts in QUANTITY',
`SECSBURNED` decimal(10,2) unsigned NOT NULL DEFAULT '0.00',
`YR` smallint unsigned NOT NULL DEFAULT (year(curdate())),
PRIMARY KEY (`ID`,`YR`),
KEY `LAST_UPDATE` (`LAST_UPDATE`),
KEY `EMPLOYEE_ID` (`EMPLOYEE_ID`),
KEY `EVENT_ID` (`EVENT_ID`)
) ENGINE=InnoDB AUTO_INCREMENT=17464583 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
/*!50100 PARTITION BY RANGE (`YR`)
(PARTITION p2015 VALUES LESS THAN (2016) ENGINE = InnoDB,
PARTITION p2016 VALUES LESS THAN (2017) ENGINE = InnoDB,
PARTITION p2017 VALUES LESS THAN (2018) ENGINE = InnoDB,
PARTITION p2018 VALUES LESS THAN (2019) ENGINE = InnoDB,
PARTITION p2019 VALUES LESS THAN (2020) ENGINE = InnoDB,
PARTITION p2020 VALUES LESS THAN (2021) ENGINE = InnoDB,
PARTITION p2021 VALUES LESS THAN (2022) ENGINE = InnoDB,
PARTITION p2022 VALUES LESS THAN (2023) ENGINE = InnoDB,
PARTITION p2023 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
Now in my application when running a report the statement:
CREATE TEMPORARY TABLE IF NOT EXISTS ape ENGINE=MEMORY AS
SELECT * FROM all_events
WHERE LAST_UPDATE BETWEEN '2022-05-01 00:00:00' AND CURRENT_TIMESTAMP()
Produces the error: 'Specified storage engine' is not supported for default value expressions.
Is there a way to still use temporary tables with ENGINE=MEMORY, or is there another high performance engine I can use? The statement worked until the partitioning was implemented. InnoDB is the only engine my tables can be in due to the MySQL implementation, and it has been InnoDB since before partitioning.
Edit: When removing ENGINE=MEMORY it does work, but running SHOW CREATE TABLE tells me that it's using InnoDB. I would prefer the performance increase of MEMORY vs InnoDB.
Second Edit:
The MySQL server has been crashing 2 to 3 times daily, and every time I catch it I find this error:
TRANSACTION 795211228, ACTIVE 0 sec fetching rows
mysql tables in use 13, locked 13
LOCK WAIT 866 lock struct(s), heap size 106704, 4800 row lock(s), undo log entries 1
MySQL thread id 5032986, OS thread handle 140442167994112, query id 141216988 myserver 192.168.1.100 my-user Searching rows for update
UPDATE `all_events` SET `EVENT_ID`=LAST_INSERT_ID() WHERE `EVENT_ID` IS NULL
RECORD LOCKS space id 30558 page no 16 n bits 792 index EVENT_ID of table `mydb`.`all_events` trx id 795211228 lock_mode X
It's running Galera Cluster with 3 nodes. Node 3 is the main, becomes unavailable, and 1 comes offline to resync 3. I fail over to 2 and we're usually good until it catches up, but it's causing downtime. The temp tables I'm using are for faster reads, the partitioning is my attempt at improving write performance.
Third edit:
Added example SELECT - note there are fields not in the table definition, I reduced what was displayed for simplicity of the post, but all fields in the SELECT do in fact exist.
CREATE TEMPORARY TABLE IF NOT EXISTS allpe AS
SELECT * FROM all_events
WHERE LAST_UPDATE BETWEEN ? AND ?;
CREATE TEMPORARY TABLE IF NOT EXISTS ap1 AS SELECT * FROM allpe;
CREATE TEMPORARY TABLE IF NOT EXISTS ap2 AS SELECT * FROM allpe;
SELECT PART_NUMBER, WORKCENTER_NAME, SUM(SECSEARNED) AS EARNED, SUM(SECSBURNED) AS BURNED, SUM(QUANTITY) AS QUANTITY, (
SELECT SUM(ap1.SECSEARNED)
FROM ap1
WHERE ap1.PART_NUMBER = ape.PART_NUMBER AND ap1.WORKCENTER_ID = ape.WORKCENTER_ID
) AS EARNEDALL, (
SELECT SUM(ap2.SECSBURNED)
FROM ap2
WHERE ap2.PART_NUMBER = ape.PART_NUMBER AND ap2.WORKCENTER_ID = ape.WORKCENTER_ID
) AS BURNEDALL
FROM allpe ape
WHERE EMPLOYEE_ID = ?
GROUP BY PART_NUMBER, WORKCENTER_ID, WORKCENTER_NAME, EMPLOYEE_ID
ORDER BY EARNED;
DROP TEMPORARY TABLE allpe;
DROP TEMPORARY TABLE ap1;
DROP TEMPORARY TABLE ap2;
Fourth edit:
Writing inside of stored procedure - this is not in a loop, but multiple rows can come from multiple joins to employee_presence, so I cannot get the ID and store it for writing subsequent rows.
INSERT INTO `all_events`(`EVENT_ID`,`LAST_UPDATE`,`PART_NUMBER`, `WORKCENTER_ID`,`XPPS_WC`, `EMPLOYEE_ID`,`WORKCENTER_NAME`, `QUANTITY`, `LEVEL_PART_NUMBER`,`OPERATORS`,`SECSEARNED`,`SECSBURNED`)
SELECT NULL,NOW(),NEW.PART_NUMBER,NEW.ID,OLD.XPPS_WC,ep.EMPLOYEE_ID,NEW.NAME,(NEW.PARTS_MADE-OLD.PARTS_MADE)*WorkerContrib(ep.EMPLOYEE_ID,OLD.ID),IFNULL(NEW.LEVEL_PART_NUMBER,NEW.PART_NUMBER),WorkerCount(NEW.ID)*WorkerContrib(ep.EMPLOYEE_ID,OLD.ID),WorkerContrib(ep.EMPLOYEE_ID,OLD.ID)*CreditSeconds,WorkerCount(NEW.ID)*WorkerContrib(ep.EMPLOYEE_ID,OLD.ID)*IFNULL(TIMESTAMPDIFF(SECOND, GREATEST(NEW.LAST_PART_TIME,NEW.JOB_START_TIME), now()),0)
FROM employee_presence ep WHERE ep.WORKCENTER_ID=OLD.ID;
UPDATE `all_events` SET `EVENT_ID`=LAST_INSERT_ID() WHERE `WORKCENTER_ID`=NEW.ID AND `EVENT_ID` IS NULL;
I would suppose to read the following link from dev.MySQL.com
You cannot use CREATE TEMPORARY TABLE ... LIKE to create an empty
table based on the definition of a table that resides in the mysql
tablespace, InnoDB system tablespace (innodb_system), or a general
tablespace. The tablespace definition for such a table includes a
TABLESPACE attribute that defines the tablespace where the table
resides, and the aforementioned tablespaces do not support temporary
tables. To create a temporary table based on the definition of such a
table, use this syntax instead:
CREATE TEMPORARY TABLE new_tbl SELECT * FROM orig_tbl LIMIT 0;
So it seems the correct syntax for your case will be:
CREATE TEMPORARY TABLE ape
SELECT * FROM all_events
WHERE...
In the current issue the problematic column is YR smallint unsigned NOT NULL DEFAULT (year(curdate())). This DEFAULT value is not legal for a column which is used in partitioning expression. The error will be "Constant, random or timezone-dependent expressions in (sub)partitioning function are not allowed ...".
And only when you fix this by removing the partitioning then you'll receive an error "'Specified storage engine' is not supported for default value expressions".
CREATE TABLE .. SELECT inherits main columns properties from source tables.
In the current issue the problematic column is YR smallint unsigned NOT NULL DEFAULT (year(curdate())) again. The column in temptable must inherit main properties, including DEFAULT expression - but this expression is not allowed for MEMORY engine.
As the error suggests, the expression default does not work with the MEMORY storage engine.
One solution would be to remove that default from your all_events.yr column.
The other solution is to create an empty temporary table initially as an InnoDB table, then use ALTER TABLE to remove the expression default and convert to MEMORY engine before filling it with data.
Example:
mysql> create temporary table t as select * from all_events where false;
mysql> alter table t alter column yr drop default, engine=memory;
mysql> insert into t select * from all_events;
Sufficient? If I am not mistaken, this is equivalent to what your SELECT finds (no temp tables needed):
SELECT PART_NUMBER, WORKCENTER_ID, WORKCENTER_NAME, EMPLOYEE_ID,
SUM(SECSEARNED) AS TOT_EARNED,
SUM(SECSBURNED) AS TOT_BURNED,
SUM(QUANTITY) AS TOT_QUANTITY
FROM all_events
WHERE EMPLOYEE_ID = ?
AND LAST_UPDATE >= '2022-05-01'
GROUP BY PART_NUMBER, WORKCENTER_ID, WORKCENTER_NAME;
For performance, it would need this.
INDEX(EMPLOYEE_ID, LAST_UPDATE)
Also, removing the partitioning might speed it up a little more.
else (Notes on other fixes to the path you have taken)
Since yr is not needed, avoid it by changing '*' to a list of needed columns in
CREATE TEMPORARY TABLE IF NOT EXISTS ape ENGINE=MEMORY AS
SELECT * FROM all_events
WHERE LAST_UPDATE BETWEEN '2022-05-01 00:00:00' AND CURRENT_TIMESTAMP()
WHERE ap2.PART_NUMBER = ape.PART_NUMBER AND ap2.WORKCENTER_ID = ape.WORKCENTER_ID
Add this composite index to all_events:
INDEX(PART_NUMBER, WORKCENTER_ID)
That will probably suffice to make the query fast enough without the temp tables.
Also add thatallpe` after building it.
If you are running MySQL 8.0, you can use WITH instead of needing the two extra temp tables.

MySQL commenting out alter table partition statement

I have a table that is empty for now but will be loaded with hundreds of millions of records. Before I do this load, I want to create some partitions on the table to improve query performance and to enable better deletion later on (just truncate an entire partition).
The alter table code I am using is:
ALTER TABLE `TABLE_NAME`
PARTITION BY RANGE (YEAR(DATE_FIELD)) (
PARTITION y1 VALUES LESS THAN (2017),
PARTITION y2 VALUES LESS THAN (2018),
PARTITION y3 VALUES LESS THAN (2019),
PARTITION ymax VALUES LESS THAN (2050)
);
When I run the code in MySQL Workbench, it executes fine without any errors. when I inspect the table, the partitions do not show up in the list:
and in the auto generated DDL, the partition is commented out:
CREATE TABLE `TABLE_NAME` (
`field1` decimal(5,2) DEFAULT NULL,
`field2` decimal(5,2) DEFAULT NULL,
`DATE_FIELD` date NOT NULL,
`field3` float DEFAULT NULL,
`field4` float DEFAULT NULL,
`field5` datetime DEFAULT NULL,
`field6` varchar(50) NOT NULL,
PRIMARY KEY (`field6`,`DATE_FIELD`),
KEY `dd_IDX1` (`DATE_FIELD`,`field1`,`field2`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (year(`DATE_FIELD`))
(PARTITION y1 VALUES LESS THAN (2017) ENGINE = InnoDB,
PARTITION y2 VALUES LESS THAN (2018) ENGINE = InnoDB,
PARTITION y3 VALUES LESS THAN (2019) ENGINE = InnoDB,
PARTITION ymax VALUES LESS THAN (2050) ENGINE = InnoDB) */
I cannot figure out why this would be. I loaded some fake records to see if the lack of data was causing the issue. I also tried commenting out the partitions and created a new table with no luck.
/*!50100 ... */ is a special type of comment. It says "If the version is 5.1.0 or later, include the text as real; else leave it as just a comment.
So, if you ran this on a 5.0 server, it would not have partitions. (5.0 did not have PARTITIONs implemented.) But 5.1 and later will.
You will see variations on this in mysqldump output.
Meanwhile, you will probably find that you gain no performance by using PARTITIONing. What were you hoping for?
After more research, I am going to chalk up the fact that the partitions do not show up in the Partitions screen when clicking Table Inspector to a bug in the GUI. When you select Alter Table and look at the partitioning tab, they show up there. Additionally, when checking the PARTITIONS information table the partitions show up there as well. See Rick James answer to understand the comment syntax.

mysql drop partition does not work

I have created main partition 20170621 and 24 sub partitions
20170621_0 .. 20170621_23
Now I would like to delete the main partition. But I get an error.
alter table VAL90W02 drop PARTITION `20180621`
#1508 - Cannot remove all partitions, use DROP TABLE instead.
I can´t drop sub-partitions either. So, how do I drop the partition?
(from Comment)
create table mytable (
id int(11) NOT NULL AUTO_INCREMENT,
...,
x_date datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id, x_date)
) ENGINE = MYISAM
PARTITION BY RANGE (day(x_date))
SUBPARTITION BY HASH (hour(x_date))
( PARTITION 20180621 VALUES LESS THAN (24)
( SUBPARTITION 20180621_0 ENGINE = MyISAM,
SUBPARTITION 20180621_1 ENGINE = MyISAM, ...)
), ...;
Irritatingly, when deleting the last partition of a partitioned table, you have to use
ALTER TABLE VAL90W02 REMOVE PARTITIONING;
instead.
This is a misleading error thrown by MySQL (I'm using 5.7 Aurora, not sure which versions this affects).
Arguably, it's a failure of MySQL to handle the edge case on the part of the ALTER TABLE DROP PARTITION command.