partitioning by data mysql - mysql

I have table 'items'. 18 mln records:
CREATE TABLE IF NOT EXISTS `items` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`log_id` int(11) NOT NULL,
`res_id` int(11) NOT NULL,
`link` varchar(255) NOT NULL,
`title` text NOT NULL,
`content` text NOT NULL,
`n_date` varchar(255) NOT NULL,
`nd_date` int(11) NOT NULL,
`s_date` int(11) NOT NULL,
`not_date` date NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `link_2` (`link`),
KEY `log_id` (`log_id`),
KEY `res_id` (`res_id`),
KEY `now_date` (`not_date`),
KEY `sql_index` (`res_id`,`id`,`not_date`)
) ENGINE=Aria DEFAULT CHARSET=utf8 PAGE_CHECKSUM=0 AUTO_INCREMENT=18382133 ;
Trying to partition this table I created a mini copy of it and include column 'not_date' in primary and uniq keys:
CREATE TABLE IF NOT EXISTS `part_items` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`log_id` int(11) NOT NULL,
`res_id` int(11) NOT NULL,
`link` varchar(255) NOT NULL,
`title` text NOT NULL,
`content` text NOT NULL,
`n_date` varchar(255) NOT NULL,
`nd_date` int(11) NOT NULL,
`s_date` int(11) NOT NULL,
`not_date` date NOT NULL,
PRIMARY KEY (`not_date`,`id`),
UNIQUE KEY `link_2` (`not_date`,`link`),
KEY `log_id` (`log_id`),
KEY `res_id` (`res_id`),
KEY `now_date` (`not_date`),
KEY `sql_index` (`res_id`,`id`,`not_date`)
) ENGINE=Aria DEFAULT CHARSET=utf8 PAGE_CHECKSUM=0
/*!50100 PARTITION BY RANGE ( TO_DAYS(not_date))
(PARTITION p_1 VALUES LESS THAN (735963) ENGINE = Aria,
PARTITION p_2 VALUES LESS THAN (736694) ENGINE = Aria) */ AUTO_INCREMENT=18414661 ;
Then I run sql_query:
alter table `part_items` PARTITION BY RANGE( TO_DAYS(not_date) ) (
PARTITION p_1 VALUES LESS THAN( TO_DAYS('2014-12-31') ),
PARTITION p_2 VALUES LESS THAN( TO_DAYS('2016-12-31') )
);
Then I trying to select records that must de in p_1 and explain partitions show me that searching was only in p_1. But when I select records that must be in p_2 explain partitions show full-scan(p_1,p_2).
What wrong in my code?
Queries:
explain partitions SELECT * FROM `part_items` where content like '%k%' and not_date < '2014-05-12'
explain partitions SELECT * FROM `part_items` where content like '%k%' and not_date > '2015-01-01'
And one more question: Is it possible to partitioning views?

When PARTITIONing by some date function, there is a chance of an invalid date being provided. That would lead to NULL; such values are stored in the first partition.
This is an issue that has tripped up many a developer. The typical 'workaround' is to have the first partition empty so that the effort of looking in it is (usually) minimal. In your case:
PARTITION p_0 VALUES LESS THAN(0)
Partitioning on less than about 6 partitions is usually not worth the effort; will you be adding more partitions?
(Caveat: My advice comes from years of MyISAM/InnoDB partitioning; I don't know of Aria works differently. I suspect that partitioning is mostly handled at then Engine-independent layer.)

Related

Partitioning large table by dates

I have implemented custom url shortener in my app and I have one table for that. table structure looks like this:
CREATE TABLE `urls` (
`id` int(11) NOT NULL,
`url_id` varchar(10) DEFAULT NULL,
`long_url` varchar(255) DEFAULT NULL,
`clicked` mediumint(5) NOT NULL DEFAULT 0,
`user_id` varchar(7) DEFAULT NULL,
`type` varchar(15) DEFAULT NULL,
`ad_id` int(11) DEFAULT NULL,
`campaign` int(11) DEFAULT,
`increment` tinyint(1) NOT NULL DEFAULT 0,
`date` date DEFAULT NULL,
`del` enum('1','0') NOT NULL DEFAULT '0'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT
ALTER TABLE `urls`
ADD PRIMARY KEY (`id`),
ADD KEY `url_id` (`url_id`),
ADD KEY `type` (`type`),
ADD KEY `campaign` (`campaign`),
ADD KEY `ad_id` (`ad_id`),
ADD KEY `date` (`date`),
ADD KEY `user_id` (`user_id`);
The table now has 20.000.000 records and currently growing by 300k-400k records per day.
url_id column is unique varchar(10) and url looks like that: http://example.com/asdfghjklu
Now i have partitioned this table into 10 partitions by HASH(id):
PARTITION BY HASH (`id`)
PARTITIONS 10;
When I try to generate reports and join this table on others query is getting really slow, so slow even can't get 1 week report.
When I try to make big query in this table I filter almost every query with dates and I think it will be much better if I partition this table by date column.
Is it good idea?
As I read if I want to partition this table by date I need to add date in composite primary key: PRIMARY KEY(id, date)
What do you think about this? How do I improve my query performance?
I wold recommend use hash partition using date or month or YEAR
CREATE TABLE `urls` (
`id` int(11) NOT NULL,
`url_id` varchar(10) DEFAULT NULL,
`long_url` varchar(255) DEFAULT NULL,
`clicked` mediumint(5) NOT NULL DEFAULT 0,
`user_id` varchar(7) DEFAULT NULL,
`type` varchar(15) DEFAULT NULL,
`ad_id` int(11) DEFAULT NULL,
`campaign` int(11) DEFAULT,
`increment` tinyint(1) NOT NULL DEFAULT 0,
`date` date DEFAULT NULL,
`del` enum('1','0') NOT NULL DEFAULT '0',
PartitionsID int(4) unsigned NOT NULL,
KEY PartitionsID (PartitionsID)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
PARTITION BY HASH (PartitionsID)
PARTITIONS 366;
IN PARTITION ID you just need to insert TO_DAYS(date) so you have only one value for entire day .
SOURCE
and it will make easy for partition for each day or you can do with month wise also depending on your data size .
for select
you can use below query as example
SELECT *
FROM TT ACT
WHERE ACT.CustomerID = vCustomerID
AND ACT.TransactionTime BETWEEN vInvoiceEndDate AND vPaymentDueDate
AND ACT.TrxnInfoTypeID IN (19, 23)
AND ACT.PaymentType = '1'
AND ACT.PartitionsID BETWEEN TO_DAYS(vInvoiceEndDate) AND TO_DAYS(vPaymentDueDate);

Mysql Partitioning Query Performance

i have created partitions on pricing table. below is the alter statement.
ALTER TABLE `price_tbl`
PARTITION BY HASH(man_code)
PARTITIONS 87;
one partition consists of 435510 records. total records in price_tbl is 6 million.
EXPLAIN query showing only one partion is used for the query . Still the query takes 3-4 sec to execute. below is the query
EXPLAIN SELECT vrimg.image_cap_id,vm.man_name,vr.range_code,vr.range_name,vr.range_url, MIN(`finance_rental`) AS from_price, vd.der_id AS vehicle_id FROM `range_tbl` vr
LEFT JOIN `image_tbl` vrimg ON vr.man_code = vrimg.man_code AND vr.type_id = vrimg.type_id AND vr.range_code = vrimg.range_code
LEFT JOIN `manufacturer_tbl` vm ON vr.man_code = vm.man_code AND vr.type_id = vm.type_id
LEFT JOIN `derivative_tbl` vd ON vd.man_code=vm.man_code AND vd.type_id = vr.type_id AND vd.range_code=vr.range_code
LEFT JOIN `price_tbl` vp ON vp.vehicle_id = vd.der_id AND vd.type_id = vp.type_id AND vp.product_type_id=1 AND vp.maintenance_flag='N' AND vp.man_code=164
AND vp.initial_rentals_id =(SELECT rental_id FROM `rentals_tbl` WHERE rental_months='9')
AND vp.annual_mileage_id =(SELECT annual_mileage_id FROM `mileage_tbl` WHERE annual_mileage='8000')
WHERE vr.type_id = 1 AND vm.man_url = 'audi' AND vd.type_id IS NOT NULL GROUP BY vd.der_id
Result of EXPLAIN.
Same query without partitioning takes 3-4 sec.
Query with partitioning takes 2-3 sec.
how we can increase query performance as it is too slow yet.
attached create table structure.
price table - This consists 6 million records
CREATE TABLE `price_tbl` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`lender_id` bigint(20) DEFAULT NULL,
`type_id` bigint(20) NOT NULL,
`man_code` bigint(20) NOT NULL,
`vehicle_id` bigint(20) DEFAULT NULL,
`product_type_id` bigint(20) DEFAULT NULL,
`initial_rentals_id` bigint(20) DEFAULT NULL,
`term_id` bigint(20) DEFAULT NULL,
`annual_mileage_id` bigint(20) DEFAULT NULL,
`ref` varchar(255) DEFAULT NULL,
`maintenance_flag` enum('Y','N') DEFAULT NULL,
`finance_rental` decimal(20,2) DEFAULT NULL,
`monthly_rental` decimal(20,2) DEFAULT NULL,
`maintenance_payment` decimal(20,2) DEFAULT NULL,
`initial_payment` decimal(20,2) DEFAULT NULL,
`doc_fee` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`,`type_id`,`man_code`),
KEY `type_id` (`type_id`),
KEY `vehicle_id` (`vehicle_id`),
KEY `term_id` (`term_id`),
KEY `product_type_id` (`product_type_id`),
KEY `finance_rental` (`finance_rental`),
KEY `type_id_2` (`type_id`,`vehicle_id`),
KEY `maintenanace_idx` (`maintenance_flag`),
KEY `lender_idx` (`lender_id`),
KEY `initial_idx` (`initial_rentals_id`),
KEY `man_code_idx` (`man_code`)
) ENGINE=InnoDB AUTO_INCREMENT=5830708 DEFAULT CHARSET=latin1
/*!50100 PARTITION BY HASH (man_code)
PARTITIONS 87 */
derivative table - This consists 18k records.
CREATE TABLE `derivative_tbl` (
`type_id` bigint(20) DEFAULT NULL,
`der_cap_code` varchar(20) DEFAULT NULL,
`der_id` bigint(20) DEFAULT NULL,
`body_style_id` bigint(20) DEFAULT NULL,
`fuel_type_id` bigint(20) DEFAULT NULL,
`trans_id` bigint(20) DEFAULT NULL,
`man_code` bigint(20) DEFAULT NULL,
`range_code` bigint(20) DEFAULT NULL,
`model_code` bigint(20) DEFAULT NULL,
`der_name` varchar(255) DEFAULT NULL,
`der_url` varchar(255) DEFAULT NULL,
`der_intro_year` date DEFAULT NULL,
`der_disc_year` date DEFAULT NULL,
`der_last_spec_date` date DEFAULT NULL,
KEY `der_id` (`der_id`),
KEY `type_id` (`type_id`),
KEY `man_code` (`man_code`),
KEY `range_code` (`range_code`),
KEY `model_code` (`model_code`),
KEY `body_idx` (`body_style_id`),
KEY `capcodeidx` (`der_cap_code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
range table - This consists 1k records
CREATE TABLE `range_tbl` (
`type_id` bigint(20) DEFAULT NULL,
`man_code` bigint(20) DEFAULT NULL,
`range_code` bigint(20) DEFAULT NULL,
`range_name` varchar(255) DEFAULT NULL,
`range_url` varchar(255) DEFAULT NULL,
KEY `range_code` (`range_code`),
KEY `type_id` (`type_id`),
KEY `man_code` (`man_code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
PARTITION BY HASH is essentially useless if you are hoping for improved performance. BY RANGE is useful in a few use cases_.
In most situations, improvements in indexes are as good as trying to use partitioning.
Some likely problems:
No explicit PRIMARY KEY for InnoDB tables. Add a natural PK, if applicable, else an AUTO_INCREMENT.
No "composite" indexes -- they often provide a performance boost. Example: The LEFT JOIN between vr and vrimg involves 3 columns; a composite index on those 3 columns in the 'right' table will probably help performance.
Blind use of BIGINT when smaller datatypes would work. (This is an I/O issue when the table is big.)
Blind use of 255 in VARCHAR.
Consider whether most of the columns should be NOT NULL.
That query may be a victim of the "explode-implode" syndrome. This is where you do JOIN(s), which create a big intermediate table, followed by a GROUP BY to bring the row-count back down.
Don't use LEFT unless the 'right' table really is optional. (I see LEFT JOIN vd ... vd.type_id IS NOT NULL.)
Don't normalize "continuous" values (annual_mileage and rental_months). It is not really beneficial for "=" tests, and it severely hurts performance for "range" tests.
Same query without partitioning takes 3-4 sec. Query with partitioning takes 2-3 sec.
The indexes almost always need changing when switching between partitioning and non-partitioning. With the optimal indexes for each case, I predict that performance will be close to the same.
Indexes
These should help performance whether or not it is partitioned:
vm: (man_url)
vr: (man_code, type_id) -- either order
vd: (man_code, type_id, range_code, der_id)
-- `der_id` 4th, else in any order (covering)
vrimg: (man_code, type_id, range_code, image_cap_id)
-- `image_cap_id` 4th, else in any order (covering)
vp: (type_id, der_id, product_type_id, maintenance_flag,
initial_rentals, annual_mileage, man_code)
-- any order (covering)
A "covering" index is an extra boost, in that it can do all the work just in the index's BTree, without touching the data's BTree.
Implement a bunch of what I recommend, then come back (in another Question) for further tweaking.
Usually the "partition key" should be last in a composite index.

MySQL | Error - A UNIQUE INDEX must include all columns in the table's partitioning function

I'm having a problem with partition on a table in MySQL. I want to partition table reminders depend on quarter of year, because I think I can delete list of reminders was expired later.
Maybe, I should add field in primary key or unique key to partition work. Please help me this problem.
CREATE TABLE `reminders`
(
`id` bigint(20) NOT NULL,
`mer_reminder_id` varchar(255) UNIQUE NOT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT NULL,
`deleted_at` timestamp NULL DEFAULT NULL,
`type` varchar(32) DEFAULT NULL,
`priority` int(11) DEFAULT NULL,
`due` timestamp NULL DEFAULT NULL,
`title` varchar(255) DEFAULT NULL,
`description` varchar(255) DEFAULT NULL,
`template` varchar(32) DEFAULT NULL,
`action_id` bigint(20) NOT NULL,
`merchant_id` bigint(20) NOT NULL,
`expired` bool NULL DEFAULT NULL,
PRIMARY KEY (`id`, `created_at`)
) ENGINE = InnoDB
DEFAULT CHARSET = utf8mb4
PARTITION BY RANGE ( UNIX_TIMESTAMP(created_at) ) (
PARTITION zpm_reminder_reminders_q1_2019 VALUES LESS THAN ( UNIX_TIMESTAMP('2019-04-01 00:00:00') ),
PARTITION zpm_reminder_reminders_q2_2019 VALUES LESS THAN ( UNIX_TIMESTAMP('2019-07-01 00:00:00') ),
PARTITION zpm_reminder_reminders_q3_2019 VALUES LESS THAN ( UNIX_TIMESTAMP('2019-10-01 00:00:00') ),
PARTITION zpm_reminder_reminders_q4_2019 VALUES LESS THAN ( UNIX_TIMESTAMP('2020-01-01 00:00:00') ),
PARTITION zpm_reminder_reminders_q1_2020 VALUES LESS THAN ( UNIX_TIMESTAMP('2020-04-01 00:00:00') ),
PARTITION zpm_reminder_reminders_q2_2020 VALUES LESS THAN ( UNIX_TIMESTAMP('2020-07-01 00:00:00') ),
PARTITION zpm_reminder_reminders_q3_2020 VALUES LESS THAN ( UNIX_TIMESTAMP('2020-10-01 00:00:00') ),
PARTITION zpm_reminder_reminders_q4_2020 VALUES LESS THAN ( UNIX_TIMESTAMP('2021-01-01 00:00:00') )
);
Problem: #1503 - A UNIQUE INDEX must include all columns in the table's partitioning function
The error message is pretty self-explanatory: if your table is partitioned and has a unique key, every column used in the unique key must also be used as part of the partition. You are using UNIX_TIMESTAMP(created_at) for partitioning, but you have a unique key on mer_reminder_id, so this configuration isn't valid.
My advice: don't use MySQL partitions for this. They are extremely limiting, and are often buggy as well. If you need to discard old rows in this table, use DELETE with a condition -- unless your table is extremely large, you'll spend much more time maintaining the partitions than you will save on faster delete operations.
(Unrelatedly, I see a potential design issue in your table -- the primary key in your table should probably be id, not (id, created_at). Timestamps almost never belong in primary keys.)

cannot partition this mysql table

I have an innoDB table named "transaction" with ~1.5 million rows. I would like to partition this table (probably on column "gas_station_id" since it is used a lot in join queries) but I've read in MySQL 5.7 Reference Manual that
All columns used in the table's partitioning expression must be part of every unique key that the table may have, including any primary key.
I have two questions:
The column "gas_station_id" is not part of unique key or primary key. How could I partition this table then?
even if I could partition this table, I am not sure which partitioning type would be better in this case? (I was thinking about LIST partitioning (we have about 40 different(distinct) gas stations) but I am not sure since there will be only one value in each list partition like the following :
ALTER TABLE transaction
PARTITION BY LIST(gas_station_id)
( PARTITION p1 VALUES IN (9001),
PARTITION p2 VALUES IN (9002),.....)
I tried partitioning by KEY, but I receive the following error (I think because id is not part of all unique keys..):
#1053 - a UNIQUE INDEX must include all columns in the table's partitioning function
This is the structure of the "transaction" table:
EDIT
and this is what SHOW CREATE TABLE shows:
CREATE TABLE `transaction` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`terminal_transaction_id` int(11) NOT NULL,
`fuel_terminal_id` int(11) NOT NULL,
`fuel_terminal_serial` int(11) NOT NULL,
`xboard_id` int(11) NOT NULL,
`gas_station_id` int(11) NOT NULL,
`operator_id` varchar(16) NOT NULL,
`shift_id` int(11) NOT NULL,
`xboard_total_counter` int(11) NOT NULL,
`fuel_type` tinyint(2) NOT NULL,
`start_fuel_time` int(11) NOT NULL,
`end_fuel_time` int(11) DEFAULT NULL,
`preset_amount` int(11) NOT NULL,
`actual_amount` int(11) DEFAULT NULL,
`fuel_cost` int(11) DEFAULT NULL,
`payment_cost` int(11) DEFAULT NULL,
`purchase_type` int(11) NOT NULL,
`payment_ref_id` text,
`unit_fuel_price` int(11) NOT NULL,
`fuel_status_id` int(11) DEFAULT NULL,
`fuel_mode_id` int(11) NOT NULL,
`payment_result` int(11) NOT NULL,
`card_pan` varchar(20) DEFAULT NULL,
`state` int(11) DEFAULT NULL,
`totalizer` int(11) NOT NULL DEFAULT '0',
`shift_start_time` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `terminal_transaction_id` (`terminal_transaction_id`,`fuel_terminal_id`,`start_fuel_time`) USING BTREE,
KEY `start_fuel_time_idx` (`start_fuel_time`),
KEY `fuel_terminal_idx` (`fuel_terminal_id`),
KEY `xboard_idx` (`xboard_id`),
KEY `gas_station_id` (`gas_station_id`) USING BTREE,
KEY `purchase_type` (`purchase_type`) USING BTREE,
KEY `shift_start_time` (`shift_start_time`) USING BTREE,
KEY `fuel_type` (`fuel_type`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1665335 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT
Short answer: Don't use PARTITION. Let's see the query to help speed it up.
Long answer:
1.5M rows is only marginally big enough to consider partitioning.
PARTITION BY LIST is probably useless for performance.
You have not given enough info to give you answers other that vague hints. Please provide at least SHOW CREATE TABLE and the slow SELECT.
It is possible to add the partition key onto the end of the PRIMARY or UNIQUE key; you will lose the uniqueness test.
Don't index a low-cardinality column; it won't be used.
More on PARTITION

partitioning and sub partitioning mysql table

My Table structure is
CREATE TABLE IF NOT EXISTS `billing_total_success` (
`bill_id` int(11) NOT NULL AUTO_INCREMENT,
`location` char(10) NOT NULL,
`circle` varchar(2) NOT NULL,
`amount` int(11) NOT NULL,
`reference_id` varchar(100) NOT NULL,
`source` varchar(100) NOT NULL,
`time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`bill_id`),
KEY `location` (`location`),
KEY `soutime` (`source`,`time`),
KEY `circle` (`circle`,`source`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=80527470 ;
I need to partition this based on circle and subpartition on source.
Circle: Is 2 character from a set of 11 values ("AA","XB","BT"...)
Source: can be either "RNE"(sub partition 1) or "PR"(sub partition 2) or any other string(sub partition 3).
How do I do this partitioning?
Take a look at MySQL range partitioning. The closest you can get is to define a range less than "RNE", then a partition less than the next value which would occur after "RNE". Similar with PR, and then you can have a catch-all partition. The only issue here is that you need multiple partitions for before "RNE", the hole between "RNE" and "PR", and after "PR".
Perhaps if you could explain why you want to partition in this way, we could provide an alternative solution :)