Mysql Partitioning Query Performance

Mysql Partitioning Query Performance - mysql

i have created partitions on pricing table. below is the alter statement.
ALTER TABLE `price_tbl`
PARTITION BY HASH(man_code)
PARTITIONS 87;
one partition consists of 435510 records. total records in price_tbl is 6 million.
EXPLAIN query showing only one partion is used for the query . Still the query takes 3-4 sec to execute. below is the query
EXPLAIN SELECT vrimg.image_cap_id,vm.man_name,vr.range_code,vr.range_name,vr.range_url, MIN(`finance_rental`) AS from_price, vd.der_id AS vehicle_id FROM `range_tbl` vr
LEFT JOIN `image_tbl` vrimg ON vr.man_code = vrimg.man_code AND vr.type_id = vrimg.type_id AND vr.range_code = vrimg.range_code
LEFT JOIN `manufacturer_tbl` vm ON vr.man_code = vm.man_code AND vr.type_id = vm.type_id
LEFT JOIN `derivative_tbl` vd ON vd.man_code=vm.man_code AND vd.type_id = vr.type_id AND vd.range_code=vr.range_code
LEFT JOIN `price_tbl` vp ON vp.vehicle_id = vd.der_id AND vd.type_id = vp.type_id AND vp.product_type_id=1 AND vp.maintenance_flag='N' AND vp.man_code=164
AND vp.initial_rentals_id =(SELECT rental_id FROM `rentals_tbl` WHERE rental_months='9')
AND vp.annual_mileage_id =(SELECT annual_mileage_id FROM `mileage_tbl` WHERE annual_mileage='8000')
WHERE vr.type_id = 1 AND vm.man_url = 'audi' AND vd.type_id IS NOT NULL GROUP BY vd.der_id
Result of EXPLAIN.
Same query without partitioning takes 3-4 sec.
Query with partitioning takes 2-3 sec.
how we can increase query performance as it is too slow yet.
attached create table structure.
price table - This consists 6 million records
CREATE TABLE `price_tbl` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`lender_id` bigint(20) DEFAULT NULL,
`type_id` bigint(20) NOT NULL,
`man_code` bigint(20) NOT NULL,
`vehicle_id` bigint(20) DEFAULT NULL,
`product_type_id` bigint(20) DEFAULT NULL,
`initial_rentals_id` bigint(20) DEFAULT NULL,
`term_id` bigint(20) DEFAULT NULL,
`annual_mileage_id` bigint(20) DEFAULT NULL,
`ref` varchar(255) DEFAULT NULL,
`maintenance_flag` enum('Y','N') DEFAULT NULL,
`finance_rental` decimal(20,2) DEFAULT NULL,
`monthly_rental` decimal(20,2) DEFAULT NULL,
`maintenance_payment` decimal(20,2) DEFAULT NULL,
`initial_payment` decimal(20,2) DEFAULT NULL,
`doc_fee` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`,`type_id`,`man_code`),
KEY `type_id` (`type_id`),
KEY `vehicle_id` (`vehicle_id`),
KEY `term_id` (`term_id`),
KEY `product_type_id` (`product_type_id`),
KEY `finance_rental` (`finance_rental`),
KEY `type_id_2` (`type_id`,`vehicle_id`),
KEY `maintenanace_idx` (`maintenance_flag`),
KEY `lender_idx` (`lender_id`),
KEY `initial_idx` (`initial_rentals_id`),
KEY `man_code_idx` (`man_code`)
) ENGINE=InnoDB AUTO_INCREMENT=5830708 DEFAULT CHARSET=latin1
/*!50100 PARTITION BY HASH (man_code)
PARTITIONS 87 */
derivative table - This consists 18k records.
CREATE TABLE `derivative_tbl` (
`type_id` bigint(20) DEFAULT NULL,
`der_cap_code` varchar(20) DEFAULT NULL,
`der_id` bigint(20) DEFAULT NULL,
`body_style_id` bigint(20) DEFAULT NULL,
`fuel_type_id` bigint(20) DEFAULT NULL,
`trans_id` bigint(20) DEFAULT NULL,
`man_code` bigint(20) DEFAULT NULL,
`range_code` bigint(20) DEFAULT NULL,
`model_code` bigint(20) DEFAULT NULL,
`der_name` varchar(255) DEFAULT NULL,
`der_url` varchar(255) DEFAULT NULL,
`der_intro_year` date DEFAULT NULL,
`der_disc_year` date DEFAULT NULL,
`der_last_spec_date` date DEFAULT NULL,
KEY `der_id` (`der_id`),
KEY `type_id` (`type_id`),
KEY `man_code` (`man_code`),
KEY `range_code` (`range_code`),
KEY `model_code` (`model_code`),
KEY `body_idx` (`body_style_id`),
KEY `capcodeidx` (`der_cap_code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
range table - This consists 1k records
CREATE TABLE `range_tbl` (
`type_id` bigint(20) DEFAULT NULL,
`man_code` bigint(20) DEFAULT NULL,
`range_code` bigint(20) DEFAULT NULL,
`range_name` varchar(255) DEFAULT NULL,
`range_url` varchar(255) DEFAULT NULL,
KEY `range_code` (`range_code`),
KEY `type_id` (`type_id`),
KEY `man_code` (`man_code`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

PARTITION BY HASH is essentially useless if you are hoping for improved performance. BY RANGE is useful in a few use cases_.
In most situations, improvements in indexes are as good as trying to use partitioning.
Some likely problems:
No explicit PRIMARY KEY for InnoDB tables. Add a natural PK, if applicable, else an AUTO_INCREMENT.
No "composite" indexes -- they often provide a performance boost. Example: The LEFT JOIN between vr and vrimg involves 3 columns; a composite index on those 3 columns in the 'right' table will probably help performance.
Blind use of BIGINT when smaller datatypes would work. (This is an I/O issue when the table is big.)
Blind use of 255 in VARCHAR.
Consider whether most of the columns should be NOT NULL.
That query may be a victim of the "explode-implode" syndrome. This is where you do JOIN(s), which create a big intermediate table, followed by a GROUP BY to bring the row-count back down.
Don't use LEFT unless the 'right' table really is optional. (I see LEFT JOIN vd ... vd.type_id IS NOT NULL.)
Don't normalize "continuous" values (annual_mileage and rental_months). It is not really beneficial for "=" tests, and it severely hurts performance for "range" tests.
Same query without partitioning takes 3-4 sec. Query with partitioning takes 2-3 sec.
The indexes almost always need changing when switching between partitioning and non-partitioning. With the optimal indexes for each case, I predict that performance will be close to the same.
Indexes
These should help performance whether or not it is partitioned:
vm: (man_url)
vr: (man_code, type_id) -- either order
vd: (man_code, type_id, range_code, der_id)
-- `der_id` 4th, else in any order (covering)
vrimg: (man_code, type_id, range_code, image_cap_id)
-- `image_cap_id` 4th, else in any order (covering)
vp: (type_id, der_id, product_type_id, maintenance_flag,
initial_rentals, annual_mileage, man_code)
-- any order (covering)
A "covering" index is an extra boost, in that it can do all the work just in the index's BTree, without touching the data's BTree.
Implement a bunch of what I recommend, then come back (in another Question) for further tweaking.
Usually the "partition key" should be last in a composite index.

Related

MySQL Query Optimization that touches three tables via a union of two of them

I have a query that returns results from a single table based on the provided ID existing in a column in one of two, or both, tables. The DB schema for the relevant tables is provided below as well as the initial query and then what was later recommended to me by a peer. I go into some details below as to why this query works but I need to optimize it farther for larger datasets and pagination.
CREATE TABLE `killmails` (
`id` BIGINT(20) UNSIGNED NOT NULL,
`hash` VARCHAR(255) NOT NULL,
`moon_id` BIGINT(20) NULL DEFAULT NULL,
`solar_system_id` BIGINT(20) UNSIGNED NOT NULL,
`war_id` BIGINT(20) NULL DEFAULT NULL,
`is_npc` TINYINT(1) NOT NULL DEFAULT '0',
`is_awox` TINYINT(1) NOT NULL DEFAULT '0',
`is_solo` TINYINT(1) NOT NULL DEFAULT '0',
`dropped_value` DECIMAL(18,4) UNSIGNED NOT NULL DEFAULT '0.0000',
`destroyed_value` DECIMAL(18,4) UNSIGNED NOT NULL DEFAULT '0.0000',
`fitted_value` DECIMAL(18,4) UNSIGNED NOT NULL DEFAULT '0.0000',
`total_value` DECIMAL(18,4) UNSIGNED NOT NULL DEFAULT '0.0000',
`killmail_time` DATETIME NOT NULL,
`created_at` DATETIME NOT NULL,
`updated_at` DATETIME NOT NULL,
PRIMARY KEY (`id`, `hash`),
INDEX `total_value` (`total_value`),
INDEX `killmail_time` (`killmail_time`),
INDEX `solar_system_id` (`solar_system_id`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
;
CREATE TABLE `killmail_attackers` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`killmail_id` BIGINT(20) UNSIGNED NOT NULL,
`alliance_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`character_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`corporation_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`faction_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`damage_done` BIGINT(20) UNSIGNED NOT NULL,
`final_blow` TINYINT(1) NOT NULL DEFAULT '0',
`security_status` DECIMAL(17,15) NOT NULL,
`ship_type_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`weapon_type_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`created_at` DATETIME NOT NULL,
`updated_at` DATETIME NOT NULL,
PRIMARY KEY (`id`),
INDEX `ship_type_id` (`ship_type_id`),
INDEX `weapon_type_id` (`weapon_type_id`),
INDEX `alliance_id` (`alliance_id`),
INDEX `corporation_id` (`corporation_id`),
INDEX `killmail_id_character_id` (`killmail_id`, `character_id`),
CONSTRAINT `killmail_attackers_killmail_id_killmails_id_foreign_key` FOREIGN KEY (`killmail_id`) REFERENCES `killmails` (`id`) ON UPDATE CASCADE ON DELETE CASCADE
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
;
CREATE TABLE `killmail_victim` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`killmail_id` BIGINT(20) UNSIGNED NOT NULL,
`alliance_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`character_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`corporation_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`faction_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`damage_taken` BIGINT(20) UNSIGNED NOT NULL,
`ship_type_id` BIGINT(20) UNSIGNED NOT NULL,
`ship_value` DECIMAL(18,4) NOT NULL DEFAULT '0.0000',
`pos_x` DECIMAL(30,10) NULL DEFAULT NULL,
`pos_y` DECIMAL(30,10) NULL DEFAULT NULL,
`pos_z` DECIMAL(30,10) NULL DEFAULT NULL,
`created_at` DATETIME NOT NULL,
`updated_at` DATETIME NOT NULL,
PRIMARY KEY (`id`),
INDEX `corporation_id` (`corporation_id`),
INDEX `alliance_id` (`alliance_id`),
INDEX `ship_type_id` (`ship_type_id`),
INDEX `killmail_id_character_id` (`killmail_id`, `character_id`),
CONSTRAINT `killmail_victim_killmail_id_killmails_id_foreign_key` FOREIGN KEY (`killmail_id`) REFERENCES `killmails` (`id`) ON UPDATE CASCADE ON DELETE CASCADE
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
;
This first query is where the problem started:
SELECT
*
FROM
killmails k
LEFT JOIN killmail_attackers ka ON k.id = ka.killmail_id
LEFT JOIN killmail_victim kv ON k.id = kv.killmail_id
WHERE
ka.character_id = ?
OR kv.character_id = ?
ORDER BY killmails.killmail_time DESC
LIMIT ? OFFSET ?
This worked okay, but long query times. We optimized to this
SELECT
killmails.*,
FROM (
SELECT killmail_victim.killmail_id FROM killmail_victim
WHERE killmail_victim.corporation_id = ?
UNION
SELECT killmail_attackers.killmail_id FROM killmail_attackers
WHERE killmail_attackers.corporation_id = ?
) SELECTED_KMS
LEFT JOIN killmails ON killmails.id = SELECTED_KMS.killmail_id
ORDER BY killmails.killmail_time DESC
LIMIT ? OFFSET ?
I saw a huge improvement in query times when looking up killmails for characters, however when I started querying for larger datasets like corporation and alliance killmails, the query slows down. This is because the queries that are union'd together can potentially return large sets of data and the time it takes to read all that into memory so that the SELECTED_KMS table can be created is what I believe is taking so much time. Most of the time, with alliances, my connection to the database times out from the application. One alliance returned 900K killmailIDs from one of the union'd tables, not sure what the other returned.
I can easily add limit statements to the internal queries, but this will introduce a lot of complications when I get to paginating the data or when I introduce a feature to search for KMs by date for example.
I am looking for suggestions on how this query can be optimized and still allow for easy pagination in the near future.
Thank You

Change INDEX(corporation_id) in both tables to INDEX(corporation_id, killmail_id) so that the inner queries will be "covering".
In general, INDEX(a) is useless when you also have INDEX(a,b). Any query that needs just a, can use either of those indexes. (This rule does not apply to b; only the "leftmost" column(s).)
Where does killmails.id come from? It's not AUTO_INCREMENT; it is not alone in the PRIMARY KEY, so there is no specified "uniqueness" constraint. Is it unique by some other design? Is it computed somewhere else in the code? (I ask because I need a feel for its uniqueness and other characteristics.)
Add INDEX(id, killmails_time).
What version are you using?
Perhaps UNION ALL give the same results? It would be faster because it would not need to de-dup.
How much RAM do you have? What is the value of innodb_buffer_pool_size?
Do you really need 8-byte BIGINTs? Even if your application is using longlong (or whatever it calls it), you can probably change the schema without changing the app.
Do you need this much precision and range? DECIMAL(30,10) -- it takes 14 bytes each. DOUBLE would give you about 16 significant digits in 8 bytes, with a wider range of values (up to about 10^308). What "units" are you using? (Overkill for light-years or parsecs; inadequate for miles or km. Perhaps AUs? Then the bottom digit would be a precision of a few meters?)
The last few questions are aimed at shrinking the table and seeing if we can avoid it being as I/O-bound as it apparently is now.
Important
innodb_buffer_pool_size = 128M is terribly small, especially for a 32GB machine, and especially if your dataset is much bigger than 128MB. If there are not any other apps running on the server, bump that setting up to 20G.

cannot partition this mysql table

I have an innoDB table named "transaction" with ~1.5 million rows. I would like to partition this table (probably on column "gas_station_id" since it is used a lot in join queries) but I've read in MySQL 5.7 Reference Manual that
All columns used in the table's partitioning expression must be part of every unique key that the table may have, including any primary key.
I have two questions:
The column "gas_station_id" is not part of unique key or primary key. How could I partition this table then?
even if I could partition this table, I am not sure which partitioning type would be better in this case? (I was thinking about LIST partitioning (we have about 40 different(distinct) gas stations) but I am not sure since there will be only one value in each list partition like the following :
ALTER TABLE transaction
PARTITION BY LIST(gas_station_id)
( PARTITION p1 VALUES IN (9001),
PARTITION p2 VALUES IN (9002),.....)
I tried partitioning by KEY, but I receive the following error (I think because id is not part of all unique keys..):
#1053 - a UNIQUE INDEX must include all columns in the table's partitioning function
This is the structure of the "transaction" table:
EDIT
and this is what SHOW CREATE TABLE shows:
CREATE TABLE `transaction` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`terminal_transaction_id` int(11) NOT NULL,
`fuel_terminal_id` int(11) NOT NULL,
`fuel_terminal_serial` int(11) NOT NULL,
`xboard_id` int(11) NOT NULL,
`gas_station_id` int(11) NOT NULL,
`operator_id` varchar(16) NOT NULL,
`shift_id` int(11) NOT NULL,
`xboard_total_counter` int(11) NOT NULL,
`fuel_type` tinyint(2) NOT NULL,
`start_fuel_time` int(11) NOT NULL,
`end_fuel_time` int(11) DEFAULT NULL,
`preset_amount` int(11) NOT NULL,
`actual_amount` int(11) DEFAULT NULL,
`fuel_cost` int(11) DEFAULT NULL,
`payment_cost` int(11) DEFAULT NULL,
`purchase_type` int(11) NOT NULL,
`payment_ref_id` text,
`unit_fuel_price` int(11) NOT NULL,
`fuel_status_id` int(11) DEFAULT NULL,
`fuel_mode_id` int(11) NOT NULL,
`payment_result` int(11) NOT NULL,
`card_pan` varchar(20) DEFAULT NULL,
`state` int(11) DEFAULT NULL,
`totalizer` int(11) NOT NULL DEFAULT '0',
`shift_start_time` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `terminal_transaction_id` (`terminal_transaction_id`,`fuel_terminal_id`,`start_fuel_time`) USING BTREE,
KEY `start_fuel_time_idx` (`start_fuel_time`),
KEY `fuel_terminal_idx` (`fuel_terminal_id`),
KEY `xboard_idx` (`xboard_id`),
KEY `gas_station_id` (`gas_station_id`) USING BTREE,
KEY `purchase_type` (`purchase_type`) USING BTREE,
KEY `shift_start_time` (`shift_start_time`) USING BTREE,
KEY `fuel_type` (`fuel_type`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1665335 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT

Short answer: Don't use PARTITION. Let's see the query to help speed it up.
Long answer:
1.5M rows is only marginally big enough to consider partitioning.
PARTITION BY LIST is probably useless for performance.
You have not given enough info to give you answers other that vague hints. Please provide at least SHOW CREATE TABLE and the slow SELECT.
It is possible to add the partition key onto the end of the PRIMARY or UNIQUE key; you will lose the uniqueness test.
Don't index a low-cardinality column; it won't be used.
More on PARTITION

Alter table to apply partitioning by key in mysql

I have a table with million of rows and the frequency of growth will probably increase in future, so far about 4.3 million rows are added in a month, causing the database to slow down. I have already applied indexing but it's not really optimizing the speed. Is applying Partitioning to such data favorable?
Also how can I apply partitioning on a table with million of rows? I know it will look something like this
ALTER TABLE gpsloggs
PARTITION BY KEY(DeviceCode)
PARTITIONS 10;
The problem is I was Partitioning on DeviceCode which is not a primary key so partitioning isn't permissible.
DROP TABLE IF EXISTS `gpslogss`;
CREATE TABLE `gpslogss` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`DeviceCode` varchar(255) DEFAULT NULL,
`Latitude` varchar(255) DEFAULT NULL,
`Longitude` varchar(255) DEFAULT NULL,
`Speed` double DEFAULT NULL,
`rowStamp` datetime DEFAULT NULL,
`Date` varchar(255) DEFAULT NULL,
`Time` varchar(255) DEFAULT NULL,
`AlarmCode` int(11) DEFAULT NULL,
PRIMARY KEY `Id` (`Id`) USING BTREE,
KEY `DeviceCode` (`DeviceCode`) USING BTREE
);
So I altered the table and made the table in a new database with 0 records this way and it worked fine
DROP TABLE IF EXISTS `gpslogss`;
CREATE TABLE `gpslogss` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`DeviceCode` varchar(255) DEFAULT NULL,
`Latitude` varchar(255) DEFAULT NULL,
`Longitude` varchar(255) DEFAULT NULL,
`Speed` double DEFAULT NULL,
`rowStamp` datetime DEFAULT NULL,
`Date` varchar(255) DEFAULT NULL,
`Time` varchar(255) DEFAULT NULL,
`AlarmCode` int(11) DEFAULT NULL,
KEY `Id` (`Id`) USING BTREE,
KEY `DeviceCode` (`DeviceCode`) USING BTREE
);
PARTITION BY KEY(DeviceCode)
PARTITIONS 10;
How should I render the code so that I can apply partitioning to the table with million of rows? How should I drop keys and alter the table to apply partitioning without damaging data?

Short answer: Don't.
Long answer: PARTITION BY KEY does not provide any performance benefit (that I know of). And why else use PARTITION?
Other notes:
You should use InnoDB for virtually all tables.
InnoDB tables should have an explicit PRIMARY KEY.
There is a DATETIME datatype; don't use VARCHAR for date or time, and don't split them.
latitude and longitude are numeric; don't use VARCHAR. FLOAT is a likely candidate (precise enough to differentiate vehicles, but not people).
Your real question is about speed. Let's see the slow SELECTs and work backward from them. Adding PARTITIONing is rarely a solution to performance.

Optimizing aggregation on MySQL Table with 850 million rows

I have a query that I'm using to summarize via aggregations.
The table is called 'connections' and has about 843 million rows.
CREATE TABLE `connections` (
`app_id` varchar(16) DEFAULT NULL,
`user_id` bigint(20) DEFAULT NULL,
`time_started_dt` datetime DEFAULT NULL,
`device` varchar(255) DEFAULT NULL,
`os` varchar(255) DEFAULT NULL,
`firmware` varchar(255) DEFAULT NULL,
KEY `app_id` (`bid`),
KEY `time_started_dt` (`time_started_dt`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
When I try to run a query, such as the one below, it takes over 10 hours and I end up killing it. Does anyone see any mistakes that I'm making, of have any suggestions as to how I could optimize the query?
SELECT
app_id,
MAX(time_started_dt),
MIN(time_started_dt),
COUNT(*)
FROM
connections
GROUP BY
app_id

I suggest you create a composite index on (app_id, time_started_dt):
ALTER TABLE connections ADD INDEX(app_id, time_started_dt)

To get that query to perform, you really need a suitable covering index, with app_id as the leading column, e.g.
CREATE INDEX `connections_IX1` ON `connections` (`app_id`,` time_start_dt`);
NOTE: creating the index may take hours, and the operation will prevent insert/update/delete to the table while it is running.
An EXPLAIN will show the proposed execution plan for your query. With the covering index in place, you'll see "Using index" in the plan. (A "covering index" is an index that can be used by MySQL to satisfy a query without having to access the underlying table. That is, the query can be satisfied entirely from the index.)
With the large number of rows in this table, you may also want to consider partitioning.

I have tried your query on randomly generated data (around 1 million rows). Adding PRIMATY KEY will improve performance of your query by 10%.
As already suggested by other people composite index should be added to the table. Index time_started_dt is useless.
CREATE TABLE `connections` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`app_id` varchar(16) DEFAULT NULL,
`user_id` bigint(20) DEFAULT NULL,
`time_started_dt` datetime DEFAULT NULL,
`device` varchar(255) DEFAULT NULL,
`os` varchar(255) DEFAULT NULL,
`firmware` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `composite_idx` (`app_id`,`time_started_dt`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Why do i HAVE to optimize tables?

I have a pretty big table with contains about 3 million records.
When running a very simple query, joining this table on a few others (all with indexes and/or primary keys), the query will take about 25 seconds to complete!
The value of "Handler_read_next" is about 7 million!
Number of requests to read the next row in key order, incremented if you are querying an index column with a range constraint or if you are doing an index scan.
This problem have only started since this table began to grow big.
Now if I do an "optimize tables" on this table, the query will run in about 0.02 seconds and "Handler_read_next" will have a value of about 1500.
How can the difference be so extreme, and do I really have to setup a scheduled query, optimizing this table once a week or so? Even so, I would like to know the meaning behind this and why mysql behaves like this. Sure, rows are deleted and updated pretty much in this table, but should it get so badly fragmented in only one week that the query goes from 0.02 sec to 25 sec?
Edit: After request, here comes the query in question:
SELECT *
FROM budget_expenses
JOIN budget_categories
ON budget_categories.BudgetAreaId = budget_expenses.BudgetAreaId
AND budget_categories.BudgetCategoryId = budget_expenses.BudgetCategoryId
LEFT JOIN budget_types
ON budget_types.BudgetAreaId = budget_expenses.BudgetAreaId
AND budget_types.BudgetCategoryId = budget_expenses.BudgetCategoryId
AND budget_types.BudgetTypeId = budget_expenses.BudgetTypeId
WHERE budget_expenses.BudgetId = 1
AND budget_expenses.ExpenseDate >= '2012-11-25'
AND budget_expenses.ExpenseDate <= '2012-12-24'
AND budget_expenses.BudgetAreaId = 2
ORDER BY budget_expenses.ExpenseDate DESC,
budget_expenses.ExpenseTime IS NULL ASC,
budget_expenses.ExpenseTime DESC
(BudgetAreaId, BudgetCategoryId) is the primary key in budget_categories and (BudgetAreaId, BudgetCategoryId, BudgetTypeId) is the primary key in budget_types. In budget_expenses these 3 keys are indexes and also ExpenseDate has an index. This query returns about 20 rows.
Show create table:
CREATE TABLE `budget_areas` (
`BudgetAreaId` int(11) NOT NULL,
`Name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`BudgetAreaId`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `budget_categories` (
`BudgetAreaId` int(11) NOT NULL,
`BudgetCategoryId` int(11) NOT NULL AUTO_INCREMENT,
`Name` varchar(255) DEFAULT NULL,
`SortOrder` int(11) DEFAULT NULL,
PRIMARY KEY (`BudgetAreaId`,`BudgetCategoryId`),
KEY `BudgetAreaId` (`BudgetAreaId`,`BudgetCategoryId`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
CREATE TABLE `budget_types` (
`BudgetAreaId` int(11) NOT NULL,
`BudgetCategoryId` int(11) NOT NULL,
`BudgetTypeId` int(11) NOT NULL,
`Name` varchar(255) DEFAULT NULL,
`SortId` int(11) DEFAULT NULL,
PRIMARY KEY (`BudgetAreaId`,`BudgetCategoryId`,`BudgetTypeId`),
KEY `BudgetAreaId` (`BudgetAreaId`,`BudgetCategoryId`,`BudgetTypeId`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `budget_expenses` (
`ExpenseId` int(11) NOT NULL AUTO_INCREMENT,
`BudgetId` int(11) NOT NULL,
`TempId` int(11) DEFAULT NULL,
`BudgetAreaId` int(11) DEFAULT NULL,
`BudgetCategoryId` int(11) DEFAULT NULL,
`BudgetTypeId` int(11) DEFAULT NULL,
`Company` varchar(255) DEFAULT NULL,
`ImportCompany` varchar(255) DEFAULT NULL,
`Sum` double(50,2) DEFAULT NULL,
`ExpenseDate` date DEFAULT NULL,
`ExpenseTime` time DEFAULT NULL,
`Inserted` datetime DEFAULT NULL,
`Changed` datetime DEFAULT NULL,
`InsertType` int(1) DEFAULT NULL,
`AccountId` int(11) DEFAULT NULL,
`BankCardId` int(11) DEFAULT NULL,
PRIMARY KEY (`ExpenseId`),
KEY `BudgetId` (`BudgetId`),
KEY `AccountId` (`AccountId`),
KEY `Company` (`Company`) USING BTREE,
KEY `ExpenseDate` (`ExpenseDate`),
KEY `BudgetAreaId` (`BudgetAreaId`),
KEY `BudgetCategoryId` (`BudgetCategoryId`),
KEY `BudgetTypeId` (`BudgetTypeId`),
CONSTRAINT `budget_expenses_ibfk_1` FOREIGN KEY (`BudgetId`) REFERENCES `budgets` (`BudgetId`)
) ENGINE=InnoDB AUTO_INCREMENT=3604462 DEFAULT CHARSET=latin1
After I copy pasted this I changed from MyIsam to Innodb on the budget_categories table.
Edit: The change from myisam to innodb didn't make any difference. The query is now very slow, just 12 hours after i optimized the budget_expenses table!
Here is the explain for the query which now takes about 9 seconds:
http://jsfiddle.net/dmVPY/1/

Ahhh MyISAM....
Try changing the table type (aka 'storage engine') to InnoDB instead.
If you do this, make sure innodb_buffer_pool_size in your my.cnf is a sensible value - the default is too small.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008