SQL: Refactoring a multi-join query - mysql

I have a query that should be quite simple and yet it causes me a lot of headaches.
I have a simple ads system that requires filtering ads according to a few variables.
I need to limit the number of views/clicks per day and the total number of views/clicks for a given ad. Also each ad is linked to one or more slots in which the ad can appear. I have a table that saves the statistics that I need about each ad. Note that the statistics table changes very frequently.
These are the tables that I'm using:
CREATE TABLE `t_ads` (
`id` int(10) unsigned NOT NULL auto_increment,
`name` varchar(255) NOT NULL,
`content` text NOT NULL,
`is_active` tinyint(1) unsigned NOT NULL,
`start_date` date NOT NULL,
`end_date` date NOT NULL,
`max_views` int(10) unsigned NOT NULL,
`type` tinyint(3) unsigned NOT NULL default '0',
`refresh` smallint(5) unsigned NOT NULL default '0',
`max_clicks` int(10) unsigned NOT NULL,
`max_daily_clicks` int(10) unsigned NOT NULL,
`max_daily_views` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `t_ad_slots` (
`id` int(10) unsigned NOT NULL auto_increment ,
`name` varchar(255) NOT NULL,
`width` int(10) unsigned NOT NULL,
`height` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `t_ads_to_slots` (
`ad_id` int(10) unsigned NOT NULL,
`slot_id` int(10) unsigned NOT NULL,
`value` int(10) unsigned NOT NULL,
PRIMARY KEY (`ad_id`,`slot_id`),
KEY `slot_id` (`slot_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `t_ads_to_slots`
ADD CONSTRAINT `t_ads_to_slots_ibfk_1` FOREIGN KEY (`ad_id`) REFERENCES `t_ads` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION,
ADD CONSTRAINT `t_ads_to_slots_ibfk_2` FOREIGN KEY (`slot_id`) REFERENCES `t_ad_slots` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION;
CREATE TABLE `t_ad_stats` (
`ad_id` int(10) unsigned NOT NULL,
`slot_id` int(10) unsigned NOT NULL,
`date` date NOT NULL COMMENT,
`views` int(10) unsigned NOT NULL,
`unique_views` int(10) unsigned NOT NULL,
`clicks` int(10) unsigned NOT NULL default '0',
PRIMARY KEY (`ad_id`,`slot_id`,`date`),
KEY `slot_id` (`slot_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `t_ad_stats`
ADD CONSTRAINT `t_ad_stats_ibfk_1` FOREIGN KEY (`ad_id`) REFERENCES `t_ads` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION,
ADD CONSTRAINT `t_ad_stats_ibfk_2` FOREIGN KEY (`slot_id`) REFERENCES `t_ad_slots` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION;
This is the query that I use to get ads for a given slot (Note that in this example I hard coded 20 as the slot id and 0,1,2 as the ad type, I get this data from a php script which invokes this query)
SELECT `ads`.`content`, `slots`.`value`, `ads`.`id`, `ads`.`refresh`, `ads`.`type`,
SUM(`total_stats`.`views`) AS "total_views",
SUM(`total_stats`.`clicks`) AS "total_clicks"
FROM (`t_ads` AS `ads`,
`t_ads_to_slots` AS `slots`)
LEFT JOIN `t_ad_stats` AS `total_stats`
ON `total_stats`.`ad_id` = `ads`.`id`
LEFT JOIN `t_ad_stats` AS `daily_stats`
ON (`daily_stats`.`ad_id` = `ads`.`id`) AND
(`daily_stats`.`date` = CURDATE())
WHERE (`ads`.`id` = `slots`.`ad_id`) AND
(`ads`.`type` IN(0,1,2)) AND
(`slots`.`slot_id` = 20) AND
(`ads`.`is_active` = 1) AND
(`ads`.`end_date` >= NOW()) AND
(`ads`.`start_date` <= NOW()) AND
((`ads`.`max_views` = 0) OR
(`ads`.`max_views` > "total_views")) AND
((`ads`.`max_clicks` = 0) OR
(`ads`.`max_clicks` > "total_clicks")) AND
((`ads`.`max_daily_clicks` = 0) OR
(`ads`.`max_daily_clicks` > IFNULL(`daily_stats`.`clicks`,0))) AND
((`ads`.`max_daily_views` = 0) OR
(`ads`.`max_daily_views` > IFNULL(`daily_stats`.`views`,0)))
GROUP BY (`ads`.`id`)
I believe that this query is self explanatory, even though its quite long. Note that the MySQL version that I'm using is: 5.0.51a-community. It seems to me like the big issue here is the double join to the stats table (I did that so that I will be able to get the data from a specific record and from multiple records (sum)).
How would you implement this query in order to get better results? (Note that I can't change from InnoDB).
Hopefully everything is clear about my question, but if that is not the case, please ask and I will clarify.
Thanks in advance,
Kfir

Add indexes to following columns:
t_ads.is_active
t_ads.start_date
t_ads.end_date
Change the order of the primary key on t_ad_stats to:
(`ad_id`,`date`,`slot_id`)
or add a covering index to t_ad_stats
('ad_id', 'date')
Change from 0 meaning "no limit" to 2147483647 meaning no limit, so you can change things like:
((`ads`.`max_views` = 0) OR (`ads`.`max_views` > "total_views"))
to
(`ads`.`max_views` > "total_views")
You could greatly improve this is if you were keeping running totals instead of having to calculate them each time.

Expanding on a comment above I believe that the following columns should be indexed:
ads.id
ads.type
ads.start_date
ads.end_date
daily_stats.date
As well as these:
slots.slot_id
ads.is_active
And these as well:
ads.max_views
ads.max_clicks
ads.max_daily_clicks
ads.max_daily_views
daily_stats.clicks
daily_stats.views
Do note that applying indexes on these columns will speed up your SELECTs but slow down your INSERTs since the indexes will need updating as well. But, you don't have to apply all of this all at once. You can do it incrementally and see how the performance shakes out for selects as well as inserts. If you cannot find a good middleground then I would suggest denormalization.

Related

Partitioning large table by dates

I have implemented custom url shortener in my app and I have one table for that. table structure looks like this:
CREATE TABLE `urls` (
`id` int(11) NOT NULL,
`url_id` varchar(10) DEFAULT NULL,
`long_url` varchar(255) DEFAULT NULL,
`clicked` mediumint(5) NOT NULL DEFAULT 0,
`user_id` varchar(7) DEFAULT NULL,
`type` varchar(15) DEFAULT NULL,
`ad_id` int(11) DEFAULT NULL,
`campaign` int(11) DEFAULT,
`increment` tinyint(1) NOT NULL DEFAULT 0,
`date` date DEFAULT NULL,
`del` enum('1','0') NOT NULL DEFAULT '0'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT
ALTER TABLE `urls`
ADD PRIMARY KEY (`id`),
ADD KEY `url_id` (`url_id`),
ADD KEY `type` (`type`),
ADD KEY `campaign` (`campaign`),
ADD KEY `ad_id` (`ad_id`),
ADD KEY `date` (`date`),
ADD KEY `user_id` (`user_id`);
The table now has 20.000.000 records and currently growing by 300k-400k records per day.
url_id column is unique varchar(10) and url looks like that: http://example.com/asdfghjklu
Now i have partitioned this table into 10 partitions by HASH(id):
PARTITION BY HASH (`id`)
PARTITIONS 10;
When I try to generate reports and join this table on others query is getting really slow, so slow even can't get 1 week report.
When I try to make big query in this table I filter almost every query with dates and I think it will be much better if I partition this table by date column.
Is it good idea?
As I read if I want to partition this table by date I need to add date in composite primary key: PRIMARY KEY(id, date)
What do you think about this? How do I improve my query performance?
I wold recommend use hash partition using date or month or YEAR
CREATE TABLE `urls` (
`id` int(11) NOT NULL,
`url_id` varchar(10) DEFAULT NULL,
`long_url` varchar(255) DEFAULT NULL,
`clicked` mediumint(5) NOT NULL DEFAULT 0,
`user_id` varchar(7) DEFAULT NULL,
`type` varchar(15) DEFAULT NULL,
`ad_id` int(11) DEFAULT NULL,
`campaign` int(11) DEFAULT,
`increment` tinyint(1) NOT NULL DEFAULT 0,
`date` date DEFAULT NULL,
`del` enum('1','0') NOT NULL DEFAULT '0',
PartitionsID int(4) unsigned NOT NULL,
KEY PartitionsID (PartitionsID)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
PARTITION BY HASH (PartitionsID)
PARTITIONS 366;
IN PARTITION ID you just need to insert TO_DAYS(date) so you have only one value for entire day .
SOURCE
and it will make easy for partition for each day or you can do with month wise also depending on your data size .
for select
you can use below query as example
SELECT *
FROM TT ACT
WHERE ACT.CustomerID = vCustomerID
AND ACT.TransactionTime BETWEEN vInvoiceEndDate AND vPaymentDueDate
AND ACT.TrxnInfoTypeID IN (19, 23)
AND ACT.PaymentType = '1'
AND ACT.PartitionsID BETWEEN TO_DAYS(vInvoiceEndDate) AND TO_DAYS(vPaymentDueDate);

MariaDB INNER JOIN with foreign keys are MUCH slower than without them

Please help me, I'm stuck with the strange behaviour of MariaDB server.
I have 3 tables.
CREATE TABLE `default_work` (
`add_date` datetime(6) NOT NULL,
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(255) NOT NULL,
`keywords` varchar(255) DEFAULT NULL,
`short_text` longtext DEFAULT NULL,
`downloads` int(10) unsigned NOT NULL,
`published` tinyint(1) NOT NULL,
`subject_id` int(11) NOT NULL,
`work_type_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `default_work_subject_id_IDX` (`subject_id`) USING BTREE,
KEY `default_work_work_type_id_IDX` (`work_type_id`) USING BTREE,
CONSTRAINT `default_work_FK` FOREIGN KEY (`subject_id`) REFERENCES `default_subject` (`id`),
CONSTRAINT `default_work_FK_1` FOREIGN KEY (`work_type_id`) REFERENCES `default_worktype` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=210673 DEFAULT CHARSET=utf8
CREATE TABLE `default_subject` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`subject` varchar(255) NOT NULL,
`old_id` int(10) unsigned NOT NULL,
`subject_literal` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=43 DEFAULT CHARSET=utf8
CREATE TABLE `default_worktype` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`work_type` varchar(250) NOT NULL,
`description` longtext DEFAULT NULL,
`old_id` int(10) unsigned NOT NULL,
`work_type_literal` varchar(250) NOT NULL,
`title` varchar(255) NOT NULL,
`multiple` varchar(255) NOT NULL,
`keywords` varchar(255) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `default_worktype_old_id_a8b508fe_uniq` (`old_id`),
UNIQUE KEY `default_worktype_work_type_literal_1e609434_uniq` (`work_type_literal`)
) ENGINE=InnoDB AUTO_INCREMENT=13 DEFAULT CHARSET=utf8
These tables were created by Django ORM but it seems to be ok.
The default_work table has about 200,000 records, default_subject - 42, and default_worktype - 12.
After I was making a request in Django admin with simple joins between those tables I've got about 9 secs of query time.
Looked in SQL log I've found a raw query:
SELECT `default_work`.`id`, `default_work`.`title`, `default_worktype`.`work_type`,`default_subject`.`subject`
FROM `default_work`
INNER JOIN `default_subject` ON (`default_work`.`subject_id` = `default_subject`.`id`)
INNER JOIN `default_worktype` ON (`default_work`.`work_type_id` = `default_worktype`.`id`)
ORDER BY `default_work`.`id` DESC LIMIT 100
The explain showing:
Explain result of the query with indexes
And this is a bit confusing because when I deleted all indexes on table default_work except the primary key, the results were completely different. The request time was about 3.4 msec and explain shows the all primary keys are used correctly.
Explain result of the query without indexes
PS. I'm tried to reproduce this situation on PostgreSQL and got a 1.3 msec with the request with indexes and foreign keys.
Looking at your EXPLAIN results you can see that when the foreign keys are turned on the system is using that key in the join, instead of choosing to use the primary key in the target table. (row 2)
As there will be many records with the same value it massivley increases the records that are being evaluated.
I don't know why it's choosing to do that. You may find that rewriting the select statement in a different order will change how it chooses the indexes. You may find the choice is different if in the ON clause you secify target table first then the source table (default_subject.id = default_work.subject_id)

Make LEFT JOIN query more efficient

The following query with LEFT JOIN is drawing too much memory (~4GB), but the host only allows about 120MB for this process.
SELECT grades.grade, grades.evaluation_id, evaluations.evaluation_name, evaluations.value, evaluations.maximum FROM grades LEFT JOIN evaluations ON grades.evaluation_id = evaluations.evaluation_id WHERE grades.registrar_id = ?
Create table syntax for grades:
CREATE TABLE `grades` (
`grade_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`evaluation_id` int(10) unsigned DEFAULT NULL,
`registrar_id` int(10) unsigned DEFAULT NULL,
`grade` float unsigned DEFAULT NULL,
`entry_date` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`grade_id`),
KEY `registrarGrade_key` (`registrar_id`),
KEY `evaluationKey` (`evaluation_id`),
KEY `grades_id_index` (`grade_id`),
KEY `eval_id_index` (`evaluation_id`),
KEY `grade_index` (`grade`),
CONSTRAINT `evaluationKey` FOREIGN KEY (`evaluation_id`) REFERENCES `evaluations` (`evaluation_id`),
CONSTRAINT `registrarGrade_key` FOREIGN KEY (`registrar_id`) REFERENCES `registrar` (`reg_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1627 DEFAULT CHARSET=utf8;
evaluations table:
CREATE TABLE `evaluations` (
`evaluation_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`instance_id` int(11) unsigned DEFAULT NULL,
`evaluation_col` varchar(255) DEFAULT NULL,
`evaluation_name` longtext,
`evaluation_method` enum('class','email','online','lab') DEFAULT NULL,
`evaluation_deadline` date DEFAULT NULL,
`maximum` int(11) unsigned DEFAULT NULL,
`value` float DEFAULT NULL,
PRIMARY KEY (`evaluation_id`),
KEY `instanceID_key` (`instance_id`),
KEY `eval_name_index` (`evaluation_name`(3)),
KEY `eval_method_index` (`evaluation_method`),
KEY `eval_deadline_index` (`evaluation_deadline`),
KEY `maximum` (`maximum`),
KEY `value_index` (`value`),
KEY `eval_id_index` (`evaluation_id`),
CONSTRAINT `instanceID_key` FOREIGN KEY (`instance_id`) REFERENCES `course_instance` (`instance_id`)
) ENGINE=InnoDB AUTO_INCREMENT=72 DEFAULT CHARSET=utf8;
The Php code to pull the data:
$sql = "SELECT grades.grade, grades.evaluation_id, evaluations.evaluation_name, evaluations.value, evaluations.maximum FROM grades LEFT JOIN evaluations ON grades.evaluation_id = evaluations.evaluation_id WHERE grades.registrar_id = ? AND YEAR(entry_date) = YEAR(CURDATE())";
$result = $mysqli->prepare($sql);
if($result === FALSE)
die($mysqli->error);
$result->bind_param('i',$reg_ids[$i]);
$result->execute();
$result->bind_result($grade, $eval_id, $evalname, $evalval, $max);
while($result->fetch()){
And the fatal error message
Is there a way to drastically reduce the memory load on this query?
Thanks!
Curiously, changing the MySQL query did not change the amount of memory attempted to be allocated
Please provide SHOW CREATE TABLE for both tables; I want to see if you have anything like these:
grades: INDEX(registration_id)
evaluations: PRIMARY KEY(evaluation_id)
Edit
You now have redundant indexes in both table -- probably because of my suggestion. That is, you already have both the indexes that would help with the query.
Since you have LONGTEXT and it is trying to allocate exactly 4GB, the max size of LONGTEXT, I guess that is the problem. Suggest you ALTER that column to be TEXT (64KB max) or MEDIUMTEXT (16MB max). I have not seen this behavior before in PHP, but then I rarely use anything bigger than TEXT.

How to optimize this mysql join on large table?

I have a project where the admin needs to create multiple newsletters with some crawled posts from the web.
I insert the posts in posts table after crawling has completed and assign them a feed_id to identify the source. this is the structure of posts table (truncated):
CREATE TABLE `posts` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`feed_id` int(11) NOT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT NULL,
`identifier` varchar(255) DEFAULT NULL,
`published` timestamp NULL DEFAULT NULL,
`content` longtext,
...
...
`is_unread` int(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Every admin (user) has access to one or more "feeds". So in Newsletter creation page I want to show them a list of posts from the feeds they are allowed to see and also, I show a button to put the posts in specifict categories of that newsletter, if the user previously selected that post, I should show him that and let him remove it from the category. So I have some other tables too: newsletters, categories, newsletter_post, category_post. Here is their structures:
newsletters:
CREATE TABLE `newsletters` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT NULL,
`sent_at` timestamp NULL DEFAULT NULL,
`title` varchar(255) DEFAULT NULL,
`date` date DEFAULT NULL,
`topic_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
categories:
CREATE TABLE `categories` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`topic_id` int(11) NOT NULL,
`title` varchar(255) DEFAULT NULL,
`slug` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
newsletter_post:
CREATE TABLE `newsletter_post` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT NULL,
`newsletter_id` int(11) NOT NULL,
`post_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
category_post:
CREATE TABLE `category_post` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NULL DEFAULT NULL,
`category_id` int(11) NOT NULL,
`post_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
So I'm using this query to find posts for the allowed feeds and check the status if a post is in a specific category of this specific newsletter:
SELECT DISTINCT `posts`.`id`, `published`, `posts`.`title`, `posts`.`content`, `source_name`, `category_id`, `newsletter_id`, `link_href`, categories.title as category_title
FROM `posts`
LEFT JOIN `category_post` ON `posts`.`id` = `category_post`.`post_id`
LEFT JOIN `categories` ON `categories`.`id` = `category_post`.`category_id`
LEFT JOIN `newsletter_post` ON `posts`.`id` = `newsletter_post`.`post_id`
LEFT JOIN `newsletters` ON `newsletters`.`id` = `newsletter_post`.`newsletter_id`
WHERE `feed_id` IN (6, 7) ORDER BY `posts`.`published` DESC LIMIT 40 OFFSET 0
but the problem is this is horrible and not optimized. My posts table contains up to 50,000 rows each month, and each row with 3~10kbs of data in avg., so sometimes when I try to run the query (which is frequently run by the admin to make the newsletter, pagination etc) mysql shows this error: too much rows to join, etc. and most of the times its really slow.
and the reason I'm doing all this in one query is because I want the result to be in one json response so I can show them the user quickly without doing additional requests.
I wanna know if there is a better way to do this query or use indexes or something else.
Thanks you in advance for your help.
index your posts table on
( feed_id, published )
so the data is already optimized for your WHERE clause, and pre-sorted to help your ORDER BY.
For reading querys that have a lot of demand, InnoDB is very inefficient. I recommend you to use a NoSQL Database but if you don't want or the cost of change is too much... you can try this:
1) LIKE Sallar Kaboli told you, you have to index your tables in columns that use in JOIN querys. For example:
CREATE INDEX index1 ON newsletter_post (post_id);
2) USE only important columns for JOINS.
I mean, you have to only use the columns that use in SELECT part of query.
I hope this'd be helpful.
To complete other answers, I suggest to change this types on posts table:
1) Change feed_id to int(4). Really you have more than int(4) feeds?
2) Change is_unread to bit instead of int(1). I should say that this may not improve your given query in the question but according to the field name, the correct type is bit.
Another more improvement to this answer is that never use default int(11) for numeric or id fields, assign types more specific. Using smaller size of types will improve your indexes also. I don't think you need more than int(4) for fields id.
For example indexing and querying int(3) column is more faster than int(11).
Please create the following indexes indexes on ::
1) `post_id` in `category_post`
2) `post_id` in `newsletter_post`

Why do i HAVE to optimize tables?

I have a pretty big table with contains about 3 million records.
When running a very simple query, joining this table on a few others (all with indexes and/or primary keys), the query will take about 25 seconds to complete!
The value of "Handler_read_next" is about 7 million!
Number of requests to read the next row in key order, incremented if you are querying an index column with a range constraint or if you are doing an index scan.
This problem have only started since this table began to grow big.
Now if I do an "optimize tables" on this table, the query will run in about 0.02 seconds and "Handler_read_next" will have a value of about 1500.
How can the difference be so extreme, and do I really have to setup a scheduled query, optimizing this table once a week or so? Even so, I would like to know the meaning behind this and why mysql behaves like this. Sure, rows are deleted and updated pretty much in this table, but should it get so badly fragmented in only one week that the query goes from 0.02 sec to 25 sec?
Edit: After request, here comes the query in question:
SELECT *
FROM budget_expenses
JOIN budget_categories
ON budget_categories.BudgetAreaId = budget_expenses.BudgetAreaId
AND budget_categories.BudgetCategoryId = budget_expenses.BudgetCategoryId
LEFT JOIN budget_types
ON budget_types.BudgetAreaId = budget_expenses.BudgetAreaId
AND budget_types.BudgetCategoryId = budget_expenses.BudgetCategoryId
AND budget_types.BudgetTypeId = budget_expenses.BudgetTypeId
WHERE budget_expenses.BudgetId = 1
AND budget_expenses.ExpenseDate >= '2012-11-25'
AND budget_expenses.ExpenseDate <= '2012-12-24'
AND budget_expenses.BudgetAreaId = 2
ORDER BY budget_expenses.ExpenseDate DESC,
budget_expenses.ExpenseTime IS NULL ASC,
budget_expenses.ExpenseTime DESC
(BudgetAreaId, BudgetCategoryId) is the primary key in budget_categories and (BudgetAreaId, BudgetCategoryId, BudgetTypeId) is the primary key in budget_types. In budget_expenses these 3 keys are indexes and also ExpenseDate has an index. This query returns about 20 rows.
Show create table:
CREATE TABLE `budget_areas` (
`BudgetAreaId` int(11) NOT NULL,
`Name` varchar(255) DEFAULT NULL,
PRIMARY KEY (`BudgetAreaId`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `budget_categories` (
`BudgetAreaId` int(11) NOT NULL,
`BudgetCategoryId` int(11) NOT NULL AUTO_INCREMENT,
`Name` varchar(255) DEFAULT NULL,
`SortOrder` int(11) DEFAULT NULL,
PRIMARY KEY (`BudgetAreaId`,`BudgetCategoryId`),
KEY `BudgetAreaId` (`BudgetAreaId`,`BudgetCategoryId`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
CREATE TABLE `budget_types` (
`BudgetAreaId` int(11) NOT NULL,
`BudgetCategoryId` int(11) NOT NULL,
`BudgetTypeId` int(11) NOT NULL,
`Name` varchar(255) DEFAULT NULL,
`SortId` int(11) DEFAULT NULL,
PRIMARY KEY (`BudgetAreaId`,`BudgetCategoryId`,`BudgetTypeId`),
KEY `BudgetAreaId` (`BudgetAreaId`,`BudgetCategoryId`,`BudgetTypeId`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
CREATE TABLE `budget_expenses` (
`ExpenseId` int(11) NOT NULL AUTO_INCREMENT,
`BudgetId` int(11) NOT NULL,
`TempId` int(11) DEFAULT NULL,
`BudgetAreaId` int(11) DEFAULT NULL,
`BudgetCategoryId` int(11) DEFAULT NULL,
`BudgetTypeId` int(11) DEFAULT NULL,
`Company` varchar(255) DEFAULT NULL,
`ImportCompany` varchar(255) DEFAULT NULL,
`Sum` double(50,2) DEFAULT NULL,
`ExpenseDate` date DEFAULT NULL,
`ExpenseTime` time DEFAULT NULL,
`Inserted` datetime DEFAULT NULL,
`Changed` datetime DEFAULT NULL,
`InsertType` int(1) DEFAULT NULL,
`AccountId` int(11) DEFAULT NULL,
`BankCardId` int(11) DEFAULT NULL,
PRIMARY KEY (`ExpenseId`),
KEY `BudgetId` (`BudgetId`),
KEY `AccountId` (`AccountId`),
KEY `Company` (`Company`) USING BTREE,
KEY `ExpenseDate` (`ExpenseDate`),
KEY `BudgetAreaId` (`BudgetAreaId`),
KEY `BudgetCategoryId` (`BudgetCategoryId`),
KEY `BudgetTypeId` (`BudgetTypeId`),
CONSTRAINT `budget_expenses_ibfk_1` FOREIGN KEY (`BudgetId`) REFERENCES `budgets` (`BudgetId`)
) ENGINE=InnoDB AUTO_INCREMENT=3604462 DEFAULT CHARSET=latin1
After I copy pasted this I changed from MyIsam to Innodb on the budget_categories table.
Edit: The change from myisam to innodb didn't make any difference. The query is now very slow, just 12 hours after i optimized the budget_expenses table!
Here is the explain for the query which now takes about 9 seconds:
http://jsfiddle.net/dmVPY/1/
Ahhh MyISAM....
Try changing the table type (aka 'storage engine') to InnoDB instead.
If you do this, make sure innodb_buffer_pool_size in your my.cnf is a sensible value - the default is too small.