I have a MySQL table that is filled with mails from a postfix mail log. The table is updated very often, some times multiple times per second. Here's the SHOW CREATE TABLE output:
Create Table postfix_mails CREATE TABLE `postfix_mails` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`mail_id` varchar(20) COLLATE utf8_danish_ci NOT NULL,
`host` varchar(30) COLLATE utf8_danish_ci NOT NULL,
`queued_at` datetime NOT NULL COMMENT 'When the message was received by the MTA',
`attempt_at` datetime NOT NULL COMMENT 'When the MTA last attempted to relay the message',
`attempts` smallint(5) unsigned NOT NULL,
`from` varchar(254) COLLATE utf8_danish_ci DEFAULT NULL,
`to` varchar(254) COLLATE utf8_danish_ci NOT NULL,
`source_relay` varchar(100) COLLATE utf8_danish_ci DEFAULT NULL,
`target_relay` varchar(100) COLLATE utf8_danish_ci DEFAULT NULL,
`target_relay_status` enum('sent','deferred','bounced','expired') COLLATE utf8_danish_ci NOT NULL,
`target_relay_comment` varchar(4098) COLLATE utf8_danish_ci NOT NULL,
`dsn` varchar(10) COLLATE utf8_danish_ci NOT NULL,
`size` int(11) unsigned NOT NULL,
`delay` float unsigned NOT NULL,
`delays` varchar(50) COLLATE utf8_danish_ci NOT NULL,
`nrcpt` smallint(5) unsigned NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `mail_signature` (`host`,`mail_id`,`to`),
KEY `from` (`from`),
KEY `to` (`to`),
KEY `source_relay` (`source_relay`),
KEY `target_relay` (`target_relay`),
KEY `target_relay_status` (`target_relay_status`),
KEY `mail_id` (`mail_id`),
KEY `last_attempt_at` (`attempt_at`),
KEY `queued_at` (`queued_at`)
) ENGINE=InnoDB AUTO_INCREMENT=111592 DEFAULT CHARSET=utf8 COLLATE=utf8_danish_ci
I want to know how many mails were relayed through a specific host on a specific date, so I'm using this query:
SELECT COUNT(*) as `count`
FROM `postfix_mails`
WHERE `queued_at` LIKE '2016-04-11%'
AND `host` = 'mta03'
The query takes between 100 and 110 ms.
Currently the table contains about 70 000 mails, and the query returns around 31 000. This is only a couple of days' worth of mails, and I plan to keep at least a month. The query cache doesn't help much because the table is getting updated constantly.
I have tried doing this instead:
SELECT SQL_NO_CACHE COUNT(*) as `count`
FROM `postfix_mails`
WHERE `queued_at` >= '2016-04-11'
AND `queued_at` < '2016-04-12'
AND `host` = 'mta03'
But the query takes the exact same time to run. I have made these changes to the MySQL configuration:
[mysqld]
query_cache_size = 128M
key_buffer_size = 256M
read_buffer_size = 128M
sort_buffer_size = 128M
innodb_buffer_pool_size = 4096M
And confirmed that they are all in effect (SHOW VARIABLES) but the query doesn't run any faster.
Am I doing something stupid that makes this query take this long? Can you spot any obvious or non-obvious ways to make it faster? Is there another database engine that works better than InnoDB in this scenario?
mysql> EXPLAIN SELECT SQL_NO_CACHE COUNT(*) as `count`
-> FROM `postfix_mails`
-> WHERE `queued_at` >= '2016-04-11'
-> AND `queued_at` < '2016-04-12'
-> AND `host` = 'mta03';
+----+-------------+---------------+------+--------------------------+----------------+---------+-------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+------+--------------------------+----------------+---------+-------+-------+-------------+
| 1 | SIMPLE | postfix_mails | ref | mail_signature,queued_at | mail_signature | 92 | const | 53244 | Using where |
+----+-------------+---------------+------+--------------------------+----------------+---------+-------+-------+-------------+
1 row in set (0.00 sec)
queued_at is a datetime value. Don't use LIKE. That converts it to a string, preventing the use of indexes and imposing a full-table scan. Instead, you want an appropriate index and to fix the query.
The query is:
SELECT COUNT(*) as `count`
FROM `postfix_mails`
WHERE `queued_at` >= '2016-04-11' AND `queued_at` < DATE_ADD('2016-04-11', interval 1 day) AND
`host` = 'mta03';
Then you want a composite index on postfix_mails(host, queued_at). The host column needs to be first.
Note: If your current version is counting 31,000 out of 70,000 emails, then an index will not be much help for that. However, this will make the code more scalable for the future.
If you need your query to be really fast, you'll need to materialize it.
MySQL lacks a way to do that natively, so you'll have to create a table like that:
CREATE TABLE mails_host_day
(
host VARCHAR(30) NOT NULL,
day DATE NOT NULL,
mails BIGINT NOT NULL,
PRIMARY KEY (host, day)
)
and update it either in a trigger on postfix_mails or with a script once in a while:
INSERT
INTO mails_host_day (host, day, mails)
SELECT host, CAST(queued_at AS DATE), COUNT(*)
FROM postfix_mails
WHERE id > :last_sync_id
GROUP BY
host, CAST(queued_at AS DATE)
ON DUPLICATE KEY
UPDATE mails = mails + VALUES(mails)
This way, querying a host-day entry is a single primary key seek.
Note that trigger-based solution will affect DML performance, while the script-based solution will result in slightly less actual data.
However, you can improve the script-based solution actuality if you union the most recent actual data with the stored results:
SELECT host, day, SUM(mails) AS mails
FROM (
SELECT host, day, mails
FROM mails_host_day
UNION ALL
SELECT host, CAST(queued_at) AS day, COUNT(*) AS mails
FROM postfix_mails
WHERE id >= :last_sync_id
GROUP BY
host, CAST(queued_at) AS day
) q
It's not a single index seek anymore, however, if you run the update script often enough, there will be less actual records to read.
You have a unique key on 'host', 'mail_id', and 'to', however when the query engine tries to use that index, you aren't filtering on 'mail_id' and 'to', so it may not be as efficient. A solution could be to add another index just on 'host' or add AND 'mail_id' IS NOT NULL AND'to' IS NOT NULL to your query to fully utilize the existing unique index.
You could use pagination to speed up queries in PHP which is usually how I resolve anything that contains a large amount of data - but this depends on your Table hierarchy.
Integrate your LIMIT in the SQL query.
PHP:
foreach ($db->Prepare("SELECT COUNT(*) as `count`
FROM `postfix_mails`
WHERE DATEDIFF(`queued_at`, '2016-04-11') = 0)
AND mail_id < :limit "))->execute(array(':limit' => $_POST['limit'])) as $row)
{
// normal output
}
jQuery:
$(document).ready( function() {
var starting = 1;
$('#next').click( function() {
starting = starting + 10;
$.post('phpfilehere.php', { limit: starting })
.done( function(data) {
$('#mail-output').innerHTML = data;
});
);
);
Here, each page shows 10 emails on, of course you can change this and modify it and even add a search which I actually have an Object I use for all my Projects.
I just thought I'd share the idea - it also adds real-time data flow on your site too.
This was inspired to me by Facebook's scrolling show more - which really isn't hard but is such a good way for querying a lot of data.
Related
I have a table defined as follows:
| book | CREATE TABLE `book` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`provider_id` int(10) unsigned DEFAULT '0',
`source_id` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
`title` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`description` longtext COLLATE utf8_unicode_ci,
PRIMARY KEY (`id`),
UNIQUE KEY `provider` (`provider_id`,`source_id`),
KEY `idx_source_id` (`source_id`),
) ENGINE=InnoDB AUTO_INCREMENT=1605425 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
when there are about 10 concurrent read with following sql:
SELECT * FROM `book` WHERE (provider_id = '1' AND source_id = '1037122800') ORDER BY `book`.`id` ASC LIMIT 1
it becomes slow, it takes about 100 ms.
however if I changed it to
SELECT * FROM `book` WHERE (provider_id = '1' AND source_id = '221630001') LIMIT 1
then it is normal, it takes several ms.
I don't understand why adding order by id makes query much slower? could anyone expain?
Try to add desired columns (Select Column Name,.. ) instead of * or Refer this.
Why is my SQL Server ORDER BY slow despite the ordered column being indexed?
I'm not a mysql expert, and not able to perform a detailed analysis, but my guess would be that because you are providing values for the UNIQUE KEY in the WHERE clause, the engine can go and fetch that row directly using an index.
However, when you ask it to ORDER BY the id column, which is a PRIMARY KEY, that changes the access path. The engine now guesses that since it has an index on id, and you want to order by id, it is better to fetch that data in PK order, which will avoid a sort. In this case though, it leads to a slower result, as it has to compare every row to the criteria (a table scan).
Note that this is just conjecture. You would need to EXPLAIN both statements to see what is going on.
A table with a few Million rows, something like this:
my_table (
`CONTVISITID` bigint(20) NOT NULL AUTO_INCREMENT,
`NODE_ID` bigint(20) DEFAULT NULL,
`CONT_ID` bigint(20) DEFAULT NULL,
`NODE_NAME` varchar(50) DEFAULT NULL,
`CONT_NAME` varchar(100) DEFAULT NULL,
`CREATE_TIME` datetime DEFAULT NULL,
`HITS` bigint(20) DEFAULT NULL,
`UPDATE_TIME` datetime DEFAULT NULL,
`CLIENT_TYPE` varchar(20) DEFAULT NULL,
`TYPE` bigint(1) DEFAULT NULL,
`PLAY_TIMES` bigint(20) DEFAULT NULL,
`FIRST_PUBLISH_TIME` bigint(20) DEFAULT NULL,
PRIMARY KEY (`CONTVISITID`),
KEY `cont_visit_contid` (`CONT_ID`),
KEY `cont_visit_createtime` (`CREATE_TIME`),
KEY `cont_visit_publishtime` (`FIRST_PUBLISH_TIME`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=57676834 DEFAULT CHARSET=utf8
I had a query that I have managed to optimize to the following departing from a flat select:
SELECT a.cont_id, SUM(a.hits)
FROM (
SELECT cont_id,hits,type,first_publish_time
FROM my_table
where create_time > '2017-03-10 00:00:00'
AND first_publish_time>1398310263000
AND type=1) as a group by a.cont_id
order by sum(HITS) DESC LIMIT 10;
Can this be further optimized?
Edit:
I started with a FLAT select like I mentioned before, what I mean by flat select not to have a composite select like my current one. Instead of the single select that someone responded with. A single select is twice slower, so not viable in my case.
Edit2: I have a DBA friend who suggested me to change the query to this:
SELECT a.cont_id, SUM(a.hits)
FROM (
SELECT cont_id,hits
FROM my_table
where create_time > '2017-03-10 00:00:00'
AND first_publish_time>1398310263000
AND type=1) as a group by a.cont_id
order by sum(HITS) DESC LIMIT 10;
As I do not need the fields extra (type,first_publish_time) and the TMP table is smaller, this makes the query faster about about 1/4 total time of the fastest version I have. He also suggested to add a composite index between (create_time, cont_id, hits). He says with this index I will get really good performance, but I have not done that as this is a production DB and the alter might affect replication. I will post results once done.
INDEX(type, first_publish_time)
INDEX(type, create_time)
Then do
SELECT cont_id, SUM(hits) AS tot_hits
FROM my_table
where create_time > '2017-03-10 00:00:00'
AND first_publish_time > 1398310263000
AND type = 1
group by cont_id
order by tot_hits DESC
LIMIT 10;
Start the index with any = filters (type, in this case); then you get one chance to us a range.
The reason for 2 indexes -- The Optimizer will look at statistics and decide which look better based on the values given.
Consider shrinking the BIGINTs (8 bytes) to some smaller INT type. Saving space will help speed, especially if the table is too big to be cached.
For further discussion, please provide EXPLAIN SELECT ...;.
In MySQL slow query log I have the following query:
SELECT * FROM `news_items`
WHERE `ctime` > 1465013901 AND `feed_id` IN (1, 2, 9) AND
`moderated` = '1' AND `visibility` = '1'
ORDER BY `views` DESC
LIMIT 5;
Here is the result of EXPLAIN:
+----+-------------+------------+-------+---------------------------------------------------------------------------------------+-------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------------------------------------------------------------------------------+-------+---------+------+------+-------------+
| 1 | SIMPLE | news_items | index | feed_id,ctime,ctime_2,feed_id_2,moderated,visibility,feed_id_3,cday_complex,feed_id_4 | views | 4 | NULL | 5 | Using where |
+----+-------------+------------+-------+---------------------------------------------------------------------------------------+-------+---------+------+------+-------------+
1 row in set (0.00 sec)
When I run this query manually, it takes like 0.00 sec but for some reason it appears in MySQL's slow log taking 1-5 seconds sometimes. I believe it happens when server is under high load.
Here is the table structure:
CREATE TABLE IF NOT EXISTS `news_items` (
`item_id` int(10) NOT NULL,
`category_id` int(10) NOT NULL,
`source_id` int(10) NOT NULL,
`feed_id` int(10) NOT NULL,
`title` varchar(255) CHARACTER SET utf8 NOT NULL,
`announce` varchar(255) CHARACTER SET utf8 NOT NULL,
`content` text CHARACTER SET utf8 NOT NULL,
`hyperlink` varchar(255) CHARACTER SET utf8 NOT NULL,
`ctime` varchar(11) CHARACTER SET utf8 NOT NULL,
`cday` tinyint(2) NOT NULL,
`img` varchar(100) CHARACTER SET utf8 NOT NULL,
`video` text CHARACTER SET utf8 NOT NULL,
`gallery` text CHARACTER SET utf8 NOT NULL,
`comments` int(11) NOT NULL DEFAULT '0',
`views` int(11) NOT NULL DEFAULT '0',
`visibility` enum('1','0') CHARACTER SET utf8 NOT NULL DEFAULT '0',
`pin` tinyint(1) NOT NULL,
`pin_dttm` varchar(10) CHARACTER SET utf8 NOT NULL,
`moderated` tinyint(1) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
The index named as "views" consists of 1 field only -- views.
I also have many other indexes consisting of (for example):
feed_id + views + visibility + moderated
moderated + visibility + feed_id + ctime
moderated + visibility + feed_id + views + ctime
I used fields in mentioned order because that was the only reason MySQL started to use them. However, I never got "Using where; using index" in EXPLAIN.
Any ideas on how to make EXPLAIN to show me "using index"?
If you have change the storage engine to InnoDB and create the correct composite index you can try this. The first query only gets the item_id for the first 5 rows. Limit is done after the complete SELECT is done. So its better to do this without any big data and then get the hole row only from the 5 woes
SELECT idata.* FROM (
SELECT item_id FROM `news_items`
WHERE `ctime` > 1465013901 AND `feed_id` IN (1, 2, 9) AND
`moderated` = '1' AND `visibility` = '1'
ORDER BY `views` DESC
LIMIT 5 ) as i_ids
LEFT JOIN news_items AS idata ON idata.item_id = i_ids.item_id
ORDER BY `views` DESC;
If your table "also have many other indexes", why do they not show in the SHOW CREATE TABLE?
There are two ways that
WHERE `ctime` > 1465013901
AND `feed_id` IN (1, 2, 9)
AND `moderated` = '1'
AND `visibility` = '1'
ORDER BY `views` DESC
could use indexing:
INDEX(views) and hope that the desired 5 rows (see LIMIT) show up early.
INDEX(moderated, visibility, feed_id, ctime)
This last 'composite' index starts with the two columns (in either order) that are compared = constant, then moves on to IN and finally a "range" (ctime > const). Older versions won't get past IN; newer versions will leapfrog through the IN values and make use of the range on ctime. More discussion.
It is useless to include the ORDER BY columns in a composite index before all of the WHERE columns. However, it will not be useful to include views in your case because the "range" on ctime.
The tip about 'lazy evaluation' that Bernd suggests will also help.
I agree that InnoDB would probably be better. Conversion tips.
To answer your question: "using index" means that MySQL will use only index to satisfy your query. To do this we will need to create a "covered" index (index which "covers" the query) = index which covers both "where" and "order by/group by" and all fields from "select" However, you are doing "select *" so that will not be practical.
MySQL chooses index on views as you have limit 5 in the query. It does that as 1) index is small 2) it can avoid filesort in this case.
I believe the problem is not with the index but rather than with the engine=MyISAM. MyISAM uses table level lock, so if you change the news_items it will be locked. I would suggest converting table to InnoDB.
Another possibility may be that if the table is large, index on (views) may not be the best option.
If you use Percona Server you can enable slow log verbosity option and see the query plan for the slow query as described here: https://www.percona.com/doc/percona-server/5.5/diagnostics/slow_extended_55.html
I have an app developed over NodeJS on AWS that has a MySQL RDS database (server class: db.r3.large - Engine: InnoDB) associated. We are having a performance problem, when we execute simultaneous queries (at the same time), the database is returning the results after finishing the last query and not after each query is finished.
So, as an example: if we execute a process that has 10 simultaneous queries of 3 seconds each, we start receiving the results at approximately 30 seconds and we want to start receiving when the first query is finished (3 seconds).
It seems that the database is receiving the queries and make a queue of them.
I'm kind of lost here since I changed several things (separate connections, pool connections, etc) of the code and the settings of AWS but doesn’t seem to improve the result.
TableA (13M records) schema:
CREATE TABLE `TableA` (
`columnA` int(11) NOT NULL AUTO_INCREMENT,
`columnB` varchar(20) DEFAULT NULL,
`columnC` varchar(15) DEFAULT NULL,
`columnD` varchar(20) DEFAULT NULL,
`columnE` varchar(255) DEFAULT NULL,
`columnF` varchar(255) DEFAULT NULL,
`columnG` varchar(255) DEFAULT NULL,
`columnH` varchar(10) DEFAULT NULL,
`columnI` bigint(11) DEFAULT NULL,
`columnJ` bigint(11) DEFAULT NULL,
`columnK` varchar(5) DEFAULT NULL,
`columnL` varchar(50) DEFAULT NULL,
`columnM` varchar(20) DEFAULT NULL,
`columnN` int(1) DEFAULT NULL,
`columnO` int(1) DEFAULT '0',
`columnP` datetime NOT NULL,
`columnQ` datetime NOT NULL,
PRIMARY KEY (`columnA`),
KEY `columnB` (`columnB`),
KEY `columnO` (`columnO`),
KEY `columnK` (`columnK`),
KEY `columnN` (`columnN`),
FULLTEXT KEY `columnE` (`columnE`)
) ENGINE=InnoDB AUTO_INCREMENT=13867504 DEFAULT CHARSET=utf8;
TableB (15M records) schema:
CREATE TABLE `TableB` (
`columnA` int(11) NOT NULL AUTO_INCREMENT,
`columnB` varchar(50) DEFAULT NULL,
`columnC` varchar(50) DEFAULT NULL,
`columnD` int(1) DEFAULT NULL,
`columnE` datetime NOT NULL,
`columnF` datetime NOT NULL,
PRIMARY KEY (`columnA`),
KEY `columnB` (`columnB`),
KEY `columnC` (`columnC`)
) ENGINE=InnoDB AUTO_INCREMENT=19153275 DEFAULT CHARSET=utf8;
Query:
SELECT COUNT(*) AS total
FROM TableA
WHERE TableA.columnB IN (
SELECT TableB.columnC
FROM TableB
WHERE TableB.columnB = "3764301"
AND TableB.columnC NOT IN (
SELECT field
FROM table
WHERE table.field = 10
AND TableB.columnC NOT IN (
SELECT field
FROM table
WHERE table.field = 10
AND TableB.columnC NOT IN (
SELECT field
FROM table
WHERE table.field = 10
AND TableB.columnC NOT IN (
SELECT field
FROM table
WHERE table.field = 10
)
AND columnM > 2;
1 execution return in 2s
10 executions return the first result in 20s and the another results after that time.
To see that queries are running I'm using "SHOW FULL PROCESSLIST" and the queries are most of the time with state "sending data".
It is not a performance issue about the query, it is a problem of recurrence over database. Even a very simple query like "SELECT COUNT(*) FROM TableA WHERE columnM = 5" has the same problem.
UPDATE
Only for testing purpose I reduce the query to only one subquery condition. Both results have 65k records.
-- USING IN
SELECT COUNT(*) as total
FROM TableA
WHERE TableA.columnB IN (
SELECT TableB.columnC
FROM TableB
WHERE TableB.columnB = "103550181"
AND TableB.columnC NOT IN (
SELECT field
FROM tableX
WHERE fieldX = 15
)
)
AND columnM > 2;
-- USING EXISTS
SELECT COUNT(*) as total
FROM TableA
WHERE EXISTS (
SELECT *
FROM TableB
WHERE TableB.columnB = "103550181"
AND TableA.columnB = TableB.columnC
AND NOT EXISTS (
SELECT *
FROM tableX
WHERE fieldX = 15
AND fieldY = TableB.columnC
)
)
AND columnM > 2;
-- Result
Query using IN : 1.7 sec
Query using EXISTS : 141 sec (:O)
Using IN or EXISTS the problem is the same, when I execute many times this query the data base have a delay and the response comes after a lot of time.
Example: If one query response in 1.7 sec, if I execute 10 times this query, the first result is in 20 sec.
Recommendation 1
Change the NOT IN ( SELECT ... ) to NOT EXISTS ( SELECT * ... ). (And you may need to change the WHERE clause a bit.
AND TableB.columnC NOT IN (
SELECT field
FROM table
WHERE table.field = 10
-->
AND NOT EXISTS ( SELECT * FROM table WHERE field = TableB.columnC )
table needs an index on field.
IN ( SELECT ... ) performs very poorly. EXISTS is much better optimized.
Recommendation 2
To deal with the concurrency, consider doing SET SESSION TRANSACTION READ UNCOMMITTED before the query. This may keep one connection from interfering with another.
Recommendation 3
Show us the EXPLAIN, the indexes (SHOW CREATE TABLE) (what you gave is not sufficient), and the WHERE clauses so we can critique the indexes.
Recommendation 4
It might help for TableB to have a composite INDEX(ColumnB, ColumnC) in that order.
What I can see here is that HUGE temporary table is being build for each query. Consider different architecture.
I have a table using InnoDB that stores all messages sent by my system. Currently the table have 40 million rows and grows 3/4 million per month.
My query is basically to select messages sent from an user and within a data range. Here is a simplistic create table:
CREATE TABLE `log` (
`id` int(10) NOT NULL DEFAULT '0',
`type` varchar(10) NOT NULL DEFAULT '',
`timeLogged` int(11) NOT NULL DEFAULT '0',
`orig` varchar(128) NOT NULL DEFAULT '',
`rcpt` varchar(128) NOT NULL DEFAULT '',
`user` int(10) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `timeLogged` (`timeLogged`),
KEY `user` (`user`),
KEY `user_timeLogged` (`user`,`timeLogged`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Note: I have individual indexes too because of other queries.
Query looks like this:
SELECT COUNT(*) FROM log WHERE timeLogged BETWEEN 1282878000 AND 1382878000 AND user = 20
The issue is that this query takes from 2 minutes to 10 minutes, depending of user and server load which is too much time to wait for a page to load. I have mysql cache enabled and cache in application, but the problem is that when user search for new ranges, it won't hit cache.
My question are:
Would changing user_timeLogged index make any difference?
Is this a problem with MySQL and big databases? I mean, does Oracle or other DBs also suffer from this problem?
AFAIK, my indexes are correctly created and this query shouldn't take so long.
Thanks for anyone who help!
you're using innodb but not taking full advantage of your innodb clustered index (primary key) as it looks like your typical query is of the form:
select <fields> from <table> where user_id = x and <datefield> between y and z
not
select <fields> from <table> where id = x
the following article should help you optimise your table design for your query.
http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/
If you understand the article correctly you should find youself with something like the following:
drop table if exists user_log;
create table user_log
(
user_id int unsigned not null,
created_date datetime not null,
log_type_id tinyint unsigned not null default 0, -- 1 byte vs varchar(10)
...
...
primary key (user_id, created_date, log_type_id)
)
engine=innodb;
Here's some query performance stats from the above design:
Counts
select count(*) as counter from user_log
counter
=======
37770394
select count(*) as counter from user_log where
created_date between '2010-09-01 00:00:00' and '2010-11-30 00:00:00'
counter
=======
35547897
User and date based queries (all queries run with cold buffers)
select count(*) as counter from user_log where user_id = 4755
counter
=======
7624
runtime = 0.215 secs
select count(*) as counter from user_log where
user_id = 4755 and created_date between '2010-09-01 00:00:00' and '2010-11-30 00:00:00'
counter
=======
7404
runtime = 0.015 secs
select
user_id,
created_date,
count(*) as counter
from
user_log
where
user_id = 4755 and created_date between '2010-09-01 00:00:00' and '2010-11-30 00:00:00'
group by
user_id, created_date
order by
counter desc
limit 10;
runtime = 0.031 secs
Hope this helps :)
COUNT(*) is not loading from the table cache because you have a WHERE clause, using EXPLAIN as #jason mentioned, try changing it to COUNT(id) and see if that helps.
I could be wrong, but I also think that your indexes have to be in the same order as your WHERE clause. Since your WHERE clause uses timeLogged before user then your index should be KEYuser_timeLogged(timeLogged,user)`
Again, EXPLAIN will tell you whether this index change makes a difference.