Indexing needs to be sped up - mysql

I have a table with the following details:
CREATE TABLE `test` (
`seenDate` datetime NOT NULL DEFAULT '0001-01-01 00:00:00',
`corrected_test` varchar(45) DEFAULT NULL,
`corrected_timestamp` timestamp NULL DEFAULT NULL,
`unable_to_correct` tinyint(1) DEFAULT '0',
`fk_zone_for_correction` int(11) DEFAULT NULL,
PRIMARY KEY (`sightinguid`),
KEY `corrected_test` (`corrected_test`),
KEY `idx_seenDate` (`seenDate`),
KEY `idx_corrected_test_seenDate` (`corrected_test`,`seenDate`),
KEY `zone_for_correction_fk_idx` (`fk_zone_for_correction`),
KEY `idx_corrected_test_zone` (`fk_zone_for_correction`,`corrected_test`,`seenDate`),
CONSTRAINT `zone_for_correction_fk` FOREIGN KEY (`fk_zone_for_correction`) REFERENCES `zone_test` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I am then using the following query:
SELECT
*
FROM
test
WHERE
fk_zone_for_correction = 1
AND (unable_to_correct = 0
OR unable_to_correct IS NULL)
AND (corrected_test = ''
OR corrected_test IS NULL)
AND (last_accessed_timestamp IS NULL
OR last_accessed_timestamp < (NOW() - INTERVAL 30 MINUTE))
ORDER BY seenDate ASC
LIMIT 1
Here is a screenshot of the optimiser - the ORDER BY is slowing things down, and in my opinion seems to be indexed properly, and the correct index (idx_corrected_test_zone) is being selected. What can be done to improve it?

There is no INDEX that will help much.
This might help:
INDEX(fk_zone_for_correction, seenDate)
Both columns can perhaps be used -- the first for filtering, the second for avoiding having to sort. But, it could backfire if it can't find the 1 row quickly.
The killer is OR. If you could avoid ever populating any of those 3 columns with NULL, then this might be better:
INDEX(fk_zone_for_correction, unable_to_correct, corrected_test, last_accessed_timestamp)
-- the range thing needs to be last
-- this index would do the filtering, but fail to help with `ORDER` and `LIMIT`.
Even though it is using idx_corrected_test_zone, it is probably not using more than the first two columns -- because of OR.
You have two cases of redundant indexes. For example, the first of these is the left part of the second; so the first is redundant and can be DROPped:
KEY `corrected_test` (`corrected_test`),
KEY `idx_corrected_test_seenDate` (`corrected_test`,`seenDate`),

Related

Mysql is looking in much more, estimate rows, then expected

I have user_rates table where i have two user foreign references user_id_owner and user_id_rated.
This is my create table query:
CREATE TABLE `user_rates` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`user_id_owner` int(10) unsigned NOT NULL,
`user_id_rated` int(10) unsigned NOT NULL,
`value` int(11) NOT NULL COMMENT '0 - dislike, 1 - like',
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `user_rates_user_id_rated_foreign` (`user_id_rated`),
KEY `user_rates_user_id_owner_foreign` (`user_id_owner`),
CONSTRAINT `user_rates_user_id_owner_foreign` FOREIGN KEY (`user_id_owner`) REFERENCES `users` (`id`),
CONSTRAINT `user_rates_user_id_rated_foreign` FOREIGN KEY (`user_id_rated`) REFERENCES `users` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1825767 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
When i execute this query:
EXPLAIN SELECT
user_id_rated
FROM
`user_rates` AS ur
WHERE
ur.user_id_owner = 10101;
It shows estimate rows to examine 107000, but returning only 60000.
Can you explain me why it's examining so many rows, when it is comparing with equality operator and also comparing field is foreign key?
EDIT
I am getting this on EXPLAIN
I want to add several where clauses also. At last my query looks like this:
Explain SELECT
user_id_rated
FROM
`user_rates` AS ur
WHERE
ur.user_id_owner = 10101
AND (ur.value IN (1, 2, 3)
OR (ur.value = 0
AND ur.created_at > '2020-02-04 00:00:00'));
Output:
It will be nice if query can be more optimized. I don't understand why isn't it reducing estimate rows.
Steps i tried when optimizing
Added compose index on (user_id_owner, value, created_at)
But estimate row is not reducing, It is filtering even more rows
Maybe i am doing indexing wrong? I really don't know how to make proper indexes. Sorry for bad question, I am new here. Thanks in advance.
The "rows" is an estimate, often quite far off -- sometimes even worse than your example. The incorrectness of the estimate rarely impacts performance.
You can run ANALYZE TABLE tablename to improve the estimate. But it may still not be better.
For the current query, use:
( SELECT user_id_rated
FROM `user_rates` AS ur
WHERE ur.user_id_owner = 10101
AND ur.value IN (1, 2, 3)
)
UNION ALL
( SELECT user_id_rated
FROM `user_rates` AS ur
WHERE ur.user_id_owner = 10101
AND ur.value = 0
AND ur.created_at > '2020-02-04 00:00:00'
);
And have the composite (and "covering") indexes:
INDEX(user_id_owner, value, user_id_rated)
INDEX(user_id_owner, value, created_at, user_id_rated)
If there are other variations of the query, show us. As you may guess; the details are important.
(The simplified version of the query does not provide any useful information when discussing the real query.)

MySQL. Why I cant update one only one column?

I have table:
CREATE TABLE `cold_water_volume_value` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`parameter_value_id` int(11) NOT NULL,
`time` timestamp(4) NOT NULL DEFAULT CURRENT_TIMESTAMP(4) ON UPDATE CURRENT_TIMESTAMP(4),
`value` double NOT NULL,
`device_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_cold_water_volume_value_id_device_time` (`parameter_value_id`,`device_id`,`time`),
KEY `idx_cold_water_volume_value_id_time` (`parameter_value_id`,`time`),
KEY `fk_cold_water_volume_value_device_id_idx` (`device_id`),
CONSTRAINT `fk_cold_water_volume_value_device_id` FOREIGN KEY (`device_id`) REFERENCES `device` (`id`) ON UPDATE SET NULL,
CONSTRAINT `fk_cold_water_volume_value_id` FOREIGN KEY (`parameter_value_id`) REFERENCES `cold_water_volume_parameter` (`id`) ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=684740 DEFAULT CHARSET=utf8;
And all rows have device_id = NULL. I want to update it by script:
UPDATE cold_water_volume_value SET device_id = 130101 WHERE parameter_value_id = 2120101;
But instead of replacing all device_id for picked parameter_value_id from null to given value, it sets all content of time and value columns to now () and some (seems like completely random from previous values) number.
Why it happens, and how to do it correct way?
time is automatically updated as per your schema.
`time` timestamp(4) NOT NULL DEFAULT CURRENT_TIMESTAMP(4) ON UPDATE CURRENT_TIMESTAMP(4)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
To get around that you can set time to itself in your update.
UPDATE cold_water_volume_value
SET device_id = 130101, time = time
WHERE parameter_value_id = 2120101;
But that is likely there to track when the last time a row was updated. If so it's working as intended, leave it to do its thing.
As for value, that might have an update trigger on it. Check with show triggers and look for triggers on that table.
Your device_id is updated using content of time probably because in your index definition you mixed datatypes. It's worth noting that you should not mix datatypes especially on where clause when indexing.
Try to separate your indexes for example:
KEY idx_cold_water_volume_value_id_device_time (time),
KEY idx_cold_water_volume_value_id_device (parameter_value_id,device_id),
Try above statements for your definition and run query again.
It makes sense for the indexed column to have the same datatypes.
e.g. parameter_value_id and device_id

Optimise mysql where, group by, scanning too many rows

Table representing statuses. User can re-share a status as on FB, therefore the original_id.
| user_status | CREATE TABLE `user_status` (
`status_id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`destination_user_id` int(11) NOT NULL,
`original_id` int(11) DEFAULT NULL,
`type` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`status_id`),
KEY `IDX_1E527E21A76ED395` (`user_id`),
KEY `IDX_1E527E21C957ECED` (`destination_user_id`),
KEY `core_index` (`destination_user_id`,`original_id`),
CONSTRAINT `FK_1E527E21A76ED395` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`),
CONSTRAINT `FK_1E527E21C957ECED` FOREIGN KEY (`destination_user_id`)REFERENCES `users` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=161362 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
I am trying to optimise a query for newsfeed (I removed everything unnecessary just to be able to optimise the core of the query but it's still simply slow).
Query:
EXPLAIN SELECT MAX(us.status_id)
FROM user_status us
WHERE us.destination_user_id IN (25,30,31,32,33,34,35,36,37,38,39,40,42,43,44,46,49,50,51,52,53,55,56,57,58,59,60,62,64,66,68,74,78,79,81,88,91,92,94,96,98,99,100,101,102,106,110,112,113,114,117,124,128,129,133,138,140,144,149,150,151,154,155,156,158,159,160,164,170,174,175,180,184,186,187,210,211,222,225,227,228,231,234,235,236,237,240,264,269,271,276,282,287,289,295,297,298,301,302,311,315,318,322,326,328,345,350,379,396,398,403,404,418,426,428,431,449,460,471,476,477,495,496,506,538,539,540,542,546,551,554,557,559,561,564,571,572,575,585,586,588,590,616,617,624,629,630,641,645,649,654,655,656,657,658,659,660,662,663,673,685,690,693,696,698,724,728,734,737,746,757,760,762,769,791,797,808,829,833,841,857,858,865,878,879,881,888,889,898,919,921,932,937,944,949,950,958,961,965,966,974,980,986,994,996,1005,1012,1013,1019,1020,1027,1044,1062,1079,1081,1097,1121,1122,1131,1140,1174,1178,1199,1214,1219,1221,1259,1261,1262,1268,1277,1282,1294,1300,1307,1320,1330,1331,1333,1336,1350,1361,1371,1388,1393,1440,1464,1482,1497,1507,1509,1511,1513,1514,1525,1537,1558,1569,1572,1573,1577,1584,1588,1591,1593,1627,1644,1645,1666,1688,1716,1729,1735,1751,1756,1803,1818,1828,1867,1871,1876,1914,1935,2038,2047,2058,2072,2074,2085,2106,2153,2168,2197,2232,2279,2355,2359,2511,2560,2651,2773,2803,2812,2818,2829,2835,2841,2865,2891,3032,3051,3095,3100,3148,3412,3476,3578,3623,3808,3853,3968,3976,3992,4045,4047,4069,4077,4119,4156,4237,4271,4280,4285,4337,4348,4644,4711,4872,4898,5084,5108,5110,5248,5254,5266,5268,5315,5318,5553,5716,5744,5768,5782,5784,5794,5815,5883,5920,5921,5985,5987,6016,6070,6364,7067,7522,7571,7733,7800,8259,8421,8640,9743,10039,11900,12344,12794,13419,13468,13548,13778,13829,13892,13902,13910,13976,13977,14042,14056,14171,14175,14176,14210,14255,14258,14279,14301,14343,14394,14465,14501,14538,14650,14656,14657,14805,14807,14813,14970,14975,15110,15174,15277,15284,15306,15354,15404,15649,15710,15776,16084,16099,14752,16516,1130,9770,1127,14200,13950,15842,16406,15614,16566,16209,16672,13887,16122,14857,16877,10093,15752,16131,17618,17767,5783,17867,16081,18224,6972,14273,18471,15403,16261,6641,18669,15153,18708,18534,17447,18843,18840,27,61,18656,18336,18006,15337,17197,18999,14360,19023,19002,16856,2885,17237,16560,15575,16297,11199,17836,14313,759,18403,19421,19514,2828,14562,1792,18131,19703,1280,18314,15944,17078,18316,19695,20017,16493,19566,17028,19104,17518,2045,16312,15508,20092,5060,18207,1773,17129,17154,18786,17077,15155,17640,2845,19480,20943,107,2775,21247,3989,20292,19077,20046,18230,18241,18102,19225,
14230,21011,5765,15344,21732,11249,15532,14105,4136,17373,14612,17944,17040,15505,17528,20461,22200,14059,11701,19410,3085,12180,22730,22631,17673,2820,20826,21895,23992,24080,24249,25144,25146,25171,25177,25181,25222,25223,25232,25245,25248,25250,25252,25255,25264,25267,25276,25279,25280,25284,25294,25298,25300,25312,25324,25332,25359,25373,25374,25381,25402,25412,25430,25434,25437,25442,25444,25446,25454,25465,25474,25486,25490,25491,25494,25535,25540,25549,25555,25568,25671,25711,25713,25714,25722,25737,25755,25768,25774,
25783,25784,25839,25854,25886,25889,25891,25913,25926,25956,25967,26026,26043)
GROUP BY us.original_id
ORDER BY us.status_id DESC
LIMIT 0,10;
Explain of the query:
Execution time: 10 rows in set (0,41 sec), MySQL 5.7 (strict mode turned off)
Imo. a table as small as 100k rows should be performing much better. I tried to change the indexes up and down but they seems to be properly set.
Any idea how could I optimise this query to 0.0x or 0.1x ?
Update
The linked duplicate is not related with my issue, shouldn't be linked imo.
Removing the unnecessary extra join resolved the issue.
Now it works as suppose to by using tight index scan http://dev.mysql.com/doc/refman/5.7/en/group-by-optimization.html
I can't get rid of the "Using temporary; Using filesort" even if I change the ORDER BY to "us.original_id" but the execution time is now as expected: 0.08
Remove the two extra tables, since us seems to be the only one relevant.
It is not valid to ORDER BY status_id since it is not in the GROUP BY, nor (technically) in the SELECT..

MySQL SELECT return wrong results

I'm working with MySQL 5.7. I created a table with a virtual column (not stored) of type DATETIME with an index on it. While I was working on it, I noticed that order by was not returning all the data (some data I was expecting at the top was missing). Also the results from MAX and MIN were wrong.
After I run
ANALYZE TABLE
CHECK TABLE
OPTIMIZE TABLE
then the results were correct. I guess there was an issue with the index data, so I have few questions:
When and why this could happen?
Is there a way to prevent this?
among the 3 command I run, which is the correct one to use?
I'm worried that this could happen in the future but I'll not notice.
EDIT:
as requested in the comments I added the table definition:
CREATE TABLE `items` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`user_id` bigint(20) unsigned DEFAULT NULL,
`image` json DEFAULT NULL,
`status` json DEFAULT NULL,
`status_expired` tinyint(1) GENERATED ALWAYS AS (ifnull(json_contains(`status`,'true','$.expired'),false)) VIRTUAL COMMENT 'used for index: it checks if status contains expired=true',
`lifetime` tinyint(4) NOT NULL,
`expiration` datetime GENERATED ALWAYS AS ((`create_date` + interval `lifetime` day)) VIRTUAL,
`last_update` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`create_date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `expiration` (`status_expired`,`expiration`) USING BTREE,
CONSTRAINT `ts_competition_item_ibfk_2` FOREIGN KEY (`user_id`) REFERENCES `ts_user_core` (`user_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1312459 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED
Queries that were returning the wrong results:
SELECT * FROM items ORDER BY expiration DESC;
SELECT max(expiration),min(expiration) FROM items;
Thanks
TLDR;
The trouble is that your data comes from virtual columns materialized via indexes. The check, optimize, analyze operations you are doing forces the indexes to be synced and fixes any errors. That gives you the correct results henceforth. At least until the index gets out of sync again.
Why it may happen
Much of the problems are caused by issues with your table design. Let's start with.
`status_expired` tinyint(1) GENERATED ALWAYS AS (ifnull(json_contains(`status`,'true','$.expired'),false)) VIRTUAL
No doubt this is created to overcome the fact that you cannot directly index a JSON column in mysql. You have created a virtual column and indexed that instead. It's all very well, but this column can hold only one of two values; true or false. Which means it has very poor cadinality. As a result, mysql is unlikely to use this index for anything.
But we can see that you have combined the status_expired column with the expired column when creating the index. Perhaps with the idea of overcoming this poor cardinality mentioned above. But wait...
`expiration` datetime GENERATED ALWAYS AS ((`create_date` + interval `lifetime` day)) VIRTUAL,
Expiration is another virtual column. This has some repercussions.
When a secondary index is created on a generated virtual column,
generated column values are materialized in the records of the index.
If the index is a covering index (one that includes all the columns
retrieved by a query), generated column values are retrieved from
materialized values in the index structure instead of computed “on the
fly”.
Ref: https://dev.mysql.com/doc/refman/5.7/en/create-table-secondary-indexes.html#json-column-indirect-index
This is contrary to
VIRTUAL: Column values are not stored, but are evaluated when rows are
read, immediately after any BEFORE triggers. A virtual column takes no
storage.
Ref: https://dev.mysql.com/doc/refman/5.7/en/create-table-generated-columns.html
We create virtual columns based on the sound principal that values generated by simple operations on columns shouldn't be stored to avoid redundancy, but by creating an index on it, we reintroduce redundancy.
Proposed fixes
based on the information provided, you don't really seem to need the status_expired column or even the expired column. An item that's past it's expiry date is expired!
CREATE TABLE `items` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`user_id` bigint(20) unsigned DEFAULT NULL,
`image` json DEFAULT NULL,
`status` json DEFAULT NULL,
`expire_date` datetime GENERATED ALWAYS AS ((`create_date` + interval `lifetime` day)) VIRTUAL,
`last_update` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`create_date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `expiration` (`expired_date`) USING BTREE,
CONSTRAINT `ts_competition_item_ibfk_2` FOREIGN KEY (`user_id`) REFERENCES `ts_user_core` (`user_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1312459 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED
Simply compare the current date with the expired_date column in the above table when you need to find out which items have expired. The difference here is instead of expired being a calculated item in every query, you calculate the expiry_date once, when you create the record.
This makes your table a lot neater and queries possibly faster

Flat MySQL table with enum-based filters is unexpectedly slow

I have a site where there is an activity feed, similar to how social sites like Facebook have one. It is a "newest first" list that describes actions taken by users. In production, there's about 200k entries in that table.
Since this is going to be asked anyway, I'll first share the full table structure:
CREATE TABLE `karmalog` (
`id` int(11) NOT NULL auto_increment,
`guid` char(36) default NULL,
`user_id` int(11) default NULL,
`user_name` varchar(45) default NULL,
`user_avat_url` varchar(255) default NULL,
`user_sec_id` int(11) default NULL,
`user_sec_name` varchar(45) default NULL,
`user_sec_avat_url` varchar(255) default NULL,
`event` enum('EDIT_PROFILE','EDIT_AVATAR','EDIT_EMAIL','EDIT_PASSWORD','FAV_IMG_ADD','FAV_IMG_ADDED','FAV_IMG_REMOVE','FAV_IMG_REMOVED','FOLLOW','FOLLOWED','UNFOLLOW','UNFOLLOWED','COM_POSTED','COM_POST','COM_VOTE','COM_VOTED','IMG_VOTED','IMG_UPLOAD','LIST_CREATE','LIST_DELETE','LIST_ADMINDELETE','LIST_VOTE','LIST_VOTED','IMG_UPD','IMG_RESTORE','IMG_UPD_LIC','IMG_UPD_MOD','IMG_GEO','IMG_UPD_MODERATED','IMG_VOTE','IMG_VOTED','TAG_FAV_ADD','CLASS_DOWN','CLASS_UP','IMG_DELETE','IMG_ADMINDELETE','IMG_ADMINDELETEFAV','SET_PASSWORD','IMG_RESTORED','IMG_VIEW','FORUM_CREATE','FORUM_DELETE','FORUM_ADMINDELETE','FORUM_REPLY','FORUM_DELETEREPLY','FORUM_ADMINDELETEREPLY','FORUM_SUBSCRIBE','FORUM_UNSUBSCRIBE','TAG_INFO_EDITED','IMG_ADDSPECIE','IMG_REMOVESPECIE','SPECIE_ADDVIDEO','SPECIE_REMOVEVIDEO','EARN_MEDAL','JOIN') NOT NULL,
`event_type` enum('follow','tag','image','class','list','forum','specie','medal','user') NOT NULL,
`active` bit(1) NOT NULL,
`delete` bit(1) NOT NULL default '\0',
`object_id` int(11) default NULL,
`object_cache` text,
`object_sec_id` int(11) default NULL,
`object_sec_cache` text,
`karma_delta` int(11) NOT NULL,
`gold_delta` int(11) NOT NULL,
`newkarma` int(11) NOT NULL,
`newgold` int(11) NOT NULL,
`migrated` int(11) NOT NULL default '0',
`date_created` timestamp NOT NULL default '0000-00-00 00:00:00',
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `user_sec_id` (`user_sec_id`),
KEY `image_id` (`object_id`),
KEY `date_event` (`date_created`,`event`),
KEY `event` (`event`),
KEY `date_created` (`date_created`),
CONSTRAINT `karmalog_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `user` (`id`) ON DELETE SET NULL,
CONSTRAINT `karmalog_ibfk_2` FOREIGN KEY (`user_sec_id`) REFERENCES `user` (`id`) ON DELETE SET NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Before optimizing this table, my query had 5 joins and I ran into slow query times. I have denormalized all of that data, so that not a single join is there anymore. So the table and query is flat.
As you can see in the table design, there's an "event" field which is an enum, holding a few dozen possible values. Throughout the site, I show activity feeds based on specific event types. Typically that query looks like this:
SELECT * FROM karmalog as k
WHERE k.event IN ($events) AND k.delete=0
ORDER BY k.date_created DESC, k.id DESC
LIMIT 0,30
What this query does is to find the latest 30 entries in the total set that match any of the events passed in $events, which can be multiple.
Due to removing the joins and having indices on most fields, I was expecting this to perform very well, but it doesn't. On 200k entries, it still takes over 3 seconds and I don't understand why.
Regarding solutions, I know I could archive older entries or partition the table per event type, but that will have quite a code impact, and I first would like to understand why the above is so slow.
As a temporary work-around, I'm now doing this:
SELECT * FROM
(SELECT * FROM karmalog ORDER BY date_created DESC, id DESC LIMIT 0,1000) as karma
WHERE karma.event IN ($events) AND karma.delete=0
LIMIT $page,$pagesize
What this does is to limit the baseset to search in to the latest 1000 entries only, hoping and guessing that there's 30 entries to be found for the filters that I pass in. It's not very robust though. It will not work for more rare events, and it brings pagination issues.
Therefore, I first like to get to the root cause of why my initial query is slow, against my expectation.
Edit: I was asked to share the execution plan. Here's the test query:
EXPLAIN SELECT * FROM karmalog
WHERE event IN ('FAV_IMG_ADD','FOLLOW','COM_POST','IMG_VOTE','LIST_VOTE','JOIN','CLASS_UP','LIST_CREATE','FORUM_REPLY','FORUM_CREATE','FORUM_SUBSCRIBE','IMG_GEO','IMG_ADDSPECIE','SPECIE_ADDVIDEO','EARN_MEDAL') AND karmalog.delete=0
ORDER BY date_created DESC, id DESC
LIMIT 0,36
Execution plan:
id = 1
select_type = SIMPLE
table = karmalog
type = range
possible_keys = event
key = event
key_len = 1
red = NULL
rows = 80519
Extra = Using where; Using filesort
I'm not sure how to read into the above, but I do know that the sort clause really seems to kill this query. With this sorting, it takes 4.3 secs, without 0.03 secs.
SELECT * sometimes slows down ordered queries by a huge amount, so let's start by refactoring your query as follows:
SELECT k.*
FROM karmalog AS k
JOIN (
SELECT id
FROM karmalog
WHERE event IN ($events)
AND delete=0
ORDER BY date_created DESC, id DESC
LIMIT 0,30
) AS m ON k.id = m.id
ORDER BY k.date_created DESC, k.id DESC
This will do your ORDER BY ... LIMIT operation without having to haul the whole table around in the sorting phase. Finally it will look up the appropriate thirty rows from the original table and sort just those again. This might save a whole lot of I/O and in-memory data shuffling.
Second, if id column values are assigned in ascending order as records are inserted, then the use of date_created in your ORDER BY operation is redundant. But MySQL doesn't know that, so leaving it out might help. This will be true if you always use the current date when inserting, and never update the dates.
Third, you might be able to use a compound covering index for the selection (inner) query. This is an index that contains all the fields you need. When you use a covering index, the whole query can be satisfied from the index, and there's no need to bounce back to the original table. This saves disk access time.
Try this compound covering index: (delete, event, id). If you decide you can't get rid of the use of date_created in your ordering, try this instead: (delete, event, date_created, id)
Add a compound index over the two relevant questions. In your table, you can do that by specifying e.g.
KEY `date_created` (`date_created`, `event`)
This key can still be used to satisfy plain old date_created range searching. But in addition to that, the event data is included as well, so the DBS will be able to detect the relevant rows by only looking at the index.
If you want, you can try the other order as well: first event and then date. This might allow some optimization if there are many event types but your filter only contains few. On the other hand, I'm not sure the system will be able to make use of the LIMIT clause in this case, so I'm not certain that this other order will be any help at all.
Edit: I completely missed that your date_event index already has this info. According to your execution plan, though, that one isn't used. Looks like the optimizer is getting things wrong. You could try removing the event index, and perhaps the date index as well, and see what happens then.