MySQL Composite Index Question - Performance issue - mysql

I have done some research on the index before posting the question here. So far I believe I have done this correctly, but for some reason, the performance of a query that returns around 2400 records has not been good.
Here is the table schema
CREATE TABLE `tblCheck` (
`id` VARCHAR(50) NULL DEFAULT NULL COLLATE 'latin1_swedish_ci',
`token` VARCHAR(50) NULL DEFAULT NULL COLLATE 'latin1_swedish_ci',
`domainId` VARCHAR(50) NULL DEFAULT NULL COLLATE 'latin1_swedish_ci',
`time` DATETIME NULL DEFAULT NULL,
`responseCode` INT(11) NULL DEFAULT NULL,
`totalTime` DECIMAL(10,2) NULL DEFAULT NULL,
`namelookupTime` INT(11) NULL DEFAULT NULL,
`connectTime` INT(11) NULL DEFAULT NULL,
`pretransferTime` INT(11) NULL DEFAULT NULL,
`startTransferTime` INT(11) NULL DEFAULT NULL,
`redirectTime` INT(11) NULL DEFAULT NULL,
`appconnectTime` INT(11) NULL DEFAULT NULL,
`responseText` TEXT(65535) NULL DEFAULT NULL COLLATE 'latin1_swedish_ci',
`agentId` VARCHAR(50) NULL DEFAULT NULL COLLATE 'latin1_swedish_ci',
`isHealthy` CHAR(50) NULL DEFAULT NULL COLLATE 'latin1_swedish_ci',
`ftp_connect_time` DECIMAL(10,4) NULL DEFAULT NULL,
`ftp_login_time` DECIMAL(10,4) NULL DEFAULT NULL,
`ftp_change_mode_time` DECIMAL(10,4) NULL DEFAULT NULL,
`ftp_list_time` DECIMAL(10,4) NULL DEFAULT NULL,
`syntheticToken` VARCHAR(50) NULL DEFAULT NULL COLLATE 'latin1_swedish_ci',
UNIQUE INDEX `id` (`id`) USING BTREE,
INDEX `domainId` (`domainId`) USING BTREE,
INDEX `deleteTime` (`time`) USING BTREE,
INDEX `SearchIndex` (`domainId`, `time`, `agentId`) USING BTREE
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
ROW_FORMAT=COMPACT
;
The Query
SELECT *
FROM `tblCheck`
WHERE (`time` BETWEEN '2020-05-04 22:15:04' AND '2020-05-05 22:15:04')
AND `domainId` = '03d4c1ce-8b13-11ea-abf5-124e96b5f417'
AND `agentId` != '145a-f6bb-11e8-983f-1231322cbdb6'
ORDER BY `time` DESC
;
/* Affected rows: 0 Found rows: 2,418 Warnings: 0 Duration for 1 query: 0.109 sec. (+ 10.360 sec. network) */
It returned 2418 rows but took almost 10s.
Running it with EXPLAIN
EXPLAIN SELECT *
FROM `tblCheck`
WHERE (`time` BETWEEN '2020-05-04 22:15:04' AND '2020-05-05 22:15:04')
AND `domainId` = '03d4c1ce-8b13-11ea-abf5-124e96b5f417'
AND `agentId` != '145a-f6bb-11e8-983f-1231322cbdb6'
ORDER BY `time` DESC
Returns this
This looks like it is using the index "SearchIndex". However, I can't figure out why it would take to process 10s for 2k rows

For this query:
SELECT *
FROM `tblCheck`
WHERE
`time` BETWEEN '2020-05-04 22:15:04' AND '2020-05-05 22:15:04'
AND `domainId` = '03d4c1ce-8b13-11ea-abf5-124e96b5f417'
AND `agentId` != '145a-f6bb-11e8-983f-1231322cbdb6'
ORDER BY `time` DESC
The right index would be: (domainId, agentId, time), or (domainId, time, agentId). You have an the second index in place, and the query plan shows that MySQL happily uses it.
Looking at the explain summary, you can see:
Duration for 1 query: 0.109 sec. (+ 10.360 sec. network)
The query runs fast in the database. The bottleneck is the network, that is the time taken to return the 2000+ rows from the database to the client. Not much can be done from database perspective. Speed up your network, or switch to a local database if you can.
As a side note: select * is a not good for performance; you should try and reduce the number of columns that the query returns (this might also reduce the amount that needs to transmitted over the network).

You don't have the primary key.
You only have a UNIQUE INDEX.

If you con't nee that bulky responseText column in the results, don't include it. This may dramatically speed up the query.
(This is because large columns are stored "off-record", thereby taking an extra disk read if the table is huge.)

Related

Mysql slow query on large table

With Mysql 5.6.10, I have a table like this:
CREATE TABLE `es_user_action` (
`id` bigint(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` bigint(11) unsigned NOT NULL,
`company_id` bigint(11) unsigned NOT NULL,
`work_id` bigint(11) unsigned NOT NULL,
`action` tinyint(2) NOT NULL COMMENT '10, 20, 30, 40',
`action_id` bigint(11) unsigned DEFAULT NULL,
`apply_id` bigint(11) unsigned DEFAULT '0',
`apply_display_id` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`action_time` datetime NOT NULL,
`scout_time` datetime NOT NULL,
`register_datetime_sorting` datetime DEFAULT NULL,
`score` tinyint(2) DEFAULT '0',
`is_pending` tinyint(2) DEFAULT '0',
`apply_status` tinyint(2) DEFAULT '2' COMMENT '1: paid, 2: free',
`has_response` tinyint(2) DEFAULT '0',
`response_time` datetime DEFAULT NULL,
`is_shown` tinyint(2) DEFAULT '1',
`source` varchar(15) COLLATE utf8mb4_unicode_ci NOT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `user_company_work` (`user_id`,`work_id`,`company_id`,`apply_id`,`source`),
KEY `IDX_2` (`company_id`,`is_shown`,`apply_status`,`apply_id`,`work_id`,`action_time`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=436896779 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
The full query I want to perform on the table would be the following:
SELECT
`id` AS `seqNo`,
`company_id` AS `companyId`,
`work_id` AS `workId`,
`action` AS `userActionType`,
`action_time` AS `userActionDatetime`,
`is_pending`,
`apply_status`,
`action_id`,
`source`,
`user_id` AS `userId`,
`apply_display_id`
FROM
`es_user_action`
WHERE
`company_id` = 449664
AND `is_shown` = 1
AND `apply_status` = 1
AND `apply_id` = 0
AND `action_time` >= '2021-01-05 15:56:14'
ORDER BY
is_pending ASC,
action ASC,
score DESC,
CASE
source
WHEN "entenshoku" THEN
action_time
END DESC,
CASE
WHEN source <> "entenshoku" THEN
action_time
END ASC
LIMIT 100 OFFSET 0;
The table has around 15 million rows and the following query takes around 15 seconds. The query becomes very slow.
Can anyone help out? Thanks in advance.
UPDATED:
This is the explain query result:
KEY `IDX_2` (`company_id`,`is_shown`,`apply_status`,`apply_id`,`work_id`,`action_time`)
Work_id is in the way of that index being effective for that SELECT. Remove it. That should speed up the query some. Still the are too many columns and expressions involved in the WHER and ORDER BY to do much more to help.
Changing BIGINTs to smaller INT types will help some (by shrinking the table).
I'm pretty sure that what you have cannot work.
You have is a dynamically changing ORDER BY, depending on the value of source. But which row's source?
If you could decide before issuing the query whether to scan action_time ascending or descending, which value of source would you pick?
Delete this Question, rewrite the query, then open another Question if you still have an issue.

MySQL Query Optimization that touches three tables via a union of two of them

I have a query that returns results from a single table based on the provided ID existing in a column in one of two, or both, tables. The DB schema for the relevant tables is provided below as well as the initial query and then what was later recommended to me by a peer. I go into some details below as to why this query works but I need to optimize it farther for larger datasets and pagination.
CREATE TABLE `killmails` (
`id` BIGINT(20) UNSIGNED NOT NULL,
`hash` VARCHAR(255) NOT NULL,
`moon_id` BIGINT(20) NULL DEFAULT NULL,
`solar_system_id` BIGINT(20) UNSIGNED NOT NULL,
`war_id` BIGINT(20) NULL DEFAULT NULL,
`is_npc` TINYINT(1) NOT NULL DEFAULT '0',
`is_awox` TINYINT(1) NOT NULL DEFAULT '0',
`is_solo` TINYINT(1) NOT NULL DEFAULT '0',
`dropped_value` DECIMAL(18,4) UNSIGNED NOT NULL DEFAULT '0.0000',
`destroyed_value` DECIMAL(18,4) UNSIGNED NOT NULL DEFAULT '0.0000',
`fitted_value` DECIMAL(18,4) UNSIGNED NOT NULL DEFAULT '0.0000',
`total_value` DECIMAL(18,4) UNSIGNED NOT NULL DEFAULT '0.0000',
`killmail_time` DATETIME NOT NULL,
`created_at` DATETIME NOT NULL,
`updated_at` DATETIME NOT NULL,
PRIMARY KEY (`id`, `hash`),
INDEX `total_value` (`total_value`),
INDEX `killmail_time` (`killmail_time`),
INDEX `solar_system_id` (`solar_system_id`)
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
;
CREATE TABLE `killmail_attackers` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`killmail_id` BIGINT(20) UNSIGNED NOT NULL,
`alliance_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`character_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`corporation_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`faction_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`damage_done` BIGINT(20) UNSIGNED NOT NULL,
`final_blow` TINYINT(1) NOT NULL DEFAULT '0',
`security_status` DECIMAL(17,15) NOT NULL,
`ship_type_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`weapon_type_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`created_at` DATETIME NOT NULL,
`updated_at` DATETIME NOT NULL,
PRIMARY KEY (`id`),
INDEX `ship_type_id` (`ship_type_id`),
INDEX `weapon_type_id` (`weapon_type_id`),
INDEX `alliance_id` (`alliance_id`),
INDEX `corporation_id` (`corporation_id`),
INDEX `killmail_id_character_id` (`killmail_id`, `character_id`),
CONSTRAINT `killmail_attackers_killmail_id_killmails_id_foreign_key` FOREIGN KEY (`killmail_id`) REFERENCES `killmails` (`id`) ON UPDATE CASCADE ON DELETE CASCADE
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
;
CREATE TABLE `killmail_victim` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
`killmail_id` BIGINT(20) UNSIGNED NOT NULL,
`alliance_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`character_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`corporation_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`faction_id` BIGINT(20) UNSIGNED NULL DEFAULT NULL,
`damage_taken` BIGINT(20) UNSIGNED NOT NULL,
`ship_type_id` BIGINT(20) UNSIGNED NOT NULL,
`ship_value` DECIMAL(18,4) NOT NULL DEFAULT '0.0000',
`pos_x` DECIMAL(30,10) NULL DEFAULT NULL,
`pos_y` DECIMAL(30,10) NULL DEFAULT NULL,
`pos_z` DECIMAL(30,10) NULL DEFAULT NULL,
`created_at` DATETIME NOT NULL,
`updated_at` DATETIME NOT NULL,
PRIMARY KEY (`id`),
INDEX `corporation_id` (`corporation_id`),
INDEX `alliance_id` (`alliance_id`),
INDEX `ship_type_id` (`ship_type_id`),
INDEX `killmail_id_character_id` (`killmail_id`, `character_id`),
CONSTRAINT `killmail_victim_killmail_id_killmails_id_foreign_key` FOREIGN KEY (`killmail_id`) REFERENCES `killmails` (`id`) ON UPDATE CASCADE ON DELETE CASCADE
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB
;
This first query is where the problem started:
SELECT
*
FROM
killmails k
LEFT JOIN killmail_attackers ka ON k.id = ka.killmail_id
LEFT JOIN killmail_victim kv ON k.id = kv.killmail_id
WHERE
ka.character_id = ?
OR kv.character_id = ?
ORDER BY killmails.killmail_time DESC
LIMIT ? OFFSET ?
This worked okay, but long query times. We optimized to this
SELECT
killmails.*,
FROM (
SELECT killmail_victim.killmail_id FROM killmail_victim
WHERE killmail_victim.corporation_id = ?
UNION
SELECT killmail_attackers.killmail_id FROM killmail_attackers
WHERE killmail_attackers.corporation_id = ?
) SELECTED_KMS
LEFT JOIN killmails ON killmails.id = SELECTED_KMS.killmail_id
ORDER BY killmails.killmail_time DESC
LIMIT ? OFFSET ?
I saw a huge improvement in query times when looking up killmails for characters, however when I started querying for larger datasets like corporation and alliance killmails, the query slows down. This is because the queries that are union'd together can potentially return large sets of data and the time it takes to read all that into memory so that the SELECTED_KMS table can be created is what I believe is taking so much time. Most of the time, with alliances, my connection to the database times out from the application. One alliance returned 900K killmailIDs from one of the union'd tables, not sure what the other returned.
I can easily add limit statements to the internal queries, but this will introduce a lot of complications when I get to paginating the data or when I introduce a feature to search for KMs by date for example.
I am looking for suggestions on how this query can be optimized and still allow for easy pagination in the near future.
Thank You
Change INDEX(corporation_id) in both tables to INDEX(corporation_id, killmail_id) so that the inner queries will be "covering".
In general, INDEX(a) is useless when you also have INDEX(a,b). Any query that needs just a, can use either of those indexes. (This rule does not apply to b; only the "leftmost" column(s).)
Where does killmails.id come from? It's not AUTO_INCREMENT; it is not alone in the PRIMARY KEY, so there is no specified "uniqueness" constraint. Is it unique by some other design? Is it computed somewhere else in the code? (I ask because I need a feel for its uniqueness and other characteristics.)
Add INDEX(id, killmails_time).
What version are you using?
Perhaps UNION ALL give the same results? It would be faster because it would not need to de-dup.
How much RAM do you have? What is the value of innodb_buffer_pool_size?
Do you really need 8-byte BIGINTs? Even if your application is using longlong (or whatever it calls it), you can probably change the schema without changing the app.
Do you need this much precision and range? DECIMAL(30,10) -- it takes 14 bytes each. DOUBLE would give you about 16 significant digits in 8 bytes, with a wider range of values (up to about 10^308). What "units" are you using? (Overkill for light-years or parsecs; inadequate for miles or km. Perhaps AUs? Then the bottom digit would be a precision of a few meters?)
The last few questions are aimed at shrinking the table and seeing if we can avoid it being as I/O-bound as it apparently is now.
Important
innodb_buffer_pool_size = 128M is terribly small, especially for a 32GB machine, and especially if your dataset is much bigger than 128MB. If there are not any other apps running on the server, bump that setting up to 20G.

How to find the reason for the difference in the execution time of a query against different databases?

I have two databases with identical schemas. The one database is from production, the other is a test database. I'm doing a query against a single table from the database. On the production table the query takes around 4.3 seconds, while on the test database it takes about 130 ms. . However, the production table has less then 50.000 records, while I've seeded the test table with more than 100.000. I've compared the two tables and both have the same indexes. To me, it seems that the problem is in the data. While seeding I tried to generate as random data as possible, so that I can simulate production conditions, but still I couldn't reproduce the slow query.
I looked the the results from EXPLAIN for the two queries. They have significant differences in the last two columns.
Production:
+-------+-------------------------+
| rows | Extra |
+-------+-------------------------+
| 24459 | Using where |
| 46 | Using where; Not exists |
+-------+-------------------------+
Test:
+------+------------------------------------+
| rows | Extra |
+------+------------------------------------+
| 3158 | Using index condition; Using where |
| 20 | Using where; Not exists |
+------+------------------------------------+
The create statement for the table on production is:
CREATE TABLE `usage_logs` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`operation` varchar(30) COLLATE utf8_unicode_ci NOT NULL,
`check_time` datetime NOT NULL,
`check_in_log_id` int(11) DEFAULT NULL,
`daily_usage_id` int(11) DEFAULT NULL,
`duration_units` decimal(11,2) DEFAULT NULL,
`is_deleted` tinyint(1) NOT NULL DEFAULT '0',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`facility_id` int(11) NOT NULL,
`notes` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`mac_address` varchar(20) COLLATE utf8_unicode_ci NOT NULL DEFAULT '00:00:00:00:00:00',
`login` varchar(40) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_usage_logs_on_user_id` (`user_id`),
KEY `index_usage_logs_on_check_in_log_id` (`check_in_log_id`),
KEY `index_usage_logs_on_facility_id` (`facility_id`),
KEY `index_usage_logs_on_check_time` (`check_time`),
KEY `index_usage_logs_on_mac_address` (`mac_address`),
KEY `index_usage_logs_on_operation` (`operation`)
) ENGINE=InnoDB AUTO_INCREMENT=145147 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
while the same in the test database is:
CREATE TABLE `usage_logs` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`operation` varchar(30) COLLATE utf8_unicode_ci NOT NULL,
`check_time` datetime NOT NULL,
`check_in_log_id` int(11) DEFAULT NULL,
`daily_usage_id` int(11) DEFAULT NULL,
`duration_units` decimal(11,2) DEFAULT NULL,
`is_deleted` tinyint(1) NOT NULL DEFAULT '0',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`facility_id` int(11) NOT NULL,
`notes` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`mac_address` varchar(20) COLLATE utf8_unicode_ci NOT NULL DEFAULT '00:00:00:00:00:00',
`login` varchar(40) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_usage_logs_on_check_in_log_id` (`check_in_log_id`),
KEY `index_usage_logs_on_check_time` (`check_time`),
KEY `index_usage_logs_on_facility_id` (`facility_id`),
KEY `index_usage_logs_on_mac_address` (`mac_address`),
KEY `index_usage_logs_on_operation` (`operation`),
KEY `index_usage_logs_on_user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=104001 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
The full query is:
SELECT `usage_logs`.*
FROM `usage_logs`
LEFT OUTER JOIN usage_logs AS usage_logs_latest ON usage_logs.facility_id = usage_logs_latest.facility_id
AND usage_logs.user_id = usage_logs_latest.user_id
AND usage_logs.mac_address = usage_logs_latest.mac_address
AND usage_logs.check_time < usage_logs_latest.check_time
WHERE `usage_logs`.`facility_id` = 5
AND `usage_logs`.`operation` = 'checkIn'
AND (usage_logs.check_time >= '2018-06-08 00:00:00')
AND (usage_logs.check_time <= '2018-06-08 11:23:05')
AND (usage_logs_latest.id IS NULL)
I execute the query on the same machine against two different databases, so I don't think that other processes are interfering in the result.
What does this result mean and what further steps can I take in order to find out the reason for the big difference in the execution time?
What MySQL version(s) are you using?
There are many factors that lead to the decision by the Optimizer as to
which table to start with; (we can't see if they are different)
which index(es) to use; (we can't see)
etc.
Some of the factors:
the distribution of the index values at the moment,
the MySQL version,
the phase of the moon.
These can also lead to different numbers (estimates) in the EXPLAIN, which may lead to different query plans.
Also other activity in the server can interfere with the availability of CPU/IO/etc. In particular caching of the data can easily show a 10x difference. Did you run each query twice? Is the Query cache turned off? Is innodb_buffer_pool_size the same? Is RAM size the same?
I see Using index condition and no "composite" indexes. Often performance can be improved by providing a suitable composite index. More
I gotta see the query!
Seeding
Random, or not-so-random, rows can influence the Optimizer's choice of which index (etc) to use. This may have led to picking a better way to run the query on 'test'.
We need to see EXPLAIN SELECT ... to discuss this angle further.
Composite indexes
These are likely to help on both servers:
INDEX(facility_id, operation, -- either order
check_time) -- last
INDEX(facility_id, user_id, max_address, check_time, -- any order
id) -- last
There is a quick improvement. Instead of finding all the later rows, but not use the contents of them, use a 'semi-join' which asks of the non-existence of any such rows:
SELECT `usage_logs`.*
FROM `usage_logs`
WHERE `usage_logs`.`facility_id` = 5
AND `usage_logs`.`operation` = 'checkIn'
AND (usage_logs.check_time >= '2018-06-08 00:00:00')
AND (usage_logs.check_time <= '2018-06-08 11:23:05')
AND NOT EXISTS ( SELECT 1 FROM usage_logs AS latest
WHERE usage_logs.facility_id = latest.facility_id
AND usage_logs.user_id = latest.user_id
AND usage_logs.mac_address = latest.mac_address
AND usage_logs.check_time < latest.check_time )
(The same indexes will be fine.)
The query seems to be getting "all but the latest"; is that what you wanted?

query slows down if add field to where condition

I have a table Mysql fiddle with about 500k records.
CREATE TABLE IF NOT EXISTS `p_transactions` (
`transaction_id` bigint(10) unsigned NOT NULL,
`amount` decimal(19,2) NOT NULL,
`dt` bigint(1) NOT NULL,
`transaction_status` int(1) NOT NULL,
`transaction_type` varchar(15) NOT NULL,
`payment_method` varchar(25) NOT NULL,
`notes` text NOT NULL,
`member_id` int(10) unsigned NOT NULL,
`new_amount` decimal(19,2) NOT NULL,
`paid_amount` decimal(19,2) NOT NULL,
`secret_code` char(40) NOT NULL,
`internal_status` varchar(40) NOT NULL,
`ip_addr` varchar(15) NOT NULL,
`description` text NOT NULL,
`seller_transaction_id` varchar(50) DEFAULT NULL,
`return_url` varchar(255) DEFAULT NULL,
`fail_url` varchar(255) DEFAULT NULL,
`success_url` varchar(255) DEFAULT NULL,
`result_url` varchar(255) DEFAULT NULL,
`user_fee` decimal(19,3) DEFAULT '0.000',
`currency` char(255) DEFAULT 'USD',
`gateway_transaction_id` char(255) DEFAULT NULL,
`load_amount` decimal(19,2) NOT NULL,
`transaction_mode` varchar(1) NOT NULL DEFAULT '',
`p_fee` decimal(19,2) NOT NULL,
`country` varchar(2) NOT NULL,
`email` varchar(255) NOT NULL,
`vat` decimal(19,2) NOT NULL DEFAULT '0.00',
`name` varchar(255) NOT NULL,
`bdate` varchar(255) NOT NULL,
`child_method` varchar(255) NOT NULL,
`processing_fee` decimal(19,2) NOT NULL DEFAULT '0.00',
`flat_fee` varchar(1) NOT NULL DEFAULT 'n',
`user_fee_sum` decimal(19,2) NOT NULL DEFAULT '0.00',
`p_fee_sum` decimal(19,2) NOT NULL DEFAULT '0.00',
`dt_open` bigint(1) NOT NULL DEFAULT '0',
`user_fee_type` varchar(1) NOT NULL DEFAULT 'r',
`custom_gateway_fee` decimal(19,2) NOT NULL DEFAULT '0.00',
`paid_currency` varchar(3) NOT NULL DEFAULT 'USD',
`paid_microtime` bigint(10) unsigned NOT NULL,
`check_ballance` varchar(1) NOT NULL DEFAULT 'n',
PRIMARY KEY (`transaction_id`),
KEY `member_id` (`member_id`),
KEY `payment_method` (`payment_method`),
KEY `child_method` (`child_method`),
KEY `check_ballance` (`check_ballance`),
KEY `dt` (`dt`),
KEY `transaction_type` (`transaction_type`),
KEY `paid_microtime` (`paid_microtime`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
When I execute query
SELECT *
FROM `p_transactions`
WHERE dt >= 1517443200
AND dt <= 1523404799
AND member_id = 2051
ORDER BY `paid_microtime` DESC
LIMIT 50;
it runs 0,000 sec. (+ 0,016 sec. network)
but if I add to query this condition AND transaction_status = 7
SELECT *
FROM `p_transactions`
WHERE dt >= 1517443200
AND dt <= 1523404799
AND member_id = 2051
AND transaction_status = 7
ORDER BY `paid_microtime` DESC
LIMIT 50
query run 12,938 sec. (+ 0,062 sec. network)
Please help me to find out the reason of such behavior
PS. There was index on transaction_status and it increased execution time even more.
Add a suitable index, such as:
ON payzoff_transactions (member_id, dt)
or
ON payzoff_transactions (member_id, dt, transaction_status)
We want member_id column as the leading column in the index, because of the equality comparison, and we expect the result to be a substantially smaller subset of the entire table. We want dt column after that, because of the "range scan" on that.
Including additional columns in the index may allow MySQL to check that condition using values from the index, without a visit/lookup of the row in the underlying table pages.
Either of these indexes would be suitable for both of the queries shown in the question.
Use EXPLAIN to see the execution plan... which index is being used.
There's really no getting around the "Using filesort" operation, since we're pulling a small subset of the entire table.
(If we were pulling the entire table (or a huge subset), we might be able to avoid an expensive sort operation with an access plan that pulls rows in reverse index order, with that has an index with leading column of paid_microtime.)
For the original query have these
INDEX(member_id, dt)
INDEX(member_id, paid_microtime)
For the secondary query, have
INDEX(transaction_status, member_id, dt)
INDEX(transaction_status, member_id, paid_microtime)
Without getting into the details of the distribution of the data values, we cannot explain why one query so much slower; however, my 4 indexes should make both queries run faster most of the time.
More discussion of how I came up with those indexes (and why (member_id, dt, transaction_status) is not so good): http://mysql.rjweb.org/doc.php/index_cookbook_mysql

Why does MySQL stop using an index for a join when I select non-indexed fields in the field list

I have the following two tables:
CREATE TABLE `temporal_expressions` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`dated_obj_type` varchar(255) DEFAULT NULL,
`dated_obj_id` int(11) DEFAULT NULL,
`start_date` datetime DEFAULT NULL,
`end_date` datetime DEFAULT NULL,
`start_time` int(11) DEFAULT NULL,
`end_time` int(11) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`lock_version` int(11) NOT NULL DEFAULT '0',
`wday` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `te_search` (`dated_obj_type`,`dated_obj_id`,`start_date`,`end_date`),
KEY `te_calendar` (`dated_obj_type`,`dated_obj_id`,`start_date`,`end_date`,`start_time`,`end_time`),
KEY `te_search_wday` (`dated_obj_type`,`dated_obj_id`,`start_date`,`end_date`,`wday`),
KEY `te_calendar_wday` (`dated_obj_type`,`dated_obj_id`,`start_date`,`end_date`,`start_time`,`end_time`,`wday`),
KEY `te_index` (`wday`,`dated_obj_type`,`start_date`,`end_date`,`start_time`,`end_time`,`dated_obj_id`)
) ENGINE=InnoDB AUTO_INCREMENT=8162445 DEFAULT CHARSET=latin1
CREATE TABLE `asset_blocks` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`block_type` int(11) DEFAULT '0',
`spaces_left` int(11) DEFAULT NULL,
`provider_note` varchar(255) DEFAULT NULL,
`extra_data` text,
`lock_version` int(11) DEFAULT '0',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`type` varchar(255) DEFAULT NULL,
`service_provider_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `type` (`type`,`id`),
KEY `service_provider_id` (`service_provider_id`,`type`,`id`),
) ENGINE=InnoDB AUTO_INCREMENT=516867 DEFAULT CHARSET=latin1
If I run explain on this query (note that I am only selecting fields in the te_calendar_wday index from temporal_expressions) it uses the index for the join as expected
EXPLAIN SELECT asset_blocks.*, temporal_expressions.id,
temporal_expressions.dated_obj_type, temporal_expressions.dated_obj_id,
temporal_expressions.start_date, temporal_expressions.end_date,
temporal_expressions.start_time
FROM `asset_blocks`
LEFT OUTER JOIN `temporal_expressions`
ON `temporal_expressions`.dated_obj_id = `asset_blocks`.id
AND `temporal_expressions`.dated_obj_type = 'AssetBlock'
WHERE ( temporal_expressions.start_date <= '2010-11-25'
AND temporal_expressions.end_date >= '2010-11-01'
AND temporal_expressions.start_time < 1000 AND temporal_expressions.end_time > 1200
AND temporal_expressions.wday IN (1,2,3,4,5,6)
AND asset_blocks.id IN (1,2,3,4,5,6,7,8,9) )
1 SIMPLE temporal_expressions range te_search,te_calendar,te_search_wday,te_calendar_wday,te_index te_calendar_wday 272 NULL 9 Using where; Using index
1 SIMPLE asset_blocks eq_ref PRIMARY PRIMARY 4 lb_production.temporal_expressions.dated_obj_id 1
However, if I run this query (note that I have added a non-indexed field to the field list) it no longer uses the index (it uses a join buffer). Is this intentional or am I missing something?
EXPLAIN SELECT asset_blocks.*, temporal_expressions.id,
temporal_expressions.dated_obj_type, temporal_expressions.dated_obj_id,
temporal_expressions.start_date, temporal_expressions.end_date,
temporal_expressions.start_time, temporal_expressions.created_at
FROM `asset_blocks`
LEFT OUTER JOIN `temporal_expressions`
ON `temporal_expressions`.dated_obj_id = `asset_blocks`.id
AND `temporal_expressions`.dated_obj_type = 'AssetBlock'
WHERE ( temporal_expressions.start_date <= '2010-11-25'
AND temporal_expressions.end_date >= '2010-11-01'
AND temporal_expressions.start_time < 1000 AND temporal_expressions.end_time > 1200
AND temporal_expressions.wday IN (1,2,3,4,5,6)
AND asset_blocks.id IN (1,2,3,4,5,6,7,8,9) )
1 SIMPLE asset_blocks range PRIMARY PRIMARY 4 NULL 9 Using where
1 SIMPLE temporal_expressions range te_search,te_calendar,te_search_wday,te_calendar_wday,new_te_index te_search 272 NULL 9 Using where; Using join buffer
I cannot be sure if this is the case here, but:
If you select only indexed fields, MySQL can answer the whole query out of the index and does not even load the table data file.
If you select a field that is not indexed, it has to load the table data.
When making its execution plan, in certain cases (see comment) MySQL decides to do a full table scan although an index is present. This is because it's much quicker to read all data blindly than to look up every entry in the index and then read the data.