Why does MySQL/MariaDB consider a string (uuid) equal to integer? - mysql

I have the following table (mariadb:10.3):
CREATE TABLE `people` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`public_id` varchar(20) COLLATE utf8mb4_unicode_ci NOT NULL,
`uuid` varchar(36) COLLATE utf8mb4_unicode_ci NOT NULL,
`name` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`slug` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`old_slug` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`created_at` timestamp NULL DEFAULT NULL,
`updated_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `people_uuid_unique` (`uuid`),
UNIQUE KEY `people_slug_unique` (`slug`),
UNIQUE KEY `people_public_id_unique` (`public_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
When I run the below query against this table, I am getting the below result. For some reason the first number inside the string is matching against the ID which is an auto incrementing integer.
Query:
select * from `people` where `id` = "1f028a28-f032-482b-a8b4-dc5483489552"
Result:
+----+-------------+--------------------------------------+-------------+-------------------------+---------------------+---------------------+
| id | public_id | uuid | name | slug | created_at | updated_at |
+----+-------------+--------------------------------------+-------------+-------------------------+---------------------+---------------------+
| 1 | 8ipui1pn9ln | 52ea30f4-cafa-4ddb-8b74-1502a34c3c21 | Mark Hamill | 8ipui1pn9ln-mark-hamill | 2020-04-15 18:47:33 | 2020-04-15 18:47:35 |
+----+-------------+--------------------------------------+-------------+-------------------------+---------------------+---------------------+
Just a note, this is slimmed down example. The real reason I am searching a string against an integer is because I am searching a dynamic value against multiple columns with and OR.

MySQL will implicitly try to cast a string to an int by taking the first sequence of digits and ignoring anything from the first letter onwards.
Not an intuitive behavior, but once you're aware of it, you can avoid it by explicitly casting such conditions to strings.

Related

mariadb - join/order by clauses cause slow query

have a small blog site, host provider runs Server version: 5.5.5-10.3.34-MariaDB-cll-lve MariaDB Server. The last weeks i noticed fetching the posts is slower, and it turns out the query now takes 6 seconds.
the in question query:
SELECT * FROM POSTS
JOIN POST_CATEGORIES on POST_CATEGORIES.CATEG_ID = POSTS.POST_CATEGORY
LEFT JOIN USERS on POSTS.POST_USERNAME = USERS.USER_USERNAME
WHERE POST_STATUS = 'ENA'
AND POST_IS_PUBLIC = 'Y'
ORDER BY POST_IS_STICKY DESC, (TIMESTAMPDIFF(SECOND, POST_CREATION_DATE, now())-(POST_LIKES-POST_DISLIKES)*3600) ASC
LIMIT 0, 12;
if i remove the order by the query time drops to 0.04 sec, but obviously it is needed for proper sorting. Same effect happens if i remove the left join to USERS, but i would like to focus on the order by clause.
Table rows:
POSTS: 62K
POST_CATEGORIES: 20
USERS: 180
Interesting finding:
There are already indexes (primary/unique/foreign key) on the tables, but not on the columns involved, for the simplicity i am not adding them here.
But i did try adding the index:
INDEX `POSTS_OPTIMIZING_INDEX` (`POST_CREATION_DATE`, `POST_LIKES`, `POST_DISLIKES`) USING BTREE,
and FORCE INDEX (POSTS_OPTIMIZING_INDEX) and saw something interesting:
on the MariaDB, literally no difference.
on mysql community server Server version: 5.7.20 MySQL Community Server (GPL) that i have home, the query duration from 6 seconds went down to 0.12sec, which is definitely acceptable.
Any suggestions on optimizing the SQL, or tuning mariadb?
EDIT
the table definitions:
POSTS:
CREATE TABLE `POSTS` (
`POST_ID` int(11) NOT NULL AUTO_INCREMENT,
`POST_URL_ID` varchar(30) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_TYPE` varchar(3) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_CATEGORY` int(3) DEFAULT NULL,
`POST_IS_PUBLIC` varchar(1) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_IS_STICKY` varchar(1) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_FREE_TEXT` varchar(5000) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_IMAGE_FILE` varchar(1000) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_VIDEO_URL` varchar(100) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_IS_FROM_USER` varchar(1) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_USERNAME` varchar(20) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_CREATION_DATE` timestamp NOT NULL DEFAULT current_timestamp(),
`POST_STATUS` varchar(3) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'ENA',
`POST_LIKES` int(5) DEFAULT 0,
`POST_DISLIKES` int(5) DEFAULT 0,
`POST_ADDITIONAL_INFO` varchar(10) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_FB_OBJ_ID` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_FB_OBJ_TYPE` varchar(10) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_FB_IMG_LAST_CHECKED` timestamp NULL DEFAULT NULL,
`POST_FB_IMG_LAST_CHECK_RESULT` varchar(3) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`POST_TWITTER_OBJ_ID` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`POST_ID`),
UNIQUE KEY `POST_URL_ID` (`POST_URL_ID`),
KEY `FK_POSTS_POST_CATEGORIES` (`POST_CATEGORY`),
CONSTRAINT `FK_POSTS_POST_CATEGORIES` FOREIGN KEY (`POST_CATEGORY`) REFERENCES `POST_CATEGORIES` (`CATEG_ID`)
) ENGINE=InnoDB AUTO_INCREMENT=62529 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
POST_CATEGORIES:
CREATE TABLE `POST_CATEGORIES` (
`CATEG_ID` int(11) NOT NULL AUTO_INCREMENT,
`CATEG_URL_NAME` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
`CATEG_NAME` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`CATEG_ID`)
) ENGINE=InnoDB AUTO_INCREMENT=51 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
USERS:
CREATE TABLE `USERS` (
`USER_ID` int(11) NOT NULL AUTO_INCREMENT,
`USER_MAIL_ADDRESS` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_USERNAME` varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_PASSWORD` varchar(60) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_CREATION_DATE` timestamp NULL DEFAULT current_timestamp(),
`USER_LAST_LOGIN_DATE` timestamp NULL DEFAULT NULL,
`USER_FROM_FB` varchar(1) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_FB_MAIL_ADDRESS` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_FB_FULL_NAME` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_SPECIAL` varchar(1) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_ACCT_STATUS` varchar(3) COLLATE utf8_unicode_ci DEFAULT 'ENA',
`USER_POSTS_COUNT` int(5) DEFAULT 0,
`USER_NAME` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_QUOTE` varchar(200) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_ABOUT_MYSELF` varchar(500) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_WEBSITE` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_FB` varchar(150) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_TWITTER` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_LOCATION` varchar(150) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_BIRTHDAY` varchar(10) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_PROFILE_PIC` varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,
`USER_PROFILE_PIC_FROM_FB` varchar(1) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`USER_ID`),
UNIQUE KEY `USER_USERNAME` (`USER_USERNAME`)
) ENGINE=InnoDB AUTO_INCREMENT=188 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
the EXPLAIN:
mysql> EXPLAIN
-> SELECT * FROM POSTS
-> JOIN POST_CATEGORIES on POST_CATEGORIES.CATEG_ID = POSTS.POST_CATEGORY
-> LEFT JOIN USERS on POSTS.POST_USERNAME = USERS.USER_USERNAME
-> WHERE POST_STATUS = 'ENA'
-> AND POST_IS_PUBLIC = 'Y'
-> ORDER BY POST_IS_STICKY DESC, (TIMESTAMPDIFF(SECOND, POST_CREATION_DATE, now())-(POST_LIKES-POST_DISLIKES)*3600) ASC
-> LIMIT 0, 12;
+------+-------------+-----------------+--------+--------------------------+---------+---------+--------------------------------+-------+-------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-----------------+--------+--------------------------+---------+---------+--------------------------------+-------+-------------------------------------------------+
| 1 | SIMPLE | POSTS | ALL | FK_POSTS_POST_CATEGORIES | NULL | NULL | NULL | 61407 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | POST_CATEGORIES | eq_ref | PRIMARY | PRIMARY | 4 | MY_DB.POSTS.POST_CATEGORY | 1 | |
| 1 | SIMPLE | USERS | ALL | NULL | NULL | NULL | NULL | 178 | Using where; Using join buffer (flat, BNL join) |
+------+-------------+-----------------+--------+--------------------------+---------+---------+--------------------------------+-------+-------------------------------------------------+
3 rows in set (0.03 sec)
mysql>
and removing the 2 joins:
mysql> EXPLAIN
-> SELECT * FROM POSTS
-> WHERE POST_STATUS = 'ENA'
-> AND POST_IS_PUBLIC = 'Y'
-> ORDER BY POST_IS_STICKY DESC, (TIMESTAMPDIFF(SECOND, POST_CREATION_DATE, now())-(POST_LIKES-POST_DISLIKES)*3600) ASC
-> LIMIT 0, 12;
+------+-------------+-------+------+---------------+------+---------+------+-------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+------+-------+-----------------------------+
| 1 | SIMPLE | POSTS | ALL | NULL | NULL | NULL | NULL | 61407 | Using where; Using filesort |
+------+-------------+-------+------+---------------+------+---------+------+-------+-----------------------------+
1 row in set (0.02 sec)
mysql>
TIA
"Explode-implode" syndrome. Rewrite the query to get the 12 rows first, then JOIN to the other tables.
SELECT ...
FROM ( SELECT user_id,
is_sticky,
(TIMESTAMPDIFF(SECOND, POST_CREATION_DATE, now()) -
(POST_LIKES-POST_DISLIKES)*3600) AS metric
FROM POSTS
WHERE POST_STATUS = 'ENA'
AND POST_IS_PUBLIC = 'Y'
ORDER BY
POST_IS_STICKY DESC, metric ASC
LIMIT 0, 12
) AS x
JOIN POSTS AS p USING(user_id)
JOIN POST_CATEGORIES AS pc ON pc.CATEG_ID = p.POST_CATEGORY
LEFT JOIN USERS AS u ON p.POST_USERNAME = u.USER_USERNAME
ORDER BY x.POST_IS_STICKY DESC, x.metric ASC
Replace the elipsis with the columns you really need.
Yes, POSTS is looked into a second time, but only for 12 rows, and by the PK. Similarly, the other rows are reached into only 12 times.
Yes, the ORDER BY needs to be repeated. The LIMIT may need repeating if the JOIN creates extra rows (eg, multiple categories).
Consider using
SELECT ...
( SELECT GROUP_CONCAT(CATEG_NAME) FROM POST_CATEGORIES
WHERE categ_id = p.post_categories ) AS Categories,
...
instead of JOINing to POST_CATEGORIES, especially if there are multiple categories.
This may help with performance, but only if it is selective enough:
INDEX(post_tatus, post_is_public)

Order by slow on joined large table

I have 3 large table around 1 table have 1,000,000 rows other 2 have 500,000 rows.
When I use ORDER BY in query it takes 15 seconds (without ORDER BY 0.001s).
I already try add index on invoice.created_at and invoice.invoice_id
But still take a lot of time.
SELECT
`i`.*,
GROUP_CONCAT(DISTINCT ii.item_id SEPARATOR ',') AS item_id,
GROUP_CONCAT(DISTINCT it.transaction_id SEPARATOR ',') AS transactions,
SUM(DISTINCT IF(ii.status = 1, ii.subtotal, 0)) AS subtotal,
SUM(DISTINCT it.subtotal) AS total_paid_amount,
SUM(DISTINCT it.additional_fee) AS total_additional_fee_amount,
SUM(ii.quantity) AS total_quantity,
(total_amount - SUM(DISTINCT COALESCE(it.amount, 0))) AS total_balance
FROM
`invoices` `i`
LEFT JOIN
`invoices_items` `ii` ON `ii`.`invoice_id` = `i`.`invoice_id`
LEFT JOIN
`invoices_transactions` `it` ON `it`.`invoice_id` = `i`.`invoice_id`
AND `it`.`process_status` = 1
AND `it`.`status` = 1
WHERE
`i`.`status` = '1'
GROUP BY `i`.`invoice_id`
ORDER BY `i`.`created_at` DESC
LIMIT 50
Query with EXPLAIN:
+----+-------------+-------+------------+-------+--------------------------------------------------+------------+---------+------------------------------+--------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+--------------------------------------------------+------------+---------+------------------------------+--------+----------+----------------------------------------------+
| 1 | SIMPLE | i | NULL | index | PRIMARY,customer_id,code,invoice_id_2,created_at | PRIMARY | 4 | NULL | 473309 | 10.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | ii | NULL | ref | invoice_id | invoice_id | 5 | test.i.invoice_id | 2 | 100.00 | NULL |
| 1 | SIMPLE | it | NULL | ref | invoice_id,status | invoice_id | 5 | test.i.invoice_id | 1 | 100.00 | Using where |
+----+-------------+-------+------------+-------+--------------------------------------------------+------------+---------+------------------------------+--------+----------+----------------------------------------------+
I try to remove SUM ,GROUP_CONCAT and WHERE CASE,
But still need 10 sec
SELECT
`i`.*
FROM
`invoices` `i`
LEFT JOIN
`invoices_items` `ii` ON `ii`.`invoice_id` = `i`.`invoice_id`
LEFT JOIN
`invoices_transactions` `it` ON `it`.`invoice_id` = `i`.`invoice_id`
GROUP BY `i`.`invoice_id`
ORDER BY `i`.`created_at` DESC
LIMIT 50
Create Table:
CREATE TABLE `invoices` (
`invoice_id` int NOT NULL AUTO_INCREMENT,
`warehouse_id` tinyint DEFAULT NULL,
`invoice_type` varchar(1) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_i NOT NULL DEFAULT 'C',
`invoice_date` date DEFAULT NULL,
`customer_id` int DEFAULT NULL,
`contact_name` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_i DEFAULT NULL,
`contact_no` varchar(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_i DEFAULT NULL,
`contact_email` varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_i DEFAULT NULL,
`address` varchar(1000) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_i DEFAULT NULL,
`code` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_i DEFAULT NULL,
`total_amount` deimal(10,2) DEFAULT '0.00',
`total_cost` deimal(20,2) DEFAULT NULL,
`delivery_type` tinyint DEFAULT '0',
`remark` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_i,
`payment_status` tinyint(1) NOT NULL DEFAULT '3',
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` datetime DEFAULT NULL,
`updated_by` int NOT NULL,
`confirmed_at` datetime DEFAULT NULL,
`status` tinyint(1) NOT NULL DEFAULT '2'
PRIMARY KEY (`invoice_id`),
KEY `customer_id` (`customer_id`),
KEY `code` (`code`),
KEY `invoice_id_2` (`invoice_id`,`created_at`),
KEY `created_at` (`created_at`)
) ENGINE=InnoDB AUTO_INCREMENT=513697 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_i;
CREATE TABLE `invoices_items` (
`item_id` int NOT NULL AUTO_INCREMENT,
`invoice_id` int NOT NULL,
`warehouse_id` int DEFAULT NULL,
`product_id` int DEFAULT NULL,
`variant_id` int DEFAULT NULL,
`name` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_i DEFAULT NULL,
`quantity` int DEFAULT '1',
`unit_price` deimal(10,2) DEFAULT NULL,
`cost` deimal(10,2) DEFAULT NULL,
`subtotal` deimal(10,2) DEFAULT NULL,
`serial_no` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_i DEFAULT NULL,
`StockoutDate` datetime DEFAULT NULL,
`ReturnDate` datetime DEFAULT NULL,
`Return_StockLocationID` tinyint DEFAULT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` datetime DEFAULT NULL,
`status` tinyint(1) NOT NULL DEFAULT '1',
`priority` int NOT NULL DEFAULT '0',
PRIMARY KEY (`item_id`),
KEY `invoice_id` (`invoice_id`),
KEY `variant_id` (`variant_id`),
KEY `quantity` (`quantity`),
KEY `unit_price` (`unit_price`),
KEY `subtotal` (`subtotal`),
KEY `created_at` (`created_at`),
KEY `updated_at` (`updated_at`),
KEY `status` (`status`),
KEY `priority` (`priority`),
KEY `variant_id_2` (`variant_id`,`quantity`),
KEY `variant_id_3` (`variant_id`,`name`,`quantity`),
KEY `product_id` (`product_id`),
KEY `StockoutDate` (`StockoutDate`)
) ENGINE=InnoDB AUTO_INCREMENT=1200951 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_i;
CREATE TABLE `invoices_transactions` (
`transaction_id` int NOT NULL AUTO_INCREMENT,
`invoice_id` int DEFAULT NULL,
`warehouse_id` int DEFAULT NULL,
`invoice_date` date DEFAULT NULL,
`payment_id` int DEFAULT NULL,
`balance` deimal(10,2) DEFAULT NULL,
`amount` deimal(10,2) DEFAULT NULL,
`additional_rate` deimal(10,2) DEFAULT NULL,
`additional_fee` deimal(10,2) DEFAULT NULL,
`subtotal` deimal(10,2) DEFAULT NULL,
`process_status` int DEFAULT '1',
`remark` varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_i DEFAULT NULL,
`payment_status` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_i DEFAULT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` datetime DEFAULT NULL,
`status` tinyint NOT NULL DEFAULT '1',
`CreatedByID` int DEFAULT NULL,
`ModifiedByID` int DEFAULT NULL,
`priority` int DEFAULT '0',
PRIMARY KEY (`transaction_id`),
KEY `invoice_id` (`invoice_id`),
KEY `payment_status` (`payment_status`),
KEY `created_at` (`created_at`),
KEY `updated_at` (`updated_at`),
KEY `status` (`status`),
KEY `amount` (`amount`),
KEY `additional_fee` (`additional_fee`),
KEY `subtotal` (`subtotal`),
KEY `transaction_id` (`transaction_id`,`amount`,`additional_fee`,`subtotal`),
KEY `invoice_id_2` (`invoice_id`,`process_status`,`status`),
KEY `process_status` (`process_status`),
KEY `transaction_id_2` (`transaction_id`,`invoice_id`,`amount`,`additional_fee`,`subtotal`,`process_status`,`status`)
) ENGINE=InnoDB AUTO_INCREMENT=543606 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_i;
Let's find the 50 invoices first, then reach for the rest of the stuff:
SELECT ... (all the current stuff)
FROM ( SELECT invoice_id
FROM invoices
WHERE status = 1
ORDER BY created_at DESC
LIMIT 50
) AS ids
JOIN invoices AS i USING(invoice_id)
LEFT JOIN ... ON ...
LEFT JOIN ... ON ...
GROUP BY invoice_id
ORDER BY created_id DESC
These composite indexes may help:
i: (status, created_at, invoice_id)
it: (status, process_status, invoice_id)
ii: (invoice_id, quantity, subtotal, status, item_id)
Redundant index:
KEY `variant_id` (`variant_id`) -- since this is the prefix of others
Because of PRIMARY KEY(transaction_id), the keys starting with transaction_id are redundant.
Summary:
I provided a "covering index" with the columns in the right order to allow finding the 50 ids directly from that index.
The rest of the work is limited to the 50 rows (or sets of rows, since I suspect that the rest are many:1 with respect to invoices.)
Your EXPLAIN shows lots of "Rows" and a "filesort" -- indicating that it gathered everything, then did the group by, then sorted, and finally, peeled of 50.
I suspect the "filesort" was really two sorts. See EXPLAIN FORMAT=JSON ... to find out.

MySQL shows index in execution plan, but doesn't really use it

I have a table with 20M rows and a query which takes 10 seconds.
select id from entity
where (entity.class_id = 67 and entity.name like '%321%' )
order by id desc
In execution plan there is an index, but it's not really used.
explain extended select id from entity
where (entity.class_id = 67 and entity.name like '%321%' )
order by id desc
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
| 1 | SIMPLE | entity | ref | constraint_unique_class_legacy_id,entity_tag,entity_class_modification_date_int_idx,entity_name_idx | entity_class_modification_date_int_idx | 8 | const | 288440 | 100.00 | Using where; Using filesort |
If I flush status and run this query, handlers show that there was a full scan
Handler_read_next: 20318800
But if I give a hint to use index which was in 'explain extended', then there is no full scan and query finishes in 250ms.
select id from entity
use index (entity_class_modification_date_int_idx)
where (entity.class_id = 67 and entity.name like '%321%' )
order by id desc
Only 166K of entities was scanned
Handler_read_next: 165894
Why do I have to give hint to use index which is already in execution plan?
If I add + 0 to order by, query finishes in 250ms as well.
select id from entity
where (entity.class_id = 67 and entity.name like '%321%' )
order by id + 0 desc
'explain extended' shows the same execution plan in every case, 'analyze' doesn't help.
Table 'entity':
CREATE TABLE `entity` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`name` varchar(4096) COLLATE utf8_bin DEFAULT NULL,
`tag` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`revision` int(11) NOT NULL,
`class_id` bigint(20) NOT NULL,
`legacy_id` bigint(20) DEFAULT NULL,
`description` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`last_modified_by` varchar(64) COLLATE utf8_bin DEFAULT NULL,
`removed` tinyint(1) NOT NULL DEFAULT '0',
`modification_date_int` bigint(20) DEFAULT NULL,
`creation_date_int` bigint(20) DEFAULT NULL,
`created_by` varchar(64) COLLATE utf8_bin DEFAULT NULL,
`ancestor_class_id` bigint(20) NOT NULL,
`acu_id` bigint(20) DEFAULT NULL,
`secured` tinyint(1) DEFAULT '1',
`system_modification_date` bigint(20) DEFAULT NULL,
`archived` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `constraint_unique_class_legacy_id` (`class_id`,`legacy_id`),
UNIQUE KEY `entity_tag` (`class_id`,`tag`),
UNIQUE KEY `class_hierarchy_tag` (`tag`,`ancestor_class_id`),
KEY `entity_legacy_id_idx` (`legacy_id`),
KEY `entity_modification_date_int_idx` (`modification_date_int`),
KEY `entity_class_modification_date_int_idx` (`class_id`,`removed`,`modification_date_int`),
KEY `ancestor_class_id` (`ancestor_class_id`),
KEY `acu_id` (`acu_id`),
KEY `entity_name_idx` (`class_id`,`name`(255)),
KEY `entity_archived_idx` (`archived`),
CONSTRAINT `entity_ibfk_1` FOREIGN KEY (`class_id`) REFERENCES `class` (`id`),
CONSTRAINT `entity_ibfk_2` FOREIGN KEY (`ancestor_class_id`) REFERENCES `class` (`id`),
CONSTRAINT `entity_ibfk_3` FOREIGN KEY (`acu_id`) REFERENCES `acu` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=60382455 DEFAULT CHARSET=utf8 COLLATE=utf8_bin
MySQL version:
SELECT ##version;
+--------------------+
| ##version |
+--------------------+
| 5.6.30-76.3-56-log |
+--------------------+
OK, I partially found an answer: MySQL Workbench which I'm using is implicitly adding 'limit 1000' to queries and it drastically reduce performance even though there are much less rows in response. With limit 'explain extended' shows PRIMARY as a key and this is not a question anymore. If I increase limit to 10000, then query finishes in 250ms. Looks like there is some heuristics in MySQL optimizer which forces it to use PRIMARY index in case of low 'limit'.

Slow Query on Rails joins

The following rails query throws back in slow query log:
Class ParserRun
scope :active, -> {
where(completed_at: nil)
.joins('LEFT JOIN system_events ON parser_runs.id = system_events.parser_run_id')
.where("system_events.created_at > '#{active_system_events_threshold}' OR parser_runs.created_at > '#{1.minute.ago.to_s(:db)}'")
}
How can I optimize this?
Slow querylog:
SELECT `parser_runs`.*
FROM `parser_runs`
INNER JOIN `system_events` ON `system_events`.`parser_run_id` = `parser_runs`.`id`
WHERE `parser_runs`.`type` IN ('DatasetParserRun')
AND `parser_runs`.`completed_at` IS NULL
AND (system_events.created_at <= '2017-08-05 04:03:09');
# Time: 170805 5:03:43
Output of 'show create table parser_runs;'
| parser_runs | CREATE TABLE `parser_runs` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`type` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`customer_id` int(11) DEFAULT NULL,
`options` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`completed_at` datetime DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_parser_runs_on_customer_id` (`customer_id`)
) ENGINE=InnoDB AUTO_INCREMENT=143327 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
output of 'show create table system_events;'
| system_events | CREATE TABLE `system_events` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`log_level` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`customer_id` int(11) DEFAULT NULL,
`classification` int(11) DEFAULT NULL,
`information` text COLLATE utf8_unicode_ci,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`parser_run_id` int(11) DEFAULT NULL,
`notified` tinyint(1) DEFAULT '0',
`dataset_log_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_system_events_on_classification` (`classification`),
KEY `index_system_events_on_customer_id` (`customer_id`),
KEY `index_system_events_on_parser_run_id` (`parser_run_id`),
KEY `index_system_events_on_dataset_log_id` (`dataset_log_id`)
) ENGINE=InnoDB AUTO_INCREMENT=730539 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
Output of EXPLAIN:
EXPLAIN for: SELECT `parser_runs`.* FROM `parser_runs` LEFT JOIN system_events ON parser_runs.id = system_events.parser_run_id WHERE `parser_runs`.`completed_at` IS NULL AND (system_events.created_at > '2017-08-07 10:09:03')
+----+-------------+---------------+--------+------------------------- -------------+---------+---------+--------------------------------------+- -------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------+--------+--------------------------------------+---------+---------+--------------------------------------+--------+-------------+
| 1 | SIMPLE | system_events | ALL | index_system_events_on_parser_run_id | NULL | NULL | NULL | 655946 | Using where |
| 1 | SIMPLE | parser_runs | eq_ref | PRIMARY | PRIMARY | 4 | ashblood.system_events.parser_run_id | 1 | Using where |
+----+-------------+---------------+--------+--------------------------------------+---------+---------+--------------------------------------+--------+-------------+
2 rows in set (0.00 sec)
The first step in the query execution plan (the output of EXPLAIN SELECT ...) indicates that the whole system_events table is being scanned in order to check which rows in the system_events table will be used in the join with the parser_runs table.
Please, add an index on the created_at column in the system_events and repeat the query. Please, check the new execution path to verify whether the whole table is being scanned, or if the new index is being used.
In addition, although probably not the root of the problem, you could add an index on the type and completed_at columns of the table parser_runs. Please, note that I mean an index on both columns (in the given order) instead of an index on each column.
INDEX(type, completed_at)
INDEX(completed_at, type)
INDEX(created_at, parser_run_id)
INDEX(parser_run_id, created_at)
It is not obvious which indexes the Optimizer will prefer; add all of those.
Don't use joins. Instead break the join queries in separate queries and store those data in variables. And later get your desired results from those data.

How to improve this MySQL query?

This is what I saw in slow-query.log
# User#Host: appUser[appUser] # [192.168.1.10]
# Query_time: 65.118800 Lock_time: 0.000257 Rows_sent: 7 Rows_examined: 78056
SET timestamp=1312883519;
select
this_.id as id2_0_,
this_.appearance_count as appearance2_2_0_,
this_.author as author2_0_,
this_.date_added as date4_2_0_,
this_.date_last_ranked as date5_2_0_,
this_.date_updated as date6_2_0_,
this_.description as descript7_2_0_,
this_.frequency_updated as frequency8_2_0_,
this_.hidden as hidden2_0_,
this_.link as link2_0_,
this_.lock_id as lock11_2_0_,
this_.plain_text_description as plain12_2_0_,
this_.processor_done as processor13_2_0_,
this_.processor_lock_id as processor14_2_0_,
this_.published_date as published15_2_0_,
this_.raw_link as raw16_2_0_,
this_.retweet_count as retweet17_2_0_,
this_.source_url as source18_2_0_,
this_.subtype as subtype2_0_,
this_.thumbnail_img_url as thumbnail20_2_0_,
this_.title as title2_0_,
this_.total_hit_count as total22_2_0_,
this_.total_lift_count as total23_2_0_,
this_.total_share_count as total24_2_0_,
this_.trend_value as trend25_2_0_,
this_.type as type2_0_
from
news this_
where
exists (select 1 from news_sub_topic where this_.id=news_sub_topics_id)
order by
this_.trend_value desc,
this_.retweet_count desc,
this_.total_lift_count desc,
this_.published_date desc
limit 7;
this is what EXPLAIN gives:
+----+--------------------+----------------+------+--------------------------------------+-------------------+---------+------------------------+-------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------------+------+--------------------------------------+-------------------+---------+------------------------+-------+-----------------------------+
| 1 | PRIMARY | this_ | ALL | NULL | NULL | NULL | NULL | 50910 | Using where; Using filesort |
| 2 | DEPENDENT SUBQUERY | news_sub_topic | ref | FK420C980470B0479,FK420C98041F791FA1 | FK420C980470B0479 | 9 | causelift_web.this_.id | 1 | Using where; Using index |
+----+--------------------+----------------+------+--------------------------------------+-------------------+---------+------------------------+-------+-----------------------------+
2 rows in set (3.28 sec)
this is the CREATE sytax for the news and news_sub_topic table:
CREATE TABLE `news` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`appearance_count` int(11) NOT NULL,
`author` longtext COLLATE utf8_unicode_ci,
`date_added` datetime NOT NULL,
`date_updated` datetime DEFAULT NULL,
`description` longtext COLLATE utf8_unicode_ci,
`link` longtext COLLATE utf8_unicode_ci,
`published_date` datetime NOT NULL,
`raw_link` longtext COLLATE utf8_unicode_ci,
`subtype` varchar(5) COLLATE utf8_unicode_ci NOT NULL,
`thumbnail_img_url` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`title` longtext COLLATE utf8_unicode_ci,
`type` varchar(5) COLLATE utf8_unicode_ci NOT NULL,
`date_last_ranked` datetime DEFAULT NULL,
`lock_id` int(11) DEFAULT NULL,
`plain_text_description` longtext COLLATE utf8_unicode_ci,
`source_url` longtext COLLATE utf8_unicode_ci,
`retweet_count` int(11) DEFAULT NULL,
`frequency_updated` bit(1) DEFAULT NULL,
`hidden` bit(1) DEFAULT NULL,
`processor_done` bit(1) DEFAULT NULL,
`processor_lock_id` int(11) DEFAULT NULL,
`total_hit_count` int(11) DEFAULT NULL,
`total_lift_count` int(11) DEFAULT NULL,
`total_share_count` int(11) DEFAULT NULL,
`trend_value` float DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `link_idx` (`link`(255)),
KEY `type_idx` (`type`),
KEY `processor_lock_id_idx` (`processor_lock_id`),
KEY `processor_done_idx` (`processor_done`),
KEY `hidden_idx` (`hidden`),
KEY `title_idx` (`title`(255)),
KEY `published_date_idx` (`published_date`)
) ENGINE=InnoDB AUTO_INCREMENT=321136 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE `news_sub_topic` (
`news_sub_topics_id` bigint(20) DEFAULT NULL,
`sub_topic_id` bigint(20) DEFAULT NULL,
KEY `FK420C98045D8019D4` (`sub_topic_id`),
KEY `FK420C980470B0479` (`news_sub_topics_id`),
CONSTRAINT `FK420C98041F791FA1` FOREIGN KEY (`news_sub_topics_id`) REFERENCES `news` (`id`),
CONSTRAINT `FK420C98045D8019D4` FOREIGN KEY (`sub_topic_id`) REFERENCES `sub_topic` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
anyone can explain why sometimes it took 65 seconds for this query to finish? but sometimes it is fast? what other factors should i look into (DB config, OS, etc.).
thanks for any leads on this.
my feeling is the 'WHERE' clause is your problem.
Try this code instead:
from
news this_
inner join news_sub_topic topic on this.id=topic.news_sub_topics_id
The explanation is the because the lookup against news is using a filesort and not an index, likely because the where clause isn't giving a good hint to the optimize.