Mysql fulltext search and integer field query slowing down performance extreme - mysql

My table consist of 600K records. The below queries have been troubling me for a while.
DDL:
CREATE TABLE `cvprofiles` (
`tid` bigint(20) NOT NULL AUTO_INCREMENT,
`partnerId` bigint(20) DEFAULT '0' COMMENT 'resume owner',
`tenant` bigint(20) DEFAULT '0' COMMENT 'sellercompany',
`lngId` int(4) DEFAULT '0',
`firstName` varbinary(150) DEFAULT '',
`lastName` varbinary(150) DEFAULT '',
`profilePicture` varchar(150) COLLATE utf8mb4_unicode_ci DEFAULT '',
`mobileIsd` varchar(20) COLLATE utf8mb4_unicode_ci DEFAULT '',
`mobileNumber` varbinary(80) DEFAULT '',
`emailId` varbinary(250) DEFAULT '',
`altEmailId` varchar(150) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`cvPath` varchar(150) COLLATE utf8mb4_unicode_ci DEFAULT '',
`cvFileName` varchar(250) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`genderCode` int(4) DEFAULT '0',
`dob` date DEFAULT NULL,
`presentLocation` varchar(350) COLLATE utf8mb4_unicode_ci DEFAULT '',
`presentLocationId` bigint(20) DEFAULT NULL,
`countryId` bigint(20) DEFAULT '0' COMMENT 'country to be maintained',
`latitude` decimal(19,15) DEFAULT '0.000000000000000',
`longitude` decimal(19,15) DEFAULT '0.000000000000000',
`totalExp` decimal(5,2) DEFAULT '0.00',
`anyKeywords` varchar(4000) COLLATE utf8mb4_unicode_ci DEFAULT '',
`cvKeywords` mediumtext COLLATE utf8mb4_unicode_ci,
`presentEmployer` varchar(400) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`noticePeriod` int(4) DEFAULT '0',
`presentSalaryCurrId` bigint(20) DEFAULT '0',
`crDate` datetime DEFAULT NULL,
`crUserId` bigint(20) DEFAULT '0',
`luDate` datetime DEFAULT NULL,
`luUserId` bigint(20) DEFAULT '0',
`skillKeywords` text COLLATE utf8mb4_unicode_ci,
`industryKeywords` varchar(2000) COLLATE utf8mb4_unicode_ci DEFAULT '',
`certiKeywords` varchar(2000) COLLATE utf8mb4_unicode_ci DEFAULT '',
`prefLocKeywords` varchar(2000) COLLATE utf8mb4_unicode_ci DEFAULT '',
`educationKeywords` varchar(2000) COLLATE utf8mb4_unicode_ci DEFAULT '',
`roleKeywords` tinytext COLLATE utf8mb4_unicode_ci,
`abilityKeywords` text COLLATE utf8mb4_unicode_ci,
`lngKeywords` text COLLATE utf8mb4_unicode_ci,
`infoKeywords` varchar(600) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'combination of fn,ln,mno,email,present employer designation',
`jobTitleId` bigint(20) DEFAULT NULL,
`source` bigint(6) DEFAULT NULL,
`portalUid` varchar(150) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`salutationId` int(5) DEFAULT NULL,
PRIMARY KEY (`tid`),
KEY `partnerId` (`partnerId`),
KEY `crDate` (`crDate`),
KEY `idx_cvprofiles_firstName` (`firstName`),
KEY `idx_cvprofiles_mobileIsd` (`mobileIsd`),
KEY `idx_cvprofiles_mobileNumber` (`mobileNumber`),
KEY `idx_cvprofiles_emailId` (`emailId`),
KEY `idx_cvprofiles_dob` (`dob`),
KEY `luDate` (`luDate`),
KEY `idx_s_p_m_e` (`tenant`,`partnerId`,`mobileNumber`,`emailId`),
KEY `idx_s_f_l_c_s` (`tenant`,`firstName`,`lastName`,`crDate`),
KEY `sel` (`tenant`),
KEY `c_c` (`crDate`,`crUserId`),
FULLTEXT KEY `fx_cat` (`skillKeywords`,`anyKeywords`,`cvKeywords`,`infoKeywords`,`abilityKeywords`,`lngKeywords`,`roleKeywords`,`industryKeywords`,`educationKeywords`,`prefLocKeywords`)
) ;
Query1: Below fulltext query takes "0.43 sec" for word "java" :
select count(*)
from cvprofiles
where match(`skillKeywords`,`anyKeywords`,`cvKeywords`,`infoKeywords`,
`abilityKeywords`,`lngKeywords`,`roleKeywords`,
`industryKeywords`,`educationKeywords`,`prefLocKeywords`)
against ('java' in boolean mode);
Result: 168944 records
Explain:
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------------------+
Query2:
select count(*) from cvprofiles where tenant=429;
Response time: 0.18 sec , Result : 845 records
Explain:
+----+-------------+-----------------+------------+------+-------------------------------+------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------+------------+------+-------------------------------+------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | icrd_resumeBank | NULL | ref | idx_s_p_m_e,idx_s_f_l_c_s,sel | sel | 9 | const | 845 | 100.00 | Using index |
Query3: Combine fulltext with integer field takes more then 45 sec+.
select count(*)
from cvprofiles
where match(`skillKeywords`,`anyKeywords`,`cvKeywords`,`infoKeywords`,
`abilityKeywords`,`lngKeywords`,`roleKeywords`,
`industryKeywords`,`educationKeywords`,`prefLocKeywords`)
against ('java' in boolean mode)
and tenant=429;
Response time: 40.12 sec, Result : 452 records
Explain:
+----+-------------+-----------------+------------+----------+--------------------------------------+--------+---------+-------+------+----------+-----------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------+------------+----------+--------------------------------------+--------+---------+-------+------+----------+-----------------------------------+
| 1 | SIMPLE | icrd_resumeBank | NULL | fulltext | idx_s_p_m_e,idx_s_f_l_c_s,sel,fx_cat | fx_cat | 0 | const | 1 | 5.00 | Using where; Ft_hints: no_ranking |
+----+-------------+-----------------+------------+----------+--------------------------------------+--------+---------+-------+------+----------+-----------------------------------+
None of the combinations of the query are working. How to improve performance of the query of fulltext with integer fields?
Using MySQL version: 5.7
Storage Engine: InnoDB

The Optimizer is caught between a rock and a hard place --
whether to use the FULLTEXT index, then filter out those with the wrong tenant. (This is probably faster, and the formulation below may trick it into using it.)
Or to start by fetching on tenant first, then do the FULLTEXT tests. (Please provide SHOW CREATE TABLE cvprofiles so we can see if this is even viable.)
This kludge may make it run faster:
select count(*)
from cvprofiles
where match(`skillKeywords`,`anyKeywords`,`cvKeywords`,`infoKeywords`,
`abilityKeywords`,`lngKeywords`,`roleKeywords`,
`industryKeywords`,`educationKeywords`,`prefLocKeywords`)
against ('java' in boolean mode)
HAVING tenant=429;
The CREATE TABLE may point out flaws in the schema that led to slugishness.
On second thought, my kludge may not help in this case. It will fetch 168944 rows, then reach into the table to check that many value of tenant; this will be 168944 random lookups. The speed of that depends on disk type (SSD vs HDD) and size of innodb_buffer_pool_size. So, which disk type and what is that setting, plus how much RAM on the server?
Where I am going with the previous paragraph... If cache space is tight, it will be slow.
This looks like a case where "index merge intersect" would be useful, but the EXPLAIN says it did not try that. So,... Consider filing a bug report at bugs.mysql.com .

Hard to say without the EXPLAIN statement and table indexes, however most likely MySQL is making a huge temporary table to store the result of the first match and then tries the second filter.
I would recommend setting the 'tenant=429' part as a subquery instead.
select count(*) from
( SELECT `skillKeywords`,`anyKeywords`,`cvKeywords`,`infoKeywords`,`abilityKeywords`,`lngKeywords`,`roleKeywords`,`industryKeywords`,`educationKeywords`,`prefLocKeywords`
from cvprofiles
where tenant=429
) AS x
WHERE match(`skillKeywords`,`anyKeywords`,`cvKeywords`,`infoKeywords`,`abilityKeywords`,`lngKeywords`,`roleKeywords`,`industryKeywords`,`educationKeywords`,`prefLocKeywords`)
against ('java' in boolean mode) ;
Note that you will most likely need to name the columns in the subquery and access that that way.

Related

speed up join between two databases

I have this query and takes 40 seconds, Is there a way to speed up? thank you
SELECT *, last.Date
FROM constant.derogation der
LEFT JOIN variable.last last
ON der.code = last.code
WHERE 1 = 1
AND status != 'removed'
ORDER BY status;
+------+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | der | ALL | NULL | NULL | NULL | NULL | 318 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | last | ALL | NULL | NULL | NULL | NULL | 250950 | Using where |
+------+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------+
This is the structure of both tables, both databases are on the same server.
I will only get a value from Last Table
DLL
| derogation | CREATE TABLE `derogation` (
`xxx` char(10) NOT NULL,
`xxx` varchar(100) DEFAULT NULL,
`code` char(17) DEFAULT NULL,
`xxx` varchar(20) DEFAULT NULL,
`xxx` char(6) DEFAULT NULL,
`xxx` varchar(50) DEFAULT NULL,
`xxx` varchar(200) DEFAULT NULL,
`xxxx` varchar(200) DEFAULT NULL,
`xxxx` varchar(100) DEFAULT NULL,
`xxxx` datetime DEFAULT NULL,
`xxx` varchar(100) DEFAULT NULL,
`xxxx` varchar(20) DEFAULT NULL,
`xxx` varchar(50) DEFAULT NULL,
`xxx` varchar(10) DEFAULT NULL,
`xxx` datetime DEFAULT NULL,
`xxx` varchar(10) DEFAULT NULL,
`xxx` datetime DEFAULT NULL,
`xxx` datetime DEFAULT NULL,
`status` varchar(20) DEFAULT NULL,
`xxx` varchar(1000) DEFAULT NULL,
KEY `code_index_derogation` (`code`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
| Last | CREATE TABLE `Last` (
`code` char(17) DEFAULT NULL,
`xxx` decimal(10,2) DEFAULT NULL,
`Date` datetime DEFAULT NULL,
`xxxx` datetime DEFAULT NULL,
`xxxx` datetime DEFAULT NULL,
`xxxx` text DEFAULT NULL,
`xxxxx` datetime DEFAULT NULL,
`xxxx` char(6) DEFAULT NULL,
KEY `idx_Last_Code` (`code`),
KEY `idx_Last_xxx` (`xxx`),
KEY `code_index` (`code`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
you can use temporary tables, staging tables CTE or you can add indexes for the columns of the tables.
But using temporary tables, it's always an excelent choice to accelerate the excecution of the the SQL query.
(The problem is subtle, and often overlooked in this forum.)
Don't mix collations. You are joining on code, but the CHARACTER SET and COLLATION are different between the two tables. Suggest using `ALTER TABLE .. CONVERT TO .. to get them to be the same.
Other tips:
Use CHAR only for truly fixed-length columns. Use VARCHAR otherwise.
Have a PRIMARY KEY on every table.
Don't say LEFT JOIN if you really expect JOIN; it adds confusion to the reader.

Optimizing a MySQL query on big tables

I have a SQL query with 3 tables joined on a distant MySQL DB- Two of these tables have size of about 18GByte (STEP_RESULT and meas_numericlimit) and then the distant server create a TMP table which takes age (about 25 min) to end
How can I optimize this query ?
select
t1.UUT_NAME,
t1.STATION_NUM,
t1.START_DATE_TIME,
t3.LOW_LIMIT,
t3.DATA,
t3.HIGH_LIMIT,
t3.UNITS,
t2b.STEP_NAME
from
meas_numericlimit t3
inner join STEP_RESULT t2a on t3.ID = t2a.STEP_ID
inner join STEP_RESULT t2b on t2a.STEP_PARENT = t2b.STEP_ID
inner join uut_result t1 on t2b.UUT_RESULT = t1.ID
where
t1.UUT_NAME like 'Variable1-1%' and
t1.STATION_NUM = 'variable2' and
t2b.STEP_NAME = 'variable3' and
t2b.STEP_TYPE = 'constant'
Here the SHOW TABLES and EXPLAIN output queries :
+--------------------+
| Tables_in_spectrum |
+--------------------+
| cal_dates |
| calibrage |
| execution_time |
| meas_numericlimit |
| station_feature |
| step_callexe |
| step_graph |
| step_msgjnl |
| step_msgpopup |
| step_passfail |
| step_result |
| step_seqcall |
| step_stringvalue |
| syst_event |
| uptime |
| users |
| uut_result |
+--------------------+
and
+----+-------------+-------+--------+-------------------------+---------+
| id | select_type | table | type | possible_keys | key |
+----+-------------+-------+--------+-------------------------+---------+
| 1 | SIMPLE | t2a | ALL | NULL | NULL |
| 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY |
| 1 | SIMPLE | t2b | ALL | NULL | NULL |
| 1 | SIMPLE | t1 | eq_ref | PRIMARY,FK_uut_result_1 | PRIMARY |
+----+-------------+-------+--------+-------------------------+---------+
---------+----------------------+----------- +---------------------------+
key_len | ref | rows | Extra |
---------+----------------------+----------- +---------------------------+
NULL | NULL | 48120004 | |
40 | spectrum.t2a.STEP_ID | 1 | |
NULL | NULL | 48120004 | Using where; Using join
buffer |
40 | spectrum.t2b.UUT_RESULT | 1 | Using where |
-------+----------------------+------------+---------------------------+
Here the SHOW CREATE TABLE :
CREATE TABLE `uut_result` (
`ID` varchar(38) NOT NULL DEFAULT '',
`STATION_NUM` varchar(255) DEFAULT NULL,
`SOFTVER_ODTGEN` varchar(10) DEFAULT NULL,
`HARDVER_ODTGEN` varchar(10) DEFAULT NULL,
`NEXT_CAL_DATE` date DEFAULT NULL,
`UUT_NAME` varchar(255) DEFAULT NULL,
`UUT_SERIAL_NUMBER` varchar(255) DEFAULT NULL,
`UUT_VERSION` varchar(255) DEFAULT NULL,
`USER_LOGIN_NAME` varchar(255) DEFAULT NULL,
`USER_LOGIN_LOGIN` varchar(255) NOT NULL DEFAULT '',
`START_DATE_TIME` datetime DEFAULT NULL,
`EXECUTION_TIME` float DEFAULT NULL,
`UUT_STATUS` varchar(255) DEFAULT NULL,
`UUT_ERROR_CODE` int(11) DEFAULT NULL,
`UUT_ERROR_MESSAGE` varchar(1023) DEFAULT NULL,
`PAT_NAME` varchar(255) NOT NULL DEFAULT '',
`PAT_VERSION` varchar(10) NOT NULL DEFAULT '',
`TEST_LEVEL` varchar(50) DEFAULT NULL,
`INTERFACE_ID` int(10) unsigned NOT NULL DEFAULT '0',
`EXECUTION_MODE` varchar(45) DEFAULT NULL,
`LOOP_MODE` varchar(45) DEFAULT NULL,
`STOP_ON_FAIL` tinyint(4) unsigned NOT NULL DEFAULT '0',
`EXECUTION_COMMENT` text,
PRIMARY KEY (`ID`),
KEY `FK_uut_result_1` (`STATION_NUM`)
) ENGINE=MyISAM DEFAULT CHARSET=latin;
and
CREATE TABLE `meas_numericlimit` (
`ID` varchar(38) NOT NULL DEFAULT '',
`STEP_RESULT` varchar(38) NOT NULL DEFAULT '',
`NAME` varchar(255) DEFAULT NULL,
`COMP_OPERATOR` varchar(30) DEFAULT NULL,
`HIGH_LIMIT` double DEFAULT NULL,
`LOW_LIMIT` double DEFAULT NULL,
`UNITS` varchar(255) DEFAULT NULL,
`DATA` double DEFAULT NULL,
`STATUS` varchar(255) DEFAULT NULL,
`FORMAT` varchar(15) DEFAULT NULL,
`NANDATA` int(11) DEFAULT '0',
PRIMARY KEY (`ID`),
KEY `FK_meas_numericlimit_1` (`STEP_RESULT`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
and
CREATE TABLE `step_result` (
`ID` varchar(38) NOT NULL DEFAULT '',
`UUT_RESULT` varchar(38) NOT NULL DEFAULT '',
`STEP_PARENT` varchar(38) DEFAULT NULL,
`STEP_NAME` varchar(255) DEFAULT NULL,
`STEP_ID` varchar(38) NOT NULL DEFAULT '',
`STEP_TYPE` varchar(255) DEFAULT NULL,
`STATUS` varchar(255) DEFAULT NULL,
`REPORT_TEXT` text,
`DIAG` text,
`ERROR_OCCURRED` tinyint(1) NOT NULL DEFAULT '0',
`ERROR_CODE` int(11) DEFAULT NULL,
`ERROR_MESSAGE` varchar(1023) DEFAULT NULL,
`MODULE_TIME` float DEFAULT NULL,
`TOTAL_TIME` float DEFAULT NULL,
`NUM_LOOPS` int(11) DEFAULT NULL,
`NUM_PASSED` int(11) DEFAULT NULL,
`NUM_FAILED` int(11) DEFAULT NULL,
`ENDING_LOOP_INDEX` int(11) DEFAULT NULL,
`LOOP_INDEX` int(11) DEFAULT NULL,
`INTERACTIVE_EXENUM` int(11) DEFAULT NULL,
`STEP_GROUP` varchar(30) DEFAULT NULL,
`STEP_INDEX` int(11) DEFAULT NULL,
`ORDER_NUMBER` int(11) DEFAULT NULL,
PRIMARY KEY (`ID`),
KEY `FK_step_result_1` (`UUT_RESULT`),
KEY `IDX_step_parent` (`STEP_PARENT`)
) ENGINE=MyISAM DEFAULT CHARSET=latin
Thank you for your help
What is the value of join_buffer_size? It should not be more than about 1% of RAM. If it is much bigger, you run the risk of swapping, which is especially bad for performance.
One thing jumps out in the EXPLAIN: NULL | 48120004 saying that this is needed: INDEX(STEP_ID);
However, the SELECT and the EXPLAIN do not seem to match. Please double check.
uut_result needs INDEX(station_num, uut_name) -- in that order; replaces just (station_num).
What is varchar(38)? UUIDs are only 36. IPv6 needs 39.
UUIDs are terribly inefficient when the data is too big to be cached. More discussion: http://mysql.rjweb.org/doc.php/uuid
Lots of datatypes could (should) be shrunken -- this shrinkage will cut down on I/O, which will speed up queries. If you provide some sample values for some typical columns, I can give more advice.
For example, STATUS is (usually) a small number of distinct values. That could be represented as a 1-byte ENUM or a 1-byte TINYINT; but maybe your app has hundreds of different status values? If so, "normalizing" it may be the better answer.
DOUBLE takes 8 bytes; FLOAT takes only 4 bytes, but limits precision to only ~7 significant digits -- perhaps that is sufficient?
(Presumably you meant latin1, not latin?)
Also consider switching to InnoDB.
How much RAM do you have? How big (GB) are the tables?

how to convert a field to string in a mysql join

I try to join a filed that is a int(13) on to a field that is varchar(50).
If I only use (a.id = b.id) the DESCRIBE says type: ref.
If I use (a.id = CONCAT(b.id)) the DESCRIBE says type: eq_ref. (where b.id is the integer)
The use of CONCAT to cast a field is ugly, so I tried to use CAST() or CONVERT().
If I use (a.id = CAST(b.id AS CHAR(50))) the DESCRIBE says type: ref.
How do I write a correct cast/convert, that gives a eq_ref join?
UPDATE 1:
DESCRIBE SELECT.. with CONCAT
+------+-------------+-----------------------+--------+-----------------------------------+----------------+---------+-------------------------------------+------+----------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-----------------------+--------+-----------------------------------+----------------+---------+-------------------------------------+------+----------------------------------------+
| 1 | SIMPLE | ext_icecat_prodmatch | ref | PRIMARY,our_article_id,product_id | our_article_id | 152 | const | 3016 | Using index condition; Using temporary |
| 1 | SIMPLE | ext_icecat_product | eq_ref | PRIMARY,product_id | PRIMARY | 4 | ext_icecat_prodmatch.product_id | 1 | |
| 1 | SIMPLE | ext_icecat_supplier | eq_ref | PRIMARY | PRIMARY | 4 | ext_icecat_product.supplier_id | 1 | |
| 1 | SIMPLE | products | eq_ref | PRIMARY | PRIMARY | 152 | ext_icecat_prodmatch.our_article_id | 1 | |
| 1 | SIMPLE | partner_product_saved | eq_ref | PRIMARY | PRIMARY | 155 | const,func | 1 | Using where |
| 1 | SIMPLE | category_names | eq_ref | PRIMARY | PRIMARY | 6 | products.category_id,const | 1 | Using where |
+------+-------------+-----------------------+--------+-----------------------------------+----------------+---------+-------------------------------------+------+----------------------------------------+
The Select:
SELECT
partner_product_saved.*,
ext_icecat_product.product_id,
CONCAT(ext_icecat_supplier.name, ' ', ext_icecat_product.name) AS export_product_name,
ext_icecat_product.catid_match AS category_id,
GROUP_CONCAT(ext_icecat_prodmatch.our_article_id) AS oais,
products.file_name,
category_names.category_path
FROM ext_icecat_product
LEFT JOIN ext_icecat_prodmatch USING (product_id)
LEFT JOIN ext_icecat_supplier USING (supplier_id)
LEFT JOIN products USING (our_article_id)
LEFT JOIN partner_product_saved ON (partner_product_saved.partner_id = 29 AND partner_product_saved.product_id = CONCAT(ext_icecat_product.product_id))
LEFT JOIN category_names ON (category_names.category_id = products.category_id AND category_names.language_id = 2)
WHERE ext_icecat_prodmatch.our_article_id = '0EF03850-D25A-1174-BCDC-EC67352010A6'
GROUP BY ext_icecat_product.product_id
ORDER BY NULL;
SHOW CREATE TABLE
CREATE TABLE `partner_product_saved` (
`partner_id` mediumint(8) NOT NULL,
`product_id` varchar(50) CHARACTER SET utf8 NOT NULL,
`product_name` varchar(100) CHARACTER SET utf8 NOT NULL,
`our_article_id` varchar(50) CHARACTER SET utf8 DEFAULT NULL,
`our_category_id` mediumint(8) DEFAULT NULL,
`manufacture_id` mediumint(8) DEFAULT NULL,
`manufacturer_partnr` varchar(255) COLLATE utf8_bin NOT NULL,
`manufacturer_upc` varchar(255) COLLATE utf8_bin NOT NULL,
`image` tinytext COLLATE utf8_bin NOT NULL,
`image_small` tinytext COLLATE utf8_bin NOT NULL,
`image_big` tinytext COLLATE utf8_bin NOT NULL,
`image_200` tinytext COLLATE utf8_bin NOT NULL,
`image_original` tinytext COLLATE utf8_bin NOT NULL,
`image_width` int(11) DEFAULT NULL,
`image_height` int(11) DEFAULT NULL,
`birth` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`last_updated` timestamp NULL DEFAULT NULL,
`saved` tinyint(3) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`partner_id`,`product_id`),
KEY `our_article_id` (`our_article_id`),
KEY `our_category_id` (`our_category_id`),
KEY `manufacture_id` (`manufacture_id`,`manufacturer_partnr`),
KEY `manufacturer_upc` (`manufacturer_upc`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
CREATE TABLE `ext_icecat_product` (
`product_id` int(13) NOT NULL,
`supplier_id` int(13) NOT NULL DEFAULT '0',
`prod_id` varchar(235) COLLATE utf8_bin NOT NULL DEFAULT '',
`prod_id_clean` varchar(255) CHARACTER SET utf8 NOT NULL,
`catid` int(13) NOT NULL DEFAULT '0',
`catid_match` varchar(50) CHARACTER SET utf8 NOT NULL,
`name` varchar(255) CHARACTER SET utf8 NOT NULL,
`name_clean` varchar(255) CHARACTER SET utf8 NOT NULL,
`low_pic` varchar(255) COLLATE utf8_bin NOT NULL DEFAULT '',
`high_pic` varchar(255) COLLATE utf8_bin NOT NULL DEFAULT '',
`thumb_pic` varchar(255) COLLATE utf8_bin DEFAULT NULL,
`family_id` int(13) NOT NULL DEFAULT '0',
`low_pic_size` int(13) DEFAULT '0',
`high_pic_size` int(13) DEFAULT '0',
`thumb_pic_size` int(13) DEFAULT '0',
`import_date` datetime NOT NULL,
`release_date` datetime NOT NULL,
`updated` datetime NOT NULL,
`need_update` tinyint(1) NOT NULL DEFAULT '0',
`deleted` tinyint(1) NOT NULL DEFAULT '0',
`keyword` tinyint(1) NOT NULL DEFAULT '0',
`special_match` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`product_id`),
KEY `supplier_id` (`supplier_id`),
KEY `catid` (`catid`),
KEY `prod_id` (`prod_id`),
KEY `product_id` (`product_id`,`prod_id`,`supplier_id`),
KEY `release_Date` (`release_date`),
KEY `prod_id_clean` (`prod_id_clean`),
KEY `name_clean` (`name_clean`),
KEY `need_update` (`need_update`),
KEY `deleted` (`deleted`),
KEY `keyword` (`keyword`),
KEY `catid_2` (`catid`,`import_date`),
KEY `catid_match` (`catid_match`),
KEY `special_match` (`special_match`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
WHERE indexed_column = any_function(any_column) -- can use index. WHERE non_indexed_column = any_function(indexed_column) -- cannot use index.
The difference between ref and eq_ref is minor. I think that eq_ref is where the optimizer decides that there cannot be more than one match, often because of UNIQUE.
WHERE ext_icecat_prodmatch.our_article_id = '0EF03850-D25A-1174-BCDC-EC67352010A6' -- is our_article_id INDEXed? or UNIQUE? Sounds like it is only an INDEX, so multiple rows might ensue. To make it eq_ref, you need UNIQUE. But only if the data supports such. The stats imply there might be 3016 rows with that article_id.
Do not use LEFT unless you need it. Note how the Optimizer turned LEFT JOIN ext_icecat_prodmatch USING (product_id) into JOIN and decided (rightly) to start with ext_icecat_prodmatch.
Back to other discussions...
AND partner_product_saved.product_id = CONCAT(ext_icecat_product.product_id))
can go one way, but not the other. That is, it can efficiently go from eip to pps, but not the other way. And EXPLAIN indicated such with const,func.

Optimize a joined MySQL query

There is this query that have been bugging me in two days now, it used to work good but now it slows down the entire cluster environment, the query is as seen below:
SELECT userUploads.*,
users_avatar.avatar AS avatar
FROM userUploads
LEFT JOIN users_avatar
ON userUploads.udid = users_avatar.udid
INNER JOIN user_subscription
ON (
user_subscription.sub_1 = 'G:123456789'
AND user_subscription.sub_2 = userUploads.udid
)
WHERE userUploads.platform = 'Private'
AND userUploads.STATUS IN ( 'featured', 'approved' )
ORDER BY userUploads.id DESC
LIMIT 50 OFFSET 0
I would really appreciate if anyone can help out with this query.
Below is the explain of the query:
+----+-------------+-------------------+--------+----------------------+----------+---------+------------------------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------------+--------+----------------------+----------+---------+------------------------+------+-----------------------------+
| 1 | SIMPLE | userUploads | range | platform,udid,status | platform | 154 | NULL | 12 | Using where; Using filesort |
| 1 | SIMPLE | users_avatar | eq_ref | PRIMARY | PRIMARY | 182 | Seeds.userUploads.udid | 1 | |
| 1 | SIMPLE | user_subscription | ref | sub_1,sub_2 | sub_1 | 93 | const | 7 | Using where |
+----+-------------+-------------------+--------+----------------------+----------+---------+------------------------+------+-----------------------------+
Thanks in advance
EDIT** show create table can seen below
Below is the show create table for the tables hope you have any ideas dancrumb.
| users_avatar | CREATE TABLE `users_avatar` (
`udid` varchar(60) COLLATE utf8_unicode_ci NOT NULL DEFAULT '',
`avatar` varchar(448) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`udid`)
) ENGINE=ndbcluster DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
| userUploads | CREATE TABLE `userUploads` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`bdaha` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
`user` varchar(60) COLLATE utf8_unicode_ci DEFAULT NULL,
`direktoren` text COLLATE utf8_unicode_ci,
`filnamnet` varchar(180) COLLATE utf8_unicode_ci DEFAULT NULL,
`karhes` varchar(150) COLLATE utf8_unicode_ci DEFAULT NULL,
`version` char(10) COLLATE utf8_unicode_ci DEFAULT NULL,
`rostat` int(10) DEFAULT NULL,
`stars` int(11) DEFAULT NULL,
`statyn` varchar(20) COLLATE utf8_unicode_ci DEFAULT NULL,
`platform` char(30) COLLATE utf8_unicode_ci DEFAULT NULL,
`images` int(2) DEFAULT NULL,
`date` char(10) COLLATE utf8_unicode_ci DEFAULT NULL,
`udid` varchar(50) COLLATE utf8_unicode_ci DEFAULT NULL,
`favorirepris` int(8) DEFAULT NULL,
`hikes` char(4) COLLATE utf8_unicode_ci DEFAULT 'no',
`dbn` char(6) COLLATE utf8_unicode_ci DEFAULT NULL,
`timestamp` char(20) COLLATE utf8_unicode_ci DEFAULT NULL,
`comments` int(5) DEFAULT NULL,
`klistret` enum('no','yes') COLLATE utf8_unicode_ci NOT NULL DEFAULT 'no',
PRIMARY KEY (`id`),
KEY `platform` (`platform`,`status`),
KEY `udid` (`udid`),
KEY `hikes` (`hikes`),
KEY `bdaha` (`bdaha`),
KEY `statyn` (`statyn`),
KEY `version` (`version`)
) ENGINE=ndbcluster AUTO_INCREMENT=118831 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
| user_subscription | CREATE TABLE `user_subscription` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`sub_1` varchar(30) COLLATE utf8_unicode_ci DEFAULT NULL,
`sub_2` varchar(30) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `sub_1` (`sub_1`),
KEY `sub_2` (`sub_2`)
) ENGINE=ndbcluster AUTO_INCREMENT=155184 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
Well, you have a filesort on userUploads which is always slow. You may want to play with indices to remove that. For example, you may want to start with an index on udid, platform, and status.
In general when querying you want to perform the most limiting operations in terms of data returned first so that other operations are only performed on the data that is actually in the results.
In this case try reordering the inner join to user_subscription and the left join to users_avatar. This way it will only attempt to even get the avatar for a user if they are actually in the result set rather than it looking up all the avatars first then filtering based on joins and where clauses.
SELECT userUploads.*,
users_avatar.avatar AS avatar
FROM userUploads
INNER JOIN user_subscription
ON (
user_subscription.sub_1 = 'G:123456789'
AND user_subscription.sub_2 = userUploads.udid
)
LEFT JOIN users_avatar
ON userUploads.udid = users_avatar.udid
WHERE userUploads.platform = 'Private'
AND userUploads.STATUS IN ( 'featured', 'approved' )
ORDER BY userUploads.id DESC
LIMIT 50 OFFSET 0

Filesort query, has index, but not using it

Why is the below query failing it use the index story_id, in the story_keywords table?
mysql> EXPLAIN SELECT `stories`.*
-> FROM (`stories`)
-> JOIN `story_keywords` ON `story_keywords`.`story_id` = `stories`.`id`
-> WHERE `image_full_url` != ''
-> AND `order` != 0
-> AND `news_type` IN ('movie', 'movie_review')
-> AND `keyword` IN ('topnews', 'toptablet')
-> GROUP BY `stories`.`id`
-> ORDER BY `created` DESC, `order` DESC
-> LIMIT 5 ;
+----+-------------+----------------+--------+---------------+---------+---------+---------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+--------+---------------+---------+---------+---------------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | story_keywords | ALL | story_id | NULL | NULL | NULL | 42 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | stories | eq_ref | PRIMARY | PRIMARY | 767 | entertainment.story_keywords.story_id | 1 | Using where |
+----+-------------+----------------+--------+---------------+---------+---------+---------------------------------------+------+----------------------------------------------+
2 rows in set (0.00 sec)
mysql> show create table stories;
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| stories | CREATE TABLE `stories` (
`id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`news_type` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`title` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`created` datetime DEFAULT NULL,
`author` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`author_title` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`image_caption` text COLLATE utf8_unicode_ci,
`image_credit` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`image_full_url` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`body` text COLLATE utf8_unicode_ci,
`summary` text COLLATE utf8_unicode_ci,
`external_url` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`order` int(10) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> show create table story_keywords;
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| story_keywords | CREATE TABLE `story_keywords` (
`id` int(10) NOT NULL AUTO_INCREMENT,
`story_id` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`keyword` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `story_id` (`story_id`)
) ENGINE=MyISAM AUTO_INCREMENT=85 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci |
+----------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
It is probably because MySQL believes it is cheaper to fetch ALL rows from story_keywords table and JOIN them in instead of using indexes. It sounds weird at first, but, you see, if you have to perform 100 index lookups on a table and this table has just about 100 rows – it will cost less to read all rows. The explanation is simple: index lookup (for BTREE indexes) is O(ln N), while reading N rows is O(N). Obviously, O(N) < N * O(ln N).
To prove it – try selecting just 1 row from stories (and by one I mean one row, not sorting the whole table and limiting the result ;), just like:
SELECT `stories`.*
FROM (`stories`)
JOIN `story_keywords` ON `story_keywords`.`story_id` = `stories`.`id`
WHERE `stories`.id = SOMETHING
This query is much more likely to turn to index on story_keywords.
Hope this answers your question :)
Anton is on the right track, but I believe there is more to the problem. As my comment on the OP says, the id columns should most likely be INT types. As the explain shows, the length of the primary key on stories is 767. Usually for an INT type the length would be in the low single digits, but since the column is a VARCHAR, the length is extremely long.
Back to the main problem, since there are no indexes on stories.news_type, stories.order, or story_keywords.story_keywords, the optimizer decided to do a full scan of story_keywords since it will yield the smallest initial result set. If there was an index on one of those columns, it would likely use that first. If you add an index that the query can use it will not need to do a full table scan.