MYSQL index with text fileds

MYSQL index with text fileds - mysql

I try build an index for article (own_a) table because can't use the exist index.
The SELECT contain 6 fileds from article table, 3 integer, 1 varchar (30) and 2 text. If I delete 2 text filed from the query, use index succesfully. But how build index with 2 large text filed? I will get error: "Specified key was too long; max key length is 1000 bytes".
SELECT
own_a.id AS article_id,
own_a.article_number,
own_d.id AS dealer_id,
own_d.name AS dealer,
own_a.tecdoc_article_id,
own_a.description,
own_a.extra_description,
'art' AS type,
al.genart_id,
1 AS own
FROM own_db.article AS own_a
JOIN own_db.dealer AS own_d ON own_a.dealer_id = own_d.id
JOIN `common_db`.`article` AS a ON a.id = own_a.t_article_id
JOIN `common_db`.`article_linkage` AS al ON al.article_id = a.id
WHERE
own_a.id IS NOT NULL
AND 1=1
AND al.genart_id IN (273,251,2462,3229,1334,1080,854,919,188,1632,1626,1213,12,3845,51,191,1547,653,2070,572,654,188,854)
AND ( 1=0 OR a.dealer_id IN (10,101,110,134,140,156,161,192,301,316,317,32,35,55,6,85,89,9,95) )
AND al.type_type = 2
AND al.type_id = 19799
GROUP BY own_a.id
ORDER BY own_a.id
explain SELECT...:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE own_a ALL PRIMARY,article_number,id,idx_id_an,idx_id_did,idx... NULL NULL NULL 20088 Using where; Using temporary; Using filesort
1 SIMPLE al ref ArtId ArtId 17 commondb.own_a.t_article_id,const,con... 2 Using where
1 SIMPLE own_d eq_ref PRIMARY PRIMARY 8 own_db.own_a.dealer_id
1 SIMPLE a eq_ref PRIMARY,dealer_id,dealer_id_index PRIMARY 8 own_db.own_a.t_article_id 1 Using where
USE INDEX (..) didn't work.
Thx for help

Related

Why is this Mysql statement slow AND using wrong indices

the SQL at the bottom is super slow ~12-15 seconds. And I don't understand why. Before you read the whole one, just check the first Coalesce part of the first Coalesce. If I replace it with "0", then it is super fast (0.0051s). If I only query the contained Subquery with set-in "client_id", it is super fast, too.
The table "rest_io_log" which is used in the Coalesce contains a lot of entries (more than 5 Million) and therefore got lots of indices to check the contents fast.
The two most important indices for this topic are these:
timestamp - contains only this column
account_id, client_id, timestamp - contains these 3 columns in this order
When I prepend this statement with an "EXPLAIN" it says:
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
PRIMARY
cl
NULL
range
PRIMARY, index_user_id
index_user_id
485
NULL
2
100.00
Using index condition
1
PRIMARY
rates
NULL
eq_ref
PRIMARY
PRIMARY
4
oauth2.cl.rate
1
100.00
NULL
4
DEPENDENT SUBQUERY
traffic
NULL
ref
unique, unique_account_id_client_id_date, index_date, index_account_id_warning_100_client_id_date
unique
162
const, const, oauth2.cl.client_id
1
100.00
Using index condition
3
DEPENDENT SUBQUERY
traffic
NULL
ref
unique, unique_account_id_client_id_date, index_account_id_warning_100_client_id_date
unique_account_id_client_id_date
158
const, oauth2.cl.client_id
56
100.00
Using where; Using index; Using filesort
2
DEPENDENT SUBQUERY
rest_io_log
NULL
index
index_client_id, index_account_id_client_id_timestamp, index_account_id_timestamp, index_account_id_duration_timestamp, index_account_id_statuscode, index_account_id_client_id_statuscode, index_account_id_rest_path, index_account_id_client_id_rest_path
timestamp
5
NULL
2
5.00
Using where
on the bottem line we can see there are tons of indices available and it chooses "timestamp" which is actually not the best choice because account_id and client_id is available, too.
If I enforce the right index by adding "USE INDEX (index_account_id_client_id_timestamp)" to the subquery the execution time is reduced to 8 seconds and the EXPLAIN looks like this:
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
PRIMARY
cl
NULL
range
PRIMARY, index_user_id
index_user_id
485
NULL
2
100.00
Using index condition
1
PRIMARY
rates
NULL
eq_ref
PRIMARY
PRIMARY
4
oauth2.cl.rate
1
100.00
NULL
4
DEPENDENT SUBQUERY
traffic
NULL
ref
unique, unique_account_id_client_id_date, index_date...
unique
162
const, const, oauth2.cl.client_id
1
100.00
Using index condition
3
DEPENDENT SUBQUERY
traffic
NULL
ref
unique, unique_account_id_client_id_date, index_acco...
unique_account_id_client_id_date
158
const, oauth2.cl.client_id
56
100.00
Using where; Using index; Using filesort
2
DEPENDENT SUBQUERY
rest_io_log
NULL
ref
index_account_id_client_id_timestamp
index_account_id_client_id_timestamp
157
const, oauth2.cl.client_id
1972
100.00
Using where; Using index; Using filesort
SELECT
cl.timestamp AS active_since,
GREATEST
(
COALESCE
(
(
SELECT
timestamp AS last_request
FROM
rest_io_log USE INDEX (index_account_id_client_id_timestamp)
WHERE
account_id = 12345 AND
client_id = cl.client_id
ORDER BY
timestamp DESC
LIMIT
1
),
"0000-00-00 00:00:00"
),
COALESCE
(
(
SELECT
CONCAT(date, " 00:00:00") AS last_request
FROM
traffic
WHERE
account_id = 12345 AND
client_id = cl.client_id
ORDER BY
date DESC
LIMIT
1
),
"0000-00-00 00:00:00"
)
) AS last_request,
(
SELECT
requests
FROM
traffic
WHERE
account_id = 12345 AND
client_id = cl.client_id AND
date=NOW()
) AS traffic_today,
cl.client_id AS user_account_name,
t.rate_name,
t.rate_traffic,
t.rate_price
FROM
clients AS cl
LEFT JOIN
(
SELECT
id AS rate_id,
name AS rate_name,
daily_max_traffic AS rate_traffic,
price AS rate_price
FROM
rates
) AS t
ON cl.rate=t.rate_id
WHERE
cl.user_id LIKE "12345|%"
AND
cl.client_id LIKE "api_%"
AND
cl.client_id LIKE "%_12345"
;
the response of the total query looks like this:
active_since
last_request
traffic_today
user_account_name
rate_name
rate_traffic
rate_price
2019-01-16 15:40:34
2019-04-23 00:00:00
NULL
api_some_account_12345
Some rate name
1000
0.00
2019-01-16 15:40:34
2022-10-27 00:00:00
NULL
api_some_other_account_12345
Some rate name
1000
0.00
Can you help?

Why is this Mysql statement slow
Fetching the same row multiple times. Use a JOIN instead of repeated subqueries.
Use MAX instead of ORDER BY and LIMIT 1:
SELECT MAX(timestamp)
FROM ...
WHERE a=12345 AND c=...
Don't use USE INDEX -- what helps today may hurt tomorrow.
Do you really need to fetch both date and timestamp?? Don't they mean the same thing? Or does the data entry need to simplify those down to a single column?
CONCAT(date, " 00:00:00") is identical to date. Making that change, let's you combine those first two subqueries.
cl.client_id LIKE "api_%" AND cl.client_id LIKE "%_12345" ==> cl.client_id LIKE 'api%12345'.
Doesn't use LEFT JOIN ( SELECT ... ) ON ... Instead, simply do LEFT JOIN rates ON ...
Suggested indexes:
rest_io_log: INDEX(account_id, client_id, timestamp)
clients: INDEX(user_id, client_id, rate, timestamp)
rates: INDEX(rate_id, rate_name, rate_traffic, rate_price) -- assuming the above change

MySQL GROUP BY slows down query x1000 times

I'm struggling with setting up proper, effective index for my Django application which uses MySQL database.
The problem is about article table which for now has a little more than 1 million rows and querying isn't as fast as we want.
Article table structure looks more or less like below:
Field Type
id int
date_published datetime(6)
date_retrieved datetime(6)
title varchar(500)
author varchar(200)
content longtext
source_id int
online tinyint(1)
main_article_of_duplicate_group tinyint(1)
After many tries I came that below index gives best performance:
CREATE INDEX search_index ON newsarticle(date_published DESC, main_article_of_duplicate_group, source_id, online);
And the problematic query is:
SELECT
`newsarticle`.`id`,
`newsarticle`.`url`,
`newsarticle`.`date_published`,
`newsarticle`.`date_retrieved`,
`newsarticle`.`title`,
`newsarticle`.`summary_provided`,
`newsarticle`.`summary_generated`,
`newsarticle`.`source_id`,
COUNT(CASE WHEN `newsarticlefeedback`.`is_relevant` THEN `newsarticlefeedback`.`id` ELSE NULL END) AS `count_relevent`,
COUNT(`newsarticlefeedback`.`id`) AS `count_nonrelevent`,
(
SELECT U0.`is_relevant`
FROM `newsarticlefeedback` U0
WHERE (U0.`news_id_id` = `newsarticle`.`id` AND U0.`user_id_id` = 27)
ORDER BY U0.`created_date` DESC
LIMIT 1
) AS `is_relevant`,
CASE
WHEN `newsarticle`.`content` = '' THEN 0
ELSE 1
END AS `is_content`,
`newsproviders_newsprovider`.`id`,
`newsproviders_newsprovider`.`name_long`
FROM
`newsarticle` USE INDEX (SEARCH_INDEX)
INNER JOIN
`newsarticle_topics` ON (`newsarticle`.`id` = `newsarticle_topics`.`newsarticle_id`)
LEFT OUTER JOIN
`newsarticlefeedback` ON (`newsarticle`.`id` = `newsarticlefeedback`.`news_id_id`)
LEFT OUTER JOIN
`newsproviders_newsprovider` ON (`newsarticle`.`source_id` = `newsproviders_newsprovider`.`id`)
WHERE
((1)
AND `newsarticle`.`main_article_of_duplicate_group`
AND `newsarticle`.`online`
AND `newsarticle_topics`.`newstopic_id` = 42
AND `newsarticle`.`date_published` >= '2020-08-08 08:39:03.199488')
GROUP BY `newsarticle`.`id`
ORDER BY `newsarticle`.`date_published` DESC
LIMIT 30
NOTE: that I have to use the index explicitly, otherwise query is muuch slower.
This query takes about 1.4s.
But when I only remove GROUP BY statement the query takes acceptable 1-10ms.
I was trying to add newsarticle ID to index at different positions but without a luck.
This is output from EXPLAIN (from Django):
ID SELECT_TYPE TABLE PARTITIONS TYPE POSSIBLE_KEYS KEY KEY_LEN REF ROWS FILTERED EXTRA
1 PRIMARY newsarticle_topics None ref newsarticle_t_newsarticle_id_newstopic_6b1123b3_uniq,newsartic_newstopic_id_ddd996b6_fk_summarize newsartic_newstopic_id_ddd996b6_fk_summarize 4 const 312628 100.0 Using temporary; Using filesort
1 PRIMARY newsarticle None eq_ref PRIMARY,newsartic_source_id_6ea2b978_fk_summarize,newsartic_topic_id_b67ae2c9_fk_summarize,kek,last_updated,last_update,search_index,fulltext_idx_content PRIMARY 4 newstech.newsarticle_topics.newsarticle_id 1 22.69 Using where
1 PRIMARY newsarticlefeedback None ref newsartic_news_id_id_5af7594b_fk_summarize newsartic_news_id_id_5af7594b_fk_summarize 5 newstech.newsarticle_topics.newsarticle_id 1 100.0 None
1 PRIMARY newsproviders_newsprovider None eq_ref PRIMARY, PRIMARY 4 newstech.newsarticle.source_id 1 100.0 None
2 DEPENDENT SUBQUERY U0 None ref newsartic_news_id_id_5af7594b_fk_summarize,newsartic_user_id_id_fc217cfe_fk_auth_user newsartic_user_id_id_fc217cfe_fk_auth_user 5 const 1 10.0 Using where; Using filesort
Interesting that same query gives different EXPLAIN in MySQL Workbench and in Django debug toolbar(if you want I can paste EXPLAIN from workbench as well). But the performance is more or less the same.
Do you maybe have an idea how to enhance index so it can search quickly?
Thanks
EDIT:
I paste here EXPLAIN from MySQL Workbench which is different but seems to be more real (not sure why Django debug toolbar explain differently)
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 PRIMARY newsarticle NULL range PRIMARY,newsartic_source_id_6ea2b978_fk_,newsartic_topic_id_b67ae2c9_fk,kek,last_updated,last_update,search_index,fulltext_idx_content search_index 8 NULL 227426 81.00 Using index condition; Using MRR; Using temporary; Using filesort
1 PRIMARY newsarticle_topics NULL eq_ref newsarticle_t_newsarticle_id_newstopic_6b1123b3_uniq,newsartic_newstopic_id_ddd996b6_fk newsarticle_t_newsarticle_id_newstopic_6b1123b3_uniq 8 newstech.newsarticle.id,const 1 100.00 Using index
1 PRIMARY newsarticlefeedback NULL ref newsartic_news_id_id_5af7594b_fk newsartic_news_id_id_5af7594b_fk 5 newstech.newsarticle.id 1 100.00 NULL
1 PRIMARY newsproviders_newsprovider NULL eq_ref PRIMARY PRIMARY 4 newstech.newsarticle.source_id 1 100.00 NULL
2 DEPENDENT SUBQUERY U0 NULL ref newsartic_news_id_id_5af7594b_fk,newsartic_user_id_id_fc217cfe_fk_auth_user newsartic_user_id_id_fc217cfe_fk_auth_user 5 const 1 10.00 Using where; Using filesort
EDIT2:
Below is EXPLAIN when I remove GROUP BY from the query (used MySQL Workbench):
id,select_type,table,partitions,type,possible_keys,key,key_len,ref,rows,filtered,Extra
1,SIMPLE,newsarticle,NULL,range,search_index,search_index,8,NULL,227426,81.00,"Using index condition"
1,SIMPLE,newsarticle_topics,NULL,eq_ref,"newsarticle_t_newsarticle_id_newstopic_6b1123b3_uniq,newsartic_newstopic_id_ddd996b6_fk",newsarticle_t_newsarticle_id_newstopic_6b1123b3_uniq,8,"newstech.newsarticle.id,const",1,100.00,"Using index"
1,SIMPLE,newsarticlefeedback,NULL,ref,newsartic_news_id_id_5af7594b_fk,newsartic_news_id_id_5af7594b_fk,5,newstech.newsarticle.id,1,100.00,"Using index"
1,SIMPLE,newsproviders_newsprovider,NULL,eq_ref,"PRIMARY,",PRIMARY,4,newstech.newsarticle.source_id,1,100.00,NULL
EDIT3:
After applying changes suggested by Rick (Thanks!):
newsarticle(id, online, main_article_of_duplicate_group, date_published)
two index for newsarticle_topics (newstopic_id, newsarticle_id) and (newsarticle_id, newstopic_id)
WITH USE_INDEX (takes 1.2s)
EXPLAIN:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 PRIMARY newsarticle_topics NULL ref newsarticle_t_newsarticle_id_newstopic_6b1123b3_uniq,opposite opposite 4 const 346286 100.00 Using index; Using temporary; Using filesort
1 PRIMARY newsarticle NULL ref search_index search_index 4 newstech.newsarticle_topics.newsarticle_id 1 27.00 Using index condition
1 PRIMARY newsproviders_newsprovider NULL eq_ref PRIMARY,filter_index PRIMARY 4 newstech.newsarticle.source_id 1 100.00 NULL
4 DEPENDENT SUBQUERY U0 NULL ref newsartic_news_id_id_5af7594b_fk,feedback_index feedback_index 5 newstech.newsarticle.id 1 100.00 Using filesort
3 DEPENDENT SUBQUERY U0 NULL ref newsartic_news_id_id_5af7594b_fk,feedback_index newsartic_news_id_id_5af7594b_fk 5 newstech.newsarticle.id 1 10.00 Using where
2 DEPENDENT SUBQUERY U0 NULL ref newsartic_news_id_id_5af7594b_fk,feedback_index newsartic_news_id_id_5af7594b_fk 5 newstech.newsarticle.id 1 90.00 Using where
WITHOUT USE_INDEX clause (takes 2.6s)
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 PRIMARY newsarticle_topics NULL ref newsarticle_t_newsarticle_id_newstopic_6b1123b3_uniq,opposite opposite 4 const 346286 100.00 Using index; Using temporary; Using filesort
1 PRIMARY newsarticle NULL eq_ref PRIMARY,search_index PRIMARY 4 newstech.newsarticle_topics.newsarticle_id 1 27.00 Using where
1 PRIMARY newsproviders_newsprovider NULL eq_ref PRIMARY,filter_index PRIMARY 4 newstech.newsarticle.source_id 1 100.00 NULL
4 DEPENDENT SUBQUERY U0 NULL ref newsartic_news_id_id_5af7594b_fk,feedback_index feedback_index 5 newstech.newsarticle.id 1 100.00 Using filesort
3 DEPENDENT SUBQUERY U0 NULL ref newsartic_news_id_id_5af7594b_fk,feedback_index newsartic_news_id_id_5af7594b_fk 5 newstech.newsarticle.id 1 10.00 Using where
2 DEPENDENT SUBQUERY U0 NULL ref newsartic_news_id_id_5af7594b_fk,feedback_index newsartic_news_id_id_5af7594b_fk 5 newstech.newsarticle.id 1 90.00 Using where
For comparison index - newsarticle(date_published DESC, main_article_of_duplicate_group, source_id, online) with USE INDEX (takes only 1-3ms!)
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 PRIMARY newsarticle NULL range search_index search_index 8 NULL 238876 81.00 Using index condition
1 PRIMARY newsproviders_newsprovider NULL eq_ref PRIMARY,filter_index PRIMARY 4 newstech.newsarticle.source_id 1 100.00 NULL
1 PRIMARY newsarticle_topics NULL eq_ref newsarticle_t_newsarticle_id_newstopic_6b1123b3_uniq,opposite newsarticle_t_newsarticle_id_newstopic_6b1123b3_uniq 8 newstech.newsarticle.id,const 1 100.00 Using index
4 DEPENDENT SUBQUERY U0 NULL ref newsartic_news_id_id_5af7594b_fk,feedback_index feedback_index 5 newstech.newsarticle.id 1 100.00 Using filesort
3 DEPENDENT SUBQUERY U0 NULL ref newsartic_news_id_id_5af7594b_fk,feedback_index feedback_index 6 newstech.newsarticle.id,const 1 100.00 Using index
2 DEPENDENT SUBQUERY U0 NULL ref newsartic_news_id_id_5af7594b_fk,feedback_index feedback_index 5 newstech.newsarticle.id 1 90.00 Using where; Using index

Is main_article_of_duplicate_group a true/false flag?
If the Optimizer chooses to start with newsarticle_topics:
newsarticle_topics: INDEX(newstopic_id, newsarticle_id)
newsarticle: INDEX(newsarticle_id, online,
main_article_of_duplicate_group, date_published)
If newsarticle_topics is a many-to-many mapping table, get rid of id and make the PRIMARY KEY be that pair, plus a secondary index in the opposite direction. More discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
If the Optimizer chooses to start with newsarticle (which seems more likely):
newsarticle_topics: INDEX(newsarticle_id, newstopic_id)
newsarticle: INDEX(online, main_article_of_duplicate_group, date_published)
Meanwhile, newsarticlefeedback needs this, in the order given:
INDEX(news_id_id, user_id_id, created_date, isrelevant)
Instead of
COUNT(`newsarticlefeedback`.`id`) AS `count_nonrelevent`,
LEFT OUTER JOIN `newsarticlefeedback`
ON (`newsarticle`.`id` = `newsarticlefeedback`.`news_id_id`)
have
( SELECT COUNT(*) FROM newsarticlefeedback
WHERE `newsarticle`.`id` = `newsarticlefeedback`.`news_id_id`
) AS `count_nonrelevent`,

I happen to have a technique that works well with "news articles" that are categorized, filtered, and ordered by date. It even handles "embargo", "expired", soft "deleted", etc.
The big goal is to touch only 30 rows when performing
ORDER BY `newsarticle`.`date_published` DESC
LIMIT 30
But currently the WHERE clause must look at two tables to do the filtering. And that leads to touching 35K, or probably more, rows.
It requires building a simple table on the side that has 3 columns:
topic (or other filtering category),
date (for fetching only the latest 30),
article_id (for doing only 30 JOINs to get the rest of the article info)
Suitable indexing on that table makes the search very efficient.
With suitable DELETEs in this table, simple flags like online or main_article can be efficiently handled. Do not include flags in this extra table; instead do not include any rows that should not be shown.
More details: http://mysql.rjweb.org/doc.php/lists
(I have watched other "news" sites to meltdown by not having this technique being used.)
Note that the difference between 30 and 35K is about 1000x.

Finally, I figured out what is the problem with this query.
First of all, In Django GROUP BY statement is added automatically when using Count in annotation. So the easiest solution was to avoid it by nested annotations.
This is well explained in the answer here https://stackoverflow.com/a/43771738/4464554
Thanks everyone for you time and help :)

Radically slower subquery without autokey in MySQL 5.7 vs 5.6 - any way to force index?

I have a datehelper table with every YYYY-MM-DD as DATE between the years 2000 and 2100. To this I'm joining a subquery for all unit transactions. unit.end is a DATETIME so my subquery simplifies it to DATE and uses that to join to the datehelper table.
In 5.6 this query takes a couple seconds to run a massive amount of transactions, and it derives a table that is auto keyed based on the DATE(unit.end) in the subquery and uses that to join everything else fairly quickly.
In 5.7, it takes 600+ seconds and I can't get it to derive a table or follow the much better execution plan that 5.6 used. Is there a flag I need to set or some way to prefer the old execution plan?
Here's the query:
EXPLAIN SELECT datehelper.id AS date, MONTH(datehelper.id)-1 AS month, DATE_FORMAT(datehelper.id,'%d')-1 AS day,
IFNULL(SUM(a.total),0) AS total, IFNULL(SUM(a.tax),0) AS tax, IFNULL(SUM(a.notax),0) AS notax
FROM datehelper
LEFT JOIN
(SELECT
DATE(unit.end) AS endDate,
getFinalPrice(unit.id) AS total, tax, getFinalPrice(unit.id)-tax AS notax
FROM unit
INNER JOIN products ON products.id=unit.productID
INNER JOIN prodtypes FORCE INDEX(primary) ON prodtypes.id=products.prodtypeID
WHERE franchiseID='1' AND void=0 AND checkout=1
AND end BETWEEN '2020-01-01' AND DATE_ADD('2020-01-01', INTERVAL 1 YEAR)
AND products.prodtypeID NOT IN (1,10)
) AS a ON a.endDate=datehelper.id
WHERE datehelper.id BETWEEN '2020-01-01' AND '2020-12-31'
GROUP BY datehelper.id ORDER BY datehelper.id;
5.6 result (much faster):
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY datehelper range PRIMARY PRIMARY 3 NULL 365 Using where; Using index
1 PRIMARY <derived2> ref <auto_key0> <auto_key0> 4 datehelper.id 10 NULL
2 DERIVED prodtypes index PRIMARY PRIMARY 4 NULL 10 Using where; Using index
2 DERIVED products ref PRIMARY,prodtypeID prodtypeID 4 prodtypes.id
9 Using index
2 DERIVED unit ref productID,end,void,franchiseID productID 9 products.id 2622 Using where
5.7 result (much slower, no auto key found):
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE datehelper NULL range PRIMARY PRIMARY 3 NULL 366 100.00 Using where; Using index
1 SIMPLE unit NULL ref productID,end,void,franchiseID franchiseID 4 const 181727 100.00 Using where
1 SIMPLE products NULL eq_ref PRIMARY,prodtypeID PRIMARY 8 barkops3.unit.productID 1 100.00 Using where
1 SIMPLE prodtypes NULL eq_ref PRIMARY PRIMARY 4 barkops3.products.prodtypeID 1 100.00 Using index

I found the problem. It was the optimizer_switch 'derived_merge' flag which is new to 5.7.
https://dev.mysql.com/doc/refman/5.7/en/derived-table-optimization.html
This flag overrides materialization of derived tables if the optimizer thinks the outer WHERE can be pushed down into a subquery. In this case, that optimization was enormously more costly than joining a materialized table on an auto_key.

EXPLAIN type "all". Not using index on inner join for no reason

I'm running a query like the one below (where '70' is an example, cause I use the same query in my PHP code only changing that value for every "clasificacion") and I wanted to optimize it, so I used EXPLAIN:
EXPLAIN SELECT TE.PK_ID_TAREA_EMPRESA AS ID_EMPRESA, TE.NOMBRE AS NOMBRE_EMPRESA, TCA.PK_ID_TAREA_CATEGORIA AS ID_CATEGORIA, TCA.NOMBRE AS NOMBRE_CATEGORIA, T.TIPO AS TIPO, T.CLIENTE AS CLIENTE, T.PETICION AS PETICION, T.FACTURABLE AS FACTURABLE, TCOM.COMENTARIO AS COMENTARIO, TCOM.PENDIENTE AS PENDIENTE FROM TAREA_COMENTARIO TCOM
INNER JOIN TAREA T
ON TCOM.TAREA_TIPO = T.TIPO
AND TCOM.TAREA_CLIENTE = T.CLIENTE
AND TCOM.TAREA_PETICION = T.PETICION
INNER JOIN TAREA_EMPRESA_ TE
ON TCOM.FK_ID_TAREA_EMPRESA = TE.PK_ID_TAREA_EMPRESA
INNER JOIN TAREA_CATEGORIA_ TCA
ON TCOM.FK_ID_TAREA_CATEGORIA = TCA.PK_ID_TAREA_CATEGORIA
INNER JOIN TAREA_CLASIFICACION_ TCL
ON TCOM.FK_ID_TAREA_CLASIFICACION = TCL.PK_ID_TAREA_CLASIFICACION
WHERE TCOM.FK_ID_TAREA_CLASIFICACION = 70
GROUP BY TCOM.FK_ID_TAREA_EMPRESA, TCOM.FK_ID_TAREA_CATEGORIA, TCOM.FK_ID_TAREA_CLASIFICACION, TCOM.TAREA_TIPO, TCOM.TAREA_CLIENTE, TCOM.TAREA_PETICION;
EXPLAIN returns me this:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE TCL const PRIMARY PRIMARY 4 const 1 Using index; Using temporary; Using filesort
1 SIMPLE TCOM ref PRIMARY,FK_COMENTARIO_CLASIFICACION,FK_COMENTARIO_EMPRESA,FK_COMENTARIO_CATEGORIA FK_COMENTARIO_CLASIFICACION 4 const 18
1 SIMPLE TCA eq_ref PRIMARY PRIMARY 4 FACTURACION_WEB_DEV.TCOM.FK_ID_TAREA_CATEGORIA 1
1 SIMPLE TE ALL PRIMARY (NULL) (NULL) (NULL) 3 Using where
1 SIMPLE T ref TIPO,CLIENTE,PETICION CLIENTE 768 FACTURACION_WEB_DEV.TCOM.TAREA_CLIENTE 76 Using where
That "type ALL" on TE (TAREA_EMPRESA_) is what can't stop bothering me cause it doesn't make sense. I got indexes for every single column (that PRIMARY included), so I don't know why it's not using the index when joining both tables.
This is what I got on TE:
Columns:
Indexes:
Any ideas? Thanks in advance!
EDIT:
TAREA_EMPRESA_ contains this:
PK_ID_TAREA_EMPRESA NOMBRE ORDEN FECHA_BAJA
1 CR ENERGIA 3 (NULL)
2 SPAIRAL COMMERCE 2 (NULL)
3 KNET COMUNICACIONES 4 (NULL)
4 IR SOLUCIONES 1 (NULL)

Slow SQL Query on MATCH sorting by Relevance

I have the following Query
SELECT
product.AID,
product.ART_ID,
product.EAN,
productdetails.DESCRIPTION_SHORT,
MAX(
(100000 * (MATCH(productdetails.DESCRIPTION_SHORT) AGAINST ('"psen in1p"' IN BOOLEAN MODE)))+
(100000 * (MATCH(product.ART_ID) AGAINST ('"psen in1p"' IN BOOLEAN MODE)))+
(100000 * (MATCH(product.EAN) AGAINST ('"psen in1p"' IN BOOLEAN MODE)))+
(100000 * (MATCH(product.SUPPLIER_ALT_PID) AGAINST ('"psen in1p"' IN BOOLEAN MODE)))+
(10 * (MATCH(productdetails.DESCRIPTION_LONG) AGAINST ('*psen in1p*' IN BOOLEAN MODE)))+
(2 * (MATCH(productdetails.KEYWORD) AGAINST ('+psen +in1p' IN BOOLEAN MODE)))
) AS relevance
FROM
tbl_product as product
INNER JOIN
`tbl_product_details` as productdetails ON product.AID = productdetails.AID
WHERE MATCH
(product.ART_ID,
product.EAN,
product.SUPPLIER_ALT_PID,
product.ERP_GROUP_SUPPLIER) AGAINST ('*psen* *in1p*' IN BOOLEAN MODE)
OR MATCH
(productdetails.DESCRIPTION_SHORT,
productdetails.DESCRIPTION_LONG,
productdetails.MANUFACTURER_TYPE_DESC,
productdetails.KEYWORD) AGAINST ('*psen* *in1p*' IN BOOLEAN MODE)
GROUP BY
product.AID
ORDER BY
relevance DESC
My Problem is, that the Query takes about ~3 Second which is ways to much. If i run the Statement without ORDER BY it takes about 0,0096 Seconds which is perfect. I dont know why it takes so long. I already tried to Subselect and Order the Subselect with the Same Result (About 3 Seconds to Finish. Same goes for a Subselect without ORDER BY.
The Database have about 600k Records and over 1 Million Records in tbl_product_details.
I'm thankfull for any Help on this Problem.
Explain for the Query with Order By (3 Seconds)
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE product index PRIMARY,tbl_product_catalog_id_foreign,tbl_product_supplier_id_foreign,tbl_product_art_id_index,tbl_product_ean_index,SUPPLIER_ALT_PID,ART_ID_2,ft_artid,ft_ean,ft_sapid PRIMARY 4 NULL 569643 Using temporary; Using filesort
1 SIMPLE productdetails ref tbl_product_details_aid_foreign tbl_product_details_aid_foreign 5 shop_meyle1.product.AID 1 Using where
Explain for the Query without Order By (0,01 Seconds)
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE product index PRIMARY,tbl_product_catalog_id_foreign,tbl_product... PRIMARY 4 NULL 569643 NULL
1 SIMPLE productdetails ref tbl_product_details_aid_foreign tbl_product_details_aid_foreign 5 shop_meyle1.product.AID 1 Using where
Explain for the Query without Order By and with Subselect (3 Seconds)
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 569643 NULL
2 DERIVED product index PRIMARY,tbl_product_catalog_id_foreign,tbl_product... PRIMARY 4 NULL 569643 NULL
2 DERIVED productdetails ref tbl_product_details_aid_foreign tbl_product_details_aid_foreign 5 shop_meyle1.product.AID 1 Using where

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MYSQL index with text fileds - mysql

Related

Why is this Mysql statement slow AND using wrong indices

MySQL GROUP BY slows down query x1000 times

Radically slower subquery without autokey in MySQL 5.7 vs 5.6 - any way to force index?

EXPLAIN type "all". Not using index on inner join for no reason

Slow SQL Query on MATCH sorting by Relevance

Categories

Resources