Paginate Count Query in Cakephp 3.0 slow - mysql

I have a problem with the count() of paginate still in cakephp version 3.3:
My table has 6,000,000 records.
The fields involved here are name and cityf. Both have index in MySQL
I'm showing 10 and 10 and despite the query is very fast, the count() of paginate is taking more than 50 seconds.
How to solve this in version 3.3 of cakephp. Follows the two SQL statements and times below:
Select query main:
SELECT
Rr.id AS `Rr__id`,
Rr.idn AS `Rr__idn`,
Rr.aniver AS `Rr__aniver`,
Rr.pessoa AS `Rr__pessoa`,
Rr.name AS `Rr__name`,
Rr.phoner AS `Rr__phoner`,
Rr.tipolf AS `Rr__tipolf`,
Rr.addressf AS `Rr__addressf`,
Rr.num_endf AS `Rr__num_endf`,
Rr.complem AS `Rr__complem`,
Rr.bairrof AS `Rr__bairrof`,
Rr.cityf AS `Rr__cityf`,
Rr.statef AS `Rr__statef`,
Rr.cepf AS `Rr__cepf`,
Rr.n1 AS `Rr__n1`,
Rr.n2 AS `Rr__n2`,
Rr.smerc AS `Rr__smerc`,
Rr.n3 AS `Rr__n3`,
Rr.n4 AS `Rr__n4`,
Rr.fone AS `Rr__fone`,
Rr.numero AS `Rr__numero`
FROM
`MG` Rr
WHERE
(
Rr.name like 'MARCOS%'
AND Rr.cityf like 'BELO HORIZONTE%'
)
ORDER BY
name asc
LIMIT
10 OFFSET 0
= 10 ms
Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Rr range NAME,CITYF,cityfbairrof,cityfaddressf,cityfbairrofaddressf,namen1n2n3n4 NAME 63 NULL 21345 Using index condition; Using where
-
Select query count:
SELECT
(
COUNT(*)
) AS `count`
FROM
`MG` Rr
WHERE
(
Rr.name like 'MARCOS%'
AND Rr.cityf like 'BELO HORIZONTE%'
)
= 51.247 ms
Explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE Rr range NAME,CITYF,cityfbairrof,cityfaddressf,cityfbairrofaddressf,namen1n2n3n4 NAME 63 NULL 21345 Using index condition; Using where
It's happening in several other cases: Always count of query is very slow.
I appreciate any help.
Marcos

Related

Why is this Mysql statement slow AND using wrong indices

the SQL at the bottom is super slow ~12-15 seconds. And I don't understand why. Before you read the whole one, just check the first Coalesce part of the first Coalesce. If I replace it with "0", then it is super fast (0.0051s). If I only query the contained Subquery with set-in "client_id", it is super fast, too.
The table "rest_io_log" which is used in the Coalesce contains a lot of entries (more than 5 Million) and therefore got lots of indices to check the contents fast.
The two most important indices for this topic are these:
timestamp - contains only this column
account_id, client_id, timestamp - contains these 3 columns in this order
When I prepend this statement with an "EXPLAIN" it says:
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
PRIMARY
cl
NULL
range
PRIMARY, index_user_id
index_user_id
485
NULL
2
100.00
Using index condition
1
PRIMARY
rates
NULL
eq_ref
PRIMARY
PRIMARY
4
oauth2.cl.rate
1
100.00
NULL
4
DEPENDENT SUBQUERY
traffic
NULL
ref
unique, unique_account_id_client_id_date, index_date, index_account_id_warning_100_client_id_date
unique
162
const, const, oauth2.cl.client_id
1
100.00
Using index condition
3
DEPENDENT SUBQUERY
traffic
NULL
ref
unique, unique_account_id_client_id_date, index_account_id_warning_100_client_id_date
unique_account_id_client_id_date
158
const, oauth2.cl.client_id
56
100.00
Using where; Using index; Using filesort
2
DEPENDENT SUBQUERY
rest_io_log
NULL
index
index_client_id, index_account_id_client_id_timestamp, index_account_id_timestamp, index_account_id_duration_timestamp, index_account_id_statuscode, index_account_id_client_id_statuscode, index_account_id_rest_path, index_account_id_client_id_rest_path
timestamp
5
NULL
2
5.00
Using where
on the bottem line we can see there are tons of indices available and it chooses "timestamp" which is actually not the best choice because account_id and client_id is available, too.
If I enforce the right index by adding "USE INDEX (index_account_id_client_id_timestamp)" to the subquery the execution time is reduced to 8 seconds and the EXPLAIN looks like this:
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
PRIMARY
cl
NULL
range
PRIMARY, index_user_id
index_user_id
485
NULL
2
100.00
Using index condition
1
PRIMARY
rates
NULL
eq_ref
PRIMARY
PRIMARY
4
oauth2.cl.rate
1
100.00
NULL
4
DEPENDENT SUBQUERY
traffic
NULL
ref
unique, unique_account_id_client_id_date, index_date...
unique
162
const, const, oauth2.cl.client_id
1
100.00
Using index condition
3
DEPENDENT SUBQUERY
traffic
NULL
ref
unique, unique_account_id_client_id_date, index_acco...
unique_account_id_client_id_date
158
const, oauth2.cl.client_id
56
100.00
Using where; Using index; Using filesort
2
DEPENDENT SUBQUERY
rest_io_log
NULL
ref
index_account_id_client_id_timestamp
index_account_id_client_id_timestamp
157
const, oauth2.cl.client_id
1972
100.00
Using where; Using index; Using filesort
SELECT
cl.timestamp AS active_since,
GREATEST
(
COALESCE
(
(
SELECT
timestamp AS last_request
FROM
rest_io_log USE INDEX (index_account_id_client_id_timestamp)
WHERE
account_id = 12345 AND
client_id = cl.client_id
ORDER BY
timestamp DESC
LIMIT
1
),
"0000-00-00 00:00:00"
),
COALESCE
(
(
SELECT
CONCAT(date, " 00:00:00") AS last_request
FROM
traffic
WHERE
account_id = 12345 AND
client_id = cl.client_id
ORDER BY
date DESC
LIMIT
1
),
"0000-00-00 00:00:00"
)
) AS last_request,
(
SELECT
requests
FROM
traffic
WHERE
account_id = 12345 AND
client_id = cl.client_id AND
date=NOW()
) AS traffic_today,
cl.client_id AS user_account_name,
t.rate_name,
t.rate_traffic,
t.rate_price
FROM
clients AS cl
LEFT JOIN
(
SELECT
id AS rate_id,
name AS rate_name,
daily_max_traffic AS rate_traffic,
price AS rate_price
FROM
rates
) AS t
ON cl.rate=t.rate_id
WHERE
cl.user_id LIKE "12345|%"
AND
cl.client_id LIKE "api_%"
AND
cl.client_id LIKE "%_12345"
;
the response of the total query looks like this:
active_since
last_request
traffic_today
user_account_name
rate_name
rate_traffic
rate_price
2019-01-16 15:40:34
2019-04-23 00:00:00
NULL
api_some_account_12345
Some rate name
1000
0.00
2019-01-16 15:40:34
2022-10-27 00:00:00
NULL
api_some_other_account_12345
Some rate name
1000
0.00
Can you help?
Why is this Mysql statement slow
Fetching the same row multiple times. Use a JOIN instead of repeated subqueries.
Use MAX instead of ORDER BY and LIMIT 1:
SELECT MAX(timestamp)
FROM ...
WHERE a=12345 AND c=...
Don't use USE INDEX -- what helps today may hurt tomorrow.
Do you really need to fetch both date and timestamp?? Don't they mean the same thing? Or does the data entry need to simplify those down to a single column?
CONCAT(date, " 00:00:00") is identical to date. Making that change, let's you combine those first two subqueries.
cl.client_id LIKE "api_%" AND cl.client_id LIKE "%_12345" ==> cl.client_id LIKE 'api%12345'.
Doesn't use LEFT JOIN ( SELECT ... ) ON ... Instead, simply do LEFT JOIN rates ON ...
Suggested indexes:
rest_io_log: INDEX(account_id, client_id, timestamp)
clients: INDEX(user_id, client_id, rate, timestamp)
rates: INDEX(rate_id, rate_name, rate_traffic, rate_price) -- assuming the above change

Slow SQL Query on MATCH sorting by Relevance

I have the following Query
SELECT
product.AID,
product.ART_ID,
product.EAN,
productdetails.DESCRIPTION_SHORT,
MAX(
(100000 * (MATCH(productdetails.DESCRIPTION_SHORT) AGAINST ('"psen in1p"' IN BOOLEAN MODE)))+
(100000 * (MATCH(product.ART_ID) AGAINST ('"psen in1p"' IN BOOLEAN MODE)))+
(100000 * (MATCH(product.EAN) AGAINST ('"psen in1p"' IN BOOLEAN MODE)))+
(100000 * (MATCH(product.SUPPLIER_ALT_PID) AGAINST ('"psen in1p"' IN BOOLEAN MODE)))+
(10 * (MATCH(productdetails.DESCRIPTION_LONG) AGAINST ('*psen in1p*' IN BOOLEAN MODE)))+
(2 * (MATCH(productdetails.KEYWORD) AGAINST ('+psen +in1p' IN BOOLEAN MODE)))
) AS relevance
FROM
tbl_product as product
INNER JOIN
`tbl_product_details` as productdetails ON product.AID = productdetails.AID
WHERE MATCH
(product.ART_ID,
product.EAN,
product.SUPPLIER_ALT_PID,
product.ERP_GROUP_SUPPLIER) AGAINST ('*psen* *in1p*' IN BOOLEAN MODE)
OR MATCH
(productdetails.DESCRIPTION_SHORT,
productdetails.DESCRIPTION_LONG,
productdetails.MANUFACTURER_TYPE_DESC,
productdetails.KEYWORD) AGAINST ('*psen* *in1p*' IN BOOLEAN MODE)
GROUP BY
product.AID
ORDER BY
relevance DESC
My Problem is, that the Query takes about ~3 Second which is ways to much. If i run the Statement without ORDER BY it takes about 0,0096 Seconds which is perfect. I dont know why it takes so long. I already tried to Subselect and Order the Subselect with the Same Result (About 3 Seconds to Finish. Same goes for a Subselect without ORDER BY.
The Database have about 600k Records and over 1 Million Records in tbl_product_details.
I'm thankfull for any Help on this Problem.
Explain for the Query with Order By (3 Seconds)
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE product index PRIMARY,tbl_product_catalog_id_foreign,tbl_product_supplier_id_foreign,tbl_product_art_id_index,tbl_product_ean_index,SUPPLIER_ALT_PID,ART_ID_2,ft_artid,ft_ean,ft_sapid PRIMARY 4 NULL 569643 Using temporary; Using filesort
1 SIMPLE productdetails ref tbl_product_details_aid_foreign tbl_product_details_aid_foreign 5 shop_meyle1.product.AID 1 Using where
Explain for the Query without Order By (0,01 Seconds)
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE product index PRIMARY,tbl_product_catalog_id_foreign,tbl_product... PRIMARY 4 NULL 569643 NULL
1 SIMPLE productdetails ref tbl_product_details_aid_foreign tbl_product_details_aid_foreign 5 shop_meyle1.product.AID 1 Using where
Explain for the Query without Order By and with Subselect (3 Seconds)
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 569643 NULL
2 DERIVED product index PRIMARY,tbl_product_catalog_id_foreign,tbl_product... PRIMARY 4 NULL 569643 NULL
2 DERIVED productdetails ref tbl_product_details_aid_foreign tbl_product_details_aid_foreign 5 shop_meyle1.product.AID 1 Using where

I need a MySQL query to be optimized

I have a query running on MySQL DB and is very slow.
Is there anyway I can optimize the following
SELECT mcm.merchant_name,
( ( Sum(ot.price) + Sum(ot.handling_charges)
+ Sum(ot.sales_tax_recd)
+ Sum(ot.shipping_cost) - Sum(ot.sales_tax_payable) ) -
Sum(im.break_even_cost) ) AS PL,
ot.merchant_id
FROM order_table ot,
item_master im,
merchant_master mcm
WHERE ot.item_id = im.item_id
AND ot.merchant_id = mcm.merchant_id
GROUP BY mcm.merchant_name
ORDER BY pl DESC
LIMIT 0, 10;
The Above Query is taking more than 200 seconds to execute.
Explain Result:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE ot ALL "merchant_id,item_id" NULL NULL NULL 507910 "Using temporary; Using filesort"
1 SIMPLE mcm eq_ref "PRIMARY,merchant_id" PRIMARY 4 stores.ot.merchant_id 1
1 SIMPLE im eq_ref "PRIMARY,item_id" PRIMARY 4 stores.ot.item_id 1
Also, I got Error-1003 when I run EXPLAIN EXTENDED
use mysql explain plan to find out why it is taking so long and then maybe create some indexes or change your code.
Update
Based upon this make sure you have an composite index on the order_table on merchant_id,item_id

Optimize Given mySql Query

I have been going through my slow queries and doing what I can to property optimize each one. I ran across this one, that I have been stuck on.
EXPLAIN SELECT pID FROM ds_products WHERE pLevel >0
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE ds_products ALL pLevel NULL NULL NULL 45939 Using where
I have indexed pLevel [tinyint(1)], but the query is not using it and doing a full table scan.
Here is the row count of this table for each value of pLevel:
pLevel count
0 34040
1 3078
2 7143
3 865
4 478
5 279
6 56
if I do the query for a specific value of pLevel, it does use the index:
EXPLAIN SELECT pID FROM ds_products WHERE pLevel =6
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE ds_products ref pLevel pLevel 1 const 1265
I've tried pLevel>=1 and pLevel<=6... but it still does a full scan
I've tried (pLevel=1 or pLevel=2 or pLevel=3 or pLevel=4 or pLevel=5 or pLevel=6) .... but it still does a full table scan.
Try using MySQL GROUP BY.
SELECT pLevel, COUNT(*)
FROM ds_products
GROUP BY pLevel
Edit:
This MySQL documentation article may be useful to you. How to Avoid Table Scans

How can I optimise a joined query which runs on 130,000 rows?

I'm using the following SQL statement:
SELECT
SUM(t.Points) AS `Points`,
CONCAT(s.Firstname, " ", s.Surname) AS `Name`
FROM transactions t
INNER JOIN student s
ON t.Recipient_ID = s.Frog_ID
GROUP BY t.Recipient_ID
The query takes 21 seconds to run. Bizarrely, even if I LIMIT 0, 30 it still takes 20.7 seconds to run!
If I run an EXPLAIN on this statement, the results are as follows:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE s ALL PRIMARY NULL NULL NULL 877 Using temporary; Using filesort
1 SIMPLE t ALL NULL NULL NULL NULL 135140 Using where
A transaction takes the following form:
Transaction_ID Datetime Giver_ID Recipient_ID Points Category_ID Reason
1 2011-09-07 36754 34401 5 6 Gave excellent feedback on the new student noteboo...
There are 130,000 rows in the transactions table.
A student takes the following form:
Frog_ID UPN Firstname Surname Intake_Year
101234 K929221234567 Madeup Student 2010
There are 835 rows in the student table.
Indexes
Is there a way I can make this query more efficient?
You both join using Recepient_ID and group by it, yet it's not indexed, so I assume this is the problem.
Try to add transactions.Recepient_ID as an index.