I have a problem with mysql query. It is too slow about 101 seconds for limited 10 row. What could be the problem ?
Query is :
SELECT isteksikayet.BASVURU_NO AS BasvuruNo, isteksikayet.BASVURU_TARIHI AS BasvuruTarihi, mahalle.ad AS MahalleAdi, konular.ADI AS KonuAdi,
sonucturleri.ADI AS Durum, isteksikayetdetay.GUNCELLEME_TARIHI AS BilgiTarihi, birimler.ad AS BirimAdi
FROM isb_istek_sikayet isteksikayet
INNER JOIN tbl_sistem_mahalle mahalle
ON isteksikayet.MAHALLE_KODU = mahalle.kod
INNER JOIN isb_konular konular
ON isteksikayet.KONU_KODU = konular.KODU
INNER JOIN isb_istek_sikayet_detay isteksikayetdetay
ON isteksikayet.BASVURU_NO = isteksikayetdetay.BASVURU_NO
INNER JOIN isb_sonuc_turleri sonucturleri
ON isteksikayetdetay.SONUC_KODU = sonucturleri.KODU
INNER JOIN mubim_birim birimler
ON isteksikayetdetay.DAIRE_KODU = birimler.kod
ORDER BY BasvuruNo DESC LIMIT 10;
It is true that the query returns only 10 row, but it has to order ALL the rows of a six tables join, this can really grow out of control quickly (like it already did on your database).
To avoid that I would suggest to order ONLY the table containing the BasvuruNocolumn and extracting the first 10 records in a subquery, and only then join with the rest of the tables. This way you should avoid ordering an overwhelming amount of records
Related
I'm trying to run a query that joins 3 tables and I want to limit the first join to just 5 rows. The end result can return any number of rows so I don't want to add LIMIT to the end of the query.
Here is the query I have, which works but obviously does not limit the first join to 5 rows. I've attempted a subquery, which I believe is the only way to accomplish this, and everything I try gives an error. I can't seem to apply examples I have seen, to my situation.
SELECT mw_customer.customer_id, mw_customer.customer_uid, mw_campaign.customer_id, mw_campaign.campaign_id, mw_campaign.type, mw_campaign.status, mw_campaign_delivery_log.campaign_id, mw_campaign_delivery_log.subscriber_id
FROM mw_customer
JOIN mw_campaign
ON mw_customer.customer_id = mw_campaign.customer_id
AND mw_customer.customer_uid = 'XYZ'
AND mw_campaign.type = 'regular'
AND mw_campaign.status = 'sent'
JOIN mw_campaign_delivery_log
ON mw_campaign.campaign_id = mw_campaign_delivery_log.campaign_id
So what I want to do is limit the "JOIN mw_customer" to a maximum of 5 rows and then after the JOIN mw_campaign_delivery_log, there can be any number of rows.
Thanks
Wrap the first join in a subquery with LIMIT 5.
SELECT t.customer_id, t.customer_uid, t.campaign_id, t.type, t.status, l.subscriber_id
FROM (SELECT cus.customer_id, cus.customer_uid, cam.campaign_id, cam.type, cam.status
FROM mw_customer AS cus
JOIN mw_campaign AS cam
ON cus.customer_id = cam.customer_id
WHERE cus.customer_uid = 'XYZ'
AND cam.type = 'regular'
AND cam.status = 'sent'
LIMIT 5) AS t
JOIN mw_campaign_delivery_log AS l
ON t.campaign_id = l.campaign_id
Note that LIMIT without ORDER BY means that the 5 rows selected will be unpredictable.
I have this two version of the same query. Both produce same results (164 rows). But the second one takes .5 sec while the 1st one takes 17 sec. Can someone explain what's going on here?
TABLE organizations : 11988 ROWS
TABLE transaction_metas : 58232 ROWS
TABLE contracts_history : 219469 ROWS
# TAKES 17 SEC
SELECT contracts_history.buyer_id as id, org.name, SUM(transactions_count) as transactions_count, GROUP_CONCAT(DISTINCT(tm.value)) as balancing_authorities
From `contracts_history`
INNER JOIN `organizations` as `org`
ON `org`.`id` = `contracts_history`.`buyer_id`
LEFT JOIN `transaction_metas` as `tm`
ON `tm`.`contract_token` = `contracts_history`.`token` and `tm`.`field` = '1'
WHERE `contracts_history`.`seller_id` = '850'
GROUP BY `contracts_history`.`buyer_id` ORDER BY `balancing_authorities` DESC
# TAKES .6 SEC
SELECT contracts_history.buyer_id as id, org.name, SUM(transactions_count) as transactions_count, GROUP_CONCAT(DISTINCT(tm.value)) as balancing_authorities
From `contracts_history`
INNER JOIN `organizations` as `org`
ON `org`.`id` = `contracts_history`.`buyer_id`
left join (select * from `transaction_metas` where contract_token in (select token from `contracts_history` where seller_id = 850)) as `tm`
ON `tm`.`contract_token` = `contracts_history`.`token` and `tm`.`field` = '1'
WHERE `contracts_history`.`seller_id` = '850'
GROUP BY `contracts_history`.`buyer_id` ORDER BY `balancing_authorities` DESC
Explain Results:
First Query: https://prnt.sc/hjtiw6
Second Query: https://prnt.sc/hjtjjg
As based on my debugging of the first query it was clear that left join to transaction_metas table was making it slow, So I tried to limit its rows instead of joining to the full table. It seems to work but I don't understand why.
Join is a set of combinations from rows in your tables. That in mind, in the first query the engine combines all the results to filter just after. In second case one it applies the filter before it tries make the combinations.
The best case would make use of filter in JOIN clause without subquery.
Much like this:
SELECT contracts_history.buyer_id as id, org.name, SUM(transactions_count) as transactions_count, GROUP_CONCAT(DISTINCT(tm.value)) as balancing_authorities
From `contracts_history`
INNER JOIN `organizations` as `org`
ON `org`.`id` = `contracts_history`.`buyer_id`
AND `contracts_history`.`seller_id` = '850'
LEFT JOIN `transaction_metas` as `tm`
ON `tm`.`contract_token` = `contracts_history`.`token`
AND `tm`.`field` = 1
GROUP BY `contracts_history`.`buyer_id` ORDER BY `balancing_authorities` DESC
Note: When you reduce the size of the join tables by filtering with subqueries, it may allow the rows fit into the buffer. Nice trick to small buffer limit.
A Better explication:
https://dev.mysql.com/doc/refman/5.5/en/explain-output.html
My query works fine without ORDER BY (<1sec). But takes about 20s with it:
I have single column indexes on all columns
SELECT SQL_NO_CACHE * FROM file_results data
INNER JOIN file_result_filters gender ON data.id = gender.file_result_id AND gender.filter = 'filter1' AND gender.value = 'male'
INNER JOIN file_result_filters age ON data.id = age.file_result_id AND age.filter = 'filter2' AND age.value > 20
INNER JOIN file_result_filters height ON data.id = height.file_result_id AND height.filter = 'filter3' AND height.value > 150
ORDER BY age.value DESC
file_results table:
id, ...
file_result_filters:
id, file_result_id, filter, value
The funny thing is that it works when I only have 2 inner joins instead of 3
The reason is based on when MySQL can start returning data.
Without an ORDER BY, MySQL can start returning data as soon as any rows match. It looks like the results are coming in fast.
With an ORDER BY, MySQL has to process all the data to find all possible matches. Results are not returned immediately, but only after the ORDER BY.
Of course, the ORDER BY itself also takes time, so that is also contributing to the delay.
When I check SHOW PROCESSLIST; in database I got below query. It heavily uses CPU (more than 100%), it took 80 seconds to complete the query. We have a separate server for database(64GB RAM).
INSERT INTO `search_tmp_598075de5c7e67_73335919`
SELECT `main_select`.`entity_id`, MAX(score) AS `relevance`
FROM (SELECT `search_index`.`entity_id`, (((0)) * 1) AS score
FROM `catalogsearch_fulltext_scope1` AS `search_index`
LEFT JOIN `catalog_eav_attribute` AS `cea`
ON search_index.attribute_id = cea.attribute_id
LEFT JOIN `catalog_category_product_index` AS `category_ids_index`
ON search_index.entity_id = category_ids_index.product_id
LEFT JOIN `review_entity_summary` AS `rating`
ON `rating`.`entity_pk_value`=`search_index`.entity_id
AND `rating`.entity_type = 1
AND `rating`.store_id = 1
WHERE (category_ids_index.category_id = 2299)
) AS `main_select`
GROUP BY `entity_id`
ORDER BY `relevance` DESC
LIMIT 10000
why does this query use my full CPU resources?
Some inefficiencies:
There is a non-null condition on the records of the outer joined catalog_category_product_index. This turns the outer join into an inner join. It will be more efficient to use an inner join clause.
There is no need to have a nested query: the grouping, ordering and limiting can be done directly on the inner query.
(((0)) * 1) is just a complex way of saying 0, and taking the MAX of that will obviously still return a relevance of 0 for all records. Not only is this an inefficient way to output 0, it also makes no sense. I assume your real query has some less evident calculation there, which might need optimisation.
If catalog_eav_attribute.attribute_id is a unique field, then there is no sense in outer joining that table, because that data is not used anywhere
If review_entity_summary.entity_pk_value is unique (at least when entity_type = 1 and store_id = 1), then again there is no use in outer joining that table, because that data is not used anywhere
If the fields in the above 2 bullet points are non-unique, but the number of records returned per search_index.entity_id value is not influencing the result (as it currently stands with the obscure (((0)) * 1) value, it does not), then neither outer join is needed either.
With these assumptions, the select part can be reduced to:
SELECT search_index.entity_id,
MAX(((0)) * 1) AS relevance
FROM catalogsearch_fulltext_scope1 AS search_index
INNER JOIN catalog_category_product_index AS category_ids_index
ON search_index.entity_id = category_ids_index.product_id
WHERE category_ids_index.category_id = 2299
GROUP BY search_index.entity_id
ORDER BY relevance DESC
LIMIT 10000
I still left the (((0)) * 1) in there, but it really makes no sense.
i have the following query
select q.has_content_violation,
if(q.max_finding_status = 1, 1,
if(q.min_finding_status >=3, 3, 2)) scan_item_finding_status,
count(*)
from (
SELECT si.id, if(f.id is not null, true, false) has_content_violation,
min(f.finding_status) min_finding_status,
max(f.finding_status) max_finding_status
FROM bank b
JOIN merchant m ON m.bank_id = b.id
JOIN merchant_has_scan ms ON m.last_scan_completed_id = ms.id
JOIN scan_item si ON si.merchant_has_scan_id = ms.id
LEFT JOIN finding f ON f.scan_item_id = si.id
WHERE f.finding_type = 1
GROUP BY si.id
) q
GROUP BY scan_item_finding_status, q.has_content_violation
the query takes data from table finding and agregate it for scan item, finding if it has several key parameters like finding_type=1 and using the left join to find if the scan_item has findings in the sub-table with finding_type=1.
after that the outer query takes the agregated data and sums it up by some columns.
the query works great but the outer query is not using indexes and doing a full-table search.
Is there a way to make the outer query run better and use indexes? this query is going to run on 10,000 rows each run and will run all day, so it is most important that it will be as fast as possible