sql orderby destroys inner self join query

sql orderby destroys inner self join query - mysql

My query works fine without ORDER BY (<1sec). But takes about 20s with it:
I have single column indexes on all columns
SELECT SQL_NO_CACHE * FROM file_results data
INNER JOIN file_result_filters gender ON data.id = gender.file_result_id AND gender.filter = 'filter1' AND gender.value = 'male'
INNER JOIN file_result_filters age ON data.id = age.file_result_id AND age.filter = 'filter2' AND age.value > 20
INNER JOIN file_result_filters height ON data.id = height.file_result_id AND height.filter = 'filter3' AND height.value > 150
ORDER BY age.value DESC
file_results table:
id, ...
file_result_filters:
id, file_result_id, filter, value
The funny thing is that it works when I only have 2 inner joins instead of 3

The reason is based on when MySQL can start returning data.
Without an ORDER BY, MySQL can start returning data as soon as any rows match. It looks like the results are coming in fast.
With an ORDER BY, MySQL has to process all the data to find all possible matches. Results are not returned immediately, but only after the ORDER BY.
Of course, the ORDER BY itself also takes time, so that is also contributing to the delay.

Related

MySQL query that limits first join to specific number of rows

I'm trying to run a query that joins 3 tables and I want to limit the first join to just 5 rows. The end result can return any number of rows so I don't want to add LIMIT to the end of the query.
Here is the query I have, which works but obviously does not limit the first join to 5 rows. I've attempted a subquery, which I believe is the only way to accomplish this, and everything I try gives an error. I can't seem to apply examples I have seen, to my situation.
SELECT mw_customer.customer_id, mw_customer.customer_uid, mw_campaign.customer_id, mw_campaign.campaign_id, mw_campaign.type, mw_campaign.status, mw_campaign_delivery_log.campaign_id, mw_campaign_delivery_log.subscriber_id
FROM mw_customer
JOIN mw_campaign
ON mw_customer.customer_id = mw_campaign.customer_id
AND mw_customer.customer_uid = 'XYZ'
AND mw_campaign.type = 'regular'
AND mw_campaign.status = 'sent'
JOIN mw_campaign_delivery_log
ON mw_campaign.campaign_id = mw_campaign_delivery_log.campaign_id
So what I want to do is limit the "JOIN mw_customer" to a maximum of 5 rows and then after the JOIN mw_campaign_delivery_log, there can be any number of rows.
Thanks

Wrap the first join in a subquery with LIMIT 5.
SELECT t.customer_id, t.customer_uid, t.campaign_id, t.type, t.status, l.subscriber_id
FROM (SELECT cus.customer_id, cus.customer_uid, cam.campaign_id, cam.type, cam.status
FROM mw_customer AS cus
JOIN mw_campaign AS cam
ON cus.customer_id = cam.customer_id
WHERE cus.customer_uid = 'XYZ'
AND cam.type = 'regular'
AND cam.status = 'sent'
LIMIT 5) AS t
JOIN mw_campaign_delivery_log AS l
ON t.campaign_id = l.campaign_id
Note that LIMIT without ORDER BY means that the 5 rows selected will be unpredictable.

Understaing the difference between two queries from performance point

I have this two version of the same query. Both produce same results (164 rows). But the second one takes .5 sec while the 1st one takes 17 sec. Can someone explain what's going on here?
TABLE organizations : 11988 ROWS
TABLE transaction_metas : 58232 ROWS
TABLE contracts_history : 219469 ROWS
# TAKES 17 SEC
SELECT contracts_history.buyer_id as id, org.name, SUM(transactions_count) as transactions_count, GROUP_CONCAT(DISTINCT(tm.value)) as balancing_authorities
From `contracts_history`
INNER JOIN `organizations` as `org`
ON `org`.`id` = `contracts_history`.`buyer_id`
LEFT JOIN `transaction_metas` as `tm`
ON `tm`.`contract_token` = `contracts_history`.`token` and `tm`.`field` = '1'
WHERE `contracts_history`.`seller_id` = '850'
GROUP BY `contracts_history`.`buyer_id` ORDER BY `balancing_authorities` DESC
# TAKES .6 SEC
SELECT contracts_history.buyer_id as id, org.name, SUM(transactions_count) as transactions_count, GROUP_CONCAT(DISTINCT(tm.value)) as balancing_authorities
From `contracts_history`
INNER JOIN `organizations` as `org`
ON `org`.`id` = `contracts_history`.`buyer_id`
left join (select * from `transaction_metas` where contract_token in (select token from `contracts_history` where seller_id = 850)) as `tm`
ON `tm`.`contract_token` = `contracts_history`.`token` and `tm`.`field` = '1'
WHERE `contracts_history`.`seller_id` = '850'
GROUP BY `contracts_history`.`buyer_id` ORDER BY `balancing_authorities` DESC
Explain Results:
First Query: https://prnt.sc/hjtiw6
Second Query: https://prnt.sc/hjtjjg
As based on my debugging of the first query it was clear that left join to transaction_metas table was making it slow, So I tried to limit its rows instead of joining to the full table. It seems to work but I don't understand why.

Join is a set of combinations from rows in your tables. That in mind, in the first query the engine combines all the results to filter just after. In second case one it applies the filter before it tries make the combinations.
The best case would make use of filter in JOIN clause without subquery.
Much like this:
SELECT contracts_history.buyer_id as id, org.name, SUM(transactions_count) as transactions_count, GROUP_CONCAT(DISTINCT(tm.value)) as balancing_authorities
From `contracts_history`
INNER JOIN `organizations` as `org`
ON `org`.`id` = `contracts_history`.`buyer_id`
AND `contracts_history`.`seller_id` = '850'
LEFT JOIN `transaction_metas` as `tm`
ON `tm`.`contract_token` = `contracts_history`.`token`
AND `tm`.`field` = 1
GROUP BY `contracts_history`.`buyer_id` ORDER BY `balancing_authorities` DESC
Note: When you reduce the size of the join tables by filtering with subqueries, it may allow the rows fit into the buffer. Nice trick to small buffer limit.
A Better explication:
https://dev.mysql.com/doc/refman/5.5/en/explain-output.html

MySQL Query Too Slow For Limited Row

I have a problem with mysql query. It is too slow about 101 seconds for limited 10 row. What could be the problem ?
Query is :
SELECT isteksikayet.BASVURU_NO AS BasvuruNo, isteksikayet.BASVURU_TARIHI AS BasvuruTarihi, mahalle.ad AS MahalleAdi, konular.ADI AS KonuAdi,
sonucturleri.ADI AS Durum, isteksikayetdetay.GUNCELLEME_TARIHI AS BilgiTarihi, birimler.ad AS BirimAdi
FROM isb_istek_sikayet isteksikayet
INNER JOIN tbl_sistem_mahalle mahalle
ON isteksikayet.MAHALLE_KODU = mahalle.kod
INNER JOIN isb_konular konular
ON isteksikayet.KONU_KODU = konular.KODU
INNER JOIN isb_istek_sikayet_detay isteksikayetdetay
ON isteksikayet.BASVURU_NO = isteksikayetdetay.BASVURU_NO
INNER JOIN isb_sonuc_turleri sonucturleri
ON isteksikayetdetay.SONUC_KODU = sonucturleri.KODU
INNER JOIN mubim_birim birimler
ON isteksikayetdetay.DAIRE_KODU = birimler.kod
ORDER BY BasvuruNo DESC LIMIT 10;

It is true that the query returns only 10 row, but it has to order ALL the rows of a six tables join, this can really grow out of control quickly (like it already did on your database).
To avoid that I would suggest to order ONLY the table containing the BasvuruNocolumn and extracting the first 10 records in a subquery, and only then join with the rest of the tables. This way you should avoid ordering an overwhelming amount of records

SQL Query with Inner Joins

I am trying to run this Query in SQL:
SELECT c.company, sum(i.grand_total) FROM billing_invoices i
INNER JOIN billing_salesman_commission b ON i.invoice_number = b.invoice
INNER JOIN customer c ON i.customer_sequence = c.sequence
WHERE
i.status = 'Unpaid' and DATE(i.datetime) >= '2015-10-01'
GROUP BY c.sequence
Which returns the correct data, however its moving the decimal point for the grand_total column that its summing up
As an example, when i run SELECT sum(grand_total) from billing_invoices WHERE customer_sequence = '270', it is returning 35.29 however when i run my first query, its returning 352.90000915527344

You are using two different where clauses. Are you sure that customer_sequence = 270 returns the exact same results as status = "unpaid" and datetime >= 2015-10-01? I bet it's not.
You could also be getting different results because you are running a sum() in a group by in one query, and then a sum() without a group by in the second. It's hard to say without knowing the underlying data. There are other factors with the relationship to the join tables and the data integrity on "sequence."
Work through the joins one at a time and verify you are getting the data you expect. Start by removing the "group by" clause and see

MySql query runs very slow(actually never gives output) without where clause

I have a mysql query and it works fine when i use where clause, but when i donot use
where clause it gone and never gives the output and finally timeout.
Actually i have used Explain command to check the performance of the query and in both cases the Explain gives the same number of rows used in joining.
I have attached the image of output got with Explain command.
Below is the query.
I couldn't figure whats the problem here.
Any help is highly appreciated.
Thanks.
SELECT
MCI.CLIENT_ID AS CLIENT_ID, MCI.NAME AS CLIENT_NAME, MCI.PRIMARY_CONTACT AS CLIENT_PRIMARY_CONTACT,
MCI.ADDED_BY AS SP_ID, CONCAT(MUD_SP.FIRST_NAME, ' ', MUD_SP.LAST_NAME) AS SP_NAME,
MCI.FK_PROSPECT_ID AS PROSPECT_ID, MCI.DATE_ADDED AS ADDED_ON,
(SELECT GROUP_CONCAT(LT.TAG_TEXT SEPARATOR ', ')
FROM LK_TAG LT
INNER JOIN M_OBJECT_TAG_MAPPING MOTM
ON LT.PK_ID = MOTM.FK_TAG_ID
WHERE MOTM.FK_OBJECT_ID = MCI.FK_PROSPECT_ID
AND MOTM.OBJECT_TYPE = 1
AND MOTM.IS_ACTIVE = 1
) AS TAGS,
IFNULL(SUM(GET_DIGITS(MMR.RCP_AMOUNT)), 0) AS REVENUE_SO_FAR,
IFNULL(SUM(GET_DIGITS(MMR.RCP_RUPEES)), 0) AS REVENUE_INR,
COUNT(DISTINCT PMI_MONTHLY.PROJECT_ID) AS MONTHLY,
COUNT(DISTINCT PMI_FIXED.PROJECT_ID) AS FIXED,
COUNT(DISTINCT PMI_HOURLY.PROJECT_ID) AS HOURLY,
COUNT(DISTINCT PMI_ANNUAL.PROJECT_ID) AS ANNUAL,
COUNT(DISTINCT PMI_CURRENTLY_RUNNING.PROJECT_ID) AS CURRENTLY_RUNNING_PROJECTS,
COUNT(DISTINCT PMI_YET_TO_START.PROJECT_ID) AS YET_TO_START_PROJECTS,
COUNT(DISTINCT PMI_TECH_SALES_CLOSED.PROJECT_ID) AS TECH_SALES_CLOSED_PROJECTS
FROM
M_CLIENT_INFO MCI
INNER JOIN M_USER_DETAILS MUD_SP
ON MCI.ADDED_BY = MUD_SP.PK_ID
LEFT OUTER JOIN M_MONTH_RECEIPT MMR
ON MMR.CLIENT_ID = MCI.CLIENT_ID
LEFT OUTER JOIN M_PROJECT_INFO PMI_FIXED
ON PMI_FIXED.CLIENT_ID = MCI.CLIENT_ID AND PMI_FIXED.PROJECT_TYPE = 1
LEFT OUTER JOIN M_PROJECT_INFO PMI_MONTHLY
ON PMI_MONTHLY.CLIENT_ID = MCI.CLIENT_ID AND PMI_MONTHLY.PROJECT_TYPE = 2
LEFT OUTER JOIN M_PROJECT_INFO PMI_HOURLY
ON PMI_HOURLY.CLIENT_ID = MCI.CLIENT_ID AND PMI_HOURLY.PROJECT_TYPE = 3
LEFT OUTER JOIN M_PROJECT_INFO PMI_ANNUAL
ON PMI_ANNUAL.CLIENT_ID = MCI.CLIENT_ID AND PMI_ANNUAL.PROJECT_TYPE = 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_CURRENTLY_RUNNING
ON PMI_CURRENTLY_RUNNING.CLIENT_ID = MCI.CLIENT_ID AND PMI_CURRENTLY_RUNNING.STATUS = 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_YET_TO_START
ON PMI_YET_TO_START.CLIENT_ID = MCI.CLIENT_ID AND PMI_YET_TO_START.STATUS < 4
LEFT OUTER JOIN M_PROJECT_INFO PMI_TECH_SALES_CLOSED
ON PMI_TECH_SALES_CLOSED.CLIENT_ID = MCI.CLIENT_ID AND PMI_TECH_SALES_CLOSED.STATUS > 4
WHERE YEAR(MCI.DATE_ADDED) = '2012'
GROUP BY MCI.CLIENT_ID ORDER BY CLIENT_NAME ASC

Yes, as many people have said, the key is that when you have the where clause, mysql engine filters the table M_CLIENT_INFO --probably drammatically--.
A similar result as removing the where clause is to to add this where clause:
where 1 = 1
You will see that the performance is degraded also because mysql will try to get all the data.

Remove the where clause and all columns from select and add a count to see how many records you get. If it is reasonable, say up to 10k, then do the following,
put back the select columns related to M_CLIENT_INFO
do not include the nested one "TAGS"
remove all your joins
run your query without where clause and gradually include the joins
this way you'll find out when the timeout is caused.

I would try the following. First, MySQL has a keyword "STRAIGHT_JOIN" which tells the optimizer to do the query in the table order you've specified. Since all you left-joins are child-related (like a lookup table), you don't want MySQL to try and interpret one of those as a primary basis of the query.
SELECT STRAIGHT_JOIN ... rest of query.
Next, your M_PROJECT_INFO table, I dont know how many columns of data are out there, but you appear to be concentrating on just a few columns on your DISTINCT aggregates. I would make sure you have a covering index on these elements to help the query via an index on
( Client_ID, Project_Type, Status, Project_ID )
This way the engine can apply the criteria and get the distinct all out of the index instead of having to go back to the raw data pages for the query.
Third, your M_CLIENT_INFO table. Ensure that has an index on both your criteria, group by AND your Order By, and change your order by from the aliased "CLIENT_NAME" to the actual column of the SQL table so it matches the index
( Date_Added, Client_ID, Name )
I have "name" in ticks as it is also a reserved word and helps clarify the column, not the keyword.
Next, the WHERE clause. Whenever you apply a function to an indexed column name, it doesn't work the greatest, especially on date/time fields... You might want to change your where clause to
WHERE MCI.Date_Added between '2012-01-01' and '2012-12-31 23:59:59'
so the BETWEEN range is showing the entire year and the index can better be utilized.
Finally, if the above do not help, I would consider splitting your query some. The GROUP_CONCACT inline select for the TAGS might be a bit of a killer for you. You might want to have all the distinct elements first for the grouping per client, THEN get those details.... Something like
select
PQ.*,
group_concat(...) tags
from
( the entire primary part of the query ) as PQ
Left join yourGroupConcatTableBasis on key columns

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

sql orderby destroys inner self join query - mysql

Related

MySQL query that limits first join to specific number of rows

Understaing the difference between two queries from performance point

MySQL Query Too Slow For Limited Row

SQL Query with Inner Joins

MySql query runs very slow(actually never gives output) without where clause

Categories

Resources