Query taking lot of time to execute - mysql

I am trying to run a query to get data one time from a client database to our database but a query is taking a lot of time to execute, when I change the order by from primary key user_appoint.id to user_appoint.u_id below is my query
SELECT
CONCAT('D',user_appoint.`id`) AS ApptId,
user_appoint.`u_id`,
tbl_questions.CandAns,
tbl_questions.ExamAns,
tbl_questions.QueNote,
CONCAT("[",GROUP_CONCAT(CONCAT('"',`tbl_investigations`.`test_id`,'":"',tbl_investigations.`result`,'"')),"]") AS CandInv,
CONCAT("[",GROUP_CONCAT(CONCAT('"',`tbl_investigations`.`test_id`,'":"',tbl_investigations.`comments`,'"')),"]") AS IntComm,
IF(tbl_questions.LastUpdatedDateTime>MAX(tbl_investigations.`ModifiedAt`),tbl_questions.LastUpdatedDateTime,MAX(tbl_investigations.`ModifiedAt`)) AS LastUpdatedDateTime,
CONCAT('D',user_appoint.`id`) AS UniqueId
FROM user_appoint
LEFT JOIN tbl_investigations ON tbl_investigations.`appt_id`=user_appoint.`id` AND tbl_investigations.`ModifiedAt`>'2011-01-01 00:00:00'
LEFT JOIN tbl_questions ON tbl_questions.`appt_id` =user_appoint.`id` AND tbl_questions.`LastUpdatedDateTime`>'2011-01-01 00:00:00'
GROUP BY user_appoint.`id`
HAVING LastUpdatedDateTime>'2011-01-01 00:00:00'
ORDER BY user_appoint.`u_id`
LIMIT 0, 2000;
user_appoint.u_id is properly indexed.

Please check the explain plan of your query. And its better to always share explain plan with your original question.
explain format=json
SELECT CONCAT('D',user_appoint.id) AS ApptId, user_appoint.u_id,
tbl_questions.CandAns, tbl_questions.ExamAns, tbl_questions.QueNote,
CONCAT("[",GROUP_CONCAT(CONCAT('"',tbl_investigations.test_id,'":"',tbl_investigations.result,'"')),"]")
AS CandInv,
CONCAT("[",GROUP_CONCAT(CONCAT('"',tbl_investigations.test_id,'":"',tbl_investigations.comments,'"')),"]")
AS IntComm,
IF(tbl_questions.LastUpdatedDateTime>MAX(tbl_investigations.ModifiedAt),tbl_questions.LastUpdatedDateTime,MAX(tbl_investigations.ModifiedAt))
AS LastUpdatedDateTime, CONCAT('D',user_appoint.id) AS UniqueId FROM
user_appoint LEFT JOIN tbl_investigations ON
tbl_investigations.appt_id=user_appoint.id AND
tbl_investigations.ModifiedAt>'2011-01-01 00:00:00' LEFT JOIN
tbl_questions ON tbl_questions.appt_id =user_appoint.id AND
tbl_questions.LastUpdatedDateTime>'2011-01-01 00:00:00' GROUP BY
user_appoint.id HAVING LastUpdatedDateTime>'2011-01-01 00:00:00'
ORDER BY user_appoint.u_id LIMIT 0, 2000;

On looking at your query,I could see lot of concat,aggregate function and join is being performed in single query.
These operations will be performed for all 2000 records as you have set limit on query execution.
This might have caused query to slow down its execution.

You have 2 identical columns with different aliases
CONCAT('D',user_appoint.`id`) AS ApptId,
CONCAT('D',user_appoint.`id`) AS UniqueId
(changed) Assuming NULLs may occur in these date columns then comparing the max() values will overcome any adverse impacts by NULL:
if(max(tbl_questions.lastupdateddatetime) > max(tbl_investigations.`modifiedat`) , max(tbl_questions.lastupdateddatetime), max(tbl_investigations.`modifiedat`)) AS LastUpdatedDateTime
Try this:
SELECT *
FROM (
SELECT
Concat('D', user_appoint.`id`) AS ApptId
, user_appoint.`u_id`
, tbl_questions.candans
, tbl_questions.examans
, tbl_questions.quenote
, Concat("[", Group_concat(Concat('"', `tbl_investigations`.`test_id`, '":"', tbl_investigations.`result`, '"')), "]") AS CandInv
, Concat("[", Group_concat(Concat('"', `tbl_investigations`.`test_id`, '":"', tbl_investigations.`comments`, '"')), "]") AS IntComm
, if(max(tbl_questions.lastupdateddatetime) > max(tbl_investigations.`modifiedat`) , max(tbl_questions.lastupdateddatetime), max(tbl_investigations.`modifiedat`) ) AS LastUpdatedDateTime
, Concat('D', user_appoint.`id`) AS UniqueId
FROM user_appoint
LEFT JOIN tbl_investigations
ON tbl_investigations.`appt_id` = user_appoint.`id`
AND tbl_investigations.`modifiedat` > '2011-01-01 00:00:00'
LEFT JOIN tbl_questions
ON tbl_questions.`appt_id` = user_appoint.`id`
AND tbl_questions.`lastupdateddatetime` > '2011-01-01 00:00:00'
GROUP BY user_appoint.`id`
HAVING lastupdateddatetime > '2011-01-01 00:00:00'
) d
ORDER BY `u_id`
LIMIT 0, 2000
;
HOWEVER
You are using a non-current and non-standard form of GROUP BY clause. MySQL started life allowing this bizarre situation where you could select many columns but only group by one of those. This is completely non-standard for SQL.
In recent versions of MySQL the default settings have changed and using just one column in the GROUP BY clause will cause an error.
So, you may have to change the way you perform the grouping to
GROUP BY
user_appoint.`id`
, user_appoint.`u_id`
, tbl_questions.candans
, tbl_questions.examans
, tbl_questions.quenote
If none of these improve performance please provide the execution plan (as text).

Related

MySql is null vs is not null performance

I have a query where I am basically doing a left outer join and checking if the joined value is null
select count(T1.code)
from ( select code
from asset
where type = 'meter'
and creation_time <= '2022-04-29 00:00:00'
and (deactivation_time > '2022-04-28 00:00:00' or deactivation_time is null )
group by code
) as T1
left join ( select asset_code
from amr_midnight_data
where server_time between '2022-04-28 00:00:00' and '2022-04-29 00:00:00'
group by asset_code
) as T2 on T1.code = T2.asset_code
Where T2.asset_code is null;
This query takes 3 seconds to execute, but if I replace the is null at the end with is not null, it takes less then a second. Why is there a performance difference here and what alternatives do I have to make my original query faster?
Look at the EXPLAIN. A guess... Changing to IS NOT NULL lets the Optimizer change LEFT JOIN to JOIN, which lets it start with amr_midnight_data which might optimize better.
I think that the LEFT JOIN ( SELECT ... ) .. IS [NOT] NULL can be replaced with
WHERE [NOT] EXISTS ( SELECT 1 FROM amr_midnight_data
WHERE asset_code = T1.code
AND server_time >= '2022-04-28'
AND server_time < '2022-04-28' + INTERVAL 1 DAY )
That would like to have INDEX(asset_code, server_time)
EXISTS is faster than SELECT .. GROUP BY because it can stop as soon as one matching row is found.
asset would probably benefit from INDEX(type, creation_time) or (to make it "covering"):
INDEX(time, creation_time, deactivation_time, code)
If you wish to discuss further, please provide SHOW CREATE TABLE for both tables and EXPLAIN for each SELECT.

How do I get the SUM of a group in an sql query

My query is
SELECT *
FROM acodes
WHERE datenumber >= '2016-12-09'
GROUP BY campaignid, acode
LIMIT 0 , 30
Results are
Is there a way to SUM() the maxworth column? I want to add up all the maxworth shown above in an sql query. The answer is not SELECT *, SUM(maxworth) as there are multiple maxworth for the same acode and campaignid.
Reference the existing query as an inline view. Take the existing query, and wrap in parens, and then use that in place of a table name in another query. (The inline view will need to be assigned an alias.)
For example:
SELECT SUM(v.maxworth)
FROM (
-- existing query goes here, between the parens
SELECT *
FROM acodes
WHERE datenumber >= '2016-12-09'
GROUP BY campaignid, acode
LIMIT 0 , 30
) v
In MySQL, that inline view is referred to as a derived table. The way that query works.... MySQL first runs the query in the inline view, and materializes the resultset into a derived table. Once the derived table is populated, the outer query runs against that.
Not sure what you're asking here.
SELECT
a.MAXWORTH1,
SUM(a.MAXWORTH) AS "MAXWORTH2"
FROM (
SELECT
CAMPAIGNID,
SUM(maxworth) AS "MAXWORTH1"
FROM acodes
WHERE datenumber ='2016-12-05'
GROUP BY campaignid
) a
GROUP BY a.MAXWORTH1
This calculates the SUM() by unique campaignid, acode, and maxworth. You mention "there are multiple maxworth for the same acode and campaignid" which makes me think you might be wanting to treat "maxworth" as unique.
SELECT
campaignid,
acode,
maxworth,
SUM(maxworth) AS 'MAXWORTH1'
FROM acodes
WHERE datenumber >= '2016-12-09'
GROUP BY campaignid, acode, maxworth
Here is another attempt at answering your question.
SELECT
a.campaignid,
a.acode,
SUM(a.maxworth) as "SUMMAXWORTH"
FROM
(SELECT
*
FROM acodes
WHERE datenumber >= '2016-12-09'
GROUP BY campaignid, acode
LIMIT 0 , 30
) a
GROUP BY a.campaignid, a.acodes

MySQL query is slow - difference in successive dates at the group level

Below is my MySQL query to find the difference between successive date for each account and then using the results to prepare a frequency count table. This query is of course very slow but before that am I doing the right thing? Please help if you can. Also embedded is a small data sample.
Appreciate your time.
OZooHA
ID DATE
403 2008-06-01
403 2012-06-01
403 2011-06-01
403 2010-06-01
403 2009-06-01
15028 2011-07-01
15028 2010-07-01
15028 2009-07-01
15028 2008-07-01
SELECT
month_diff,
count(*)
FROM
(SELECT t1.id,
t1.date,
MIN(t2.date) AS lag_date,
TIMESTAMPDIFF(MONTH, t1.date, MIN(t2.date)) AS month_diff
FROM tbl_name T1
INNER JOIN tbl_name T2
ON t1.id = t2.id
AND t2.date > t1.date
GROUP BY t1.id, t1.date
ORDER BY t1.id, t1.date
)
GROUP BY month_diff
ORDER BY month_diff
Likely, materializing the inline view is taking most of the time. Ensure you have suitable indexes available to improve performance of the join operation; a covering index ON tbl_name (id, date) would likely be optimal for this query.
With a suitable index available (as above) it may be possible to get better performance with a query something like this:
SELECT d.month_diff
, COUNT(*)
FROM ( SELECT IF(#prev_id = t.id
, TIMESTAMPDIFF(MONTH, t.date, #prev_date )
, NULL
) AS month_diff
, #prev_date := t.date
, #prev_id := t.id
FROM tbl_name t
CROSS
JOIN (SELECT #prev_date := NULL, #prev_id := NULL) i
GROUP BY t.id DESC, t.date DESC
) d
WHERE d.month_diff IS NOT NULL
GROUP BY d.month_diff
Note that the usage of MySQL user-defined variables is not guaranteed. But we do observe consistent behavior with queries written in a particular way. (Future versions of MySQL may change the behavior we observe.)
EDIT: I modified the query above, to replace the ORDER BY t.id, t.date with a GROUP BY t.id, t.date... It's not clear from the example data whether (id,date) is guaranteed to be unique. (If we do have that guarantee, then we don't need the GROUP BY, we can just use ORDER BY. Otherwise, we need the GROUP BY to get the same result returned by the original query.)

I am trying to add the results of to queries together

I am trying to create a single query to display the results of 2 queries. The headings are identical but I just cant seem to figure this out. Here is what I have written:
SELECT ut.question_id, ut.question, ut.response_value, ut.response_text, SUM(ut.total)
FROM
((SELECT survey_questions.id AS 'question_id', survey_questions.question, (survey_responses.sort_order+1) AS 'response_value',
survey_responses.response AS 'response_text', COUNT(survey_responses.response) AS 'total'
FROM voters, group_precincts, voters_surveys, survey_questions, survey_responses
WHERE survey_questions.survey_id = 1
AND voters.id=voters_surveys.voter_id
AND voters.precinct = group_precincts.precincts
AND group_precincts.group_id IN (0)
AND voters_surveys.question_id = survey_questions.id
AND voters_surveys.response_id = survey_responses.id
AND voters_surveys.timestamp BETWEEN '2014-01-01 00:00:00' AND '2014-04-01 00:00:00') AS 'T'
UNION ALL
(SELECT survey_questions.id AS 'question_id', survey_questions.question, (survey_responses.sort_order+1) AS 'response_value',
survey_responses.response AS 'response_text', COUNT(voters_surveys_responses.response_id) AS 'total'
FROM groups, `voters_surveys_responses`, survey_questions, survey_responses
WHERE `voters_surveys_responses`.question_id = survey_questions.id
AND `voters_surveys_responses`.response_id = survey_responses.id
AND `voters_surveys_responses`.timestamp BETWEEN '2014-01-01 00:00:00' AND '2014-04-01 00:00:00'
AND survey_questions.survey_id = 1
AND groups.id IN (0)) AS 'U') AS 'ut'
GROUP BY ut.question_id, ut.response_value;
You have a syntax error, near the UNION ALL. I don't think you can use AS 'T' and AS 'U' where you added them. You are not using these nicknames, so try removing them and re-running.
Another possible problem is that you are grouping by question_id and response_value but also selecting question. You will probably only be able to select fields that you group by, or perform an aggregate function on (like how you apply SUM() to total.
A possible solution is to add question to the GROUP BY.

need the work around with the mysql query

Folks
when i m running the below query , i m getting the error for invalid use of group by function
SELECT `Margin`.`destination`,
ROUND(sum(duration),2) as total_duration,
sum(calls) as total_calls
FROM `ilax`.`margins` AS `Margin`
WHERE `date1` = '2013-08-30' and `destination` like "af%"
AND ROUND(sum(duration),2) like "3%"
group by `destination`
ORDER BY duration Asc LIMIT 0, 20;
let me know the work around
The WHERE clause is evaluated before grouping takes place, so SUM() cannot be used therein; use the HAVING clause instead, which is evaluated after grouping:
SELECT destination,
ROUND(SUM(duration), 2) AS total_duration,
SUM(calls) AS total_calls
FROM ilax.margins
WHERE date1 = '2013-08-30'
AND destination LIKE 'af%'
GROUP BY destination
HAVING total_duration LIKE '3%'
ORDER BY total_duration ASC
LIMIT 0, 20
Note also that one really ought to use numeric comparison operations for numeric values, rather than string pattern matching. For example:
HAVING total_duration >= 3000 AND total_duration < 4000