How to join two aggregate views, when one is a derrived view? - mysql

I'm pretty inexperienced with SQL in general, so I'm struggling with this. I know this looks messy and inefficient (so open to suggestions on improving that too!)
I have two queries that are pulling and aggregating data from three different tables. The first query is pulling from a single table, the second is aggregating data from 2 different tables.
SELECT
ifnull( `rt_poes`.`Account`, 'Total Compliant Spend' ) AS `Account`,
round( sum( IF ((( `rt_poes`.`Platform` = 'Google' ) AND ( `rt_poes`.`Ad_Type` = 'Search' )), `rt_poes`.`Cost`, 0 )), 2 ) AS `Compliant Google Search`,
round( sum( IF ((( `rt_poes`.`Platform` = 'Microsoft' ) AND ( `rt_poes`.`Ad_Type` = 'Search' )), `rt_poes`.`Cost`, 0 )), 2 ) AS `Compliant Bing Search`,
round( sum( IF ((( `rt_poes`.`Platform` = 'Google' ) AND ( `rt_poes`.`Ad_Type` = 'Shopping' )), `rt_poes`.`Cost`, 0 )), 2 ) AS `Compliant Google Shopping`,
round( sum( IF ((( `rt_poes`.`Platform` = 'Microsoft' ) AND ( `rt_poes`.`Ad_Type` = 'Shopping' )), `rt_poes`.`Cost`, 0 )), 2 ) AS `Compliant Bing Shopping`,
round( sum( IF ((( `rt_poes`.`Platform` = 'Google' ) AND ( `rt_poes`.`Ad_Type` = 'Search' )), `rt_poes`.`Cost`, 0 )), 2 ) +
round( sum( IF ((( `rt_poes`.`Platform` = 'Microsoft' ) AND ( `rt_poes`.`Ad_Type` = 'Search' )), `rt_poes`.`Cost`, 0 )), 2 )
AS `Compliant Country Spend`
FROM
`rt_poes`
WHERE
((
`rt_poes`.`field1` LIKE '% condition1%'
)
OR ( `rt_poes`.`field2` LIKE '% condition2%' )
GROUP BY
`rt_poes`.`Account` WITH ROLLUP
ORDER BY
sum( `rt_poes`.`Cost` )
This works well and generates a nice overview of the data I'm looking for as below.
The second query is:
SELECT ifnull( Account, 'Total All Platform Spend' ) AS Account, Round(SUM(Cost), 2) AS total_platform_costs
FROM
(SELECT Account, Cost FROM QTD_Account_Report
UNION ALL
SELECT AccountName, Spend FROM Bing_QTD_Account_Report) AS DerivedTable
GROUP BY DerivedTable.Account WITH ROLLUP
ORDER BY total_platform_costs
This generates another nice view of the data from another table that I'm looking for too.
The output of both is correct.
Now, I'm trying to join both on the account column. Essentially just add the second column from the second output on to the output of the first query.
The problem I'm having is that whenever I try to use a standard join, I get an error that MySQL does not support this syntax.
Is there another way to join the two or modify the second so that I can have a single view with both like below? Or do I have to create a whole new table for this?

I haven’t done MySQL in a while, but simply joining the second query should work. Is this what you did?
select
ifnull(rt_poes.Account, 'Total Compliant Spend') as Account,
round( sum( if(rt_poes.Platform = 'Google' AND rt_poes.Ad_Type = 'Search', rt_poes.Cost, 0)), 2) as `Compliant Google Search`,
round( sum( if(rt_poes.Platform = 'Microsoft' AND rt_poes.Ad_Type = 'Search', rt_poes.Cost, 0)), 2) as `Compliant Bing Search`,
round( sum( if(rt_poes.Platform = 'Google' AND rt_poes.Ad_Type = 'Shopping', rt_poes.Cost, 0)), 2) as `Compliant Google Shopping`,
round( sum( if(rt_poes.Platform = 'Microsoft' AND rt_poes.Ad_Type = 'Shopping', rt_poes.Cost, 0)), 2) as `Compliant Bing Shopping`,
round( sum( if(rt_poes.Platform = 'Google' AND rt_poes.Ad_Type = 'Search', rt_poes.Cost, 0)), 2) +
round( sum( if(rt_poes.Platform = 'Microsoft' AND rt_poes.Ad_Type = 'Search', rt_poes.Cost, 0)), 2) as `Compliant Country Spend`
round(SUM(DerivedTable.Cost), 2) as total_platform_costs
from rt_poes
left join (select Account, Cost
from QTD_Account_Report
UNION ALL
select AccountName, Spend
from Bing_QTD_Account_Report) as DerivedTable on DerivedTable.Account = rt_poes.Account
where rt_poes.field1 like '%condition1%' or rt_poes.field2 like '%condition2%'
group by rt_poes.Account WITH ROLLUP
order by sum(rt_poes.Cost)
It would be helpful if you quoted the specific error message you got, as well as the query that caused it.

Related

What kind of query optimization can be done on this query?

I'm listing out orders data based on this query. This query basically pulls the recurring orders data from the table. I'm also using some dropdown and a input field to search / filter query results.
SELECT
orders.id,
parent_id,
(
SELECT
COUNT(*)
FROM
orders o
WHERE
o.parent_id = orders.id
) AS recurring_order_count,
shopify_order_type,
shopify_order_id,
shopify_order_customer_ID,
coupon_code AS coupon,
FORMAT(shopify_order_total_price, 2) AS shopify_order_total_price,
FORMAT(
shopify_order_subtotal_price,
2
) AS shopify_order_subtotal_price,
FORMAT(
shopify_order_total_line_items_price,
2
) AS shopify_order_total_line_items_price,
FORMAT(commission_amount, 2) AS commission_amount,
(
CASE WHEN is_paid = 0 THEN 'No' WHEN is_paid = 1 THEN 'Yes' ELSE 'Rejected'
END
) AS is_paid,
(
CASE WHEN is_invoice_generated = 1 THEN 'Pending' ELSE 'Invoice Generated'
END
) AS is_invoice_generated,
DATE_FORMAT(
shopify_order_created_at,
'%m-%d-%Y'
) AS shopify_order_created_at,
(
CASE WHEN is_paused = 0 THEN 'Running' ELSE 'Paused'
END
) AS is_paused,
DATE_FORMAT(
shopify_recurring_date,
'%m-%d-%Y'
) AS shopify_recurring_date
FROM
`orders`
WHERE
coupon_code LIKE '%GERALD8314%' OR shopify_order_id LIKE '%GERALD8314%' OR(
CASE WHEN is_paid = 0 THEN 'No' WHEN is_paid = 1 THEN 'Yes' ELSE 'Rejected'
END
) LIKE '%GERALD8314%' OR(
CASE WHEN is_invoice_generated = 1 THEN 'Pending' ELSE 'Invoice Generated'
END
) LIKE '%GERALD8314%' OR DATE_FORMAT(
shopify_order_created_at,
'%m-%d-%Y'
) LIKE '%GERALD8314%' OR(
CASE WHEN is_paused = 0 THEN 'Running' ELSE 'Paused'
END
) LIKE '%GERALD8314%' OR DATE_FORMAT(
shopify_recurring_date,
'%m-%d-%Y'
) LIKE '%GERALD8314%' AND DATE_FORMAT(
shopify_order_created_at,
'%Y-%m-%d'
) BETWEEN ? AND ?
GROUP BY
`id`
HAVING
parent_id = 0 AND shopify_order_type = 1
ORDER BY
`id`
DESC
LIMIT 10 OFFSET 0
Is this query optimized? Is this SELECT COUNT(*) FROM orders WHERE o.parent_id = orders.id AS recurring_order_count line most expensive in terms of query execution speed? Is there anything I should take care to improve the query speed here? Please advise.
OR and LIKE with leading wildcard are terrible for performance. Consider having a FULLTEXT index across the relevant columns. If it is practical, it will be immensely faster.
WHERE coupon_code LIKE '%GERALD8314%'
OR shopify_order_id LIKE '%GERALD8314%' OR( CASE WHEN is_paid = 0 THEN 'No' WHEN is_paid = 1 THEN 'Yes' ELSE 'Rejected' END ) LIKE '%GERALD8314%' OR( CASE WHEN is_invoice_generated = 1 THEN 'Pending' ELSE 'Invoice Generated' END ) LIKE '%GERALD8314%'
OR DATE_FORMAT( shopify_order_created_at, '%m-%d-%Y' ) LIKE '%GERALD8314%' OR( CASE WHEN is_paused = 0 THEN 'Running' ELSE 'Paused' END ) LIKE '%GERALD8314%'
OR DATE_FORMAT( shopify_recurring_date, '%m-%d-%Y' ) LIKE '%GERALD8314%'
AND DATE_FORMAT( shopify_order_created_at, '%Y-%m-%d' ) BETWEEN ? AND ?
Also, there may be surprises in the results. Note that you have
a OR b OR c AND d
which is the same as
a OR b OR (c AND d)
I suspect you wanted
(a OR b OR c) AND d
I think the GROUP BY is unnecessary. And the HAVING clauses can be merged into the WHERE.
GROUP BY `id`
HAVING parent_id = 0
AND shopify_order_type = 1
ORDER BY `id` DESC
This has multiple issues:
DATE_FORMAT( shopify_order_created_at, '%Y-%m-%d' ) BETWEEN ? AND ?
BETWEEN is "inclusive". The way you have written the query, it will include the entire ending day. This may not be what you wanted.
Assuming the variable is a DATE or DATETIME, it can be simplified to
shopify_order_created_at BETWEEN ? AND ?
There may be more tips; fix these then come back for more.

complex query - look back on same table data to output results in mysql (VERY slow query)

I'm trying to optimize a query, which as the db is growing, its performance is severely lacking.
Background: we are trying to find a list of users who have taken a course and their credential is now due to be renewed (or has not renewed). In searching the query we have to have a look into the registration table (which is the same table that holds all their registration history) and find records where they have not renewed. (Each time the client takes a course they have a registration record added.) The query I'm wanting to optimize looks to see if they've (client) taken the same course type on a date/time after the last class (of same type) they took. If there is no record it should result row(s) that they didn't renew their course. it sounds easy, but, as you know, when you're in the heat of writing a query it gets very complex--and even more so once the db has grown to be so large that it takes almost 5-6 minutes to find the data. So, I'm asking for help on how I can optimize the efforts of my predecessor, below.
Here is the query, thus far (don't laugh, it wasn't started by me--I took over the project).
I have no clue where to begin with optimizing this MySQL. I think it needs to have select statements within the JOINS, but I'm at your mercy to direct me as to where to start! (I"m not a db guy, but offered to take a look and see where we can fix this).
Thanks a million for reading.
Lee
SELECT
r.GUID AS `A/C #`,
concat( a.AttendeeLastName, ', ', a.AttendeeFirstName ) AS Full Name (Last, First),
r.CourseExpirationDateFull AS `Exp Date`,
mtype_master_abbrev AS Course,
a.EmailName AS Email,
r.EventID,
r.EventTypeMasterID,
m.type_master_name,
IF( ( r.CourseExpirationDateFull < curdate( ) ), 'Expired', 'Valid' ) AS Status,
e.StartDateTime,
( to_days( curdate( ) ) - to_days( r.ExpNoticeSent ) ) AS Last Notice,
r.AttendeeID,
a.AttendeeCredentials,
r.RegistrationID,
r.RenewedExternalYYYY,
r.ExpNoticeSent,
q.RenewedRegID,
rs.reg_status_name AS `Reg Status`,
( to_days( r.CourseExpirationDateFull ) - to_days( curdate( ) ) ) AS Days2Exp,
a.flgReturnEmail,
a.flgSendEmail,
r.reg_type_ID,
a._usr_flg_do_not_call,
a.flgPrintLetter
e.EventTypeMasterID AS MasterID,
c.Last: yy-mm-dd - by - topic AS LastComm,
r.reg_renewal_status_id
FROM
vjgzuqrr_wtsql.registration r LEFT JOIN vjgzuqrr_wtsql.events ON ( r.EventID = events.EventID ) LEFT JOIN
vjgzuqrr_wtsql.attendees a ON a.ID = r.GUID LEFT JOIN
vjgzuqrr_wtsql.tbl_crs_type_master m ON r.EventTypeMasterID = m.ID_crs_type_master LEFT JOIN
vjgzuqrr_wtsql.qryrenreg q ON r.RegistrationID = q.OrigRegID LEFT JOIN
vjgzuqrr_wtsql.tbl_reg_status rs ON rs.ID_reg_status = r.RegistrationStatus LEFT JOIN
vjgzuqrr_wtsql.v_last_contact c ON c.registrationid = r.RegistrationID
WHERE
r.Role = 1
AND r.reg_type_ID IN ( 1, 2 )
AND r.CompletionStatus IN ( 9, 8 )
AND r.r IN ( 1, 14, 9 )
AND ( r.EventTypeMasterID IS NOT NULL OR r.EventTypeMasterID = 17 )
AND r.flgDelete = 0
AND r.flgTest = 0
AND e.flgDelete = 0
AND e.flgTestCourse = 0
AND e.flgDelete = 0
AND a.flgTest = 0
AND isnull( q.RenewedRegID )
AND a.flgReturnEmail = 0
AND m.type_master_abbrev NOT IN ( 'EKGPHARM', 'IVCERT', 'sem', 'fam&friends', 'cccc' )
Edit to include Explain:
Sorry im a bit slow, mysql,
This does not speed anything up ( i think, but it may help a bit), but it should help in reading it in a non-mindbreaking way. (hopefully this will also help others look at it.)
SELECT
r.GUID AS `A/C #`,
concat( a.AttendeeLastName, ', ', a.AttendeeFirstName ) AS Full Name (Last, First),
r.CourseExpirationDateFull AS `Exp Date`,
mtype_master_abbrev AS Course,
a.EmailName AS Email,
r.EventID,
r.EventTypeMasterID,
m.type_master_name,
IF( ( r.CourseExpirationDateFull < curdate( ) ), 'Expired', 'Valid' ) AS Status,
e.StartDateTime,
( to_days( curdate( ) ) - to_days( r.ExpNoticeSent ) ) AS Last Notice,
r.AttendeeID,
a.AttendeeCredentials,
r.RegistrationID,
r.RenewedExternalYYYY,
r.ExpNoticeSent,
q.RenewedRegID,
rs.reg_status_name AS `Reg Status`,
( to_days( r.CourseExpirationDateFull ) - to_days( curdate( ) ) ) AS Days2Exp,
a.flgReturnEmail,
a.flgSendEmail,
r.reg_type_ID,
a._usr_flg_do_not_call,
a.flgPrintLetter
e.EventTypeMasterID AS MasterID,
c.Last: yy-mm-dd - by - topic AS LastComm,
r.reg_renewal_status_id
FROM
vjgzuqrr_wtsql.registration r LEFT JOIN
vjgzuqrr_wtsql.events e ON r.EventID = e.EventID LEFT JOIN
vjgzuqrr_wtsql.attendees a ON a.ID = r.GUID LEFT JOIN
vjgzuqrr_wtsql.tbl_crs_type_master m ON r.EventTypeMasterID = m.ID_crs_type_master LEFT JOIN
vjgzuqrr_wtsql.qryrenreg q ON r.RegistrationID = q.OrigRegID LEFT JOIN
vjgzuqrr_wtsql.tbl_reg_status rs ON rs.ID_reg_status = r.RegistrationStatus LEFT JOIN
vjgzuqrr_wtsql.v_last_contact c ON c.registrationid = r.RegistrationID
WHERE
r.Role = 1
AND r.reg_type_ID IN ( 1, 2 )
AND r.CompletionStatus IN ( 9, 8 )
AND r.r IN ( 1, 14, 9 )
AND ( r.EventTypeMasterID IS NOT NULL OR r.EventTypeMasterID = 17 )
AND r.flgDelete = 0
AND r.flgTest = 0
AND e.flgDelete = 0
AND e.flgTestCourse = 0
AND e.flgDelete = 0
AND a.flgTest = 0
AND isnull( q.RenewedRegID )
AND a.flgReturnEmail = 0
AND m.type_master_abbrev NOT IN ( 'EKGPHARM', 'IVCERT', 'sem', 'fam&friends', 'cccc' )

Sql query refactor from mysql 5.6 to 8.0 (GROUP BY problem)

I get error
SQL Error (1055): Expression #7 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'ifu.amount' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
after migrating to mysql 8.0 from 5.6. I know that it can be easily fixed by disabling ONLY_FULL_GROUP_BY flag, but I want it to be more compatible with mysql 8.0. So question is if I would add ifu.amount to GROUP BY it should work perfetcly fine and I won't miss any query results or anything? Now without GROUP BY ifu.amount MySQL code looks like:
select
`i`.`id` AS `institution_id`,
`i`.`name` AS `institution_name`,
`cr`.`check_date` AS `check_date`,
sum(
(
case when (`cr`.`status` = '1') then 1 else 0 end
)
) AS `can_accept`,
sum(
(
case when (`cr`.`status` = '0') then 1 else 0 end
)
) AS `cannot_accept`,(
sum(
(
case when (`cr`.`status` = '1') then 1 else 0 end
)
) + sum(
(
case when (`cr`.`status` = '0') then 1 else 0 end
)
)
) AS `suma`,
`ifu`.`amount` AS `amount`,
round(
(
(
(
(
sum(
(
case when (`cr`.`status` = '1') then 1 else 0 end
)
) * 100
) / (
sum(
(
case when (`cr`.`status` = '1') then 1 else 0 end
)
) + sum(
(
case when (`cr`.`status` = '0') then 1 else 0 end
)
)
)
) * `ifu`.`amount`
) * 0.01
),
2
) AS `financed_amount`
from
(
(
(
`check_results` `cr`
join `family_doctors` `fd` on((`fd`.`id` = `cr`.`doctor_id`))
)
join `institutions` `i` on((`i`.`id` = `fd`.`institution_id`))
)
join `institutions_funding` `ifu` on((`ifu`.`institution_id` = `i`.`id`))
)
where
(`cr`.`status` in (1, 0))
group by
`i`.`id`,
`i`.`name`,
`cr`.`check_date`
Thanks for help in advance!
Include amount in your group by clause.
where
(`cr`.`status` in (1, 0))
group by
`i`.`id`,
`i`.`name`,
`cr`.`check_date`,
`ifu`.`amount`
if amount is excluded on your group by clause, this will get the amount that corresponds on your id, name and check date in ascending order (default).
or
min(`ifu`.`amount`) as `amount`.

MySQL equivalent in MSSQL to assigning a variable and using it

I tried finding the answer, but maybe I am too new to MSSQL, I come from MySQL, so this is my question super simplified to go straight to the point:
Imagine we have a table "Things"
Thingie | Value
--------+-------
Thing1 | 10
Thing1 | 15
Thing1 | 16
In MySQL I could do something like this in a query:
SET #halfvalue := 0;
SELECT Thingie, Value,
(#halfvalue := Value / 2) AS HalfValue,
(#halfvalue / 2) AS HalfOfHalf
FROM Things
Which would return
Thingie | Value | HalfValue | HalfofHalf
--------+-------+-----------+------------
Thing1 | 10 | 5.00 | 2.50
Thing1 | 15 | 7.50 | 3.75
Thing1 | 16 | 8.00 | 4.00
Looks pretty simple, the actual one is a tad more complicated.
My problem is, in MSSQL I can't assign, and use a variable on the same SELECT. And I can't find anything similar to this functionality on this simple level.
Any solutions?
Edit, this is the select that contains all those nasty operations:
SELECT
fvh.DocEntry,
MAX( fvs.SeriesName ) AS "Serie",
MAX( fvh.DocNum - 1000000 ) AS "Número",
MAX( fvh.DocDate ) AS "Fecha",
MAX( fvh.U_FacNit ) AS "NIT",
MAX( fvh.U_FacNom ) AS "Nombre",
MAX( IIF( ISNULL( fvh.Address, '' ) = '', fvh.Address2, fvh.Address ) ) AS "Dirección",
SUM( fvd.GTotal - ISNULL( ncd.GTotal, 0 ) ) AS "Total",
IIF( MAX( fvh.CANCELED ) = 'Y' OR ( SUM( fvd.GTotal - ISNULL( ncd.GTotal, 0 ) ) = 0 ),
'Anulada',
IIF( SUM( fvd.GTotal ) > SUM( ISNULL( ncd.GTotal, 0 ) ) AND ( SUM( ISNULL( ncd.GTotal, 0 ) ) > 0 ),
'Devuelta',
'Emitida' )
) AS "Estado",
ROUND( ( ( SUM( fvd.GTotal - ISNULL( ncd.GTotal, 0 ) ) / 1.12 ) * 0.12 ), 4 ) AS "IVA",
ROUND( SUM( IIF( fvd.U_TipoA = 'BB',
( fvd.GTotal - ISNULL( ncd.GTotal, 0 ) ) - ( ( ( fvd.GTotal - ISNULL( ncd.GTotal, 0 ) ) / 1.12 ) * 0.12 ),
0 ) ), 4) AS "Bien",
ROUND( SUM( IIF( fvd.U_TipoA = 'S',
( fvd.GTotal - ISNULL( ncd.GTotal, 0 ) ) - ( ( ( fvd.GTotal - ISNULL( ncd.GTotal, 0 ) ) / 1.12 ) * 0.12 ),
0 ) ), 4) AS "Servicio",
ROUND( SUM( IIF( fvd.U_TipoA = 'N',
( fvd.GTotal - ISNULL( ncd.GTotal, 0 ) ) - ( ( ( fvd.GTotal - ISNULL( ncd.GTotal, 0 ) ) / 1.12 ) * 0.12 ),
0 ) ), 4) AS "No Aplica",
COUNT(fvd.LineNum) AS "Lineas", SUM(fvd.GTotal) AS "FCTotal",
SUM(ISNULL( ncd.GTotal, 0 )) AS "NCTotal"
/* Facturas */
FROM OINV AS fvh
LEFT JOIN NNM1 AS fvs ON fvs.Series = fvh.Series
LEFT JOIN INV1 as fvd ON fvd.DocEntry = fvh.DocEntry
/* Notas de Credito */
LEFT JOIN RIN1 AS ncd ON ncd.BaseEntry = fvh.DocEntry AND ncd.LineNum = fvd.LineNum
WHERE fvh.DocDate BETWEEN ? AND ? /*AND fvh.DocEntry = 1108*/
GROUP BY fvh.DocEntry
Thank you all for your time. I will dismantle my query and re-do it taking into consideration all of your input. Gracias, totales.
You think you can do this in MySQL:
SET #halfvalue := 0;
SELECT Thingie, Value,
(#halfvalue := Value / 2) AS HalfValue,
(#halfvalue / 2) AS HalfOfHalf
FROM Things;
But you are wrong. Why? MySQL -- as with every other database -- does not guarantee the order of evaluation of expression in a SELECT. The documentation even warns about this:
In the following statement, you might think that MySQL will evaluate #a first and then do an assignment second:
SELECT #a, #a:=#a+1, ...;
However, the order of evaluation for expressions involving user variables is undefined.
In both databases, you can use a subquery. In the most recent versions of MySQL (and just about any other database), you can also use a CTE:
SELECT Thingie, Value, HalfValue,
(HalfValue / 2) AS HalfOfHalf
FROM (SELECT t.*, (Value / 2) AS HalfValue
FROM Things t
) t;
The answer is simple: you can't do that in MSSQL, because when you try it you'll get:
Msg 141, Level 15, State 1, Line 3
A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations.
which you most probably experienced.
The most simple workaround would be:
SELECT Thingie, Value, Value/2, Value/4 from Things
Other method:
select Thingie, Value, HalfValue, HalfValue / 2 from (
SELECT Thingie, Value, Value / 2 HalfValue from Things
) a
No, that doesn't work in SQL. The parameter value is not set until the query completes. You can do it in two steps:
DECLARE #halfvalue FLOAT = 0;
SELECT #halfvalue = ([Value] / 2)
FROM Things ;
SELECT Thingie
, [Value]
, HalfValue = [Value]/2
, HalfAgainValue = #halfvalue / 2
FROM Things ;

MySQL IN Condition Subquery

I have a question and answers listing and an option to filter the questions based on the % of correct answers. So I am using the following query for the listing :
SELECT
question_id,
text
FROM
test_answers LEFT JOIN test_questions ON test_questions.id = test_answers.question_id
LEFT JOIN test_categories ON test_questions.`category_id` = test_categories.id
WHERE `question_id` IN(question IDS)
GROUP BY `question_id`
ORDER BY `question_id` DESC;
and using another query for finding the question IDS for which the % of correct answers in the given range. The query is as follows :
SELECT q1.question_id FROM (
SELECT test_answers.question_id AS question_id,
SUM( IF( test_answers.correct_answer =1, 1, 0 ) ) AS correct_answers,
SUM( IF( test_answers.correct_answer !=1, 1, 0 ) ) AS incorrect_answers,
round( ( SUM( IF( test_answers.correct_answer =1, 1, 0 ) ) / ( SUM( IF( test_answers.correct_answer =1, 1, 0 ) ) + SUM( IF( test_answers.correct_answer !=1, 1, 0 ) ) ) *100 ) , 2 ) AS percentage
FROM test_replies
JOIN test_answers ON test_replies.answer_id = test_answers.id
GROUP BY test_answers.question_id
HAVING percentage between 80 and 89 AND correct_answers >25
) AS q1
Now the issue is that the second query returns almost 4000 question Ids and it will increase in the near future and might be become 10k or more. So I seriously would like to optimize the query as it is going to impact the performance in a great deal. Can anyone suggest a better method for doing it ?
try join instead of IN, see if it helps. (sql not tested)
SELECT
ta.question_id, text
FROM
(
SELECT test_answers.question_id AS question_id,
SUM( IF( test_answers.correct_answer =1, 1, 0 ) ) AS correct_answers,
SUM( IF( test_answers.correct_answer !=1, 1, 0 ) ) AS incorrect_answers,
round( ( SUM( IF( test_answers.correct_answer =1, 1, 0 ) ) / ( SUM( IF( test_answers.correct_answer =1, 1, 0 ) ) + SUM( IF( test_answers.correct_answer !=1, 1, 0 ) ) ) *100 ) , 2 ) AS percentage
FROM test_replies
JOIN test_answers ON test_replies.answer_id = test_answers.id
GROUP BY test_answers.question_id
HAVING percentage between 80 and 89 AND correct_answers >25
) AS q1
INNER JOIN
test_answers ta USING (question_id)
LEFT JOIN
test_questions ON test_questions.id = ta.question_id
LEFT JOIN
test_categories ON test_questions.`category_id` = test_categories.id
GROUP BY
ta.question_id`
ORDER BY
ta.question_id DESC;