MySQL sum case value equals max value of that group - mysql

I'm trying to solve this query to return the total count of items in a group and the total of items in that group which value equals the max value of one field of that group and also which is the max(value).
So far:
SELECT
limid, COUNT(*), SUM(CASE WHEN cotavertical=MAX(cotavertical) THEN 1 ELSE 0 END), MAX(cotavertical)
FROM limites
LEFT JOIN tbparentchild ON parent=limid
LEFT JOIN tbspatialbi ON child=rgi
WHERE limtipo=4 AND x=1
GROUP BY limid
So far MySQL returns
"Invalid use of group function."
Is it too complex to solve in MySQL only? Better to use algorithm?

You are trying to use the max value for each group in an aggregate function (SUM) before the aggregation has finished, and hence it is not available. The query below uses the strategy of joining a subquery which contains the max value of cotavertical for each limid group. In this case, the max value per group which you want to use will now be available from another source, and you can sum using it.
SELECT l.limid,
COUNT(*),
SUM(CASE WHEN cotavertical = t.cotamax THEN 1 ELSE 0 END),
MAX(cotavertical)
FROM limites l
LEFT JOIN tbparentchild pc
ON pc.parent = l.limid
LEFT JOIN tbspatialbi s
ON pc.child = s.rgi
LEFT JOIN
(
SELECT limid, MAX(cotavertical) AS cotamax
FROM limites
LEFT JOIN tbparentchild
ON parent = limid
LEFT JOIN tbspatialbi
ON child = rgi
WHERE limtipo = 4 AND x = 1
GROUP BY limid
) t
ON l.limid = t.limid
WHERE limtipo = 4 AND l.x = 1
GROUP BY l.limid
Another option for solving your problem would be to use a subquery directly in the CASE statement. But, given the size and number of joins in your original query, this would be way uglier than the query above. MySQL does not support common table expressions, which would have helped with both these solutions.

Related

I was wondering if anyone knew how to find the min value and a max value and find the timeframe between them?

im working with a movie database I made and I wanted to make a select query that would separately select the the movie with the highest revenue and the movie with the lowest value and find the timeframe between them
i've tried to use the min and max functions to try to separately select the lowest and the highest movies and tried to use datediff() to try and the timeframe between them. my code is below
SELECT titles.title, min(financial_info.revenue),max(financial_info.revenue),
DATEDIFF(year, date(min(financial_info.revenue), date(max(financial.revenue)))
production_company.release_date
from titles
left join financial_info on titles.id = financial_info.id
left join production_company on
titles.imdb_id = production_company.imdb_id
I had aggregation error and a syntax error
The general format for this kind of query would be:
SELECT stuff
, maxB.Y - minB.Y -- Or, in this case DATEDIFF instead of minus
FROM (SELECT MIN(x) AS minX, MAX(x) AS maxX FROM a) AS minMax
INNER JOIN a AS minA ON minMax.minX = minA.X
INNER JOIN a AS maxA ON minMax.maxX = maxA.X
INNER JOIN b AS minB ON minA.b_id = minB.b_id
INNER JOIN b AS maxB ON maxA.b_id = maxB.b_id
;
You can end up with more than one result if there are multiple entries with the same revenue amount. (2 at min revenue and 3 at max revenue would yield 6 results).

How to do INNER JOIN with 2 Column COUNTS equal

I am trying to perform the following query:
SELECT wwpqsr.statistic_ref_id,
wwpqsr.create_time,
wwpqm.name
FROM wp_wp_pro_quiz_statistic_ref AS wwpqsr
INNER JOIN wp_wp_pro_quiz_statistic AS wwpqs
ON ( wwpqs.statistic_ref_id = wwpqsr.statistic_ref_id
AND COUNT(wwpqs.correct_count) AS correct =
COUNT(wwpqs.incorrect_count) AS incorrect)
INNER JOIN wp_wp_pro_quiz_master AS wwpqm
ON (wwpqm.id = wwpqsr.quiz_id)
WHERE wwpqsr.user_id = 1;
I need to do a limit on the result here at the end, that is not being shown right now for functionality purposes, since I need to only get results returned from the p_wp_pro_quiz_statistic table where the count of correct_count equals the count of rows from the incorrect_count column. How can I do this within an INNER JOIN here? All within 1 query? Possible? The above code returns empty result, where it should not be an empty result. How should something like this be done?
As I said in comments, you can't use aggregate functions as a where clause unless it is a field from a subquery. For your case I think you are looking for:
SELECT wwpqsr.statistic_ref_id,
wwpqsr.create_time,
wwpqm.name
FROM wp_wp_pro_quiz_statistic_ref AS wwpqsr
INNER JOIN wp_wp_pro_quiz_statistic AS wwpqs
ON ( wwpqs.statistic_ref_id = wwpqsr.statistic_ref_id )
INNER JOIN wp_wp_pro_quiz_master AS wwpqm
ON (wwpqm.id = wwpqsr.quiz_id)
WHERE wwpqsr.user_id = 1
GROUP
BY wwpqsr.statistic_ref_id,
wwpqsr.create_time,
wwpqm.name
HAVING COUNT(wwpqs.correct_count) = COUNT(wwpqs.incorrect_count);

Better way to write MySQL sub-query

I have two tables in my MySQL database: allele and locus. I want to know for a given locus how many alleles there are and of those how many have the status Tentative. I currently have the following query with subquery:
SELECT COUNT(*) as alleleCount,
(SELECT COUNT(*)
FROM allele
INNER JOIN locus ON allele.LocusID = locus.PrimKey
WHERE Status = 'Tentative'
AND locus.ID = 762
) as newAlleleCount
FROM allele
INNER JOIN locus ON allele.LocusID = locus.PrimKey
WHERE locus.ID = 762
but I feel there must be a better way to write this query.
You can use SUM() using sum with condition will result in a boolean 1 or 0 so it will give you the count for your conditions
SELECT locus.ID,COUNT(*) `all_alleles_per_locus`,
SUM(Status = 'Tentative') `tentative_alleles_762`
FROM allele
INNER JOIN locus ON allele.LocusID = locus.PrimKey
GROUP BY locus.ID
One way would be to group the locus by its statuses and fetch each status's respective count; using the WITH ROLLUP modifier will add a NULL status at the end representing the total:
SELECT status, COUNT(*)
FROM allele JOIN locus ON locus.PrimKey = allele.LocusID
WHERE locus.ID = 762
GROUP BY status WITH ROLLUP
If you absolutely do not want a list of all statuses, you can instead GROUP BY status = 'Tentative' (optionally WITH ROLLUP if desired)—but it will not be sargable.

How to fix a count() in a query with a "group by" clause?

I have a function that gets a SQL code and inserts a count field in it and executes the query to return the number of rows in it. The objective is to have a dynamic SQL code and be able to get its record count no matter what code it has, because I use it in a registry filter window and I never know what code may be generated, because the user can add as many filters as he/she wants.
But as I use the group by clause, the result is wrong because it is counting the number of times a main registry appears because of the use on many join connections.
The result of that code above should only one row with a columns with 10 as result, but I get a new table with the first columns with a 2 in the first row and a 1 on the other rows.
If I take off the group by clause I will receive a 11 as a count result, but the first row will be counted twice.
What should I do to get a single row and the correct number?
SELECT
COUNT(*) QUERYRECORDCOUNT, // this line appears only in the Count() function
ARTISTA.*,
CATEGORIA.NOME AS CATEGORIA,
ATIVIDADE.NOME AS ATIVIDADE,
LOCALIDADE.NOME AS CIDADE,
MATRICULA.NUMERO AS MAP
FROM
ARTISTA
LEFT JOIN PERFIL ON PERFIL.REGISTRO = ARTISTA.ARTISTA_ID
LEFT JOIN CATEGORIA ON CATEGORIA.CATEGORIA_ID = PERFIL.CATEGORIA
LEFT JOIN ATIVIDADE ON ATIVIDADE.ATIVIDADE_ID = PERFIL.ATIVIDADE
LEFT JOIN LOCALIDADE ON LOCALIDADE.LOCALIDADE_ID = ARTISTA.LOCAL_ATIV_CIDADE
LEFT JOIN MATRICULA ON MATRICULA.REGISTRO = ARTISTA.ARTISTA_ID
WHERE
((ARTISTA.SIT_PERFIL <> 'NORMAL') AND (ARTISTA.SIT_PERFIL <> 'PRIVADO'))
GROUP BY
ARTISTA.ARTISTA_ID
ORDER BY
ARTISTA.ARTISTA_ID;
This always gives you the number of rows for any query you have:
Select count(*) as rowcount from
(
Paste your query here
) as countquery
Since your are GROUPING BY ARTISTA.ARTISTA_ID, COUNT(*) QUERYRECORDCOUNT will return records count for each ARTISTA.ARTISTA_ID value.
If you want GLOBAL count, then you need to use a nested query:
SELECT COUNT(*) AS QUERYRECORDCOUNT
FROM (SELECT
ARTISTA.*,
CATEGORIA.NOME AS CATEGORIA,
ATIVIDADE.NOME AS ATIVIDADE,
LOCALIDADE.NOME AS CIDADE,
MATRICULA.NUMERO AS MAP
FROM
ARTISTA
LEFT JOIN PERFIL ON PERFIL.REGISTRO = ARTISTA.ARTISTA_ID
LEFT JOIN CATEGORIA ON CATEGORIA.CATEGORIA_ID = PERFIL.CATEGORIA
LEFT JOIN ATIVIDADE ON ATIVIDADE.ATIVIDADE_ID = PERFIL.ATIVIDADE
LEFT JOIN LOCALIDADE ON LOCALIDADE.LOCALIDADE_ID = ARTISTA.LOCAL_ATIV_CIDADE
LEFT JOIN MATRICULA ON MATRICULA.REGISTRO = ARTISTA.ARTISTA_ID
WHERE
((ARTISTA.SIT_PERFIL <> 'NORMAL') AND (ARTISTA.SIT_PERFIL <> 'PRIVADO'))
GROUP BY
ARTISTA.ARTISTA_ID
ORDER BY
ARTISTA.ARTISTA_ID);
In this case, you may not need to select those many columns.
If you need to retrieve the all records count with details, then better to use two separate queries.

optimize Mysql: get latest status of the sale

In the following query, I show the latest status of the sale (by stage, in this case the number 3). The query is based on a subquery in the status history of the sale:
SELECT v.id_sale,
IFNULL((
SELECT (CASE WHEN IFNULL( vec.description, '' ) = ''
THEN ve.name
ELSE vec.description
END)
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
WHERE veh.id_sale = v.id_sale
AND vec.id_stage = 3
ORDER BY veh.id_record DESC
LIMIT 1
), 'x') sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
WHERE 1 =1
AND v.flag =1
AND v.id_quarters =4
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
the query delay 0.0057seg and show 1011 records.
Because I have to filter the sales by the name of the state as it would have to repeat the subquery in a where clause, I have decided to change the same query using joins. In this case, I'm using the MAX function to obtain the latest status:
SELECT
v.id_sale,
IFNULL(veh3.State3,'x') AS sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
LEFT JOIN (
SELECT veh.id_sale,
(CASE WHEN IFNULL(vec.description,'') = ''
THEN ve.name
ELSE vec.description END) AS State3
FROM t_record veh
INNER JOIN (
SELECT id_sale, MAX(id_record) AS max_rating
FROM(
SELECT veh.id_sale, id_record
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign AND vec.id_stage = 3
) m
GROUP BY id_sale
) x ON x.max_rating = veh.id_record
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
) veh3 ON veh3.id_sale = v.id_sale
WHERE v.flag = 1
AND v.id_quarters = 4
This query shows the same results (1011). But the problem is it takes 0.0753 sec
Reviewing the possibilities I have found the factor that makes the difference in the speed of the query:
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
If I remove this clause, both queries the same time delay... Why it works better? Is there any way to use this clause in the joins? I hope your help.
EDIT
I will show the results of EXPLAIN for each query respectively:
q1:
q2:
Interesting, so that little statement basically determines if there is a match between t_record.id_sale and t_sale.id_sale.
Why is this making your query run faster? Because Where statements applied prior to subSelects in the select statement, so if there is no record to go with the sale, then it doesn't bother processing the subSelect. Which is netting you some time. So that's why it works better.
Is it going to work in your join syntax? I don't really know without having your tables to test against but you can always just apply it to the end and find out. Add the keyword EXPLAIN to the beginning of your query and you will get a plan of execution which will help you optimize things. Probably the best way to get better results in your join syntax is to add some indexes to your tables.
But I ask you, is this even necessary? You have a query returning in <8 hundredths of a second. Unless this query is getting ran thousands of times an hour, this is not really taxing your DB at all and your time is probably better spent making improvements elsewhere in your application.