Mysql average based on sum if in another column - mysql

For example purposes lets say Im trying to figure out the average score for males and females from each parent.
Example data looks like this:
parentID childID sex score
------------------------------------
1 21 m 17
1 23 f 12
2 33 f 55
2 55 m 22
3 67 m 26
3 78 f 29
3 93 m 31
This is the result I want:
parentID offspring m f avg-m avg-f avg-both
----------------------------------------------------
1 2 1 1 17 12 14.5
2 2 1 1 22 55 38.5
3 3 2 1 28.5 29 28.67
With the below query I can find the average for both males and females but I'm not sure how to get the average for either male or female
SELECT parentID, COUNT( childID ) AS offspring, SUM( IF( sex = 'm', 1, 0 ) ) AS m, SUM( IF( sex = 'f', 1, 0 ) ) AS f, max(score) as avg-both
FROM sexb_1
WHERE avg-both > 11
GROUP BY parentID
I tried something like this in the query but it returns an error
AVG(IF(sex = 'm', max(score),0)) as avg-m

I tried something like this in the query but it returns an error
AVG(IF(sex = 'm', max(score),0)) as avg-m
You can't use one aggregate function within another (in this case, MAX() within AVG())—what would that even mean? Once one has discovered the MAX() of the group, over what is there to take an average?
Instead, you want to take the AVG() of score values where the sex matches your requirement; since AVG() ignores NULL values and the default for unmatched CASE expressions is NULL, one can simply do:
SELECT parentID,
COUNT(*) offspring,
SUM(sex='m') m,
SUM(sex='f') f,
AVG(CASE sex WHEN 'm' THEN score END) `avg-m`,
AVG(CASE sex WHEN 'f' THEN score END) `avg-f`,
AVG(score) `avg-both`
FROM sexb_1
GROUP BY parentID
HAVING `avg-both` > 11
See it on sqlfiddle.

Using if
SELECT parentID, COUNT( childID ) AS offspring,
SUM(iF( sex='m', 1 ,0 )) AS m,
SUM(iF( sex='f', 1 ,0 )) AS f,
AVG(if(sex='m', score, null)) as avg_m,
AVG(if(sex='f', score, null)) as avg_f,
AVG(score) as avgboth
FROM sexb_1
GROUP BY parentID
HAVING avgboth > 11
fiddle
In your query the error is due to the usage of avg-both You need to use back ticks or underscore for the alias name. Here it considers it as difference of avg and both
And also you cannot use alias names inside where clause as after the table name is picked up from the query, it is the where clause that comes next. So the database doesn't know the alias names yet.

You can try below query-
SELECT
parentID, COUNT(childID) AS `offspring`,
COUNT(IF(sex = 'm',sex ,NULL )) AS `m`, COUNT(IF(sex = 'f', sex,NULL)) AS `f`,
AVG(IF(sex = 'm',score,NULL )) AS `avg-m`, COUNT(IF(sex = 'f', score,NULL)) AS `avg-f`,
AVG(score) AS `avg-both`
FROM sexb_1
GROUP BY parentID
HAVING `avg-both` > 11;

Related

Better way to select max values in multiple-column group?

Simplified the following question I got from a coding challenge...
I have a table grades like:
year sex person mark
2000 M Mark 70
2010 F Alyssa 23
2020 M Robert 54
I want to select the people per year for both sexes that have the highest marks.
My Attempt:
SELECT
year,
MAX(CASE
WHEN sex = ‘F’ THEN person
ELSE ‘’ END) AS person_f,
MAX(CASE
WHEN sex = ‘M’ THEN person
ELSE ‘’ END) AS person_m
FROM (
SELECT
year,
sex,
person,
**mark
FROM grades
WHERE mark IN (
SELECT MAX(mark) AS mark
FROM grades
GROUP BY year, sex)
**) AS t
WHERE x = 1
GROUP BY 1
ORDER BY 1
I modified everything within the ** ** but the rest of the code was pre-populated. The code seemed right to me, but somehow only passed 2/4 test cases, and there were no tiebreaker records.
Also, I omitted the WHERE x = 1 line, but the correct solution apparently needs that. (yes, x isn't a column in any table)
Is there a more elegant/efficient way to solve this?
Can't seem to figure it out, and it's really bugging me.
First you need to use single quotes for Strings
The Problem of your query, is the subquery for your marks, you select a bunch of highest marks without associating them to the year, and gender
MySql allows you to IN clause with multiple columns.
SELECT
year,
MAX(CASE
WHEN sex = 'F' THEN person
ELSE '' END) AS person_f,
MAX(CASE
WHEN sex = 'M' THEN person
ELSE '' END) AS person_m
FROM (
SELECT
year,
sex,
person,
mark
FROM grades
WHERE (year,sex,mark) IN (
SELECT year, sex,MAX(mark) AS mark
FROM grades
GROUP BY year, sex)
) AS t
GROUP BY 1
ORDER BY 1
| year | person\_f | person\_m |
|-----:|:----------|:----------|
| 2000 | | Mark |
| 2010 | Alyssa | |
| 2020 | | Robert |
fiddle
I believe this approach incorporates the WHERE x = 1 clause as well.
SELECT
year,
MAX(CASE
WHEN sex = 'F' THEN person
ELSE '' END) AS person_f,
MAX(CASE
WHEN sex = 'M' THEN person
ELSE '' END) AS person_m
FROM (
SELECT
year,
sex,
person,
RANK() OVER (PARTITION BY year, sex ORDER BY mark DESC) AS x
FROM grades)
WHERE x = 1
GROUP BY 1
ORDER BY 1
You can use rank. I added info to the table so you could see what happens in different scenarios including a tie.
select year
,sex
,person
,mark
from (
select *
,rank() over(partition by year, sex order by mark desc) as rnk
from t
) t
where rnk = 1
order by year, sex
year
sex
person
mark
2000
F
Alyssa
23
2000
M
Mark
70
2000
M
Danny
70
2010
F
Alma
100
2010
M
Dudu
47
2020
F
Noga
98
2020
M
Moshe
56
Fiddle

MySQL Where Clause with Union All getting wrong results

I will preface this by saying I am still very much learning MySQL, and I am absolutely at that stage where I know just enough to be dangerous.
I have a database with data for scorekeeping for a sports league. We record wins/losses as either 1 or zero points. There is a night that has double play involved (meaning the players play twice in a single night, for 2 different formats). My data is structured like so (just a sample, I have hundreds of rows, over different formats):
ID
FID
WK
Type
HomeTeam
AwayTeam
HF1
HF2
AF1
AF2
1
44
1
PL
TM1
TM2
1
0
0
1
2
44
1
PL
TM3
TM4
0
0
1
1
3
44
2
PL
TM2
TM3
1
1
0
0
4
44
2
PL
TM4
TM1
0
1
1
0
5
44
3
PL
TM3
TM1
999
0
999
1
6
44
3
PL
Tm2
TM4
1
0
0
1
Where the 999 is used as a code number for us to know that the match hasn't yet been played, or the scoresheet hasn't been turned in to us for recordkeeping. (I use PHP to call these to a website for users to see what is going on, and am using an IF statement to convert that 999 to "TBD" on the website)
I can pull the Format 1 and Format 2 scores separately and get a listing just fine, but when I try to pull them together and get a total score, I am getting an incorrect count. I know the error lies with my WHERE Clause, but I've been banging my head trying to get it to work correctly, and I think I just need an extra set of eyes on this.
My current SQL Query is as follows:
SELECT Team,
SUM(TotalF1) AS TotalF1,
SUM(TotalF2) AS TotalF2,
SUM(TotalF1+TotalF2) AS Total
FROM ( ( SELECT HomeTeam AS Team,
HF1 AS TotalF1,
HF2 AS TotalF2
FROM tbl_teamscores
WHERE FID = 44
AND Type = 'PL'
AND HF1 != 999
AND HF2 != 999 )
UNION ALL
( SELECT AwayTeam,
AF1,
AF2
FROM tbl_teamscores
WHERE FID = 44
AND Type = 'PL'
AND AF1 != 999
AND AF2 != 999 )
) CC
GROUP BY Team
ORDER BY Total desc, Team ASC;
I am getting incorrect totals though, and I know the reason is because of those 999 designations, as the WHERE clause is skipping over ALL lines where either home or away score matches 999.
I tried separating it out to 4 separate Select Statements, and unioning them, but I just get an error when I do that. I also tried using Inner Join, but MySQL doesn't seem to like that either.
Edit to add DBFiddle with Real World Table Data and queries: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=1d4d090b08b8280e734218ba32db6d88
An example of the problem can be observed when looking at the data for Player 10. The overall total should be 13, but I am only getting 12.
Any suggestions would be very helpful.
Thanks in advance!
You can use conditional aggregation:
SELECT Team,
SUM(CASE WHEN Total8 <> 999 THEN Total8 END) AS Total8,
SUM(CASE WHEN TotalLO <> 999 THEN TotalLO END) AS TotalLO,
SUM(CASE WHEN Total8 <> 999 THEN Total8 END) + SUM(CASE WHEN TotalLO <> 999 THEN TotalLO END) AS Total
FROM (
SELECT HomeTeam AS Team, Home8PTS AS Total8, HomeLOPTS AS TotalLO FROM tbl_teamscores WHERE FID = 44 AND Type = 'PL'
UNION ALL
SELECT AwayTeam, Away8PTS, AwayLOPTS FROM tbl_teamscores WHERE FID = 44 AND Type = 'PL'
) CC
GROUP BY Team
ORDER BY Team ASC;
or:
SELECT Team,
SUM(NULLIF(Total8, 999)) AS Total8,
SUM(NULLIF(TotalLO, 999)) AS TotalLO,
SUM(NULLIF(Total8, 999)) + SUM(NULLIF(TotalLO, 999)) AS Total
FROM (
SELECT HomeTeam AS Team, Home8PTS AS Total8, HomeLOPTS AS TotalLO FROM tbl_teamscores WHERE FID = 44 AND Type = 'PL'
UNION ALL
SELECT AwayTeam, Away8PTS, AwayLOPTS FROM tbl_teamscores WHERE FID = 44 AND Type = 'PL'
) CC
GROUP BY Team
ORDER BY Team ASC;
If you get nulls in the results then you should also use COALESCE():
SELECT Team,
COALESCE(SUM(NULLIF(Total8, 999)), 0) AS Total8,
COALESCE(SUM(NULLIF(TotalLO, 999)), 0) AS TotalLO,
COALESCE(SUM(NULLIF(Total8, 999)), 0) + COALESCE(SUM(NULLIF(TotalLO, 999)), 0) AS Total
FROM (
SELECT HomeTeam AS Team, Home8PTS AS Total8, HomeLOPTS AS TotalLO FROM tbl_teamscores WHERE FID = 44 AND Type = 'PL'
UNION ALL
SELECT AwayTeam, Away8PTS, AwayLOPTS FROM tbl_teamscores WHERE FID = 44 AND Type = 'PL'
) CC
GROUP BY Team
ORDER BY Team ASC;
See the demo.

Highest Matches as an umpire

Their are two column umpire1 and umpire 2.
I need to find the name of the umpire who attend maximum matches doesn't matter he is umpire1 or umpire2 in that match his occurrence should count.
for e.g-- S Ravi
37 time as umpire1
57 time as umpire2
94 -- total
Output like
Umpire Total_count
S Ravi 94
Error in SQL statement: AnalysisException: cannot resolve 'level' given input columns:
line 2 pos 14;
'Aggregate [umpire1#2497], [umpire1#2497, 'sum(CASE WHEN ('level = S Ravi) THEN 1 ELSE 0 END) AS u1#2464, 'sum(CASE WHEN ('level = SJ Davis) THEN 1 ELSE 0 END) AS u2#2465, 'sum(('u1 + 'u2)) AS total#2466]
select umpire1,
sum(case when level ='umpire1' then 1 else 0 end) as u1,
sum(case when level ='umpire2' then 1 else 0 end) as u2,
sum(u1+u2) as total
from IPL_MATCHES group by umpire1;
I need to find the name of the umpire who attend maximum matches doesn't matter he is umpire1 or umpire2 in that match his occurrence should count.
I think umpire1 and umpire2 are two columns in your table and you want to unpivot and count:
select umpire, count(*)
from ((select umpire1 as umpire from ipl_matches
) union all
(select umpire2 as umpire from ipl_matches
)
) u
group by umpire;
If you want the umpire with the most matches, then add:
order by count(*) desc
limit 1

Selecting multiple columns from two tables in which one column of a table has multiple where conditions and group them by two columns and order by one

I have two tables namely "appointment" and "skills_data".
Structure of appointment table is:
id_ap || ap_meet_date || id_skill || ap_status.
And the value of ap_status are complete, confirm, cancel and missed.
And the skills_data table contains two columns namely:
id_skill || skill
I want to get the count of total number of appointments for each of these conditions
ap_status = ('complete' and 'confirm'),
ap_status = 'cancel' and
ap_status = 'missed'
GROUP BY id_skill and year and
order by year DESC
I tried this query which only gives me count of one condition but I want to get other two based on group by and order by clauses as mentioned.
If there is no record(for example: zero appointments missed in 2018 for a skill) matching for certain conditions, then it should display the output value 0 for zero count.
Could someone please suggest me with a query whether I should implement multiple select query or CASE clause to achieve my expected results. I have lot of records in appointment table and want a efficient way to query my records. Thank you!
SELECT a.id_skill, YEAR(a.ap_meet_date) As year, s.skill,COUNT(*) as count_comp_conf
FROM appointment a,skills_data s WHERE a.id_skill=s.id_skill and a.ap_status IN ('complete', 'confirm')
GROUP BY `id_skill`, `year`
ORDER BY `YEAR` DESC
Output from my query:
id_skill | year | skill | count_comp_conf
-----------------------------------------
1 2018 A 20
2 2018 B 15
1 2019 A 10
2 2019 B 12
3 2019 C 10
My expected output should be like this:
id_skill | year | skill | count_comp_conf | count_cancel | count_missed
------------------------------------------------------------------------
1 2018 A 20 5 1
2 2018 B 15 8 0
1 2019 A 10 4 1
2 2019 B 12 0 5
3 2019 C 10 2 2
You can use conditional aggregation using case when expression
SELECT a.id_skill, YEAR(a.ap_meet_date) As year, s.skill,
COUNT(case when a.ap_status IN ('complete', 'confirm') then 1 end) as count_comp_conf,
COUNT(case when a.ap_status = 'cancel' then 1 end) as count_cancel,
COUNT(case when a.ap_status = 'missed' then 1 end) as count_missed
FROM appointment a inner join skills_data s on a.id_skill=s.id_skill
GROUP BY `id_skill`, `year`
ORDER BY `YEAR` DESC
SELECT a.id_skill,
YEAR(a.ap_meet_date) As year,
s.skill,
SUM(IF(a.ap_status IN ('complete', 'confirm'),1,0)) AS count_comp_conf,
SUM(IF(a.ap_status='cancel',1,0)) AS count_cancel,
SUM(IF(a.ap_status='missed',1,0)) AS count_missed
FROM appointment a,skills_data s WHERE a.id_skill=s.id_skill
GROUP BY `id_skill`, `year`
ORDER BY `YEAR` DESC;
Please try to use if condition along with sum.
With below query you will get output.
select id_skill ,
year ,
skill ,
count_comp_conf ,
count_cancel ,
count_missed ( select id_skill, year, skill, if ap_status ='Completed' then count_comp_conf+1, elseif ap_status ='cancelled' then count_cancel +1 else count_missed+1
from appointment a join skills_data s on (a.id_skill = s.id_skill) group by id_skill, year) group by id_skill,year
order by year desc;

How to add a null value in the group_concat if the record does not exist?

I have this SQL query with substring_index and group_concat but the results I get does not give the right location of the values because the values does not exist.
I need to add a null or zero value in order to have the right location of the values in the sql result.
In the table there are three lid (1, 2, 3). The lid should be the basis count of the P's (P1, P2, P3) for the substring_index.
This is the table:
lid class_id class total
----- ------- ----- -----
1 73 Leader 10000
1 77 Consultant 8000
1 83 Coordinator 6000
2 73 Leader 20000
2 76 Staff 8000
2 77 Consultant 10000
3 73 Leader 30000
3 78 Team Leader 8000
This is the SQL query I used to group_concat for the totals and substring_index to separate the grouped values with their each column (P1, P2, P3)
SELECT *, GROUP_CONCAT(lid) as lids, GROUP_CONCAT(pyear) as pyears,
COUNT(DISTINCT lib_id) AS total_count,
CASE WHEN COUNT(*)>=1 THEN SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(if(total is null,0,total) ORDER BY lid ASC SEPARATOR ' '),' ',1),' ',-1) END AS P1,
CASE WHEN COUNT(*)>=2 THEN SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(if(total is null,0,total) ORDER BY lid ASC SEPARATOR ' '),' ',2),' ',-1) END AS P2,
CASE WHEN COUNT(*)>=3 THEN SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(if(total is null,0,total) ORDER BY lid ASC SEPARATOR ' '),' ',3),' ',-1) END AS P3
FROM (
SELECT * FROM view_items WHERE lid='1'
UNION
SELECT * FROM view_items WHERE lid='2'
UNION
SELECT * FROM view_items WHERE lid='3'
) AS AZ GROUP BY class_id
This is the result of the above query:
class id class lids P1 P2 P3
--------- ----- ----- ---- ---- ----
73 Leader 1,2,3 10000 20000 30000
76 Staff 2 8000
77 Consultant 1,2 8000 10000
78 Team Leader 3 8000
83 Coordinator 1 6000
The lids should always have three count even though the record does not exists, a zero or null value should be added. How to do the adding of null value?
This is the expected result I need.
class id class lids P1 P2 P3
--------- ----- ----- ---- ---- ----
73 Leader 1,2,3 10000 20000 30000
76 Staff 0,2,0 0 8000 0
77 Consultant 1,2,0 8000 10000 0
78 Team Leader 0,0,3 0 0 8000
83 Coordinator 1,0,0 6000 0 0
To get 0 values where no lid is present in the table, you need to generate a list of all lid values for all class_id values, which you can do with a CROSS JOIN of two SELECT DISTINCT queries (one for lid, one for class_id). This can then be LEFT JOINed to the table, to get the required total value for each P group using conditional aggregation:
SELECT c.class_id,
MAX(v.class),
GROUP_CONCAT(COALESCE(v.lid, 0) ORDER BY l.lid) AS lids,
MAX(CASE WHEN v.lid=1 THEN total ELSE 0 END) AS P1,
MAX(CASE WHEN v.lid=2 THEN total ELSE 0 END) AS P2,
MAX(CASE WHEN v.lid=3 THEN total ELSE 0 END) AS P3
FROM (SELECT DISTINCT lid FROM view_items) l
CROSS JOIN (SELECT DISTINCT class_id FROM view_items) c
LEFT JOIN view_items v ON v.lid = l.lid AND v.class_id = c.class_id
GROUP BY c.class_id
Output:
class_id class lids P1 P2 P3
73 Leader 1,2,3 10000 20000 30000
76 Staff 0,2,0 0 8000 0
77 Consultant 1,2,0 8000 10000 0
78 Team Leader 0,0,3 0 0 8000
83 Coordinator 1,0,0 6000 0 0
Demo on dbfiddle
Use else 0 in case expression
SELECT *, GROUP_CONCAT(lid) as lids, GROUP_CONCAT(pyear) as pyears,
COUNT(DISTINCT lib_id) AS total_count,
CASE WHEN COUNT(*)>=1 THEN SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(if(total is null,0,total) ORDER BY lid ASC SEPARATOR ' '),' ',1),' ',-1) else 0 END AS P1,
CASE WHEN COUNT(*)>=2 THEN SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(if(total is null,0,total) ORDER BY lid ASC SEPARATOR ' '),' ',2),' ',-1) else 0 END AS P2,
CASE WHEN COUNT(*)>=3 THEN SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(if(total is null,0,total) ORDER BY lid ASC SEPARATOR ' '),' ',3),' ',-1) else 0 END AS P3
FROM (
SELECT * FROM view_items WHERE lid='1'
UNION
SELECT * FROM view_items WHERE lid='2'
UNION
SELECT * FROM view_items WHERE lid='3'
) AS AZ GROUP BY class_id
I think the problem is, that you say count >= 1, but count >= 3 is also count >=1 so your code never reaches this line. You have to say,
CASE WHEN COUNT() >=1 AND COUNT () < 2...
CASE WHEN COUNT() >=2 AND COUNT () < 3...
I don't think GROUP_CONCAT() is the right approach for what you want. Try this:
SELECT vi.class_id, vi.class,
COUNT(DISTINCT vi.lib_id) AS total_count,
CONCAT_WS(',',
MAX(CASE WHEN vi.lid = 1 THEN 1 ELSE 0 END),
MAX(CASE WHEN vi.lid = 2 THEN 1 ELSE 0 END),
MAX(CASE WHEN vi.lid = 3 THEN 1 ELSE 0 END)
) as lids,
SUM(CASE WHEN vi.lid = 1 THEN vi.total ELSE 0 END) as total_1,
SUM(CASE WHEN vi.lid = 2 THEN vi.total ELSE 0 END) as total_2,
SUM(CASE WHEN vi.lid = 3 THEN vi.total ELSE 0 END) as total_3
FROM view_items vi
WHERE vi.lid IN (1, 2, 3)
GROUP BY vi.class_id;
Notes:
Your subquery and UNION are not needed. In MySQL, these can actually hurt performance.
I assume that lid is a number, so I've removed the single quotes.
You can use conditional aggregation for each of the totals that you want. Parsing GROUP_CONCAT() is not the right way to do this.
Your question is about the lids. The CONCAT_WS() does what you want -- concatenating either the value (if it appears) or zero if it does not.