Averaging an average in mySQL - mysql

I have a table
car
id | person_id | mpg
------------------------
4 | 1 | 50
5 | 1 | 15
6 | 2 | 10
7 | 2 | 28
8 | 3 | 33
I need to get an average of each person's mpg and then an average for the group.
person 1 avg = (50 + 15) / 2 = 32.5
person 2 avg = (10 + 28) / 2 = 19
person 3 avg = 33
group average = 32.5 + 19 + 33 / 3 = 28.1
Is there a query that will do what I need?

SELECT person_id, AVG(mpg) from car group by person_id;
If you want to get an average for the group, you should probably do this:
SELECT AVG(mpg) from car;
Unless you really want to average the averages, which seems a bit dubious to me:
SELECT AVG(average) from (SELECT person_id, AVG(mpg) as average from car group by person_id);

you cannot solve this in 1 query, but you have to use 2 queries or 1 query en solve the overal average in your code
select person, avg(mpg) from cat group by person

SELECT person_id, AVG(mpg) AS mpg_avg FROM car GROUP BY person_id WITH ROLLUP
The WITH ROLLUP-modifier will add a line to the result set where persion_id is NULL and mpg_avg is the average over the whole result set (MySQL >= 4.1.1):
person_id | mpg
------------------
1 | 32.5
2 | 19.0
3 | 33.0
NULL | 27.2

Related

How do I get results of a MySQL JOIN where records meet a value criteria in joined table?

This may be simple but I can't figure it out...
I have two tables:
tbl_results:
runID | balance |
1 | 3432
2 | 5348
3 | 384
tbl_phases:
runID_fk | pc |
1 | 34
1 | 2
1 | 18
2 | 15
2 | 18
2 | 20
3 | -20
3 | 10
3 | 60
I want to get a recordset of: runID, balance, min(pc), max(pc) only where pc>10 and pc<50 for each runID as a group, excluding runIDs where any associated pc value is outside of value range.
I would want the following results from what's described above:
runID | balance | min_pc | max_pc
2 | 5348 | 15 | 20
... because runID=1&3 have pc values that fall outside the numeric range for pc noted above.
Thanks in advance!
You may apply filters based on your requirements in your having clause. You may try the following.
Query #1
SELECT
r.runID,
MAX(r.balance) as balance,
MIN(p.pc) as min_pc,
MAX(p.pc) as max_pc
FROM
tbl_results r
INNER JOIN
tbl_phases p ON p.runID_fk = r.runID
GROUP BY
r.runID
HAVING
MIN(p.pc)>10 AND MAX(p.pc) < 50;
runID
balance
min_pc
max_pc
2
5348
15
20
Query #2
SELECT
r.runID,
MAX(r.balance) as balance,
MIN(p.pc) as min_pc,
MAX(p.pc) as max_pc
FROM
tbl_results r
INNER JOIN
tbl_phases p ON p.runID_fk = r.runID
GROUP BY
r.runID
HAVING
COUNT(CASE WHEN p.pc <= 10 or p.pc >= 50 THEN 1 END) =0;
runID
balance
min_pc
max_pc
2
5348
15
20
View working demo on DB Fiddle
Updated with comments from Rahul Biswas

MySql Sum different types of expenses from 'expense' table based on value in 'expense type' group by employee

A more generic title for this post would be
MySql Sum different columns in same table based on value of another row, group by yet another row
I have a table of employee expenses:
id | employee_id | expense_cat_id | expense_amount |
1 | 11 | 1 | 100 |
2 | 11 | 1 | 200 |
3 | 12 | 1 | 120 |
4 | 12 | 1 | 140 |
5 | 11 | 2 | 5 |
6 | 12 | 2 | 8 |`
and I want to produce a report like this:
Employee Id | Expense Cat 1 Total Amount | Expense Cat 2 Total Amount
11 | 300 | 5
12 | 260 | 8
So initially I thought I could use 2 table aliases for the same table like this:
SELECT
employee_id,
sum(expense_cat_1.expense_amount) as expense_1_total,
sum(expense_cat_2.expense_amount) as expense_2_total
FROM
expenses as expense_cat_1 where expense_cat_1.expense_cat_id=1 ,
expenses as expense_cat_2 where expense_cat_2.expense_cat_id=2
group by employee_id
but this was not correct Sql Syntax, which makes sense to me.
So I thought I could do two joins on between employee table and the expenses table:
SELECT
employees.id as employee_id,
sum(expenses_cat_1.expense_amount) as expense_1_total,
sum(expenses_cat_2.expense_amount) as expense_2_total
FROM employees
join expenses as expenses_cat_1 on employees.id = expenses_cat_1.employee_id and expenses_cat_1.expense_cat_id=1
join expenses as expenses_cat_2 on employees.id = expenses_cat_2.employee_id and expenses_cat_2.expense_cat_id=2
group by employees.id
Which comes close, but is wrong:
employee_id | expense_1_total | expense_2_total
11 | 300 | 10
12 | 260 | 16
as the expense 2 total is doubled! I think this is because the join on shows up two rows for each of the two expenses with category 1, and sums them.
I also tried a sub-query approach:
SELECT (SELECT sum(expense_amount)
FROM expenses
WHERE expense_cat_id = 1) AS sum1 ,
(SELECT sum(expense_amount)
FROM expenses
WHERE expense_cat_id = 2) AS sum2,
employee_id
FROM expenses group by employee_id
but this has the same problem as the join approach - totals for cat 2 are doubled.
How do I make the second join only include the expense_2_total once ???
I have a personal dislike of sql case statements as they seem more of a procedural language construct (and sql is declarative), but am happy to consider their use in this case - but I put the challenge out there for sql experts to solve this elegantly.
You are looking for conditional aggregation:
SELECT employee_id,
sum(case when expense_cat_id = 1 then expense_amount else 0 end) as expense_1_total,
sum(case when expense_cat_id = 2 then expense_amount else 0 end) as expense_2_total
FROM expenses e
GROUP BY employee_id;

select average from average

I have the following table:
rating
--------
| id | account_id | room | kitchen | bathroom |
-----------------------------------------------
| 1 | 1 | 5 | 5 | 5 |
| 2 | 1 | 2 | 4 | 1 |
| 3 | 1 | 5 | 2 | 1 |
-----------------------------------------------
People can rate the room, kitchen and bathroom (from 1-5).
Average rating for ID = 1: 5 (because 15/3 = 5)
Average rating for ID = 2: 2.3333 (because 7/3 = 2.33333)
Average rating for ID = 3: 2.6666 (because 8/3 = 2.66665)
First question
As you can see, the average rating for ID = 2 => 2.3333... and for ID = 3 => 2.6666. How can I make it floor() and ceil()? (when < .5 => floor, when > .5 => ceil), so that the avg rating for ID = 2 becomes 2 (instead of 2.3333) and the avg rating for ID = 3 becomes 3 (instead of 2.6666...)
Second question
I want to select the average rating of the average ratings (so the average rating from all the rows together). So - when floor() and ceil() are used I have 3 average ratings: 5, 2 and 3 => 10/15 => 3. How do I get to the 3?
Thanks in advance!
For the first question, the answer is round():
select round( (room + kitchen + bathroom) / 3)
For the second, you would just use aggregation:
select avg(room + kitchen + bathroom)
from ratings;
If you want the average of the rounded results:
select round(avg(round(room + kitchen + bathroom)))
from ratings;
However, that seems strange to me.
Use ROUND function.
Query
SELECT id, ROUND(((room + kitchen + bathroom)/3), 0) as `average`
FROM rating
GROUP BY id;

retrieve value of maximum occurrence in a table

I am in a very complicated problem. Let me explain you first what I am doing right now:
I have a table name feedback in which I am storing grades against course id. The table looks like this:
+-------+-------+-------+-------+-----------+--------------
| id | cid | grade |g_point| workload | easiness
+-------+-------+-------+-------+-----------+--------------
| 1 | 10 | A+ | 1 | 5 | 4
| 2 | 10 | A+ | 1 | 2 | 4
| 3 | 10 | B | 3 | 3 | 3
| 4 | 11 | B+ | 2 | 2 | 3
| 5 | 11 | A+ | 1 | 5 | 4
| 6 | 12 | B | 3 | 3 | 3
| 7 | 11 | B+ | 2 | 7 | 8
| 8 | 11 | A+ | 1 | 1 | 2
g_point has just specific values for the grades, thus I can use these values to show the user courses sorted by grades.
Okay, now first my task is to print out the grade of each course. The grade can be calculated by the maximum occurrence against each course. For example from this table we can see the result of cid = 10 will be A+, because it is present two times there. This is simple. I have already implemented this query which I will write here in the end.
The main problem is when we talk about the course cid = 11 which has two different grades. Now in that situation client asks me to take the average of workload and easiness of both these courses and whichever course has the greater average should be shown. The average would be computed like this:
all workload values of the grade against course
+ all easiness values of the grade against course
/ 2
From this example cid = 11 has four entries,have equal number of grades against a course
B+ grade average
avgworkload(2 + 7)/2=x
avgeasiness(3 + 8)/2 = y
answer x+y/2 = 10
A+ grade average
avgworkload(5 + 1)/2=x
avgeasiness(4 + 2)/2 = y
answer x+y/2 = 3
so the grade should be B+.
This is the query which I am running to get the max occurrence grade
SELECT
f3.coursecodeID cid,
f3.grade_point p,
f3.grade g
FROM (
SELECT
coursecodeID,
MAX(mode_qty) mode_qty
FROM (
SELECT
coursecodeID,
COUNT(grade_point) mode_qty
FROM feedback
GROUP BY
coursecodeID, grade_point
) f1
GROUP BY coursecodeID
) f2
INNER JOIN (
SELECT
coursecodeID,
grade_point,
grade,
COUNT(grade_point) mode_qty
FROM feedback
GROUP BY
coursecodeID, grade_point
) f3
ON
f2.coursecodeID = f3.coursecodeID AND
f2.mode_qty = f3.mode_qty
GROUP BY f3.coursecodeID
ORDER BY f3.grade_point
Here is SQL Fiddle.
I added a table Courses with the list of all course IDs, to make the main idea of the query easier to see. Most likely you have it in the real database. If not, you can generate it on the fly from feedback by grouping by cid.
For each cid we need to find the grade. Group feedback by cid, grade to get a list of all grades for the cid. We need to pick only one grade for a cid, so we use LIMIT 1. To determine which grade to pick we order them. First, by occurrence - simple COUNT. Second, by the average score. Finally, if there are several grades than have same occurrence and same average score, then pick the grade with the smallest g_point. You can adjust the rules by tweaking the ORDER BY clause.
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
FROM courses
ORDER BY courses.cid
result set
cid CourseGrade
10 A+
11 B+
12 B
UPDATE
MySQL doesn't have lateral joins, so one possible way to get the second column g_point is to repeat the correlated sub-query. SQL Fiddle
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
,(
SELECT feedback.g_point
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGPoint
FROM courses
ORDER BY CourseGPoint
result set
cid CourseGrade CourseGPoint
10 A+ 1
11 B+ 2
12 B 3
Update 2 Added average score into ORDER BY SQL Fiddle
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
,(
SELECT feedback.g_point
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGPoint
,(
SELECT (AVG(workload) + AVG(easiness))/2
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS AvgScore
FROM courses
ORDER BY CourseGPoint, AvgScore DESC
result
cid CourseGrade CourseGPoint AvgScore
10 A+ 1 3.75
11 B+ 2 5
12 B 3 3
If I understood well you need an inner select to find the average, and a second outer select to find the maximum values of the average
select cid, grade, max(average)/2 from (
select cid, grade, avg(workload + easiness) as average
from feedback
group by cid, grade
) x group by cid, grade
This solution has been tested on your data usign sql fiddle at this link
If you change the previous query to
select cid, max(average)/2 from (
select cid, grade, avg(workload + easiness) as average
from feedback
group by cid, grade
) x group by cid
You will find the max average for each cid.
As mentioned in the comments you have to choose wich strategy use if you have more grades that meets the max average. For example if you have
+-------+-------+-------+-------+-----------+--------------
| id | cid | grade |g_point| workload | easiness
+-------+-------+-------+-------+-----------+--------------
| 1 | 10 | A+ | 1 | 5 | 4
| 2 | 10 | A+ | 1 | 2 | 4
| 3 | 10 | B | 3 | 3 | 3
| 4 | 11 | B+ | 2 | 2 | 3
| 5 | 11 | A+ | 1 | 5 | 4
| 9 | 11 | C | 1 | 3 | 6
You will have grades A+ and C soddisfing the maximum average 4.5

Update with SUM and LIMIT, rolling SUM

I have 2 tables, SVISE and OVERW
Inside OVERW I have some scores with person ids and the date of that score.
E.g
p_id degrees mo_date
5 10.2 2013-10-09
5 9.85 2013-03-10
8 14.75 2013-04-25
8 11.00 2013-02-22
5 5.45 2013-08-11
5 6.2 2013-06-10
SVISE.ofh field must be updated with the sum of the last three records
(for a specific person, ordered by date descending), so for person with id 5, the sum would result from the rows
5 10.2 2013-10-09
5 5.45 2013-08-11
5 6.2 2013-06-10
sum=21.85.
Desired final result on SVISE, based on the values above:
HID OFH START
5 21.85 October, 16 2013 ##(10.2 + 5.45 + 6.2)
5 21.5 September, 07 2013 ##(5.45 + 6.2 + 9.85)
5 0 March, 05 2013 ##(no rows)
8 25.75 October, 14 2013 ##(14.75 + 11)
3 0 October, 14 2013 ##(no rows)
5 0 March, 05 2012 ##(no rows)
OFHwas 0 initially
I can get the total sum for a specific person, but I can't use limit to get the last 3 rows. It gets ignored.
This is the query I use to retrieve the sum of all degrees per person for a given date:
UPDATE SVISE SV
SET
SV.ofh=(SELECT sum(degrees) FROM OVERW WHERE p_id =SV.hid
AND date(mo_date)<date(SV.start)
AND year(mo_date)=year(SV.start))
I cannot just use limit with sum:
UPDATE SVISE SV
SET
SV.ofh=(SELECT sum(degrees) FROM OVERW WHERE p_id =SV.hid
AND date(mo_date)<date(SV.start)
AND year(mo_date)=year(SV.start)
ORDER BY mo_date DESC
LIMIT 3)
This does not work.
I have tried with multi-table updates and with nested queries to achieve this.
Every scenario has known limitations that block me from accomplishing the desired result.
Nested queries cant see the parent table. Unknown column 'SV.hid'in 'where clause'
Multi-table update cant be use with limit. Incorrect usage of UPDATE and LIMIT
Any solution will do. There is no need to do it in a single query. If anyone wants to try even with an intermediate table.
An SQL fiddle is also available.
Thanks in advance for your help.
--Update--
Here is the solution from Akash: http://sqlfiddle.com/#!2/4cf1a/1
This should work,
UPDATED to have a join on svice
UPDATE
svice SV
JOIN (
SELECT
hid,
start,
sum(degrees) as degrees
FROM
(
SELECT
*,
IF(#prev_row <> unix_timestamp(start)+P_ID, #row_number:=0,NULL),
#prev_row:=unix_timestamp(start)+P_ID,
#row_number:=#row_number+1 as row_number
FROM
(
SELECT
mo_date,
p_id,
hid,
start,
degrees
FROM
OVERW
JOIN svice sv ON ( p_id = hid
AND date(mo_date)<date(SV.start)
AND year(mo_date)=year(SV.start) )
ORDER BY
hid,
start,
mo_date desc
) sub_query1
JOIN ( select #row_number:=0, #prev_row:=0 ) sub_query2
) sub_query
where
row_number <= 3
GROUP BY
hid,
start
) sub_query ON ( sub_query.hid = sv.hid AND sub_query.start = sv.start )
SET
SV.ofh = sub_query.degrees
Note: Check this with your updated data, the test data provided could not yield the results you expected due to the date conditions
Try
UPDATE svice SV
JOIN (SELECT SUM(degrees)sumdeg,p_id FROM(SELECT DISTINCT degrees,p_id FROM OVERW,svice WHERE OVERW.p_id IN (SELECT svice.hid FROM svice)
AND date(mo_date)<date(svice.start)
AND year(mo_date)=year(svice.start)ORDER BY mo_date DESC )deg group by p_id)bbc
ON bbc.p_id=SV.hid
SET
SV.ofh=bbc.sumdeg where p_id =SV.hid
http://sqlfiddle.com/#!2/95b42/42
Getting closer,now it "only" needs a limit in GROUP BY.
Two assumptions:
You can figure out how to turn this into an update, and
A PK exists on (id,mo_date)
Then you can do this -
SELECT p_id
, SUM(degrees) ttl
FROM
( SELECT x.*
FROM overw x
JOIN overw y
ON y.p_id = x.p_id
AND y.mo_date >= x.mo_date
GROUP
BY p_id
, mo_date HAVING COUNT(*) <= 3
) a
GROUP
BY p_id;
Maybe I'm slow, but let's ignore svice for now.
Can you show the correct result and the working for each row below...
+------+---------+------------+--------+
| p_id | degrees | mo_date | result |
+------+---------+------------+--------+
| 5 | 6.20 | 2013-06-10 | ? |
| 5 | 5.45 | 2013-08-11 | ? |
| 5 | 10.20 | 2013-10-09 | 21.85 | <- = 10.2+5.45+6.2 = 21.85
| 8 | 14.75 | 2013-04-25 | ? |
| 5 | 9.85 | 2013-03-10 | ? |
| 8 | 11.00 | 2013-02-22 | ? |
+------+---------+------------+--------+