MySQL Max() with group by and multiple tables - mysql

for my studies i need to get a code working. I do have two tables:
Training:
UserID
Date
Uebung
Gewicht
Wiederholungen
Mitglied:
UserID
Name
Vorname
and i need to display the max power which you get if you multiply 'Wiederholungen' with 'Gewicht' from the 'Training' table for EACH User with the date and name.
I know there is a "problem" with max() and group by. But i'm kinda new to MySQL and i was only able to find fixes with one table and also every column already existing. I have to join two tables AND create the power column.
I tried a lot and i think this may be my best chance
select name, vorname, x.power from
(SELECT mitglied.UserID,max( Wiederholungen*Gewicht) as Power
FROM training join mitglied
where Uebung = 'Beinstrecker'
and training.UserID = mitglied.UserID
group by training.UserID) as x
inner join (training, mitglied)
on (training.UserID = mitglied.UserID)
and x.Power = Power;
'''
I get way too many results. I know the last statement is wrong (x.power = power) but i have no clue how to solve it.

This is actually a fairly typical question here, but I am bad a searching for previous answers so....
You "start" in a subquery, finding those max values:
SELECT UserID, Uebung, MAX(Gewicht*Wiederholugen) AS Power
FROM training
WHERE Uebung = 'Beinstrecker'
GROUP BY UserID, Uebung
Then, you join that back to the table it came from to find the date(s) those maxes occurred:
( SELECT UserID, Uebung, MAX(Gewicht*Wiederholugen) AS Power
FROM training
WHERE Uebung = 'Beinstrecker'
GROUP BY UserID, Uebung
) AS maxes
INNER JOIN training AS t
ON maxes.UserID = t.UserID
AND maxes.Uebeng = t.Uebeng
AND maxes.Power = (t.Gewicht*t.Wiederholugen)
Finally, you join to mitglied to get information for the user:
SELECT m.name, m.vorname, maxes.Power
FROM ( SELECT UserID, Uebung, MAX(Gewicht*Wiederholugen) AS Power
FROM training
WHERE Uebung = 'Beinstrecker'
GROUP BY UserID, Uebung
) AS maxes
INNER JOIN training AS t
ON maxes.UserID = t.UserID
AND maxes.Uebeng = t.Uebeng
AND maxes.Power = (t.Gewicht*t.Wiederholugen)
INNER JOIN mitglied AS m ON t.UserID = m.UserID
;
Note: t.Uebung = 'Beinstrecker' could be used as a join condition instead, and might be faster; but as a matter of style I try to prevent redundant literals like that unless there is a worthwhile performance difference.

Related

MYSQL: Using math in SELECT with alias

I have an extremely complex SQL query that I am needing help with. Essentially, this query will see how many total assignments a student is assigned (total) and how many they have completed (completed) for the course. I need one final column that would give me the percentage of completed assignments, because I want to run a query to select all users who have completed less than 50% of their assignments.
What am I doing wrong? I am getting an error "Unknown column 'completed' in 'field list'"
Is there a better way to execute this? I am open to changing my query.
Query:
SELECT students.usid AS ID, students.firstName, students.lastName,
(
SELECT COUNT(workID) FROM assignGrades
INNER JOIN students ON students.usid = assignGrades.usid
INNER JOIN assignments ON assignments.assID = assignGrades.assID
WHERE
assignGrades.usid = ID AND
assignments.subID = 4 AND
(
assignGrades.submitted IS NOT NULL OR
(assignGrades.score IS NOT NULL AND CASE WHEN assignments.points > 0 THEN assignGrades.score ELSE 1 END > 0)
)
) AS completed,
(
SELECT COUNT(workID) FROM assignGrades
INNER JOIN students ON students.usid = assignGrades.usid
INNER JOIN assignments ON assignments.assID = assignGrades.assID
WHERE
assignGrades.usid = ID AND
assignments.subID = 4 AND
(NOW() - INTERVAL 5 HOUR) > assignments.assigned
) AS total,
(completed/total)*100 AS percentage
FROM students
INNER JOIN profiles ON profiles.usid = students.usid
INNER JOIN classes ON classes.ucid = profiles.ucid
WHERE classes.utid=2 AND percentage < 50
If I cut the (percentage) part in the SELECT statement, the query runs as expected. See below for results.
Information about the tables involved in this query:
assignGrades: Lists the student's score for each assignment.
assignments: List the assignments for each course.
students: Lists student information
classes: Lists class information
profiles: Links a student to a class
If you need to check when value is >50% but you don't need to see it, you might use a different approach using HAVING clause
SELECT (now) AS completed, (totalassignments) AS total
FROM db
HAVING (completed/total)*100 > 50;

MySQL Query limiting results by sub table

I'm really struggling with this query and I hope somebody can help.
I am querying across multiple tables to get the dataset that I require. The following query is an anonymised version:
SELECT main_table.id,
sub_table_1.field_1,
main_table.field_1,
main_table.field_2,
main_table.field_3,
main_table.field_4,
main_table.field_5,
main_table.field_6,
main_table.field_7,
sub_table_2.field_1,
sub_table_2.field_2,
sub_table_2.field_3,
sub_table_3.field_1,
sub_table_4.field_1,
sub_table_4.field_2
FROM main_table
INNER JOIN sub_table_4 ON sub_table_4.id = main_table.id
INNER JOIN sub_table_2 ON sub_table_2.id = main_table.id
INNER JOIN sub_table_3 ON sub_table_3.id = main_table.id
INNER JOIN sub_table_1 ON sub_table_1.id = main_table.id
WHERE sub_table_4.field_1 = '' AND sub_table_4.field_2 = '0' AND sub_table_2.field_1 != ''
The query works, the problem I have is sub_table_1 has a revision number (int 11). Currently I get duplicate records with different revision numbers and different versions of sub_table_1.field_1 which is to be expected, but I want to limit the result set to only include results limited by the latest revision number, giving me only the latest sub_table_1_field_1 and I really can not figure it out!
Can anybody lend me a hand?
Many Thanks.
It's always important to remember that a JOIN can be on a subquery as well as a table. You could build a subquery that returns the results you want to see then, once you've got the data you want, join it in the parent query.
It's hard to 'tailor' an answer that's specific to you problem, as it's too obfuscated (as you admit) to know what the data and tables really look like, but as an example:
Say table1 has four fields: id, revision_no, name and stuff. You want to return a distinct list of name values, with their latest version of stuff (which, we'll pretend varies by revision). You could do this in isolation as:
select t.* from table1 t
inner join
(SELECT name, max(revision_no) maxr
FROM table1
GROUP BY name) mx
on mx.name = t.name
and mx.maxr = t.revision_no;
(Note: see fiddle at the end)
That would return each individual name with the latest revision of stuff.
Once you've got that nailed down, you could then swap out
INNER JOIN sub_table_1 ON sub_table_1.id = main_table.id
....with....
INNER JOIN (select t.* from table1 t
inner join
(SELECT name, max(revision_no) maxr
FROM table1
GROUP BY name) mx
on mx.name = t.name
and mx.maxr = t.revision_no) sub_table_1
ON sub_table_1.id = main_table.id
...which would allow a join with a recordset that is more tailored to that which you want to join (again, don't get hung up on the actual query I've used, it's just there to demonstrate the method).
There may well be more elegant ways to achieve this, but it's sometimes good to start with a simple solution that's easier to replicate, then simplify it once you've got the general understanding of the what and why nailed down.
Hope that helps - as I say, it's as specific as I could offer without having an idea of the real data you're using.
(for the sake of reference, here is a fiddle with a working version of the above example query)
In your case where you only need one column from the table, make this a subquery in your select clause instead of than a join. You get the latest revision by ordering by revision number descending and limiting the result to one row.
SELECT
main_table.id,
(
select sub_table_1.field_1
from sub_table_1
where sub_table_1.id = main_table.id
order by revision_number desc
limit 1
) as sub_table_1_field_1,
main_table.field_1,
...
FROM main_table
INNER JOIN sub_table_4 ON sub_table_4.id = main_table.id
INNER JOIN sub_table_2 ON sub_table_2.id = main_table.id
INNER JOIN sub_table_3 ON sub_table_3.id = main_table.id
WHERE sub_table_4.field_1 = ''
AND sub_table_4.field_2 = '0'
AND sub_table_2.field_1 != '';

MySQL - retrieving rows ordered by greatest value of a variable series of columns

I am having a quite difficult time with a particular MySQL query.
Here is the structure of 3 tables I need to join and sort rows from:
users
ID username
routes
ID val_1 val_2 val_3 val_4
scores
ID route_id user_id route_level
This query is for a leaderboard feature, and needs to return a list of user IDs ordered by their best score. One user may have several scores, and we need to find the best score.
The twist is the "route_level" part: if the value of this column is for example "2", then we need to go to the corresponding route_id, and find the greatest value between val_1 and val_2. We must go no higher than val_x, x being the value of column scores.route_level.
Also, val_x is not an int (the value is usually something like "6A+", the next bigger value being 6B, then 6B+ etc.), and val_x is not necessarily smaller than val_x+1.
Here is as far as I got, but it doesn't work (I get results all over the place, not ordered at all, at least not in a way I can make any sense of):
SELECT u.*, r.val_1 AS v1, r.val_2 AS v2, r.val_3 AS v3, r.val_4 AS v4
FROM users u
INNER JOIN scores s
ON s.user_id = u.ID
INNER JOIN routes r
ON s.route_id = r.ID
GROUP BY u.ID
ORDER BY GREATEST(v1, v2, v3, v4) DESC
Does any of you have an idea of how I might take this on in a single MySQL query?
Thanks! :-)
Edit: here is a SQLFiddle link
You need to get your aggregate query right in order to get your ordering right. Try this. It will handle the aggregation by user correctly, and pull the maximum val_n score for each user.
SELECT u.ID, u.username,
MAX(r.val_1) AS v1,
MAX(r.val_2) AS v2,
MAX(r.val_3) AS v3,
MAX(r.val_4) AS v4
FROM users u
INNER JOIN scores s ON s.user_id = u.ID
INNER JOIN routes r ON s.route_id = r.ID
GROUP BY u.ID ,u.username
ORDER BY GREATEST( MAX(r.val_1), MAX(r.val_2), MAX(r.val_3), MAX(r.val_4)) DESC
Edit I missed the bit about the route_level ... focused too much on the query in your question, not the text.
But it's not tremendously hard apply. It needs to be applied row-by-row, at the detail level, BEFORE the aggregation function MAX(). That can be done, I believe, by changing the SELECT part of your query as follows.
SELECT u.ID, u.username,
MAX(GREATEST (
IF(r.route_level>=1, r.val_1, NULL)
IF(r.route_level>=2, r.val_2, NULL)
IF(r.route_level>=3, r.val_3, NULL)
IF(r.route_level>=4, r.val_4, NULL) )) AS val
FROM users u
INNER JOIN scores s ON s.user_id = u.ID
INNER JOIN routes r ON s.route_id = r.ID
GROUP BY u.ID ,u.username
ORDER BY MAX(GREATEST (
IF(r.route_level>=1, r.val_1, NULL)
IF(r.route_level>=2, r.val_2, NULL)
IF(r.route_level>=3, r.val_3, NULL)
IF(r.route_level>=4, r.val_4, NULL) ))
I also missed the point about your val_n values being varchar() values rather than integers. That one is harder. When you use MAX() on varchar() values, it employs the collation. All character collations I know of will declare that the value 6B+, for example, is larger than 10B+, because 6 is greater than 1.
I am sorry, I don't know of any elegant SQL magic that can make that work better.
What you have looks really close...
But you need an aggregate function to get the "best score". (Databases other than MySQL would object to your statement, griping about references to non-aggregates not included in the GROUP BY...)
The "twist" about the route_level is a bit tricky. I'd handle that first, to get the "best" score from each row. I'd use an expressions like these:
IF(r.route_level>=1,r.val_1,NULL)
IF(r.route_level>=2,r.val_2,NULL)
IF(r.route_level>=3,r.val_3,NULL)
IF(r.route_level>=4,r.val_4,NULL)
Those expressions conditionally return a value from the particular val_N column, depending on the value of route_level. Since you don't need to return the individual values, you could wrap all of those expressions in that convenient GREATEST() function.
GREATEST( IF(r.route_level>=1,r.val_1,NULL)
, IF(r.route_level>=2,r.val_2,NULL)
, IF(r.route_level>=3,r.val_3,NULL)
, IF(r.route_level>=4,r.val_4,NULL)
, ...
)
That would get the "best score" from each row. All that remains now is to find the "best score" from the multiple rows returned for a given user. And we can use the MAX aggregate to do that.
So, the only change you need to make to your query would be to replace the expression in the ORDER BY clause. Something like this:
SELECT u.ID
FROM users u
JOIN scores s
ON s.user_id = u.ID
JOIN routes r
ON s.route_id = r.ID
GROUP BY u.ID
ORDER BY MAX( GREATEST( IF(r.route_level>=1,r.val_1,NULL)
, IF(r.route_level>=2,r.val_2,NULL)
, IF(r.route_level>=3,r.val_3,NULL)
, IF(r.route_level>=4,r.val_4,NULL)
)
) DESC
(Also remove all those val_1, val_2, et al. expressions from the SELECT list. It's indeterminate which row those will be returned from. None of those values may be the "best score". If you want to also return the value of the "best score", then use that same expression in the ORDER BY in the SELECT list.
Combining the answers from Ollie Jones and spencer7593, I figured the answer to my own question:
SELECT u.ID, u.username, s.route_level,
GREATEST( IF(s.route_level >= 1, MAX(r.val_1), 0)
, IF(s.route_level >= 2, MAX(r.val_2), 0)
, IF(s.route_level >= 3, MAX(r.val_3), 0)
, IF(s.route_level >= 4, MAX(r.val_4), 0)
) AS vmax
FROM users u
INNER JOIN scores s ON s.user_id = u.ID
INNER JOIN routes r ON s.route_id = r.ID
GROUP BY u.ID
ORDER BY vmax DESC
This gives the max value AS vmax, taking into account the route_length value for each row, and orders the results correctly.
Thanks everyone for your participation! :-)

SQL - How to list items which are below the average

I have quite a basic databast of 3 tables. "Students" "Tests" and "Scores"
For each test I need to list all students with test scores that are below the average for that test. (If that makes any sense at all)
I have an SQL query which simply prints the average score for each test.
SELECT t.Test_name, AVG(sc.Result) AS Avgscore
FROM Tests t
JOIN Scores sc ON t.id_Tests = sc.Tests_id_Tests
JOIN Students s ON sc.Students_id_Students = s.id_Students
WHERE t.id_Tests = $c"
($c is a parameter from a for loop, which is incrementing to printing out each test as a separate table)
Any help would be appreciated, thanks
Change the select list for whatever columns you want to display, but this will limit the results as you want, for a given testid (replace testXYZ with the actual test you're searching on)
SELECT t.Test_name, s.*, sc.*
FROM Tests t
JOIN Scores sc
ON t.id_Tests = sc.Tests_id_Tests
JOIN Students s
ON sc.Students_id_Students = s.id_Students
WHERE t.id_Tests = 'textXYZ'
and sc.result <
(select avg(x.result)
from scores x
where sc.Tests_id_Tests = x.Tests_id_Tests)
Note: To run this for ALL tests, and have scores limited to those that are below the average for each test, you would just leave that one line out of the where clause and run:
SELECT t.Test_name, s.*, sc.*
FROM Tests t
JOIN Scores sc
ON t.id_Tests = sc.Tests_id_Tests
JOIN Students s
ON sc.Students_id_Students = s.id_Students
WHERE sc.result <
(select avg(x.result)
from scores x
where sc.Tests_id_Tests = x.Tests_id_Tests)
For example in PostgreSQL you could use a window function like AVG(Score) OVER (GROUP BY id_Tests), but in MySQL I suggest using a subquery as follows:
SELECT Scores.*, Students.*, t.Test_name, Avgscore
FROM Scores
JOIN Students ON sc.Students_id_Students = s.id_Students
JOIN
SELECT id_Tests, t.Test_name, AVG(sc.Result) AS Avgscore
FROM Tests t
JOIN Scores sc ON t.id_Tests = sc.Tests_id_Tests
-- WHERE id_Tests = $c
GROUP BY id_Tests, t.Test_name
) avgsc ON Scores.Tests_id_Tests=avgsc.id_Tests
WHERE Scores.Result < Avgscore
Note that a student can be listed multiple times if they got below average score multiple times -- might or might not be what you want.
I commented out the line filtering the test as I guess it is easier to retrieve all tests at once, but if you insist on filtering on one test on application level then you can filter here by uncommenting it.

optimize Mysql: get latest status of the sale

In the following query, I show the latest status of the sale (by stage, in this case the number 3). The query is based on a subquery in the status history of the sale:
SELECT v.id_sale,
IFNULL((
SELECT (CASE WHEN IFNULL( vec.description, '' ) = ''
THEN ve.name
ELSE vec.description
END)
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
WHERE veh.id_sale = v.id_sale
AND vec.id_stage = 3
ORDER BY veh.id_record DESC
LIMIT 1
), 'x') sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
WHERE 1 =1
AND v.flag =1
AND v.id_quarters =4
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
the query delay 0.0057seg and show 1011 records.
Because I have to filter the sales by the name of the state as it would have to repeat the subquery in a where clause, I have decided to change the same query using joins. In this case, I'm using the MAX function to obtain the latest status:
SELECT
v.id_sale,
IFNULL(veh3.State3,'x') AS sale_state_3
FROM t_sale v
INNER JOIN t_quarters sd ON v.id_quarters = sd.id_quarters
LEFT JOIN (
SELECT veh.id_sale,
(CASE WHEN IFNULL(vec.description,'') = ''
THEN ve.name
ELSE vec.description END) AS State3
FROM t_record veh
INNER JOIN (
SELECT id_sale, MAX(id_record) AS max_rating
FROM(
SELECT veh.id_sale, id_record
FROM t_record veh
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign AND vec.id_stage = 3
) m
GROUP BY id_sale
) x ON x.max_rating = veh.id_record
INNER JOIN t_state_campaign vec ON vec.id_state_campaign = veh.id_state_campaign
INNER JOIN t_state ve ON ve.id_state = vec.id_state
) veh3 ON veh3.id_sale = v.id_sale
WHERE v.flag = 1
AND v.id_quarters = 4
This query shows the same results (1011). But the problem is it takes 0.0753 sec
Reviewing the possibilities I have found the factor that makes the difference in the speed of the query:
AND EXISTS (
SELECT '1'
FROM t_record
WHERE id_sale = v.id_sale
LIMIT 1
)
If I remove this clause, both queries the same time delay... Why it works better? Is there any way to use this clause in the joins? I hope your help.
EDIT
I will show the results of EXPLAIN for each query respectively:
q1:
q2:
Interesting, so that little statement basically determines if there is a match between t_record.id_sale and t_sale.id_sale.
Why is this making your query run faster? Because Where statements applied prior to subSelects in the select statement, so if there is no record to go with the sale, then it doesn't bother processing the subSelect. Which is netting you some time. So that's why it works better.
Is it going to work in your join syntax? I don't really know without having your tables to test against but you can always just apply it to the end and find out. Add the keyword EXPLAIN to the beginning of your query and you will get a plan of execution which will help you optimize things. Probably the best way to get better results in your join syntax is to add some indexes to your tables.
But I ask you, is this even necessary? You have a query returning in <8 hundredths of a second. Unless this query is getting ran thousands of times an hour, this is not really taxing your DB at all and your time is probably better spent making improvements elsewhere in your application.