Reuse body of a mysql query for both count and rows - mysql

Because I'm working with a framework (Magento) I don't have direct control of the SQL that is actually executed. I can build various parts of the query, but in different contexts its modified in different ways before it goes to the database.
Here is a simplified example of what I'm working with.
students enrolments
-------- ------------------
id| name student_id| class
--+----- ----------+-------
1| paul 1|biology
2|james 1|english
3| jo 2| maths
2|english
2| french
3|physics
3| maths
A query to show all students who are studying English together with all the courses those students are enrolled on, would be:
SELECT name, GROUP_CONCAT(enrolments.class) AS classes
FROM students LEFT JOIN enrolments ON students.id=enrolments.student_id
WHERE students.id IN ( SELECT e.student_id
FROM enrolments AS e
WHERE e.class LIKE "english" )
GROUP BY students.id
This will give the expected results
name| classes
----+----------------------
paul|biology, english
james|maths, english, french
Counting the number of students who study english would be trivial, if it weren't for the fact that Magento automatically uses portions of my first query. For the count, it modifies my original query as follows:
Removes the columns being selected. This would be the name and classes columns.
Adds a count(*) column to the select
Removes any group by clause
After this butchery, my query above becomes
SELECT COUNT(*)
FROM students LEFT JOIN enrolments ON students.id=enrolments.student_id
WHERE students.id IN ( SELECT e.student_id
FROM enrolments AS e
WHERE e.class LIKE "english" )
Which will not give me the number of students enrolled on the English course as I require. Instead it will give me the combined number of enrolments of all students who are enrolled on the English course.
I'm trying to come up with a query which can be used in both contexts, counting and getting the rows. I get to keep any join clauses and and where clauses and that's about it.

The problem with your original query is the GROUP BY clause. Selecting COUNT(*) by keeping the GROUP BY clause would result in two rows with a number of classes for each user:
| COUNT(*) |
|----------|
| 2 |
| 3 |
Removing the GROUP BY clause will just retun the number of all rows from the LEFT JOIN:
| COUNT(*) |
|----------|
| 5 |
The only way I see, magento could solve that problem, is to put the original query into a subquery (derived table) and count the rows of the result. But that might end up in terrible performance. I would also be fine with an exception, complaining that a query with a GROUP BY clause can not be used for pagination (or something like that). Just return an anexpected result is probably the worst what a library can do.
Well, it just so happens I have a solution. :-)
Use a corelated subquery for GROUP_CONCAT in the SELECT clause. This way you will not need a GROUP BY clause.
SELECT name, (SELECT GROUP_CONCAT(enrolments.class)
FROM enrolments
WHERE enrolments.student_id = students.id
) AS classes
FROM students
WHERE students.id IN ( SELECT e.student_id
FROM enrolments AS e
WHERE e.class LIKE "english" )
However, I would rewrite the query to use an INNER JOIN instead of an IN condition:
SELECT s.name, (
SELECT GROUP_CONCAT(e2.class)
FROM enrolments e2
WHERE e2.student_id = s.id
) AS classes
FROM students s
JOIN enrolments e1
ON e1.student_id = s.id
WHERE e1.class = "english";
Both queries will return the same result as your original one.
| name | classes |
|-------|----------------------|
| paul | biology,english |
| james | maths,english,french |
But also return the correct count when modified my magento.
| COUNT(*) |
|----------|
| 2 |
Demo: http://rextester.com/OJRU38109
Additionally - chances are good that it will even perform better, due to MySQLs optimizer, which often creates bad execution plans for queries with JOINs and GROUP BY.

Related

SQL Query, Getting multiple dates for one entity

I am working on writing a SQL query to produce a table that will look something like this:
Name |Dates Absent|Total Absences
student |10/28/2018 | 2
|10/29/2018 |
I currently have a data base which has 2 tables that are part of a larger system which contain the needed data (absences, students).
I have tried the following query
SELECT s.student_id,s.last_name,s.first_name, COUNT(s.student_id) AS 'Total Absences'
FROM `students` s, `absences` a INNER JOIN students ON students.student_id=a.student_id
Which yielded the following results:
student_id | last_name | first_name | Total Absences
1 | student | name | 12464
I want this to only use each ID once and count the times it appears. Is the problem from a relationship in the database that has many dates the one person can be absent? The ID was left in the select for now for debugging purposes, it will be removed later.
EDIT
I now have the query
SELECT s.last_name, s.first_name,a.date_absence, COUNT(s.student_id) AS 'Total Absences'
FROM `students` s, `absences` a
INNER JOIN students ON students.student_id=a.student_id
GROUP BY s.student_ID
This only displays one of the dates, how I can add all of the dates without redisplaying the students information?
You can do this with group_concat. It's not quite what you descibe, but it's close.
SELECT s.student_id,s.last_name,s.first_name, group_concat(a.date_absent) AS 'Dates Absent', COUNT(a.id) AS 'Total Absences'
FROM `students` s JOIN `absences` a ON s.student_id = a.student_id
GROUP BY s.student_id
which should yield
student_id | last_name | first_name | Dates Absent | Total Absences
1 | student | name | 10/28/2018,10/29/2018 | 2
It looks like you are almost there with the counting, but missing your GROUP BY statement
If you include aggregate functions, such as COUNT(), but leave off the GROUP BY, the whole intermediate result is taken as one group
You also seem to have a strange CROSS JOIN going on with your duplicate mention of the students table
If you want the absence dates in each row you'll have to use another aggregate function, GROUP_CONCAT()
Something along the lines of
SELECT s.student_id, /** Include as names could feasibly be duplicated */
CONCAT(s.first_name, ' ', s.last_name) name,
GROUP_CONCAT([DISTINCT] a.date) dates_absent, /** May want distinct here if more than one absence is possible per diem */
COUNT(*) total_absences
FROM students s
JOIN absences a
ON a.student_id = s.student_id
GROUP BY s.student_id[, name] /** name required for SQL standard */
[ORDER BY name [ASC]] /** You'll probably want some kind of ordering */
[] indicate optional inclusions

Group by with MAX() still selects other row

I'm writing a query where I need to select student name by who has a MAX gradelevel_id. How ever it still selects the other row with the same id of the student where I already define what gradelevel_id should I select.
schoolyear_id | student_id | gradelevel_id
407 18 307
409 18 309`
Query:
SELECT
student_mt.student_id,
registration_mt.firstname, registration_mt.middlename, registration_mt.lastname,
MAX(grade.gwa)
FROM schoolyear_student_lt
INNER JOIN gradelevel_mt ON gradelevel_mt.gradelevel_id = schoolyear_student_lt.gradelevel_id
INNER JOIN student_mt ON student_mt.student_id = schoolyear_student_lt.student_id
INNER JOIN registration_mt ON registration_mt.registration_id = student_mt.registration_id
INNER JOIN student_grade ON student_grade.student_id = schoolyear_student_lt.student_id
INNER JOIN grade ON grade.grade_id = student_grade.grade_id
WHERE gradelevel_mt.gradelevel_id = 309
GROUP BY student_mt.student_id;
If I define 307 in my WHERE CLAUSE still selects the student name which I should not already see in my row.
Output:
student_id | firstname | middlename | lastname | MAX(grade.gwa)
18 Billie Joe Armstrong 88
(This is more of a comment) I guess you accidentally stumbled on the quirky MySQL behavior of GROUP BY.
When using GROUP BY, in the SELECT clause we can only put the GROUP BY predicate (student_mt.student_id) and aggregate functions (MAX(grade.gwa)). Even though MySQL allows this, the DBEngine assumes you know what you are doing but might result in anomalies.
Why not get take the approach of getting the student_id(pk) for the MAX(grade.gwa) as a inner sub query and then do the INNER JOIN's with other tables to select what you wanted in the outer subquery.

Problems using INNER JOIN with using Composite Key

I'm quite new to SQL and very new (learning today in fact!) how to use JOINS or in particular INNER JOIN. I have read some guides but haven't seen any helpful information for when one has a table with a composite key.
Tables:
Matches
+-----------+------------+
| (MatchID) | StartTime |
+-----------+------------+
| 548576804 | 1393965361 |
| 548494906 | 1393123251 |
+-----------+------------+
And
+-----------------------------------+---------+
| (PlayerID) - (MatchID) | Result |
+-----------------------------------+---------+
| 38440257 548576804 | Win |
| 17164642 548494906 | Loss |
+-----------------------------------+---------+
Of the above tables, The MatchID in table Matches is a Foreign Key.
Problem
The columns in the parenthesis are Keys (so the composite key is in the MatchDetails table). I am trying to pull all of the Matches played by player 38440257, and the StartTime from the Matches table. The first Join I tried worked, however it pulled every game, regardless of player:
SELECT matchdetails.MatchID,
matches.StartTime,
matchdetails.Result
FROM matchdetails,
matches
WHERE matchdetails.MatchID = matches.MatchID
ORDER BY matches.StartTime ASC
Now, I am not sure how to add in the point that I want ONLY matches from a particular playerID in the query. Because the following does not work:
SELECT matchdetails.MatchID,
matches.StartTime,
matchdetails.Result
FROM matchdetails,
matches
WHERE matchdetails.MatchID = matches.MatchID,
matchdetails.PlayerID=76561197998705985
ORDER BY matches.StartTime ASC
In addition, the JOIN I am using above, is there an easier way to write it that I am missing? Or am I not writing a Join at all? I followed one of the queries from here, which stated they were equivalent queries. However it feels rather cumbersome to write.
Please let me know if I have neglected any information.
You just need AND between your predicates:
SELECT matchdetails.MatchID,
matches.StartTime,
matchdetails.Result
FROM matchdetails,
matches
WHERE matchdetails.MatchID = matches.MatchID
AND matchdetails.PlayerID=76561197998705985
ORDER BY matches.StartTime ASC;
However, I would highly recommend you move away from the ANSI 89 JOIN syntax and adopt ANSI 92 instead. As the names suggest, the syntax you have used is over 20 years out of date.
SELECT matchdetails.MatchID,
matches.StartTime,
matchdetails.Result
FROM matchdetails
INNER JOIN matches
ON matchdetails.MatchID = matches.MatchID
WHERE matchdetails.PlayerID=76561197998705985
ORDER BY matches.StartTime ASC;
SELECT matchdetails.MatchID,
matches.StartTime,
matchdetails.Result
FROM matchdetails
JOIN ON matchdetails.MatchID = matches.MatchID
WHERE
matchdetails.PlayerID=76561197998705985
ORDER BY matches.StartTime ASC

Using SUM with multiple joins in mysql

I've been looking for a solution to this, there's plenty of similar questions but none have any proper answers that helped me solve the problem.
First up, my questions/problem:
I want to sum and count certain columns in a multiple join query
Is it not possible with multiple joins? Do I have to nest SELECT queries?
Here's a SQL dump of my database with sample data: http://pastie.org/private/vq7qkfer5mwyraudb5dh0a
This is the query I thought would do the trick:
SELECT firstname, lastname, sum(goal.goal), sum(assist.assist), sum(gw.gw), sum(win.win), count(played.idplayer) FROM player
LEFT JOIN goal USING (idplayer)
LEFT JOIN assist USING (idplayer)
LEFT JOIN gw USING (idplayer)
LEFT JOIN win USING (idplayer)
LEFT JOIN played USING (idplayer)
GROUP BY idplayer
What I'd like this to produce is a table where the columns for goal, assist, gw, win and played are a sum/count of every row in that column, like so: (with supplied sample data)
+-----------+----------+------+--------+----+-----+--------+
| firstname | lastname | goal | assist | gw | win | played |
+-----------+----------+------+--------+----+-----+--------+
| Gandalf | The White| 10 | 6 | 1 | 1 | 2 |
| Frodo | Baggins | 16 | 2 | 1 | 2 | 2 |
| Bilbo | Baggins | 7 | 3 | 0 | 0 | 2 |
+-----------+----------+------+--------+----+-----+--------+
So, to iterate the above questions again, is this possible with one query and multiple joins?
If you provide solutions/queries, please explain them! I'm new to proper relational databases and I have never used joins before this project. I'd also appreciate if you avoid aliases unless necessary.
I have run the above query without sum and grouping and I get a set of rows for each column I do a SELECT on, which I suspect is then multiplied or added together, but I was under the impression that grouping and/or doing sum(TABLE.COLUMN) would solve that.
Another thing is that, I think, doing a SELECT DISTINCT or any other DISTINCT operation won't work since that will leave out some ("duplicate") results.
PS. If it matters, my dev machine is a WAMP but release will be on ubuntu/apache/mysql/php.
To understand why you're not getting the answers you expect, take a look at this query:
SELECT * FROM player LEFT JOIN goal USING (idplayer)
As you can see, the rows on the left are duplicated for the matching rows on the right. That procedure is repeated for each join. Here's the raw data for your query:
SELECT * FROM player
LEFT JOIN goal USING (idplayer)
LEFT JOIN assist USING (idplayer)
LEFT JOIN gw USING (idplayer)
LEFT JOIN win USING (idplayer)
LEFT JOIN played USING (idplayer)
Those repeated values are then used for the SUM calculations. The SUMs need to be calculated before the rows are joined:
SELECT firstname, lastname, goals, assists, gws, wins, games_played
FROM player
INNER JOIN
(SELECT idplayer, SUM(goal) AS goals FROM goal GROUP BY idplayer) a
USING (idplayer)
INNER JOIN
(SELECT idplayer, SUM(assist) AS assists FROM assist GROUP BY idplayer) b
USING (idplayer)
INNER JOIN
(SELECT idplayer, SUM(gw) AS gws FROM gw GROUP BY idplayer) c
USING (idplayer)
INNER JOIN
(SELECT idplayer, SUM(win) AS wins FROM win GROUP BY idplayer) d
USING (idplayer)
INNER JOIN
(SELECT idplayer, COUNT(*) AS games_played FROM played GROUP BY idplayer) e
USING (idplayer)
SQLFiddle

MySQL Table Joining With AVG()

I have a "ratings" table, that contains (as a foreign key) the ID for the thing that it is rating. There are possibly multiple ratings for a thing, or no ratings for a value.
I want to join tables to see the different ratings for all the different IDs, but right now I'm having trouble viewing things that have no ratings. For example:
mysql> select avg(ratings.rating), thing.id from ratings, things where ratings.thingId = thing.id group by thing.id;
+----------------------+----+
| avg(ratings.rating) | id |
+----------------------+----+
| 6.3333 | 1 |
| 6.0000 | 2 |
+----------------------+----+
Is there any way to modify my select query to also include IDs that have no ratings? I tried modifying the statement to say where ratings.thingId = thing.id or thing.id > 0 but that doesn't seem to help.
Thanks and sorry if it's unclear.
SELECT AVG(ratings.rating),
thing.id
FROM things
LEFT OUTER JOIN ratings
ON ratings.thingId = things.id
GROUP BY thing.id
You're currently performing an INNER JOIN, which eliminates things records with no associated ratings. Instead, an OUTER JOIN...
SELECT AVG(COALESCE(ratings.rating, 0)), thing.id
FROM things
LEFT JOIN ratings ON things.id = ratings.thingId
GROUP BY thing.id
Will return ALL things, regardless of whether or not they have ratings. Note the use of COALESCE(), which will return the first non-NULL argument - thus things with no ratings will return 0 as their average.
SELECT p.id,p.title,(select round(avg(pr.rating),1)
from post_rating pr
where pr.postid=p.id)as AVG FROM posts p