Retrieving aggregate values from multiple tables - mysql

I'm trying to get some aggregate values from different tables, but my problem is that they're being returned incorrectly, i.e. as multiplications of each other.
I've got these tables:
Institutes
----------
ID
name
Conversions
-----------
ID
institute_id
training_id
Trainings
---------
ID
institute_id
And I want to do get the Institute.name, the number of conversions and the number of trainings. I can get the name and one of the aggregate values, but when I try to get both values it returns the wrong number, because it will repeat all conversions for each training (or the other way round).
I'm using this SQL:
SELECT Institute.id
,Institute.name
,Institute.friendly_url
,count(Training.`id`) as totalTrainings
,count(Conversion.`id`) AS totalConversions
FROM institutes Institute
LEFT JOIN trainings Training ON Institute.`id` = Training.`institute_id`
LEFT JOIN conversions Conversion ON Training.`institute_id` = Conversion.`institute_id`
GROUP BY 1,2,3
I need to add more tables later on, using the same structure and also with the sole purpose of retrieving aggregates. I can make it work by using multiple queries, but then I'd run into the problem of not being able to use proper sorting, pagination and problems when I want to filter the data.
Creating a view is a solution I've been thinking about too, but I still need some SQL to create that view with.
I've also tried tinkering with the GROUP BY but to no avail.
Update
I'm now using subqueries in my join, which seems to work. Is there a more optimal solution though?
SELECT Institute.id
,Institute.name
,Institute.friendly_url
,Training.total
,Conversion.total
FROM institutes Institute
LEFT JOIN (SELECT institute_id, count(id) AS total
FROM `trainings`
GROUP BY institute_id) AS Training ON Training.institute_id = Institute.id
LEFT JOIN (SELECT institute_id, count(id) AS total
FROM `conversions`
GROUP BY institute_id) AS Conversion ON Conversion.institute_id = Institute.id

Try that. Doing a separate subqueries is better IMHO. And looks better also, easier to read the query.
SELECT
Institute.id
,Institute.name
,Institute.friendly_url
,(SELECT count(Training.`id`) FROM trainings Training WHERE Institute.`id` = Training.`institute_id` LIMIT 1) as totalTrainings
,(SELECT count(Conversion.`id`) FROM conversions Conversion ON Institute.`id` = Conversion.`institute_id` LIMIT 1) AS totalConversions
FROM institutes Institute

(realize this is two years later)
I believe a DISTINCT within your COUNT clause would fix the problem.
SELECT Institute.id
,Institute.name
,Institute.friendly_url
,count(DISTINCT Training.`id`) as totalTrainings
,count(DISTINCT Conversion.`id`) AS totalConversions
FROM institutes Institute
LEFT JOIN trainings Training ON Institute.`id` = Training.`institute_id`
LEFT JOIN conversions Conversion ON Training.`institute_id` = Conversion.`institute_id`
GROUP BY 1,2,3

Related

MySQL - Cannot reference result of a subquery

Hit a roadblock and hoping someone here is able to help please?
edit: DB<>FIDDLE : https://dbfiddle.uk/?rdbms=mysql_5.5&fiddle=c73f8ec9a60f530fe4ad489dc743f9b9
I have 3 tables:
marks - which uses grades;
target - which also uses grades;
grades - a lookup for grades to points.
What I am trying to do is calculate the total points for grades within the marks table, then calculate the total target points by multiplying the points value for the target by the number of grades within the marks table for a given person (adno).
I'm able to sum and count the points values from the marks table without a problem, but as I've used an inner joint for the marks to grades already I cannot add a further one for target to grades so I've used a subquery.
However when I try to use the result of subquery (EDIT single_target_points , not target_points as I originally posted) in the calculation in the line straight after it I get the error :
[Err] 1054 - Unknown column 'single_target_points' in 'field list'
This is the query I am trying:
SELECT
marks.adno,
Sum(grades.points) AS total_points,
Count(grades.points) AS no_of_subjects,
(SELECT grades.points FROM targets INNER JOIN grades ON targets.grade = grades.grade WHERE targets.adno = marks.adno GROUP BY grades.points) AS single_target_points,
single_target_points*no_of_subjects AS target_points
FROM
marks
INNER JOIN grades ON marks.resultvalue = grades.grade
INNER JOIN targets ON targets.adno = marks.adno
GROUP BY
marks.adno
This will resolve the query for you, but not sure what you are doing is completely accurate. I took your in-line query and made it a subquery including the "adno" (person id) to get all distinct points. Using that and JOINING based on the person like the others.
SELECT
m.adno,
Sum(g.points) AS total_points,
Count(g.points) AS no_of_subjects,
SUM( TP.single_target_points * g.points) AS target_points
FROM
marks m
JOIN grades g
ON m.resultvalue = g.grade
JOIN students s
ON m.adno = s.adno
JOIN targets t
ON m.adno = t.adno
JOIN
( SELECT distinct
t2.adno,
t2.grade,
g2.points single_target_points
FROM
targets t2
JOIN grades g2
ON t2.grade = g2.grade ) TP
on t.adno = TP.adno
and t.grade = TP.grade
GROUP BY
m.adno
Now, that being said, it looks like you are trying to compute a person's GPA (grade point average). If you can EDIT your existing post and provide samples of data (use spaces to align, not tabs) even if fictional names (but not necessary since things here are all ID values and otherwise not private). It would help to see the basis of what computations are going on. If not done correctly, you will get Cartesian results and skew your points due to multiple entries * multiple entries = oversized expected answer values.
Also, I Updated your dbfiddle. You inserted into adno instead of students the sample records, inserted into targets the grades, so those tables were blank. I had to correct that. Then changed grades to integer since doing math (via sum( g.points)). Best to use proper data types vs everything as character.
At least the queries are now working, but does not make sense if this is GPA calculations -- as far as I know and have done for U.S. college transcript purposes.

SQL Temporary Table or Select

I've got a problem with MySQL select statement.
I have a table with different Department and statuses, there are 4 statuses for every department, but for each month there are not always every single status but I would like to show it in the analytics graph that there is '0'.
I have a problem with select statement that it shows only existing statuses ( of course :D ).
Is it possible to create temporary table with all of the Departments , Statuses and amount of statuses as 0, then update it by values from other select?
Select statement and screen how it looks in perfect situation, and how it looks in bad situation :
SELECT utd.Departament,uts.statusDef as statusoforder,Count(uts.statusDef) as Ilosc_Statusow
FROM ur_tasks_details utd
INNER JOIN ur_tasks_status uts on utd.StatusOfOrder = uts.statusNR
WHERE month = 'Sierpien'
GROUP BY uts.statusDef,utd.Departament
Perfect scenario, now bad scenario :
I've tried with "union" statements but i don't know if there is a possibility to take only "the highest value" for every department.
example :
I've also heard about
With CTE tables, but I don't really get how to use it. Would love to get some tips on it!
Thanks for your help.
Use a cross join to generate the rows you want. Then use a left join and aggregation to bring in the data:
select d.Departament, uts.statusDef as statusoforder,
Count(uts.statusDef) as Ilosc_Statusow
from (select distinct utd.Departament
from ur_tasks_details utd
) d cross join
ur_tasks_status uts left join
ur_tasks_details utd
on utd.Departament = d.Departament and
utd.StatusOfOrder = uts.statusNR and
utd.month = 'Sierpien'
group by uts.statusDef, d.Departament;
The first subquery should be your source of all the departments.
I also suspect that month is in the details table, so that should be part of the on clause.

Remove duplicates from LEFT JOIN query

I am using the following JOIN statement:
SELECT *
FROM students2014
JOIN notes2014 ON (students2014.Student = notes2014.NoteStudent)
WHERE students2014.Consultant='$Consultant'
ORDER BY students2014.LastName
to retrieve a list of students (students2014) and corresponding notes for each student stored in (notes2014).
Each student has multiple notes within the notes2014 table and each note has an ID that corresponds with each student's unique ID. The above statement is returning a the list of students but duplicating every student that has more than one note. I only want to display the latest note for each student (which is determined by the highest note ID).
Is this possible?
You need another join based on the MAX noteId you got from your select.
Something like this should do it (not tested; next time I'd recommed you to paste a link to http://sqlfiddle.com/ with your table structure and some sample data.
SELECT *
FROM students s
LEFT JOIN (
SELECT MAX(NoteId) max_id, NoteStudent
FROM notes
GROUP BY NoteStudent
) aux ON aux.NoteStudent = s.Student
LEFT JOIN notes n2 ON aux.max_id = n2.NoteId
If I may say so, the fact that a table is called students2014 is a big code smell. You'd be much better off with a students table and a year field, for many reasons (just a couple: you won't need to change your DB structure every year, querying across years is much, much easier, etc, etc). Perhaps you "inherited" this, but I thought I'd mention it.
GROUP the query by studentId and select the MAX of the noteId
Try :
SELECT
students2014.Student,
IFNULL(MAX(NoteId),0)
FROM students2014
LEFT JOIN notes2014 ON (students2014.Student = notes2014.NoteStudent)
WHERE students2014.Consultant='$Consultant'
GROUP BY students2014.Student
ORDER BY students2014.LastName

Getting object if count is less then a number

I have 2 simple tables - Firm and Groups. I also have a table FirmGroupsLink for making connections between them (connection is one to many).
Table Firm has attributes - FirmID, FirmName, City
Table Groups has attributes - GroupID, GroupName
Table FirmGroupsLink has attributes - FrmID, GrpID
Now I want to make a query, which will return all those firms, that have less groups then #num, so I write
SELECT FirmID, FirmName, City
FROM (Firm INNER JOIN FirmGroupsLink ON Firm.FirmID =
FirmGroupsLink.FrmID)
HAVING COUNT(FrmID)<#num
But it doesn't run, I try this in Microsoft Access, but it eventually should work for Sybase. Please show me, what I'm doing wrong.
Thank you in advance.
In order to count properly, you need to provide by which group you are couting.
The having clause, and moreover the count can't work if you are not grouping.
Here you are counting by Firm. In fact, because you need to retrieve information about the Firm, you are grouping by FirmId, FirmName and City, so the query should look like this:
SELECT Firm.FirmID, Firm.FirmName, Firm.City
FROM Firm
LEFT OUTER JOIN FirmGroupsLink
ON Firm.FirmID = FirmGroupsLink.FrmID
GROUP BY Firm.FirmID, Firm.FirmName, Firm.City
HAVING COUNT(FrmID) < #num
Note that I replace the INNER JOIN by a LEFT OUTER JOIN, because you might want Firm which doesn't belongs to any groups too.

Using Joins, Group By and Sub Queries, Oh My!

I have a database with a table for details of ponies, another for details of contacts (owners and breeders), and then several other small tables for parameters (colours, counties, area codes, etc.). To give me a list of existing pony profiles, with their various details given, i use the following query:
SELECT *
FROM profiles
INNER JOIN prm_breedgender
ON profiles.ProfileGenderID = prm_breedgender.BreedGenderID
LEFT JOIN contacts
ON profiles.ProfileOwnerID = contacts.ContactID
INNER JOIN prm_breedcolour
ON profiles.ProfileAdultColourID = prm_breedcolour.BreedColourID
ORDER BY profiles.ProfileYearOfBirth ASC $limit
In the above sample, the 'profiles' table is my primary table (holding the Ponies info), 'contacts' is second in importance holding as it does the owner and breeder info. The lesser parameter tables can be identified by their prm_ prefix. The above query works fine, but i want to do more.
The first big issue is that I wish to GROUP the results by gender: Stallions, Mares, Geldings... I used << GROUP BY prm_breedgender.BreedGender >> or << GROUP BY ProfileBreedGenderID >> before my ORDER BY line, but than only returns two results from all my available profiles. I have read up on this, and apparantly need to reorganise my query to accomodate GROUP within my primary SELECT clause. How to do this however, gets me verrrrrrry confused. Step by step help here would be fantabulous.
As a further note on the above - You may have noticed the $limit var at the end of my query. This is for pagination, a feature I want to keep. I shouldn't think that's an issue however.
My secondary issue is more of an organisational one. You can see where I have pulled my Owner information from the contacts table here:
LEFT JOIN contacts
ON profiles.ProfileOwnerID = contacts.ContactID
I could add another stipulation:
AND profiles.ProfileBreederID = contacts.ContactID
with the intention of being able to list a pony's Owner and Breeder, where info on either is available. I'm not sure how to echo out this info though, as $row['ContactName'] could apply in either the capacity of owner OR breeder.
Is this a case of simply running two queries rather than one? Assigning a variable $foo to the first run of the query, then just run another separate query altogether and assign $bar to those results? Or is there a smarter way of doing it all in the one query (e.g. $row['ContactName']First-iteration, $row['ContactName']Second-iteration)? Advice here would be much appreciated.
And That's it! I've tried to be as clear as possible, and do really appreciate any help or advice at all you can give. Thanks in advance.
##########################################################################EDIT
My query currently stands as an amalgam of that provided by Cularis and Symcbean:
SELECT *
FROM (
profiles
INNER JOIN prm_breedgender
ON profiles.ProfileGenderID = prm_breedgender.BreedGenderID
LEFT JOIN contacts AS owners
ON profiles.ProfileOwnerID = owners.ContactID
INNER JOIN prm_breedcolour
ON profiles.ProfileAdultColourID = prm_breedcolour.BreedColourID
)
LEFT JOIN contacts AS breeders
ON profiles.ProfileBreederID = breeders.ContactID
ORDER BY prm_breedgender.BreedGender ASC, profiles.ProfileYearOfBirth ASC $limit
It works insofar as the results are being arranged as I had hoped: i.e. by age and gender. However, I cannot seem to get the alias' to work in relation to the contacts queries (breeder and owner). No error is displayed, and neither are any Owners or Breeders. Any further clarification on this would be hugely appreciated.
P.s. I dropped the alias given to the final LEFT JOIN by Symcbean's example, as I could not get the resulting ORDER BY statement to work for me - my own fault, I'm certain. Nonetheless, it works now although this may be what is causing the issue with the contacts query.
GROUP in SQL terms means using aggregate functions over a group of entries. I guess what you want is order by gender:
ORDER BY prm_breedgender.BreedGender ASC, profiles.ProfileYearOfBirth ASC $limit
This will output all Stallions, etc. next to each other.
To also get the breeders contact, you need to join with the contacts table again, using an alias:
LEFT JOIN contacts AS owners
ON profiles.ProfileOwnerID = owners.ContactID
LEFT JOIN contacts AS breeders
ON profiles.ProfileBreederID = breeders.ContactID
To further expand on what #cularis stated, group by is for aggregations down to the lowest level of "grouping" criteria. For example, and I'm not doing per your specific tables, but you'll see the impact. Say you want to show a page grouped by Breed. Then, a user picks a breed and they can see all entries of that breed.
PonyID ProfileGenderID Breeder
1 1 1
2 1 1
3 2 2
4 3 3
5 1 2
6 1 3
7 2 3
Assuming your Gender table is a lookup where ex:
BreedGenderID Description
1 Stallion
2 Mare
3 Geldings
SELECT *
FROM profiles
INNER JOIN prm_breedgender
ON profiles.ProfileGenderID = prm_breedgender.BreedGenderID
select
BG.Description,
count(*) as CountPerBreed
from
Profiles P
join prm_BreedGender BG
on p.ProfileGenderID = BG.BreedGenderID
group by
BG.Description
order by
BG.Description
would result in something like (counts are only coincidentally sequential)
Description CountPerBreed
Geldings 1
Mare 2
Stallion 4
change the "order by" clause to "order by CountsPerBreed Desc" (for descending) and you would get
Description CountPerBreed
Stallion 4
Mare 2
Geldings 1
To expand, if you wanted the aggregations to be broken down per breeder... It is a best practice to group by all things that are NOT AGGREGATES (such as MIN(), MAX(), AVG(), COUNT(), SUM(), etc)
select
BG.Description,
BR.BreaderName,
count(*) as CountPerBreed
from
Profiles P
join prm_BreedGender BG
on p.ProfileGenderID = BG.BreedGenderID
join Breeders BR
on p.Breeder = BR.BreaderID
group by
BG.Description,
BR.BreaderName
order by
BG.Description
would result in something like (counts are only coincidentally sequential)
Description BreaderName CountPerBreed
Geldings Bill 1
Mare John 1
Mare Sally 1
Stallion George 2
Stallion Tom 1
Stallion Wayne 1
As you can see, the more granularity you provide to the group by, the aggregation per that level is smaller.
Your join conditions otherwise are obviously understood from what you've provided. Hopefully this sample clearly provides what the querying process will do. Your group by does not have to be the same as the final order... its just common to see so someone looking at the results is not trying to guess how the data was organized.
In your sample, you had an order by the birth year. When doing an aggregation, you will never have the specific birth year of a single pony to so order by... UNLESS.... You included the YEAR( ProfileYearOfBirth ) as BirthYear as a column, and included that WITH your group by... Such as having 100 ponies 1 yr old and 37 at 2 yrs old of a given breed.
It would have been helpful if you'd provided details of the table structure and approximate numbers of rows. Also using '*' for a SELECT is a messy practice - and will cause you problems later (see below).
What version of MySQL is this?
apparantly need to reorganise my query to accomodate GROUP within my primary SELECT clause
Not necessarily since v4 (? IIRC), you could just wrap your query in a consolidating select (but move the limit into the outer select:
SELECT ProfileGenderID, COUNT(*)
FROM (
[your query without the LIMIT]
) ilv
GROUP BY ProfileGenderID
LIMIT $limit;
(note you can't ORDER BY ilv.ProfileYearOfBirth since it is not a selected column / group by expression)
How many records/columns do you have in prm_breedgender? Is it just Stallions, Mares, Geldings...? Do you think this list is likely to change? Do you have ponies with multiple genders? I suspect that this domain would be better represented by an enum in the profiles table.
with the intention of being able to list a pony's Owner and Breeder,
Using the code you suggest, you'll only get returned instances where the owner and breeder are the same! You need to add a second instance of the contacts table with a different alias to get them all, e.g.
SELECT *
FROM (
SELECT *
FROM profiles
INNER JOIN prm_breedgender
ON profiles.ProfileGenderID = prm_breedgender.BreedGenderID
LEFT JOIN contacts ownerContact
ON profiles.ProfileOwnerID = ownerContact.ContactID
INNER JOIN prm_breedcolour
ON profiles.ProfileAdultColourID = prm_breedcolour.BreedColourID
) ilv LEFT JOIN contacts breederContact
ON ilv.ProfileBreederID = breederContact.ContactID
ORDER BY ilv.ProfileYearOfBirth ASC $limit