Using two SELECT statements in SQL? - mysql

I have two tables, one is 'points' which contains ID and points. The other table is 'name' and contains ID, Forename, and Surname.
I'm trying to search for the total number of points someone with the forename Anne, and surname Brown, scored.
Would I have to do a join? If so, is this correct?
SELECT Name.Forename, Name.Surname
FROM Name
FULL OUTER JOIN Points
ON Name.ID=Points.ID
ORDER BY Name.Forename;
But then I also have to add the points, so would I have to use:
SELECT SUM (`points`) FROM Points
Then there is also the WHERE statement so that it only searches for the person with this name:
WHERE `Forename`="Anne" OR `Surname`="Brown";
So how does this all come together (based on the assumption that you do something like this)?

SELECT Name.ID, Forename, Surname, SUM(Points)
FROM Name
INNER JOIN Points ON Name.ID = Points.ID
/* Optional WHERE clause:
WHERE Name.ForeName = 'Anne' AND Name.Surname='Brown'
*/
GROUP BY Name.ID, Name.Forename, Name.Surname

So, first, your answer:
select sum(points) as Points
from
Points
inner join Name on Name.ID = Points.ID
where
Name.Forename ='Anne' and Name.SurName='Brown'
Secondly, FULL JOINS are bad since they pull all values from both sets even those without matches. If you want to only return values that match your criteria (A & B) you must use an INNER JOIN.
Thirdly, here is the MySQL reference documentation on SQL statement syntax. Please consider reading up on it and familiarizing yourself at least with the basics like JOINs, aggregation (including GROUP BY and HAVING), WHERE clauses, UNIONs, some of the basic functions provided, and perhaps subqueries. Having a good base in those will get you 99% of the way through most MySQL queries.

You can write it like this with a subquery.
SELECT Name.Forename, Name.Surname, Name.ID,
(SELECT SUM (`points`) FROM Points where Points.ID = Name.ID) as total_points
FROM Name ORDER BY Name.Forename;
However, I would like to point out, that it appears that your linking of the tables is incorrect. I can not be completely sure without seeing the tables, but I imagine it should be where points.userid = name.id

Related

SQL: Column Must Appear in the GROUP BY Clause Or Be Used in an Aggregate Function

I'm doing what I would have expected to be a fairly straightforward query on a modified version of the imdb database:
select primary_name, release_year, max(rating)
from titles natural join primary_names natural join title_ratings
group by year
having title_category = 'film' and year > 1989;
However, I'm immediately running into
"column must appear in the GROUP BY clause or be used in an aggregate function."
I've tried researching this but have gotten confusing information; some examples I've found for this problem look structurally identical to mine, where others state that you must group every single selected parameter, which defeats the whole purpose of a group as I'm only wanting to select the maximum entry per year.
What am I doing wrong with this query?
Expected result: table with 3 columns which displays the highest-rated movie of each year.
If you want the maximum entry per year, then you should do something like this:
select r.*
from ratings r
where r.rating = (select max(r2.rating) where r2.year = r.year) and
r.year > 1989;
In other words, group by is the wrong approach to writing this query.
I would also strongly encourage you to forget that natural join exists at all. It is an abomination. It uses the names of common columns for joins. It does not even use properly declared foreign key relationships. In addition, you cannot see what columns are used for the join.
While I am it, another piece of advice: qualify all column names in queries that have more than one table reference. That is, include the table alias in the column name.
If you want to display all the columns you can user window function like :
select primary_name, year, max(rating) Over (Partition by year) as rating
from titles natural
join primary_names natural join ratings
where title_type = 'film' and year > 1989;

Is there a query which selects columns from only one table which would give incorrect results if grouped by only the primary key of that table?

For example, a bookpub database contains the following tables (pseudocode):
book (key: isbn)
bookauthor (key:author_id, isbn)
author (key: author_id)
The following query returns the number of books by each author:
select lastname, firstname, count(isbn)
from author
join bookauthor using (author_id)
group by lastname, firstname;
However, the following query also produces identical results in MySQL without complaint:
select lastname, firstname, count(isbn)
from author
join bookauthor using (author_id)
group by author_id;
So why shouldn't author_id be used instead of lastname, firstname?
I might add that the formal SQL spec contains the following:
All non-aggregate groups in a SELECT expression list or HAVING expression list must be included in the GROUP BY clause.
Can somebody please interpret this? What is a "non-aggregate group"? Why not just say "columns"? Furthermore, what is an "expression list"? Does an expression in this case always evaluate to a column?
No SQL Implementation is 100% true to the ANSI definition. Some things are missing, some things are added, something are just different.
In MySQL's case, it was chosen to not enforce the restriction you mention:
All non-aggregate groups in a SELECT expression list or HAVING expression list must be included in the GROUP BY clause.
This allows the GROUP BY primary_key syntax that you have noticed, instead of the clunky (and actually slightly more costly) GROUP BY property1, property2, property3, etc. It's clean and elegant.
There are downsides, however; misuse and misunderstanding are rife in web developers because of MySQL, and the flexibility allows bugs to slip though undetected. I recommend avoiding it it in most cases as the performance gains are minimal and the potential for bugs can be huge.
An example of an bug that slips through could be:
SELECT
person.name,
address.city
FROM
person
INNER JOIN
address
ON address.person_id = person.id
GROUP BY
person.id
MySQL will pretty much always allow that code to execute. Even if the address table can have multiple entries per person (I've lived at more than one address).
The code could possibly need to be as follows, but MySQL will never enforce this:
SELECT
person.name,
address.move_in_date,
address.city
FROM
person
INNER JOIN
address
ON address.person_id = person.id
GROUP BY
person.id,
address.id
The more joins involved, the more chance the GROUP BY needs to include multiple primary keys, or other fields.
The behavior you get is that MySQL arbitrarily chooses what values to return when the code is ambiguous. It is explicitly non-deterministic. The following code could give the city from one address and the city's population from another address :-/
SELECT
person.name,
address.move_in_date,
address.city,
city.population
FROM
person
INNER JOIN
address
ON address.person_id = person.id
INNER JOIN
city
ON address.city_id = city.id
GROUP BY
person.id
People then try to abuse this with "tricks" like the following...
SELECT
person.name,
address.move_in_date,
address.city,
city.population
FROM
person
INNER JOIN
address
ON address.person_id = person.id
INNER JOIN
city
ON address.city_id = city.id
GROUP BY
person.id
ORDER BY
person.id,
city.population DESC
This happens to cause the MySQL engine to choose the city with the highest population. Useful for finding the most populous city each person has lived in? Well, it's not actually guaranteed to work. It's still arbitrary; if the tables are being written to, or the database is in a distributed environment, or the MySQL code changes, etc, the behavior could change.
But people do it anyway. Because "well, it's always worked for me so far!"...
In the group by clause you list fields and expressions whose values will partition your result set. For those groups you can calculate aggregate functions like count sum etc.
MySQL lets you select non aggregate expressions or fields, not present in the group by clause, but it's non standard SQL. The result will be non deterministic if those fields have more than one value for a group.
If you group by the primary key the result will be deterministic because there's only one row for each key.

MySQL: Count then sort by the count total

I know other posts talk about this, but I haven't been able to apply anything to this situation.
This is what I have so far.
SELECT *
FROM ccParts, ccChild, ccFamily
WHERE parGC = '26' AND
parChild = chiId AND
chiFamily = famId
ORDER BY famName, chiName
What I need to do is see the total number of ccParts with the same ccFamily in the results. Then, sort by the total.
It looks like this is close to what you want:
SELECT f.famId, f.famName, pc.parCount
FROM (
SELECT c.chiFamily AS famId, count(*) AS parCount
FROM
ccParts p
JOIN ccChild c ON p.parChild = c.chiId
WHERE p.parGC ='26'
GROUP BY c.chiFamily
) pc
JOIN ccFamily f ON f.famId = pc.famId
ORDER BY pc.parCount
The inline view (between the parentheses) is the headliner: it does your grouping and counting. Note that you do not need to join table ccFamily there to group by family, as table ccChild already carries the family information. If you don't need the family name (i.e. if its ID were sufficient), then you can stick with the inline view alone, and there ORDER BY count(*). The outer query just associates family name with the results.
Additionally, MySQL provides a non-standard mechanism by which you could combine the outer query with the inline view, but in this case it doesn't help much with either clarity or concision. The query I provided should be accepted by any SQL implementation, and it's to your advantage to learn such syntax and approaches first.
In the SELECT, add something like count(ccParts) as count then ORDER BY count instead? Not sure about the structure of your tables so you might need to improvise.

Remove duplicates from LEFT JOIN query

I am using the following JOIN statement:
SELECT *
FROM students2014
JOIN notes2014 ON (students2014.Student = notes2014.NoteStudent)
WHERE students2014.Consultant='$Consultant'
ORDER BY students2014.LastName
to retrieve a list of students (students2014) and corresponding notes for each student stored in (notes2014).
Each student has multiple notes within the notes2014 table and each note has an ID that corresponds with each student's unique ID. The above statement is returning a the list of students but duplicating every student that has more than one note. I only want to display the latest note for each student (which is determined by the highest note ID).
Is this possible?
You need another join based on the MAX noteId you got from your select.
Something like this should do it (not tested; next time I'd recommed you to paste a link to http://sqlfiddle.com/ with your table structure and some sample data.
SELECT *
FROM students s
LEFT JOIN (
SELECT MAX(NoteId) max_id, NoteStudent
FROM notes
GROUP BY NoteStudent
) aux ON aux.NoteStudent = s.Student
LEFT JOIN notes n2 ON aux.max_id = n2.NoteId
If I may say so, the fact that a table is called students2014 is a big code smell. You'd be much better off with a students table and a year field, for many reasons (just a couple: you won't need to change your DB structure every year, querying across years is much, much easier, etc, etc). Perhaps you "inherited" this, but I thought I'd mention it.
GROUP the query by studentId and select the MAX of the noteId
Try :
SELECT
students2014.Student,
IFNULL(MAX(NoteId),0)
FROM students2014
LEFT JOIN notes2014 ON (students2014.Student = notes2014.NoteStudent)
WHERE students2014.Consultant='$Consultant'
GROUP BY students2014.Student
ORDER BY students2014.LastName

Getting object if count is less then a number

I have 2 simple tables - Firm and Groups. I also have a table FirmGroupsLink for making connections between them (connection is one to many).
Table Firm has attributes - FirmID, FirmName, City
Table Groups has attributes - GroupID, GroupName
Table FirmGroupsLink has attributes - FrmID, GrpID
Now I want to make a query, which will return all those firms, that have less groups then #num, so I write
SELECT FirmID, FirmName, City
FROM (Firm INNER JOIN FirmGroupsLink ON Firm.FirmID =
FirmGroupsLink.FrmID)
HAVING COUNT(FrmID)<#num
But it doesn't run, I try this in Microsoft Access, but it eventually should work for Sybase. Please show me, what I'm doing wrong.
Thank you in advance.
In order to count properly, you need to provide by which group you are couting.
The having clause, and moreover the count can't work if you are not grouping.
Here you are counting by Firm. In fact, because you need to retrieve information about the Firm, you are grouping by FirmId, FirmName and City, so the query should look like this:
SELECT Firm.FirmID, Firm.FirmName, Firm.City
FROM Firm
LEFT OUTER JOIN FirmGroupsLink
ON Firm.FirmID = FirmGroupsLink.FrmID
GROUP BY Firm.FirmID, Firm.FirmName, Firm.City
HAVING COUNT(FrmID) < #num
Note that I replace the INNER JOIN by a LEFT OUTER JOIN, because you might want Firm which doesn't belongs to any groups too.