I'm trying to LEFT JOIN on the same table multiple times, to get all the values of the specific topics. It works like I thought it would, see: http://sqlfiddle.com/#!9/9cda67/4
However, using the above fiddle, the database returns a single row for each different course. I'd like to group them, using GROUP BY PersonID, but then it would only take the first value (a 6 for Math) and a (null) value for all the other columns. See: http://sqlfiddle.com/#!9/9cda67/5
What do I need to change so that I get single row per Person, with all the grades filled in into their respective columns (when available)?
MySQL allows you to include columns in a SELECT that are not in the GROUP BY. This actually violates the ANSI standard and is not supported by any other database (although in some cases the ANSI standard does allow it). The result is indeterminate values from a single row in the output.
The solution is aggregation functions:
SELECT p.id AS PersonID, p.name AS PersonName,
max(pc1.grade) AS Math,
max(pc2.grade) AS Chemistry,
max(pc3.grade) AS Physics
FROM Person p LEFT JOIN
Person_Course pc
on p.id = pc.user_id LEFT JOIN
Course c on c.id = pc.course_id LEFT JOIN
Person_Course pc1
on pc1.id = pc.id AND pc1.course_id = 1 LEFT JOIN
Person_Course pc2
on pc2.id = pc.id AND pc2.course_id = 2 LEFT JOIN
Person_Course pc3
on pc3.id = pc.id AND pc3.course_id = 3
GROUP BY PersonID;
You might want group_concat() if people could take the same course multiple times. Also, don't use single quotes for column names. Only use them for string and date constants.
Hardwiring the course ids into the code seems like a bad idea. I would write this more simply using conditional aggregation:
SELECT p.id AS PersonID, p.name AS PersonName,
max(case when c.name = 'Math' then pc.grade end) AS Math,
max(case when c.name = 'Chemistry' then pc.grade end) AS Chemistry,
max(case when c.name = 'Physics' then pc.grade end) AS Physics
FROM Person p LEFT JOIN
Person_Course pc
on p.id = pc.user_id LEFT JOIN
Course c
on c.id = pc.course_id
GROUP BY PersonID;
Related
For this example I got 3 simple tables (Page, Subs and Followers):
For each page I need to know how many subs and followers it has.
My result is supposed to look like this:
I tried using the COUNT function in combination with a GROUP BY like this:
SELECT p.ID, COUNT(s.UID) AS SubCount, COUNT(f.UID) AS FollowCount
FROM page p, subs s, followers f
WHERE p.ID = s.ID AND p.ID = f.ID AND s.ID = f.ID
GROUP BY p.ID
Obviously this statement returns a wrong result.
My other attempt was using two different SELECT statements and then combining the two subresults into one table.
SELECT p.ID, COUNT(s.UID) AS SubCount FROM page p, subs s WHERE p.ID = s.ID GROUP BY p.ID
and
SELECT p.ID, COUNT(f.UID) AS FollowCount FROM page p, follow f WHERE p.ID = f.ID GROUP BY p.ID
I feel like there has to be a simpler / shorter way of doing it but I'm too unexperienced to find it.
Never use commas in the FROM clause. Always use proper, explicit, standard JOIN syntax.
Next, learn what COUNT() does. It counts the number of non-NULL values. So, your expressions are going to return the same value -- because f.UID and s.UID are never NULL (due to the JOIN conditions).
The issue is that the different dimensions are multiplying the amounts. A simple fix is to use COUNT(DISTINCT):
SELECT p.ID, COUNT(DISTINCT s.UID) AS SubCount, COUNT(DISTINCT f.UID) AS FollowCount
FROM page p JOIN
subs s
ON p.ID = s.ID JOIN
followers f
ON s.ID = f.ID
GROUP BY p.ID;
The inner joins are equivalent to the original query. You probably want left joins so you can get counts of zero:
SELECT p.ID, COUNT(DISTINCT s.UID) AS SubCount, COUNT(DISTINCT f.UID) AS FollowCount
FROM page p LEFT JOIN
subs s
ON p.ID = s.ID LEFT JOIN
followers f
ON p.ID = f.ID
GROUP BY p.ID;
Scalar subquery should work in this case.
SELECT p.id,
(SELECT Count(s_uid)
FROM subs s1
WHERE s1.s_id = p.id) AS cnt_subs,
(SELECT Count(f_uid)
FROM followers f1
WHERE f1.f_id = p.id) AS cnt_fol
FROM page p
GROUP BY p.id;
We are maintaining a history of Content. We want to get the updated entry of each content, with create Time and update Time should be of the first entry of the Content. The query contains multiple selects and where clauses with so many left joins. The dataset is very huge, thereby query is taking more than 60 seconds to execute. Kindly help in improving the same. Query:
select * from (select * from (
SELECT c.*, initCMS.initcreatetime, initCMS.initupdatetime, user.name as partnerName, r.name as rightsName, r1.name as copyRightsName, a.name as agelimitName, ct.type as contenttypename, cat.name as categoryname, lang.name as languagename FROM ContentCMS c
left join ContentCategoryType ct on ct.id = c.contentType
left join User user on c.contentPartnerId = user.id
left join Category cat on cat.id = c.categoryId
left join Language lang on lang.id = c.languageCode
left join CopyRights r on c.rights = r.id
left join CopyRights r1 on c.copyrights = r1.id
left join Age a on c.ageLimit = a.id
left outer join (
SELECT contentId, createTime as initcreatetime, updateTime as initupdatetime from ContentCMS cms where cms.deleted='0'
) as initCMS on initCMS.contentId = c.contentId WHERE c.deleted='0' order by c.id DESC
) as temp group by contentId) as c where c.editedBy='0'
Any help would be highly appreciated. Thank you.
Just a partial eval and suggestion because your query seems non properly formed
This left join seems unuseful
FROM ContentCMS c
......
left join (
SELECT contentId
, createTime as initcreatetime
, updateTime as initupdatetime
from ContentCMS cms
where cms.deleted='0'
) as initCMS on initCMS.contentId = c.contentId
same table
the order by (without limit) in a subquery in join is unuseful because join ordered values or unordered value produce the same result
the group by contentId is strange beacuse there aren't aggregation function and the sue of group by without aggregation function is deprecated is sql
and in the most recente version for mysql is not allowed (by deafult) if you need distinct value or just a rows for each contentId you should use distinct or retrive the value in a not casual manner (the use of group by without aggregation function retrive casual value for not aggregated column .
for a partial eval your query should be refactored as
SELECT c.*
, c.initcreatetime
, c.initupdatetime
, user.name as partnerName
, r.name as rightsName
, r1.name as copyRightsName
, a.name as agelimitName
, ct.type as contenttypename
, cat.name as categoryname
, lang.name as languagename
FROM ContentCMS c
left join ContentCategoryType ct on ct.id = c.contentType
left join User user on c.contentPartnerId = user.id
left join Category cat on cat.id = c.categoryId
left join Language lang on lang.id = c.languageCode
left join CopyRights r on c.rights = r.id
left join CopyRights r1 on c.copyrights = r1.id
WHERE c.deleted='0'
) as temp
for the rest you should expiclitally select the column you effectively need add proper aggregation function for the others
Also the nested subquery just for improperly reduce the rows don't help performance ... you should also re-eval you data modelling and design.
I have a very old project which uses MySQL, which I am considering converting part of to MS Access. I'm running into problems with some of the more complex queries, and wondered if there is a reference which details the differences between Access's SQL and MySQL. For example, I have the following query:
select P.PersonID, P.FirstName, P.MiddleName, P.LastName,
PR.LastName as MarriedName, P.Born, LocID, PlaceName,
City, County, State, Country
from persons P
left join relatives R on (R.Person = P.PersonID and TookName)
left join persons PR on (PR.PersonID = R.Relative)
left join locations L on (L.Person = P.PersonID and L.FromDate = P.Born)
where not P.Deleted
and (P.FirstName in ('Alan','Albert','Alfred','Allan','Allen','Alvin','Al')
or P.MiddleName in ('Alan','Albert','Alfred','Allan','Allen','Alvin','Al')
or P.Nickname in ('Alan','Albert','Alfred','Allan','Allen','Alvin','Al'))
and (P.LastName = 'Little' or PR.LastName = 'Little')
group by P.PersonID
order by P.Born desc
In Access, I can get as far as the first join:
select P.PersonID, P.FirstName, P.MiddleName, P.LastName,
PR.LastName as MarriedName, P.Born
from persons P
left join relatives R on (R.Person = P.PersonID and TookName)
where not P.Deleted
and P.FirstName in ('Alan','Albert','Alfred','Allan','Allen','Alvin','Al')
if I add the second join it says, Syntax error (missing operator) in query expression '(R.Person = P.PersonID and TookName) left join persons PR on (PR.PersonID = R.Relative.'
Clicking the Help button very helpfully informs me, The expression you typed is not valid for the reason indicated in the message. Gee thanks!
But I have some other rather complex queries, so beyond solving the problem with this one, I'm looking for something that will explain the differences in general.
EDIT:
So, I changed the query according to the answer linked to:
select P.PersonID, P.FirstName, P.MiddleName, P.LastName,
PR.LastName as MarriedName, P.Born
from (persons P
left join relatives R on R.Person = P.PersonID and TookName=true)
left join persons PR on PR.PersonID = R.Relative
where not P.Deleted
and P.FirstName in ('Alan','Albert','Alfred','Allan','Allen','Alvin','Al')
It tells me JOIN expression not supported, and highlights TookName=true. I also tried it as TookName=1 and just TookName. I tried removing the second JOIN, with the first in parentheses, and it still just tells me JOIN expression not supported.
The Access SQL parser is a huge fan of parentheses.
They are needed in all JOINs with more than 2 tables
FROM (a JOIN b ON a.id = b.id) JOIN c on b.id = c.id
so that only two tables / subqueries are joined in one set of parentheses.
And (as I learned today) they are needed around the ON clause if you want to use literal values in it.
FROM (a JOIN b ON (a.id = b.id AND b.foo = True)) JOIN c on b.id = c.id
An extended description is here. Link was found here: https://stackoverflow.com/a/23632282/3820271
the part left join relatives R on (R.Person = P.PersonID and TookName)
seems not complete (or not a valid sql expression)
tookName is not compared with nothings
could be you need somthings like :
left join relatives R on R.Person = P.PersonID and R.TookName = P.TookName
or
left join relatives R on R.Person = P.PersonID and R.TookName = 'FIXED VALUE'
or
left join relatives R on R.Person = P.PersonID and R.TookName is not null
for cross platform where on boolean you should use
left join relatives R on R.Person = P.PersonID and R.TookName =1
or better WHERE your_column <> 0
so, I am creating an music database.
I am using three tables (files, categories, categories_assignments).
I want to be able to select a file that is in multiple categories (e.g. a song that is both pop and rock)
I already have made the or variance (included below for reference)
SELECT DISTINCT `files`.`filename` FROM `files`
INNER JOIN `categories_assignments`
ON `files`.`id` = `categories_assignments`.`fileid`
INNER JOIN `categories`
ON `categories_assignments`.`catid` = `categories`.`id`
WHERE `categories`.`name` = 'rock' OR `categories`.`name`='pop';
This is a "set-within-sets" problem -- you are looking for songs that have a set of categories. I like to solve this using group by and having:
SELECT f.filename
FROM files f JOIN
categories_assignments ca
ON f.id = ca.fileid JOIN
categories c
ON ca.catid = c.id
WHERE c.name IN ('rock', 'pop')
GROUP BY f.filename
HAVING COUNT(*) = 2;
Notes:
Table aliases make the query easier to write and to read.
I don't see a need to put backticks around every identifier. That just makes the query harder to read.
You should use IN instead of multiple OR comparisons.
If you are learning SQL, then SELECT DISTINCT is almost never useful. Learn to use GROUP BY first.
Group by the file and take only those groups having both categories
SELECT f.filename
FROM files f
INNER JOIN categories_assignments ca ON f.id = ca.fileid
INNER JOIN categories c ON ca.catid` = c.id
WHERE c.name in ('rock', 'pop')
GROUP BY f.filename
HAVING count(c.name) = 2
SELECT SUM(case when p.status = 2 then p.value end) as 'val_accepted'
FROM
props AS p
INNER JOIN (p_contents AS pc
INNER JOIN contents AS c ON c.id = pc.library_id)
ON p.id = pc.prop_id
WHERE p.account_id = 3
GROUP BY (pc.library_id)
so, what's happening:
there are two p_contents that are associated with a prop. those two p_contents have the same library_id which points to a corresponding content.
So, the SUM of p.value is double what it should be because there are two p_contents that point to the same content
How do I not double SUM the p.value?
EDIT:
I figured out how to use DISTINCT, but I still need access to the inner columns...
SELECT c.name as 'library_name',
SUM(case when p.status = 2 then p.value end) as 'val_accepted',
FROM
props AS p
INNER JOIN
(
SELECT DISTINCT(pc.library_id), prop_id
FROM prop_contents AS pc
INNER JOIN
(
SELECT name, visibility, id, updated_at
FROM contents AS c
) as c
ON c.id = pc.library_id
)as pc
ON p.id = pc.prop_id
WHERE p.account_id = 3
GROUP BY (pc.library_id)
and now I get the error:
Unknown column 'c.name' in 'field list')
Here's one solution. First reduce the set to distinct rows in an derived table, then apply the GROUP BY to that result:
SELECT SUM(case when d.status = 2 then d.value end) as 'val_accepted'
FROM (
SELECT DISTINCT p.id, p.status, p.value, pc.library_id
FROM props p
INNER JOIN p_contents AS pc ON p.id = pc.prop_id
INNER JOIN contents AS c ON c.id = pc.library_id
WHERE p.account_id = 3) AS d
GROUP BY d.library_id
You use DISTINCT(pc.library_id) in your example, as if DISTINCT applies only to the column inside the parentheses. This is a common misconception. DISTINCT applies to all columns of the select-list. DISTINCT is not a function; it's a query modifier.