relational division - mysql

I'm supposed to write a query for this statement:
List the names of customers, and album titles, for cases where the customer has bought the entire album (i.e. all tracks in the album)
I know that I should use division.
Here is my answer but I get some weird syntax errors that I can't resolve.
SELECT
R1.FirstName
,R1.LastName
,R1.Title
FROM (Customer C, Invoice I, InvoiceLine IL, Track T, Album Al) AS R1
WHERE
C.CustomerId=I.CustomerId
AND I.InvoiceId=IL.InvoiceId
AND T.TrackId=IL.TrackId
AND Al.AlbumId=T.AlbumId
AND NOT EXISTS (
SELECT
R2.Title
FROM (Album Al, Track T) AS R2
WHERE
T.AlbumId=Al.AlbumId
AND R2.Title NOT IN (
SELECT R3.Title
FROM (Album Al, Track T) AS R3
WHERE
COUNT(R1.TrackId)=COUNT(R3.TrackId)
)
);
ERROR: misuse of aggregate function COUNT()
You can find the schema for the database here

You cannot alias a table list such as (Album Al, Track T) which is an out-dated syntax for (Album Al CROSS JOIN Track T). You can either alias a table, e.g. Album Al or a subquery, e.g. (SELECT * FROM Album CROSS JOIN Track) AS R2.
So first of all you should get your joins straight. I don't assume that you are being taught those old comma-separated joins, but got them from some old book or Website? Use proper explicit joins instead.
Then you cannot use WHERE COUNT(R1.TrackId) = COUNT(R3.TrackId). COUNT is an aggregate function and aggregation is done after WHERE.
As to the query: It's a good idea to compare track counts. So let's do that step by step.
Query to get the track count per album:
select albumid, count(*)
from track
group by albumid;
Query to get the track count per customer and album:
select i.customerid, t.albumid, count(distinct t.trackid)
from track t
join invoiceline il on il.trackid = t.trackid
join invoice i on i.invoiceid = il.invoiceid
group by i.customerid, t.albumid;
Complete query:
select c.firstname, c.lastname, a.title
from
(
select i.customerid, t.albumid, count(distinct t.trackid) as cnt
from track t
join invoiceline il on il.trackid = t.trackid
join invoice i on i.invoiceid = il.invoiceid
group by i.customerid, t.albumid
) bought
join
(
select albumid, count(*) as cnt
from track
group by albumid
) complete on complete.albumid = bought.albumid and complete.cnt = bought.cnt
join customer c on c.customerid = bought.customerid
join album a on a.albumid = bought.albumid;

Seems you are using count in the wrong place
use having for aggregate function
SELECT R3.Title
FROM (Album Al, Track T) AS R3
HAVING COUNT(R1.TrackId)=COUNT(R3.TrackId))
but be sure of alias because in some database the alias in not available in subquery ..

You should simplify your query. Take a look at this:
SELECT FirstName
, LastName
, Title
FROM (
SELECT C.FirstName
, C.LastName
, A.AlbumID
, A.Title
, COUNT(DISTINCT TrackID) as TracksInvoiced
FROM Customer C
INNER JOIN Invoice I
ON I.CustomerId = C.CustomerId
INNER JOIN InvoiceLine IL
ON I.InvoiceId = IL.InvoiceId
INNER JOIN Track T
ON T.TrackID = I
INNER JOIN Album A
ON A.AlbumID = T.AlbumID
GROUP BY C.FirstName, C.LastName, A.AlbumID, A.Title
) C
INNER JOIN (
SELECT AlbumID
, COUNT(TrackID) as TotalTracks
FROM Track
GROUP BY AlbumID
) A
ON C.AlbumID = A.AlbumID
AND TracksInvoiced = TotalTracks
I used two subselects, the first one counts invoiced tracks per customer and album and joins it with another subselect for each album and amount of tracks on it, only where the two counts are equal.

This one seems to be a little less complicated:
SELECT r.FirstName, r.LastName, r.Title FROM
(
SELECT C.FirstName as FirstName,
C.LastName as LastName,
A.Title as Title,
A.AlbumId as AlbumId,
COUNT(*) as count
FROM Customer C, Invoice I, InvoiceLine IL, Track T, Album A
WHERE C.CustomerId=I.CustomerId
AND I.InvoiceId = IL.InvoiceId
AND T.TrackId = IL.TrackId
AND A.AlbumId = T.AlbumId
GROUP BY C.CustomerId, A.AlbumId
) AS r
WHERE r.count IS IN
(
SELECT COUNT(*) FROM Track T
WHERE T.AlbumId = r.AlbumId
)
Tested the idea on a simpler basis and extended to your example so I don't give a guarantee that you can copy and paste and its working immediately...

Related

Subquery left join refer to parent ID

I am trying to make a query to fetch the newest car for each user:
select * from users
left join
(select cars.* from cars
where cars.userid=users.userid
order by cars.year desc limit 1) as cars
on cars.userid=users.userid
It looks like it says Unknown column "users.userid" in where clause
I tried to remove cars.userid=users.userid part, but then it only fetches 1 newest car, and sticks it on to each user.
Is there any way to accomplish what I'm after? thanks!!
For this purpose, I usually use row_number():
select *
from users u left join
(select c.* , row_number() over (partition by c.userid order by c.year desc) as seqnum
from cars c
) c
on c.userid = u.userid and c.seqnum = 1;
One option is to filter the left join with a subquery:
select * -- better enumerate the columns here
from users u
left join cars c
on c.userid = u.userid
and c.year = (select max(c1.year) from cars c1 where c1.userid = c.userid)
For performance, consider an index on car(userid, year).
Note that this might return multiple cars per user if you have duplicate (userid, year) in cars. It would be better to have a real date rather than just the year.
Maybe there are better and more efficient way to query this. Here is my solution;
select users.userid, cars.*
from users
left join cars on cars.userid = users.userid
join (SELECT userid, MAX(year) AS maxDate
FROM cars
GROUP BY userid) as sub on cars.year = sub.maxDate;

SQL query find supplier with lowest price for each part

I need to find the supplier with the lowest price for each part.
Tables: suppliers(sid, sname, address), parts(pid, pname, colour), catalog(sid, pid, cost)
This works:
SELECT
sname, pid
FROM
(SELECT
*
FROM
suppliers
NATURAL JOIN catalog
NATURAL JOIN (SELECT
pid, MIN(cost) AS min_cost
FROM
catalog
GROUP BY (pid)) AS m
HAVING cost = min_cost) AS n
But when I try to shorten it to the following I get an error that there is an unknown cost in the having clause:
SELECT
sname, pid
FROM
suppliers
NATURAL JOIN
catalog
NATURAL JOIN
(SELECT
pid, MIN(cost) AS min_cost
FROM
catalog
GROUP BY (pid)) AS m
HAVING cost = min_cost
Why can't it find the cost? Isn't the cost in the table because I've joined the subquery to catalog?
EDIT
I changed it to use INNER JOIN instead of NATURAL JOIN as per suggestions, but I'm still getting the same error. New query:
SELECT
s.sname, m.pid
FROM
suppliers s
INNER JOIN
catalog c ON s.sid = c.sid
INNER JOIN
(SELECT
pid, MIN(cost) AS min_cost
FROM
catalog
GROUP BY (pid)) AS m ON c.pid = m.pid
HAVING cost = min_cost
EDIT_2
The problem was not the JOIN but the HAVING, which should actually be WHERE, as shown by bbrumm's answer.
I would suggest a query like this:
SELECT
supplier.sname,
catalog.pid
FROM suppliers
INNER JOIN catalog ON suppliers.supplier_id = catalog.supplier_id
INNER JOIN
(SELECT
pid, MIN(cost) AS min_cost
FROM catalog
GROUP BY (pid)) AS m
ON catalog.pid = m.pid
WHERE catalog.cost = m.min_cost;
I've made a few assumptions on your column names (e.g. supplier_id) that you may need to change. A point could be made that the "cost=min_cost" is part of the JOIN so it could go there as well. I've also not included table aliases as while it's best practice, it's not required.

SQL: How do you showcase the percentage of total in a column?

I have to create a Movie database, and one of the queries involves showing a list of movies that a user has "rated," along with the actual rating and the Genre.
Additionally, it must show the percentage of that genre out of all the movies the user has rated. For example:
CREATE TABLE GENRES (
MOVIEID INT NOT NULL,
GENRE VARCHAR(20),
PRIMARY KEY(MOVIEID),
FOREIGN KEY(MOVIEID) REFERENCES MOVIES(ID)
);
Query:
SELECT U.USERID, M.TITLE, U.USERRATING, G.GENRE
FROM USER_RATEDMOVIES U, MOVIES M, GENRES G
WHERE (M.ID = U.MOVIEID) AND (U.USERID LIKE "EXAMPLE") AND (G.MOVIEID = M.ID)
ORDER BY DATE_YEAR, DATE_MONTH, DATE_DAY, DATE_HOUR, DATE_MINUTE, DATE_SECOND;
This returns everything I need EXCEPT the percentage of the total. That's where I get lost. I'm new with SQL and struggle with the COUNT(*) and GROUP BY.
The challenge of your needs is presenting unit level data with aggregate level data. Hence, consider using aggregate groupby derived tables (nested select statements in the FROM or JOIN clauses) that correlates (matches USERID) to your previously set up query (using explicit instead of implicit joins):
SELECT U.USERID, M.TITLE, U.USERRATING, G.GENRE,
(gdT.GENRE_COUNT / udT.RATING_COUNT) AS PERCENTAGE
FROM USER_RATEDMOVIES U
INNER JOIN MOVIES M ON M.ID = U.MOVIEID
INNER JOIN GENRES G ON M.ID = G.MOVIEID
INNER JOIN
-- COUNTS BY CORRESPONDING USER AND GENRE
(SELECT subU.USERID, subG.GENRE, Count(*) AS GENRE_COUNT
FROM USER_RATEDMOVIES subU
INNER JOIN MOVIES subM ON subM.ID = subU.MOVIEID
INNER JOIN GENRES subG ON subM.ID = subG.MOVIEID
GROUP BY subU.USERID, subG.GENRE) AS gdT
ON gdT.USERID = U.USERID AND gdT.GENRE = G.GENRE
INNER JOIN
-- COUNTS BY CORRESPONDING USER
(SELECT subU.USERID, Count(*) AS RATING_COUNT
FROM USER_RATEDMOVIES subU
GROUP BY subU.USERID) AS udT
ON udT.USERID = U.USERID
WHERE U.USERID LIKE "EXAMPLE"
ORDER BY DATE_YEAR, DATE_MONTH, DATE_DAY,
DATE_HOUR, DATE_MINUTE, DATE_SECOND;

MySQL select in join clause scanning too many rows

Oke guys, the following has been bugging me all day:
I use the query below to select an overview of products and prices including the latest result-price based on field StartTime from another table (tresults). To do this I thought I would need a subselect in the join.
The problem is that the EXPLAIN function is telling me that MySQL is scanning ALL result rows (225000 rows) not using any index.
Is there some way I can speed this up? Preferably by adding a WHERE statement to have mysql look only at the rows with the corresponding pID's.
select p.pID, brandname, description, p.EAN, RetailPrice, LowestPrice, min(price), min(price)/lowestprice-1 as afwijking
from tproducts p
join (
select Max(tresults.StartTime) AS maxstarttime, tresults.pID
from tresults
-- maybe adding a where clause here?
group by tresults.pID
) p_max on (p_max.pID = p.pID)
join tresults res on (res.starttime = p_max.maxstarttime and p.pID = res.pID and res.websiteID = 1)
join tsupplierproducts sp on (sp.pID = p.pID AND supplierID = 1)
join tbrands b on (b.brandID = p.BrandID)
group by p.pID, brandname, description, p.EAN, RetailPrice, LowestPrice
Indexes are on all columns that are part of joins or where clauses.
Any help would be appreciated. Thanks!
From your SQL I assume that you are listing product based on 1 supplier (supplierID = 1) only.
Best practice is do your known filter at begin of sql to eliminate record, then use inner join to join other without filter table.
select p.pID, brandname, description, p.EAN, RetailPrice, LowestPrice, min(price), min(price)/lowestprice-1 as afwijking
from
(select p.pID, p.BrandID p.EAN, Max(t.StartTime) AS maxstarttime
FROM tproducts p INNER JOIN tresults t on supplierID=1 and p.pID=t.pID
group by tresults.pID
) p
inner join tresults res on (res.websiteID = 1 and p.pID = res.pID and res.starttime = p_max.maxstarttime)
inner join tsupplierproducts sp on (sp.pID = p.pID)
inner join tbrands b on (b.brandID = p.BrandID)
group by p.pID, brandname, description, p.EAN, RetailPrice, LowestPrice
from above code, I eliminate all supplierID != 1 from tproducts before join tresults.
let me know if the above sql help, and what is the EXPLAIN function result
:-)

MySQL Join Query (possible two inner joins)

I currently have the following:
Table Town:
id
name
region
Table Supplier:
id
name
town_id
The below query returns the number of suppliers for each town:
SELECT t.id, t.name, count(s.id) as NumSupplier
FROM Town t
INNER JOIN Suppliers s ON s.town_id = t.id
GROUP BY t.id, t.name
I now wish to introduce another table in to the query, Supplier_vehicles. A supplier can have many vehicles:
Table Supplier_vehicles:
id
supplier_id
vehicle_id
Now, the NumSupplier field needs to return the number of suppliers for each town that have any of the given vehicle_id (IN condition):
The following query will simply bring back the suppliers that have any of the given vehicle_id:
SELECT * FROM Supplier s, Supplier_vehicles v WHERE s.id = v.supplier_id AND v.vehicle_id IN (1, 4, 6)
I need to integrate this in to the first query so that it returns the number of suppliers that have any of the given vehicle_id.
SELECT t.id, t.name, count(s.id) as NumSupplier
FROM Town t
INNER JOIN Suppliers s ON s.town_id = t.id
WHERE s.id IN (SELECT sv.supplier_id
FROM supplier_vehicles sv
WHERE sv.vehicle_id IN (1,4,6))
GROUP BY t.id, t.name
Or you could do an INNER JOIN (as your supplier join is INNER, but this will remove towns with no suppliers with those vehicles) and change the COUNT(s.id) TO COUNT(DISTINCT s.id)
If I remember correctly, you can put your second query inside the LEFT OUTER JOIN condition.
So for example, you can do something like
...
LEFT OUTER JOIN (SELECT * FROM Suppler s, Supplier_vehicles ......) s ON s.town_id=t.id
In that way you are "integrating" or combining the two queries into one. Let me know if this works.
SELECT t.name, count(s.id) as NumSupplier
FROM Town t
LEFT OUTER JOIN Suppliers s ON t.id = s.town_id
LEFT OUTER JOIN Supplier_vehicles v ON s.id = v.supplier_id
WHERE v.vehicle_id IN (1,4,6)
GROUP BY t.name