Having trouble with a nested JOIN - mysql

So I'm trying to combine a number of queries, and I'm running into an issue:
SELECT OFFERS.ID AS ID, OFFERS.NAME as NAME, PROGRAM_ID, OFFER_TYPE, DATE_CREATED, PROGRAMS.NAME as PROGRAM_NAME, CLICKS_IN, CLICKS_OUT, SALES
FROM
(OFFERS INNER JOIN PROGRAMS ON PROGRAMS.ID = OFFERS.PROGRAM_ID)
INNER JOIN
(SELECT COUNT(*) AS CLICKS_IN FROM CLICKS_IN WHERE OFFER = ID)a
INNER JOIN
(SELECT COUNT(*) AS CLICKS_OUT FROM CLICKS_OUT WHERE OFFER = ID)b
INNER JOIN
(SELECT SUM(REVENUE) AS SALES FROM CONVERSIONS WHERE LOCAL_OFFER = ID)c
WHERE OFFER_ACTIVE = 1 AND OFFERS.USER_GROUP = ?
I want the OFFER = ID queries to be using the value from OFFERS.ID AS ID but I'm stumped as to how to get it to accomplish that.

I think your initial parenthesis are just confusing matters a bit, and the type of selects you are using are more appropriate for subqueries in the SELECTs expression list; try it this way instead:
SELECT OFFERS.ID AS ID, OFFERS.NAME as NAME, PROGRAM_ID, OFFER_TYPE
, DATE_CREATED, PROGRAMS.NAME as PROGRAM_NAME, CLICKS_IN, CLICKS_OUT, SALES
FROM OFFERS
INNER JOIN PROGRAMS ON PROGRAMS.ID = OFFERS.PROGRAM_ID
INNER JOIN
(SELECT OFFER, COUNT(*) AS CLICKS_IN FROM CLICKS_IN GROUP BY OFFER) AS a
ON a.OFFER = OFFERS.ID
INNER JOIN
(SELECT OFFER, COUNT(*) AS CLICKS_OUT FROM CLICKS_OUT GROUP BY OFFER) AS b
ON b.OFFER = OFFERS.ID
INNER JOIN
(SELECT LOCAL_OFFER, SUM(REVENUE) AS SALES FROM CONVERSIONS GROUP BY LOCAL_OFFER) AS c
ON c.LOCAL_OFFER = OFFERS.ID
WHERE OFFER_ACTIVE = 1 AND OFFERS.USER_GROUP = ?
This method, as opposed to correlated subqueries (subqueries that reference outer queries), tends to be more efficient unless the tables involved in the subqueries are really huge, and/or you expect very few results in the end. In such cases, your original subqueries can be moved to the SELECT clause almost "as is":
SELECT o.ID AS ID, o.NAME as NAME, PROGRAM_ID, OFFER_TYPE, DATE_CREATED, PROGRAMS.NAME as PROGRAM_NAME
, (SELECT COUNT(*) FROM CLICKS_IN AS t WHERE t.OFFER = o.ID) AS CLICKS_IN
, (SELECT COUNT(*) FROM CLICKS_OUT AS t WHERE t.OFFER = o.ID) AS CLICKS_OUT
, (SELECT SUM(REVENUE) FROM CONVERSIONS AS t WHERE t.LOCAL_OFFER = o.ID) AS SALES
FROM OFFERS AS o INNER JOIN PROGRAMS ON PROGRAMS.ID = o.PROGRAM_ID
WHERE OFFER_ACTIVE = 1 AND o.USER_GROUP = ?
Sidenote: In queries involving multiple tables it is a good practice to fully qualify any field names used. (For example, I can't be sure where any of these fields come from: PROGRAM_ID, OFFER_TYPE, DATE_CREATED, OFFER_ACTIVE).

Related

Minimizing redundancy of MySQL query

I'm having a bit of trouble trying to reduce the redundancy of a query in MySQL. I currently have it working, but it feels like I have too much overhead because it uses a redundant subquery. What I am trying to do is use a dvd rental database to find which store location has rented out more dvd's for each month in 2005.
Here is the working query
SELECT b.month, c.store_id, b.maxRentals
FROM
(SELECT a.month, MAX(a.rentalCount) as maxRentals
FROM
(SELECT MONTH(rental.rental_date) as month, inventory.store_id, count(1) as rentalCount
FROM rental
INNER JOIN inventory
ON rental.inventory_id = inventory.inventory_id
WHERE YEAR(rental.rental_date) = 2005
GROUP BY MONTH(rental.rental_date), inventory.store_id
) a
GROUP BY a.month
) b
INNER JOIN
(SELECT MONTH(rental.rental_date) as month, inventory.store_id, count(1) as rentalCount
FROM rental
INNER JOIN inventory
ON rental.inventory_id = inventory.inventory_id
WHERE YEAR(rental.rental_date) = 2005
GROUP BY MONTH(rental.rental_date), inventory.store_id
) c
ON b.maxRentals = c.rentalCount
GROUP BY b.month;
Notice how the subquery with the alias of "c" is the exact same subquery of alias "a". I'm not sure if there's a way to get rid of this, as I can't inner join on an alias. Am I just stuck with a giant query, or is there something else I can do?
I am 90% certain this query will achieve your intentions:
SELECT MONTH(r.rental_date), i.store_id, COUNT(*)
FROM rental r
LEFT JOIN inventory i ON r.inventory_id = i.inventory_id
WHERE YEAR(r.rental_date) = 2005
GROUP BY MONTH(r.rental_date), i.store_id
Let me know how it goes!
Edit: to answer the question which store location has rented out more dvd's for each month in 2005:
SELECT x.rental_month, x.store_id, MAX(x.rental_count) FROM (
SELECT MONTH(r.rental_date) AS rental_month, i.store_id AS store_id, COUNT(*) AS rental_count
FROM rental r LEFT JOIN inventory i ON r.inventory_id = i.inventory_id
WHERE YEAR(r.rental_date) = 2005
GROUP BY MONTH(r.rental_date), i.store_id) x
GROUP BY x.rental_month, x.store_id
I was explicit by using aliases everywhere, you could probably omit some. Hopefully this helps...
Edit: Dirty hack:
SELECT x.rental_month, x.store_id, MAX(x.rental_count) FROM (
SELECT MONTH(r.rental_date) AS rental_month, i.store_id AS store_id, COUNT(*) AS rental_count
FROM rental r LEFT JOIN inventory i ON r.inventory_id = i.inventory_id
WHERE YEAR(r.rental_date) = 2005
GROUP BY MONTH(r.rental_date), i.store_id
ORDER BY MONTH(r.rental_date) ASC, COUNT(*) DESC) x
GROUP BY x.rental_month
Ref:
http://kristiannielsen.livejournal.com/6745.html
But then does this satisfy you, seeing as you do already have a working query...

relational division

I'm supposed to write a query for this statement:
List the names of customers, and album titles, for cases where the customer has bought the entire album (i.e. all tracks in the album)
I know that I should use division.
Here is my answer but I get some weird syntax errors that I can't resolve.
SELECT
R1.FirstName
,R1.LastName
,R1.Title
FROM (Customer C, Invoice I, InvoiceLine IL, Track T, Album Al) AS R1
WHERE
C.CustomerId=I.CustomerId
AND I.InvoiceId=IL.InvoiceId
AND T.TrackId=IL.TrackId
AND Al.AlbumId=T.AlbumId
AND NOT EXISTS (
SELECT
R2.Title
FROM (Album Al, Track T) AS R2
WHERE
T.AlbumId=Al.AlbumId
AND R2.Title NOT IN (
SELECT R3.Title
FROM (Album Al, Track T) AS R3
WHERE
COUNT(R1.TrackId)=COUNT(R3.TrackId)
)
);
ERROR: misuse of aggregate function COUNT()
You can find the schema for the database here
You cannot alias a table list such as (Album Al, Track T) which is an out-dated syntax for (Album Al CROSS JOIN Track T). You can either alias a table, e.g. Album Al or a subquery, e.g. (SELECT * FROM Album CROSS JOIN Track) AS R2.
So first of all you should get your joins straight. I don't assume that you are being taught those old comma-separated joins, but got them from some old book or Website? Use proper explicit joins instead.
Then you cannot use WHERE COUNT(R1.TrackId) = COUNT(R3.TrackId). COUNT is an aggregate function and aggregation is done after WHERE.
As to the query: It's a good idea to compare track counts. So let's do that step by step.
Query to get the track count per album:
select albumid, count(*)
from track
group by albumid;
Query to get the track count per customer and album:
select i.customerid, t.albumid, count(distinct t.trackid)
from track t
join invoiceline il on il.trackid = t.trackid
join invoice i on i.invoiceid = il.invoiceid
group by i.customerid, t.albumid;
Complete query:
select c.firstname, c.lastname, a.title
from
(
select i.customerid, t.albumid, count(distinct t.trackid) as cnt
from track t
join invoiceline il on il.trackid = t.trackid
join invoice i on i.invoiceid = il.invoiceid
group by i.customerid, t.albumid
) bought
join
(
select albumid, count(*) as cnt
from track
group by albumid
) complete on complete.albumid = bought.albumid and complete.cnt = bought.cnt
join customer c on c.customerid = bought.customerid
join album a on a.albumid = bought.albumid;
Seems you are using count in the wrong place
use having for aggregate function
SELECT R3.Title
FROM (Album Al, Track T) AS R3
HAVING COUNT(R1.TrackId)=COUNT(R3.TrackId))
but be sure of alias because in some database the alias in not available in subquery ..
You should simplify your query. Take a look at this:
SELECT FirstName
, LastName
, Title
FROM (
SELECT C.FirstName
, C.LastName
, A.AlbumID
, A.Title
, COUNT(DISTINCT TrackID) as TracksInvoiced
FROM Customer C
INNER JOIN Invoice I
ON I.CustomerId = C.CustomerId
INNER JOIN InvoiceLine IL
ON I.InvoiceId = IL.InvoiceId
INNER JOIN Track T
ON T.TrackID = I
INNER JOIN Album A
ON A.AlbumID = T.AlbumID
GROUP BY C.FirstName, C.LastName, A.AlbumID, A.Title
) C
INNER JOIN (
SELECT AlbumID
, COUNT(TrackID) as TotalTracks
FROM Track
GROUP BY AlbumID
) A
ON C.AlbumID = A.AlbumID
AND TracksInvoiced = TotalTracks
I used two subselects, the first one counts invoiced tracks per customer and album and joins it with another subselect for each album and amount of tracks on it, only where the two counts are equal.
This one seems to be a little less complicated:
SELECT r.FirstName, r.LastName, r.Title FROM
(
SELECT C.FirstName as FirstName,
C.LastName as LastName,
A.Title as Title,
A.AlbumId as AlbumId,
COUNT(*) as count
FROM Customer C, Invoice I, InvoiceLine IL, Track T, Album A
WHERE C.CustomerId=I.CustomerId
AND I.InvoiceId = IL.InvoiceId
AND T.TrackId = IL.TrackId
AND A.AlbumId = T.AlbumId
GROUP BY C.CustomerId, A.AlbumId
) AS r
WHERE r.count IS IN
(
SELECT COUNT(*) FROM Track T
WHERE T.AlbumId = r.AlbumId
)
Tested the idea on a simpler basis and extended to your example so I don't give a guarantee that you can copy and paste and its working immediately...

using same where clause on multiple tables before joining them

Is there a better way of joining two tables and using same where clause on both of them? I am doing it as follows:
SELECT a.account_id,
a.account,
b.tax,
b.rate
FROM (SELECT account_id,
account
FROM accounts
WHERE account_id IN (SELECT account_id
FROM account_location
WHERE location = "A")) AS a
LEFT JOIN (SELECT tax,
rate
FROM tax
WHERE tax_id IN (SELECT tax_id
FROM account_tax
WHERE account_id IN (SELECT account_id
FROM account_location
WHERE location = "A"))) AS b
ON a.account_id = b.account_id
I have 4 tables. Accounts, account_location which has list of accounts mapped to locations, tax which has all taxes, account_tax which has mapping of all taxes applicable to each account. The code works fine, but can this be made faster?
If I'm not mistaking, this query should do the same:
select
a.account_id, a.account, t.tax, t.rate
from
accounts a
inner join account_location al
on al.account_id = a.account_id
and al.location = 'A'
left join account_tax at
on at.account_id = a.account_id
left join tax t on t.tax_id = at.tax_id
I think it will be faster do to more efficient joins (MySQL isn't good with subselects, especially in those join conditions. Also, I think it's more readable.
You can used derived tables (and possibly alternatively CTE's) to DRY up what you are doing in the subquery. Then join to the derived table, instead of using IN. You can do this for the other subqueries as well:
Select a.account_id, a.account, b.tax, b.rate
from
(select account_id from account_location where location="A") as UsefulAccounts
inner join
(
select account_id, account from accounts acc
INNER JOIN UsefulAccounts ua on acc.account_id = ua.account_id
) as a
left join
(
select tax, rate
from tax t
inner join account_tax actx
on t.account_id = acc.account_id
inner join UseFulAccounts ua on actx.account_id = actx.account_id
) as b
on a.account_id = b.account_id;

Can anyone help me to optimize my query?

SELECT p.ID, Name
FROM Policies p
INNER JOIN ProgramYears py ON p.ProgramYearID = py.id
INNER JOIN (SELECT MemberID, max(EffectiveDate) AS EffectiveDate
FROM Policies
GROUP BY MemberID) TEMP
ON p.memberid = TEMP.MemberID
AND p.EffectiveDate = TEMP.effectivedate
AND p.memberid NOT IN (SELECT MemberID
FROM InvoiceDetail
WHERE ProgramYear = NAME)
NOT EXISTS is usually a better substitute for NOT IN, but your choices largely depend on the data and the structure of your tables and indexes.
Try the query below, but compare its execution plan to that of your current query; what works for one scenario may not work for another.
SELECT p.ID, Name
FROM Policies p
INNER JOIN ProgramYears py ON p.ProgramYearID = py.id
INNER JOIN (SELECT MemberID, max(EffectiveDate) AS EffectiveDate
FROM Policies
GROUP BY MemberID) TEMP
ON p.memberid = TEMP.MemberID
AND p.EffectiveDate = TEMP.effectivedate
WHERE NOT EXISTS
(SELECT MemberID
FROM InvoiceDetail AS ID
WHERE ID.ProgramYear = NAME
AND p.MemberId = ID.MemberId)
You can try:
SELECT a.ID, a.Name
FROM (SELECT p.ID, Name,
ROW_NUMBER()OVER(PARTITION BY p.memberid ORDER BY p.EffectiveDate DESC) AS rnk
FROM Policies p
INNER JOIN ProgramYears py ON p.ProgramYearID = py.id
WHERE NOT EXISTS (SELECT MemberID
FROM InvoiceDetail AS ID
WHERE ID.ProgramYear = NAME
AND p.MemberId = ID.MemberId)
) a
WHERE a.rnk = 1

MySQL select in join clause scanning too many rows

Oke guys, the following has been bugging me all day:
I use the query below to select an overview of products and prices including the latest result-price based on field StartTime from another table (tresults). To do this I thought I would need a subselect in the join.
The problem is that the EXPLAIN function is telling me that MySQL is scanning ALL result rows (225000 rows) not using any index.
Is there some way I can speed this up? Preferably by adding a WHERE statement to have mysql look only at the rows with the corresponding pID's.
select p.pID, brandname, description, p.EAN, RetailPrice, LowestPrice, min(price), min(price)/lowestprice-1 as afwijking
from tproducts p
join (
select Max(tresults.StartTime) AS maxstarttime, tresults.pID
from tresults
-- maybe adding a where clause here?
group by tresults.pID
) p_max on (p_max.pID = p.pID)
join tresults res on (res.starttime = p_max.maxstarttime and p.pID = res.pID and res.websiteID = 1)
join tsupplierproducts sp on (sp.pID = p.pID AND supplierID = 1)
join tbrands b on (b.brandID = p.BrandID)
group by p.pID, brandname, description, p.EAN, RetailPrice, LowestPrice
Indexes are on all columns that are part of joins or where clauses.
Any help would be appreciated. Thanks!
From your SQL I assume that you are listing product based on 1 supplier (supplierID = 1) only.
Best practice is do your known filter at begin of sql to eliminate record, then use inner join to join other without filter table.
select p.pID, brandname, description, p.EAN, RetailPrice, LowestPrice, min(price), min(price)/lowestprice-1 as afwijking
from
(select p.pID, p.BrandID p.EAN, Max(t.StartTime) AS maxstarttime
FROM tproducts p INNER JOIN tresults t on supplierID=1 and p.pID=t.pID
group by tresults.pID
) p
inner join tresults res on (res.websiteID = 1 and p.pID = res.pID and res.starttime = p_max.maxstarttime)
inner join tsupplierproducts sp on (sp.pID = p.pID)
inner join tbrands b on (b.brandID = p.BrandID)
group by p.pID, brandname, description, p.EAN, RetailPrice, LowestPrice
from above code, I eliminate all supplierID != 1 from tproducts before join tresults.
let me know if the above sql help, and what is the EXPLAIN function result
:-)