Best way to reference an outer query / subquery? - mysql

I'm trying to reference a field from the 1st select table in the 3rd select(subquery) table.
However, that field isn't recognized when it goes to that sub-level of a query.
The php code I'm working on uses sql to return part of the sql command (string) that will be used in other places.
I've came up with this example that shows up the kind of nested querys that I want to solve.
In here I'm trying to get the name and emails of users that are working at night and have a matching job rank for an available job:
tables -----------> fields
table_users -> [user_id, name, email, rank, ...]
table_users_jobs -> [user_id, job_id, period, ....]
table_jobs -> [job_id, status, rank, ...]
-- sql calling code -> $rank = "t1.rank"; get_users_info_by_rank($rank);
-- maybe using: SET #rank = NULL; SELECT #rank := $rank, t1.name, ...
SELECT t1.name, t1.email
FROM table_users as t1
WHERE t1.user_id IN (
SELECT t2.user_id
FROM table_users_jobs as t2
WHERE t2.period = 'night' AND
t2.job_id IN (
-- avaiable jobs to that rank -> get_job_ranks_sql($rank);
SELECT t3.job_id
FROM table_jobs as t3
-- maybe using: t3.rank = #rank
WHERE t3.rank = t1.rank AND
t3.status = 'avaiable_position')
)
Working a little I guess I could avoid the 3rd level select problem. Nevertheless the point is that I'm trying to reuse sql code like the function that gives me the job_id of the rank that I chose:
function get_job_ranks_sql($rank){
//probably 't3' will be renamed for something more unique
return 'SELECT t3.job_id
FROM table_jobs as t3
WHERE t3.rank = '.$rank.' AND
t3.status = "available_position")';
}
Even using php I'm trying to make it generic to maybe use with another language if possible.
The sql version using is MySQL 5.1.41
Actually I think it's possible the way I want, by using sql variables like #rank, but I'm not sure if it's slower and if there are other better ways to do it.
Thanks in advance for any help :)

So, as one commenter pointed out, I think you would do much better off using JOINS, than sub-selects. For example, if I am reading your query/problem correctly, you could do a join query like this:
SELECT t1.name, t1.email, t3.job_id
FROM table_users t1
LEFT JOIN table_users_job t2
ON t1.user_id = t2.user_id
LEFT JOIN table_jobs t3
ON t3.job_id = t2.job_id
WHERE t2.period = 'night
AND t3.status = 'available_position'
Which is a lot more concise, easier to read, and is easier on your database. But doing this would prevent you from modularizing your SQL. If that is really important, you might consider storing such queries in Stored Procedure. This way, you can actually get a SP to return a list of results. Take a look at this tutorial:
http://www.wellho.net/resources/ex.php4?item=s163/stp4
Of course, that doesn't really solve your problem of being able to access variables at the lower levels of a sub select, but it would make your SQL easier to manage, and make it available to other language implementations, as you mentioned might be a need for you.
Something else to consider, in the bigger picture, would be migrating to a PHP framework that provides an ORM layer, where you could make those tables into objects, and then be able to access your data with much greater ease and flexibility (usually). But that is very 'big picture' and might not be suitable for your project requirements. One such framework that I could recommend, however, is CakePHP.

Related

Don't Know How To Join Using Calculation and a column

I do not know how to join tables based on a calculation. I have to take a substring to get the part of a string I need to match up to a column from another table. I cannot figure out how to join them and really don't know where to start.
I tried everything in my power but I literally took a beginner's class and now have to fend for myself.
Select *
From five9_data.calllog join warbird.user
ON warbird.attr_employee = substring(five9_data.calllog.agent, 4,position('#' in five9_data.calllog.agent)- 4)
Group By warbird.attr_employee
Order warbird.attr_employee
Limit 100
I tried the above in the Select command but figured out it will not work and that I need to use the calculations in the join statement, but have no idea on syntax/formula. A few examples made as simple as possible would be great. I also have issue with the Group By Order by with this.
Shown above.
Often, the join condition would look like:
from t1 join
t2
on t1.empid = concat('%', t2.agent, '%')
Or, you can just use the expression:
from t1 join
t2
on t1.empid = substring(t2.agent, 4, position('#' in t2.agent) - 4)
EDIT:
As for your example code, I would write it as:
Select b.attr_employee, . . . -- aggregation functions go here
From five9_data.calllog cl join
warbird.user u
on u.attr_employee = substring(cl.agent, 4, position('#' in cl.calllog.agent) - 4)
Group By u.attr_employee
Order u.attr_employee
Limit 100;
Here are changes to notice:
Table aliases make the query easier to write and to read.
When using GROUP BY, the only unaggregated columns in the SELECT should be the GROUP BY keys. The rest should be aggregated.
Your problem is that warbird.attr_employee is not defined, because you have missed the table name. However, u is so much easier to write and to read.
from t1 join
t2
on t1.empid = substring(t2.agent, 4, position('#' in t2.agent) - 4)

how to write this query in correct syntax?

SELECT collegename(SELECT allotement.collegename,dean.id
FROM dean,allotement
WHERE allotement.city=dean.city
&&dean.collegename<>allotement.collegename
&&dean.id<>allotement.id)as t WHERE id=1
SELECT collegename from (
SELECT allotement.collegename, dean.id
FROM dean,allotement WHERE allotement.city=dean.city
and dean.collegename<>allotement.collegename
and dean.id<>allotement.id)
as t WHERE id=1
A few points to note here:
Treat sub-query as a table source from which you are retrieving the data. Thus, you need a from in the first line.
&& doesn't work in SQL. You have to write and instead.
In your case, writing as t is optional.
You can actually go through a pretty good link which I generally use to follow mySQL syntax, as it's a bit confusing, considering the fact that different SQL databases have a slight variation in syntax and functions available.
You can refer to the official mySQL docs here as well, if in case required.
TRY THIS: We can simply achieve that in following simple way even we don't need sub query for that:
SELECT a.collegename, d.id
FROM dean AS d
INNER JOIN allotement AS a ON a.city = d.city
AND d.collegename <> a.collegename
AND d.id <> a.id
WHERE d.id = 1

Correlated Subquery in a MySQL CASE Statement

Here is a brief explanation of what I'm trying to accomplish; my query follows below.
There are 4 tables and 1 view which are relevant for this particular query (sorry the names look messy, but they follow a strict convention that would make sense if you saw the full list):
Performances may have many Performers, and those associations are stored in PPerformer. Fans can have favorites, which are stored in Favorite_Performer. The _UpcomingPerformances view contains all the information needed to display a user-friendly list of upcoming performances.
My goal is to select all the data from _UpcomingPerformances, then include one additional column that specifies whether the given Performance has a Performer which the Fan added as their favorite. This involves selecting the list of Performers associated with the Performance, and also the list of Performers who are in Favorite_Performer for that Fan, and intersecting the two arrays to determine if anything is in common.
When I execute the below query, I get the error #1054 - Unknown column 'up.pID' in 'where clause'. I suspect it's somehow related to a misuse of Correlated Subqueries but as far as I can tell what I'm doing should work. It works when I replace up.pID (in the WHERE clause of t2) with a hard-coded number, and yes, pID is an existing column of _UpcomingPerformances.
Thanks for any help you can provide.
SELECT
up.*,
CASE
WHEN EXISTS (
SELECT * FROM (
SELECT RID FROM Favorite_Performer
WHERE FanID = 107
) t1
INNER JOIN
(
SELECT r.ID as RID
FROM PPerformer pr
JOIN Performer r ON r.ID = pr.Performer_ID
WHERE pr.Performance_ID = up.pID
) t2
ON t1.RID = t2.RID
)
THEN "yes"
ELSE "no"
END as pText
FROM
_UpcomingPerformances up
The problem is scope related. The nested Selects make the up table invisible inside the internal select. Try this:
SELECT
up.*,
CASE
WHEN EXISTS (
SELECT *
FROM Favorite_Performer fp
JOIN Performer r ON fp.RID = r.ID
JOIN PPerformer pr ON r.ID = pr.Performer_ID
WHERE fp.FanID = 107
AND pr.Performance_ID = up.pID
)
THEN 'yes'
ELSE 'no'
END as pText
FROM
_UpcomingPerformances up

MYSQL and IN in requests

I am a beginner when it comes to using mysql queries embedded inside other mysql queries using the IN statement.
I currently have this query:
SELECT DISTINCT BorName
FROM Borrower
WHERE BorId IN (
SELECT Borrower.BorId
FROM Loan
WHERE Loan.BcId IN (
SELECT BookCopy.BcId
FROM BookCopy
WHERE BookCopy.BtId In (
SELECT BookTitle.BtId
FROM BookTitle
WHERE BookTitle.PubId In (
SELECT Publisher.PubId
FROM Publisher
WHERE `PubName` = CONVERT( _utf8 'Methuen' USING latin1 ) COLLATE latin1_swedish_ci
)
)
)
);
I am basically trying to find out if a borrower has borrowed a book from the publisher Methuen. I just cant seem to work out what is wrong, I have gone through each individual statement and they all seem to work just not the overall request with all of the IN statements.
Can anyone spot what is wrong?
Like suggested, JOINs are a much cleaner, and likely a more efficient way to do this query as opposed to nested INs:
SELECT DISTINCT b.BorName
FROM
Borrower b
JOIN Loan l ON l.BorId = b.BorId
JOIN BookCopy bc ON bc.BcId = l.BcId
JOIN BookTitle bt ON bt.BtId = bc.BtId
JOIN Publisher p ON p.PubId = bt.PubID
WHERE
p.PubName = CONVERT( _utf8 'Methuen' USING latin1 ) COLLATE latin1_swedish_ci
Additionally, I think there was a problem in your first sub-query:
SELECT Borrower.BorId
FROM Loan
WHERE Loan.BcId IN...
I believe should have been:
SELECT Loan.BorId
FROM Loan
WHERE Loan.BcId IN...
Replace all the IN by =
Obviously using joins will be much cleaner.
I am not sure if MySQL supports the with clause but in SQL Server you are able to use something called a Common Table Expression. This is a ANSII SQL spec that should be easily determined.
Psuedo:
With
someMadeUpTableAlias AS
(
SELECT...
)
SELECT ...
FROM
OutsideTable AS A
someMadeUpTableAlias AS B ON (A. = B.)
I have been making a large effort to take advantage of the CTE's because of the readability inherently not there with subqueries. You may want to take a look at the side effects as behind the scenes it would be creating a temporary table and mysql may have some performance hits associated. Easiest way to tell is clear your query cache and run it both ways.

Dev Code - Understanding What I Am Seeing

This is where I start by saying I am not a developer and this is not my code. As the DBA though it has shown up on plate from a performance perspective. The execution plan shows me that there are CI scans for Table2 aliased as D and Table2 aliased as E. Focusing on Table 2 aliased as E. The scan is coming from the subquery in the where clause for E.SEQ_NBR =
I am also seeing far more executions than need be. I know it depends on the exact index structure on the table, but at a high level is it likely that what I am seeing is a CI scan resulting from the aggregate (min) for every match it finds. Basically it is walking the table for the min SEQ_NBR for each match on EMPLID and other fields?
If likely, is it more a result of the manner in which it is written (I would think incorporating a CTE with some ROW_NUMBER logic would help) or lack of indexing? I am trying to avoid throwing an index at it "just because". I am getting hung up on that sub query in the where clause.
SELECT
D.EMPLID
,D.JOBCODE
,D.DEPTID
,E.DUR
,SUM(D.TL_QUANTITY) 'YTD_TL_QUANTITY'
FROM
Table1 B
,Table2 D
,Table2 E
WHERE
D.TRC = B.TRC
AND B.TL_ERNCD IN ( #0, #1, #2, #3, #4, #5, #6 )
AND D.EMPLID = E.EMPLID
AND D.EMPL_RCD = E.EMPL_RCD
AND D.DUR < = E.DUR
AND D.DUR > = '1/1/' + CAST(DATEPART(YEAR, E.DUR) AS CHAR)
AND E.SEQ_NBR =
( SELECT
MIN(EX.SEQ_NBR)
FROM
Table2 EX
WHERE
E.EMPLID = EX.EMPLID
AND E.EMPL_RCD = EX.EMPL_RCD
AND E.DUR = EX.DUR
)
AND B.EFFDT = ( SELECT
MAX(B_ED.EFFDT)
FROM
Table1 B_ED
WHERE
B.TRC = B_ED.TRC
AND B_ED.EFFDT < = GETDATE()
)
GROUP BY
D.EMPLID
,D.JOBCODE
,D.DEPTID
,E.DUR
The MIN operation has nothing to do with the CL scan. A MIN or Max is calculated using a sort. The problem is most likely the number of times the subquery is being executed. It has to loop through the subquery for every record returned in the parent query. A CTE may be helpful here depending on the size of Table2, but I don't think you need to worry about finding a replacement for the MIN() ... at least not yet.
Correlated subqueries are performance killers. Remove them and replace them with CTEs and JOINs or derived tables.
Try something like this (not tested)
SELECT
D.EMPLID
,D.JOBCODE
,D.DEPTID
,E.DUR
,SUM(D.TL_QUANTITY) 'YTD_TL_QUANTITY'
FROM Table1 B
JOIN Table2 D
ON D.TRC = B.TRC AND D.EMPLID = E.EMPLID
JOIN Table2 E
ON D.EMPL_RCD = E.EMPL_RCD AND D.DUR < = E.DUR
JOIN (SELECT MIN(EX.SEQ_NBR)FROM Table2) EX
ON E.EMPLID = EX.EMPLID
AND E.EMPL_RCD = EX.EMPL_RCD
AND E.DUR = EX.DUR
JOIN (SELECT MAX(B_ED.EFFDT)
FROM Table1
WHERE B_ED.EFFDT < = GETDATE()) B_ED
ON B.TRC = B_ED.TRC
WHERE B.TL_ERNCD IN ( #0, #1, #2, #3, #4, #5, #6 )
AND D.DUR > = '1/1/' + CAST(DATEPART(YEAR, E.DUR) AS CHAR)
As far as the implicit join syntax, do not allow anyone to ever do this again. It is a poor programming technique. As a DBA you can say what you will and will not allow in the database. Code review what is coming in and do not pass it until they remove the implicit syntax.
Why is is bad? In the first place you get accidental cross joins. Further, from a maintenance perspective, you can't tell if the cross join was accidental (and thus the query incorrect) or on purpose. This means the query with a cross join in it is unmaintainable.
Next, if you have to change some of the joins later to outer joins and do not fix all the implict ones at the same time, you can get incorrect results (which may not be noticed by an inexperienced developer. In SQL Server 2008 you cannot use the implicit syntax for an outer join, but it shouldn't have been used even as far back as SQl Server 2000 because Books Online (for SQL Server 2000) states that there are cases where it is misinterpreted. In other words, the syntax in unreliable for outer joins. There is no excuse ever for using an implicit join, you gain nothing from them over using an explicit join and they can create more problems.
You need to educate your developers and tell them that this code (which has been obsolete since 1992!) is not longer acceptable.
This a quick one, but this, CAST('1/1/' + CAST(DATEPART(YEAR, E.DUR) AS CHAR) AS DATETIME), it likely causing a table scan on Table2 E because the function likely has to be evaluated against each row.