MYSQL and IN in requests - mysql

I am a beginner when it comes to using mysql queries embedded inside other mysql queries using the IN statement.
I currently have this query:
SELECT DISTINCT BorName
FROM Borrower
WHERE BorId IN (
SELECT Borrower.BorId
FROM Loan
WHERE Loan.BcId IN (
SELECT BookCopy.BcId
FROM BookCopy
WHERE BookCopy.BtId In (
SELECT BookTitle.BtId
FROM BookTitle
WHERE BookTitle.PubId In (
SELECT Publisher.PubId
FROM Publisher
WHERE `PubName` = CONVERT( _utf8 'Methuen' USING latin1 ) COLLATE latin1_swedish_ci
)
)
)
);
I am basically trying to find out if a borrower has borrowed a book from the publisher Methuen. I just cant seem to work out what is wrong, I have gone through each individual statement and they all seem to work just not the overall request with all of the IN statements.
Can anyone spot what is wrong?

Like suggested, JOINs are a much cleaner, and likely a more efficient way to do this query as opposed to nested INs:
SELECT DISTINCT b.BorName
FROM
Borrower b
JOIN Loan l ON l.BorId = b.BorId
JOIN BookCopy bc ON bc.BcId = l.BcId
JOIN BookTitle bt ON bt.BtId = bc.BtId
JOIN Publisher p ON p.PubId = bt.PubID
WHERE
p.PubName = CONVERT( _utf8 'Methuen' USING latin1 ) COLLATE latin1_swedish_ci
Additionally, I think there was a problem in your first sub-query:
SELECT Borrower.BorId
FROM Loan
WHERE Loan.BcId IN...
I believe should have been:
SELECT Loan.BorId
FROM Loan
WHERE Loan.BcId IN...

Replace all the IN by =
Obviously using joins will be much cleaner.

I am not sure if MySQL supports the with clause but in SQL Server you are able to use something called a Common Table Expression. This is a ANSII SQL spec that should be easily determined.
Psuedo:
With
someMadeUpTableAlias AS
(
SELECT...
)
SELECT ...
FROM
OutsideTable AS A
someMadeUpTableAlias AS B ON (A. = B.)
I have been making a large effort to take advantage of the CTE's because of the readability inherently not there with subqueries. You may want to take a look at the side effects as behind the scenes it would be creating a temporary table and mysql may have some performance hits associated. Easiest way to tell is clear your query cache and run it both ways.

Related

Why is my column 'unknown' when using FULL JOIN

I'm learning sql and wanted to create my own tables to practise on. I found the following site: https://sqltest.net/
I created two tables to practise joins on. LEFT/RIGHT/INNER joins work fine with the sql statements i create but when i try to use FULL JOIN the following error appears:
Ouuuu snap '': Unknown column 'wizards.colours' in 'field list'
Is this something i'm doing wrong or a glitch with the website?
CREATE TABLE wizards
(
colours varchar(255),
numbers int,
symbols varchar(255)
);
INSERT INTO wizards
VALUES ('red','49','£'),
('blue','83','$'),
('blue','72','£'),
('purple','24','%'),
('orange','82','$'),
('white','67',NULL),
('blue','17','%'),
('black','12','%'),
('green','97','&'),
('grey','1','%'),
('red','6','£'),
('red','76','%');
CREATE TABLE warriors
(
colours varchar(255),
numbers int,
symbols varchar(255)
);
INSERT INTO warriors
VALUES ('orange','59','£'),
('purple','2','£'),
('white','11','%'),
('blue','78','%'),
('grey','56','$'),
('red','5','%'),
('orange','92',NULL),
('green','50','$'),
('orange','49',NULL),
('red','1','%');
my sql statement:
SELECT wizards.colours, warriors.numbers, wizards.numbers
FROM wizards
FULL JOIN warriors ON wizards.colours=warriors.colours
ORDER BY wizards.colours;
MySQL does not support full join:
You can use left + union + right join:
select * from (
SELECT wizards.colours, warriors.numbers as warriors_numbers, wizards.numbers as wizards_numbers
FROM wizards
LEFT JOIN warriors ON wizards.colours=warriors.colours
UNION
SELECT wizards.colours, warriors.numbers as warriors_numbers, wizards.numbers as wizards_numbers
FROM wizards
RIGHT JOIN warriors ON wizards.colours=warriors.colours
) T
ORDER BY colours;
Online demo at db<>fiddle
MySQL does not support FULL JOIN. But, the LEFT JOIN/RIGHT JOIN hack is not the best way to implement it. For instance, it doesn't handle duplicates correctly.
A better way is:
select cc.colours,
wa.numbers, wi.numbers -- whatever you want here
from ((select wi.colours
from wizards
) union -- on purpose to remove duplicates
(select wa.colours
from warriors wa
)
) c left join
wizards wi
on wi.colours = c.colours left join
warriors wa
on wa.colours = c.colours;
Even this is not 100% equivalent, because it does not handle NULL values correctly. However, it usually gets the intention right. You can fix the NULL handling by changing the ON conditions to use the NULL-safe comparison, on (wi.colours <=> c.colours>.
More importantly, you shouldn't need a FULL JOIN. They are almost never needed in a properly formed database.
In this case, that would mean that you have a colours table with correctly formed foreign key constraints. This is part of entity-relationship modeling and the right way to implement such relationships.

Correlated Subquery in a MySQL CASE Statement

Here is a brief explanation of what I'm trying to accomplish; my query follows below.
There are 4 tables and 1 view which are relevant for this particular query (sorry the names look messy, but they follow a strict convention that would make sense if you saw the full list):
Performances may have many Performers, and those associations are stored in PPerformer. Fans can have favorites, which are stored in Favorite_Performer. The _UpcomingPerformances view contains all the information needed to display a user-friendly list of upcoming performances.
My goal is to select all the data from _UpcomingPerformances, then include one additional column that specifies whether the given Performance has a Performer which the Fan added as their favorite. This involves selecting the list of Performers associated with the Performance, and also the list of Performers who are in Favorite_Performer for that Fan, and intersecting the two arrays to determine if anything is in common.
When I execute the below query, I get the error #1054 - Unknown column 'up.pID' in 'where clause'. I suspect it's somehow related to a misuse of Correlated Subqueries but as far as I can tell what I'm doing should work. It works when I replace up.pID (in the WHERE clause of t2) with a hard-coded number, and yes, pID is an existing column of _UpcomingPerformances.
Thanks for any help you can provide.
SELECT
up.*,
CASE
WHEN EXISTS (
SELECT * FROM (
SELECT RID FROM Favorite_Performer
WHERE FanID = 107
) t1
INNER JOIN
(
SELECT r.ID as RID
FROM PPerformer pr
JOIN Performer r ON r.ID = pr.Performer_ID
WHERE pr.Performance_ID = up.pID
) t2
ON t1.RID = t2.RID
)
THEN "yes"
ELSE "no"
END as pText
FROM
_UpcomingPerformances up
The problem is scope related. The nested Selects make the up table invisible inside the internal select. Try this:
SELECT
up.*,
CASE
WHEN EXISTS (
SELECT *
FROM Favorite_Performer fp
JOIN Performer r ON fp.RID = r.ID
JOIN PPerformer pr ON r.ID = pr.Performer_ID
WHERE fp.FanID = 107
AND pr.Performance_ID = up.pID
)
THEN 'yes'
ELSE 'no'
END as pText
FROM
_UpcomingPerformances up

How can I optimize this raw SQL and perhaps implement it via CodeIgniter?

It's been a while since I've written raw SQL, I was hoping someone could help me out in optimizing this SQL query so that it works across, both, MySQL and PostgreSQL.
I would also have to implement this via CodeIgniter (2.x) using ActiveRecord, any help/advice?
SELECT *
FROM notaries, contact_notaries
WHERE notaries.id = contact_notaries.notary_id
AND WHERE ( contact_notaries.city LIKE %$criteria%
OR contact_notaries.state LIKE %$criteria
OR contact_notaries.address LIKE %$criteria%)
Thanks!
Each query can have just one WHERE clause (you don't need the second)
It's much better to put join condition into JOIN rather then WHERE.
Are you sure you really need all the columns from 2 tables (*)?
So I'd refactor it to
SELECT [field_list]
FROM notaries
INNER JOIN contact_notaries ON (notaries.id = contact_notaries.notary_id)
WHERE ( contact_notaries.city LIKE '%$criteria%'
OR contact_notaries.state LIKE '%$criteria'
OR contact_notaries.address LIKE '%$criteria%')
Using a1ex07's query:
SELECT [field_list]
FROM notaries
INNER JOIN contact_notaries ON (notaries.id = contact_notaries.notary_id)
WHERE ( contact_notaries.city LIKE '%$criteria%'
OR contact_notaries.state LIKE '%$criteria'
OR contact_notaries.address LIKE '%$criteria%')
Active record:
$this->db->select(); // Leave empty to select all fields
$this->db->join('contact_notaries', 'notaries.id = contact_notaries.notary_id', 'inner');
$this->db->like('contact_notaries.city', 'criteria');
$this->db->like('contact_notaries.state', 'criteria');
$this->db->like('contact_notaries.address', 'match');
$results = $this->db->get('notaries');
To specify a list of fields you can do $this->db->select('field_1, field_2, ...');.
http://codeigniter.com/user_guide/database/active_record.html

Best way to reference an outer query / subquery?

I'm trying to reference a field from the 1st select table in the 3rd select(subquery) table.
However, that field isn't recognized when it goes to that sub-level of a query.
The php code I'm working on uses sql to return part of the sql command (string) that will be used in other places.
I've came up with this example that shows up the kind of nested querys that I want to solve.
In here I'm trying to get the name and emails of users that are working at night and have a matching job rank for an available job:
tables -----------> fields
table_users -> [user_id, name, email, rank, ...]
table_users_jobs -> [user_id, job_id, period, ....]
table_jobs -> [job_id, status, rank, ...]
-- sql calling code -> $rank = "t1.rank"; get_users_info_by_rank($rank);
-- maybe using: SET #rank = NULL; SELECT #rank := $rank, t1.name, ...
SELECT t1.name, t1.email
FROM table_users as t1
WHERE t1.user_id IN (
SELECT t2.user_id
FROM table_users_jobs as t2
WHERE t2.period = 'night' AND
t2.job_id IN (
-- avaiable jobs to that rank -> get_job_ranks_sql($rank);
SELECT t3.job_id
FROM table_jobs as t3
-- maybe using: t3.rank = #rank
WHERE t3.rank = t1.rank AND
t3.status = 'avaiable_position')
)
Working a little I guess I could avoid the 3rd level select problem. Nevertheless the point is that I'm trying to reuse sql code like the function that gives me the job_id of the rank that I chose:
function get_job_ranks_sql($rank){
//probably 't3' will be renamed for something more unique
return 'SELECT t3.job_id
FROM table_jobs as t3
WHERE t3.rank = '.$rank.' AND
t3.status = "available_position")';
}
Even using php I'm trying to make it generic to maybe use with another language if possible.
The sql version using is MySQL 5.1.41
Actually I think it's possible the way I want, by using sql variables like #rank, but I'm not sure if it's slower and if there are other better ways to do it.
Thanks in advance for any help :)
So, as one commenter pointed out, I think you would do much better off using JOINS, than sub-selects. For example, if I am reading your query/problem correctly, you could do a join query like this:
SELECT t1.name, t1.email, t3.job_id
FROM table_users t1
LEFT JOIN table_users_job t2
ON t1.user_id = t2.user_id
LEFT JOIN table_jobs t3
ON t3.job_id = t2.job_id
WHERE t2.period = 'night
AND t3.status = 'available_position'
Which is a lot more concise, easier to read, and is easier on your database. But doing this would prevent you from modularizing your SQL. If that is really important, you might consider storing such queries in Stored Procedure. This way, you can actually get a SP to return a list of results. Take a look at this tutorial:
http://www.wellho.net/resources/ex.php4?item=s163/stp4
Of course, that doesn't really solve your problem of being able to access variables at the lower levels of a sub select, but it would make your SQL easier to manage, and make it available to other language implementations, as you mentioned might be a need for you.
Something else to consider, in the bigger picture, would be migrating to a PHP framework that provides an ORM layer, where you could make those tables into objects, and then be able to access your data with much greater ease and flexibility (usually). But that is very 'big picture' and might not be suitable for your project requirements. One such framework that I could recommend, however, is CakePHP.

indexes in mysql SELECT AS or using Views

I'm in over my head with a big mysql query (mysql 5.0), and i'm hoping somebody here can help.
Earlier I asked how to get distinct values from a joined query
mysql count only for distinct values in joined query
The response I got worked (using a subquery with join as)
select *
from media m
inner join
( select uid
from users_tbl
limit 0,30) map
on map.uid = m.uid
inner join users_tbl u
on u.uid = m.uid
unfortunately, my query has grown more unruly, and though I have it running, joining into a derived table is taking too long because there is no indexes available to the derived query.
my query now looks like this
SELECT mdate.bid, mdate.fid, mdate.date, mdate.time, mdate.title, mdate.name,
mdate.address, mdate.rank, mdate.city, mdate.state, mdate.lat, mdate.`long`,
ext.link,
ext.source, ext.pre, meta, mdate.img
FROM ext
RIGHT OUTER JOIN (
SELECT media.bid,
media.date, media.time, media.title, users.name, users.img, users.rank, media.address,
media.city, media.state, media.lat, media.`long`,
GROUP_CONCAT(tags.tagname SEPARATOR ' | ') AS meta
FROM media
JOIN users ON media.bid = users.bid
LEFT JOIN tags ON users.bid=tags.bid
WHERE `long` BETWEEN -122.52224684058 AND -121.79760915942
AND lat BETWEEN 37.07500915942 AND 37.79964684058
AND date = '2009-02-23'
GROUP BY media.bid, media.date
ORDER BY media.date, users.rank DESC
LIMIT 0, 30
) mdate ON (mdate.bid = ext.bid AND mdate.date = ext.date)
phew!
SO, as you can see, if I understand my problem correctly, i have two derivative tables without indexes (and i don't deny that I may have screwed up the Join statements somehow, but I kept messing with different types, is this ended up giving me the result I wanted).
What's the best way to create a query similar to this which will allow me to take advantage of the indexes?
Dare I say, I actually have one more table to add into the mix at a later date.
Currently, my query is taking .8 seconds to complete, but I'm sure if I could take advantage of the indexes, this could be significantly faster.
First, check for indices on ext(bid, date), users(bid) and tags(bid), you should really have them.
It seems, though, that it's LONG and LAT that cause you most problems. You should try keeping your LONG and LAT as a (coordinate POINT), create a SPATIAL INDEX on this column and query like that:
WHERE MBRContains(#MySquare, coordinate)
If you can't change your schema for some reason, you can try creating additional indices that include date as a first field:
CREATE INDEX ix_date_long ON media (date, `long`)
CREATE INDEX ix_date_lat ON media (date, lat)
These indices will be more efficient for you query, as you use exact search on date combined with a ranged search on axes.
Starting fresh:
Question - why are you grouping by both media.bid and media.date? Can a bid have records for more than one date?
Here's a simpler version to try:
SELECT
mdate.bid,
mdate.fid,
mdate.date,
mdate.time,
mdate.title,
mdate.name,
mdate.address,
mdate.rank,
mdate.city,
mdate.state,
mdate.lat,
mdate.`long`,
ext.link,
ext.source,
ext.pre,
meta,
mdate.img,
( SELECT GROUP_CONCAT(tags.tagname SEPARATOR ' | ')
FROM tags
WHERE ext.bid = tags.bid
ORDER BY tags.bid GROUP BY tags.bid
) AS meta
FROM
ext
LEFT JOIN
media ON ext.bid = media.bid AND ext.date = media.date
JOIN
users ON ext.bid = users.bid
WHERE
`long` BETWEEN -122.52224684058 AND -121.79760915942
AND lat BETWEEN 37.07500915942 AND 37.79964684058
AND ext.date = '2009-02-23'
AND users.userid IN
(
SELECT userid FROM users ORDER BY rank DESC LIMIT 30
)
ORDER BY
media.date,
users.rank DESC
LIMIT 0, 30
You might want to compare your perforamnces against using a temp table for each selection, and joining those tables together.
create table #whatever
create table #whatever2
insert into #whatever select...
insert into #whatever2 select...
select from #whatever join #whatever 2
....
drop table #whatever
drop table #whatever2
If your system has enough memory to hold full tables this might work out much faster. It depends on how big your database is.