SQL : Call information from another table based on a condition - mysql

I have two tables:
One is
that has information about the passenger in airlines, its small, just for the sake of representation
I have another table named
, in this i storage information about the flight of each passenger
The idea is that if the passenger did more than 5 flights i need to display the passenger details
I stated a query in the table "flights" that counts each time the passenger id repeats in the table
SELECT passId, COUNT(passId) as numflights FROM flights GROUP BY passId;
Showing
Problem
I need to update that query ir order that, if the number of flights for a passID is > 5, then it displays their detail information located in the table "passenger"

Use having an IN.
SELECT *
FROM passengers
WHERE passid IN (SELECT passId as numflights FROM flights GROUP BY passId HAVING COUNT(passId) > 5)
if the numbers are high you need to join the tables
SELECT p1.*
FROM passengers p1
INNER JOIN (SELECT passId as numflights FROM flights GROUP BY passId HAVING COUNT(passId) > 5) p2
ON p1.passid = p2.passId

Oner method uses a correlated subquery:
select p.*
from passengers p
where (select count(*)
from flights f
where f.passid = p.passid
) > 5;
In particular, this can take advantage of an index on flights(passid). However, it is worth checking if this is actually faster than other alternatives, such as aggregating first and joining.

Related

Creating a SQL view with personal best records

I have the following SQL Database structure:
Users are the registered users. Maps are like circuits or race tracks. When a user is driving a time a new time record will be created including the userId, mapId and the time needed to finish the racetrack.
I wish to create a view where all the users personal bests on all maps are listed.
I tried creating the view like this:
CREATE VIEW map_pb AS
SELECT MID, UID, TID
FROM times
WHERE score IN (SELECT MIN(score) FROM times)
ORDER BY registered
This does not lead to the wished result.
Thank you for your help!
I hope that you have 'times' table created as the above diagram and 'score' column in the table that you use to measure the best record.
(MIN(score) is the best record).
You can simply create a view to have the personal best records using sub-queries like this.
CREATE VIEW map_pb AS
SELECT a.MID, a.UID, a.TID
FROM times a
INNER JOIN (
SELECT TID, UID, MIN(score) score
FROM times
GROUP BY UID
) b ON a.UID = b.UID AND a.score= b.score
-- if you have 'registered' column in the 'times' table to order the result
ORDER BY registered
I hope this may work.
You probably need to use a query that will first return the minimum score for each user on each map. Something like this:
SELECT UID,
MID,
MIN(score) AS best_time
FROM times
GROUP BY UID, MID
Note: I used MIN(score) as this is what is shown in your example query, but perhaps it should be MIN(time) instead?
Then just use the subquery JOINed to your other tables to get the output:
SELECT *
FROM (
SELECT UID,
MID,
MIN(score) AS best_time
FROM times
GROUP BY UID, MID
) a
INNER JOIN users u ON u.UID = a.UID
INNER JOIN maps m ON m.MID = a.MID
Of course, replace SELECT * with the columns you actually want.
Note: code untested but does give an idea as to a solution.
Start with a subquery to determine each user's minimum score on each map
SELECT UID, TID, MIN(time) time
FROM times
GROUP BY UID, TID
Then join that subquery into a main query.
SELECT times.UID, times.TID,
mintimes.time
FROM times
JOIN (
) mintimes ON times.TID = mintimes.TID
AND times.UID = mintimes.UID
AND times.time = mintimes.time
JOIN maps ON times.MID = maps.MID
JOIN users ON times.UID = users.UID
This query pattern uses a GROUP BY function to find the outlying (MIN in this case) value for each combination. It then uses that subquery to find the detail record for each outlying value.

Grouping in my SQL for a constraint less than 10

I am confused on which table I should use and or should I join the tables when attempting this question?
List total number of hotels in the database that have less than 10 rooms.
Hotel (hotelNo, hotelName, city)
Room (roomNo, hotelNo, type, price)
Booking (hotelNo, guestNo, dateFrom, dateTo, roomNo)
Guest (guestNo, guestName, guestAddress)
I have tried by just useing one table Room in my statement
SELECT hotelNo
FROM Room
WHERE roomNo < 10
GROUP BY hotelNo;
Would this be correct or should I use something like this?
SELECT h.hotelNo,r.roomNo
FROM Hotel h JOIN Room r ON h.hotelNo= r.hotelNo
WHERE r.roomNo < 10
GROUP BY hotelNo;
Assuming that all hotels have at least one room, you don't need a join. But, you do need aggregation:
select count(*)
from (select r.hotelno
from rooms r
group by r.hotelno
having count(*) < 10
) r;
The subquery returns the hotels that have fewer than 10 rooms (and are in the rooms table, so they have at least one room).
The outer query counts the number of such hotels.
Check count on the having in order to be applied to the groups, and that count the number of rows returned by that query
SELECT COUNT(*)
FROM (
SELECT hotelRo
FROM Room
GROUP BY hotelNo
HAVING COUNT(*)<10
) AS TMP;

MySQL nested selection

I am trying to figure out this question on a practice page online with the following tables:
Question:
For all cases in which the same customer rated the same product
more than once, and in some point in time gave it a lower rating
than before, return the customer name, the name of the product,
and the lowest star rating that was given.
I cant seem to figure out why this isnt correct - would anyone be able to help?
Here is what I have so far (without sample data):
SELECT
Customer.customer_name,
Product.product_name,
MIN(Rating.rating_stars)
FROM Rating
JOIN Product ON Rating.prod_id = Product.prod_id
JOIN Customer ON Rating.cust_id = Customer.prod_id
GROUP BY Customer.customer_name, Product.product_name
HAVING COUNT(Product.prod_id) > 1
This query will return the minimum rating stars of a product that has been reviewed more than once by the same customer, with any of the newer ratings lower than an older rating:
SELECT
r1.prod_id,
r1.cust_id,
MIN(r1.rating_star) AS min_rating
FROM
rating r1 INNER JOIN rating r2
ON r1.prod_id=r2.prod_id
AND r1.cust_id=r2.cust_id
AND r1.rating_date>r2.rating_date
AND r1.rating_star<r2.rating_star
GROUP BY
r1.prod_id,
r1.cust_id
you can then join this query with products and customers table:
SELECT
customer.customer_name,
product.product_name,
m.min_rating
FROM (
SELECT
r1.prod_id,
r1.cust_id,
MIN(r1.rating_star) AS min_rating
FROM
rating r1 INNER JOIN rating r2
ON r1.prod_id=r2.prod_id
AND r1.cust_id=r2.cust_id
AND r1.rating_date>r2.rating_date
AND r1.rating_star<r2.rating_star
GROUP BY
r1.prod_id,
r1.cust_id) m
INNER JOIN customer on m.cust_id = customer.cust_id
INNER JOIN product ON m.product_id = product.product_id
Just a few points:
You cant specify the tables in the FROM clause the sane way you specify attributes in the SELECT. You can only have a single table in the FROM, and one more for each Join you use.
SELECT a, b, c FROM a; <----------fine
SELECT a, b, c FROM a, b; <----not fine
SELECT a, b, c FROM a JOIN b; <---fine
When it comes to the tables in the FROM/JOIN, you dont use "AS" to give them an alias, just the table name followed by the alias.
FROM atable a JOIN btable b; <--This assigns alias a to "atable" and b to "btable".
You also have to specify the common attribute that the tables are going to be joined on:
customer JOIN rating ON customer.cust_id = rating.cust_id;
As for the rest you can probably work out the correct WHERE clauses to use once you have the syntax down.

MySQL - 3 tables, is this complex join even possible?

I have three tables: users, groups and relation.
Table users with fields: usrID, usrName, usrPass, usrPts
Table groups with fields: grpID, grpName, grpMinPts
Table relation with fields: uID, gID
User can be placed in group in two ways:
if collect group minimal number of points (users.usrPts > group.grpMinPts ORDER BY group.grpMinPts DSC LIMIT 1)
if his relation to the group is manually added in relation tables (user ID provided as uID, as well as group ID provided as gID in table named relation)
Can I create one single query, to determine for every user (or one specific), which group he belongs, but, manual relation (using relation table) should have higher priority than usrPts compared to grpMinPts? Also, I do not want to have one user shown twice (to show his real group by points, but related group also)...
Thanks in advance! :) I tried:
SELECT * FROM users LEFT JOIN (relation LEFT JOIN groups ON (relation.gID = groups.grpID) ON users.usrID = relation.uID
Using this I managed to extract specified relations (from relation table), but, I have no idea how to include user points, respecting above mentioned priority (specified first). I know how to do this in a few separated queries in php, that is simple, but I am curious, can it be done using one single query?
EDIT TO ADD:
Thanks to really educational technique using coalesce #GordonLinoff provided, I managed to make this query to work as I expected. So, here it goes:
SELECT o.usrID, o.usrName, o.usrPass, o.usrPts, t.grpID, t.grpName
FROM (
SELECT u.*, COALESCE(relationgroupid,groupid) AS thegroupid
FROM (
SELECT u.*, (
SELECT grpID
FROM groups g
WHERE u.usrPts > g.grpMinPts
ORDER BY g.grpMinPts DESC
LIMIT 1
) AS groupid, (
SELECT grpUID
FROM relation r
WHERE r.userUID = u.usrID
) AS relationgroupid
FROM users u
)u
)o
JOIN groups t ON t.grpID = o.thegroupid
Also, if you are wondering, like I did, is this approach faster or slower than doing three queries and processing in php, the answer is that this is slightly faster way. Average time of this query execution and showing results on a webpage is 14 ms. Three simple queries, processing in php and showing results on a webpage took 21 ms. Average is based on 10 cases, average execution time was, really, a constant time.
Here is an approach that uses correlated subqueries to get each of the values. It then chooses the appropriate one using the precedence rule that if the relations exist use that one, otherwise use the one from the groups table:
select u.*,
coalesce(relationgroupid, groupid) as thegroupid
from (select u.*,
(select grpid from groups g where u.usrPts > g.grpMinPts order by g.grpMinPts desc limit 1
) as groupid,
(select gid from relations r where r.userId = u.userId
) as relationgroupid
from users u
) u
Try something like this
select user.name, group.name
from group
join relation on relation.gid = group.gid
join user on user.uid = relation.uid
union
select user.name, g1.name
from group g1
join group g2 on g2.minpts > g1.minpts
join user on user.pts between g1.minpts and g2.minpts

MySQL is not using INDEX in subquery

I have these tables and queries as defined in sqlfiddle.
First my problem was to group people showing LEFT JOINed visits rows with the newest year. That I solved using subquery.
Now my problem is that that subquery is not using INDEX defined on visits table. That is causing my query to run nearly indefinitely on tables with approx 15000 rows each.
Here's the query. The goal is to list every person once with his newest (by year) record in visits table.
Unfortunately on large tables it gets real sloooow because it's not using INDEX in subquery.
SELECT *
FROM people
LEFT JOIN (
SELECT *
FROM visits
ORDER BY visits.year DESC
) AS visits
ON people.id = visits.id_people
GROUP BY people.id
Does anyone know how to force MySQL to use INDEX already defined on visits table?
Your query:
SELECT *
FROM people
LEFT JOIN (
SELECT *
FROM visits
ORDER BY visits.year DESC
) AS visits
ON people.id = visits.id_people
GROUP BY people.id;
First, is using non-standard SQL syntax (items appear in the SELECT list that are not part of the GROUP BY clause, are not aggregate functions and do not sepend on the grouping items). This can give indeterminate (semi-random) results.
Second, ( to avoid the indeterminate results) you have added an ORDER BY inside a subquery which (non-standard or not) is not documented anywhere in MySQL documentation that it should work as expected. So, it may be working now but it may not work in the not so distant future, when you upgrade to MySQL version X (where the optimizer will be clever enough to understand that ORDER BY inside a derived table is redundant and can be eliminated).
Try using this query:
SELECT
p.*, v.*
FROM
people AS p
LEFT JOIN
( SELECT
id_people
, MAX(year) AS year
FROM
visits
GROUP BY
id_people
) AS vm
JOIN
visits AS v
ON v.id_people = vm.id_people
AND v.year = vm.year
ON v.id_people = p.id;
The: SQL-fiddle
A compound index on (id_people, year) would help efficiency.
A different approach. It works fine if you limit the persons to a sensible limit (say 30) first and then join to the visits table:
SELECT
p.*, v.*
FROM
( SELECT *
FROM people
ORDER BY name
LIMIT 30
) AS p
LEFT JOIN
visits AS v
ON v.id_people = p.id
AND v.year =
( SELECT
year
FROM
visits
WHERE
id_people = p.id
ORDER BY
year DESC
LIMIT 1
)
ORDER BY name ;
Why do you have a subquery when all you need is a table name for joining?
It is also not obvious to me why your query has a GROUP BY clause in it. GROUP BY is ordinarily used with aggregate functions like MAX or COUNT, but you don't have those.
How about this? It may solve your problem.
SELECT people.id, people.name, MAX(visits.year) year
FROM people
JOIN visits ON people.id = visits.id_people
GROUP BY people.id, people.name
If you need to show the person, the most recent visit, and the note from the most recent visit, you're going to have to explicitly join the visits table again to the summary query (virtual table) like so.
SELECT a.id, a.name, a.year, v.note
FROM (
SELECT people.id, people.name, MAX(visits.year) year
FROM people
JOIN visits ON people.id = visits.id_people
GROUP BY people.id, people.name
)a
JOIN visits v ON (a.id = v.id_people and a.year = v.year)
Go fiddle: http://www.sqlfiddle.com/#!2/d67fc/20/0
If you need to show something for people that have never had a visit, you should try switching the JOIN items in my statement with LEFT JOIN.
As someone else wrote, an ORDER BY clause in a subquery is not standard, and generates unpredictable results. In your case it baffled the optimizer.
Edit: GROUP BY is a big hammer. Don't use it unless you need it. And, don't use it unless you use an aggregate function in the query.
Notice that if you have more than one row in visits for a person and the most recent year, this query will generate multiple rows for that person, one for each visit in that year. If you want just one row per person, and you DON'T need the note for the visit, then the first query will do the trick. If you have more than one visit for a person in a year, and you only need the latest one, you have to identify which row IS the latest one. Usually it will be the one with the highest ID number, but only you know that for sure. I added another person to your fiddle with that situation. http://www.sqlfiddle.com/#!2/4f644/2/0
This is complicated. But: if your visits.id numbers are automatically assigned and they are always in time order, you can simply report the highest visit id, and be guaranteed that you'll have the latest year. This will be a very efficient query.
SELECT p.id, p.name, v.year, v.note
FROM (
SELECT id_people, max(id) id
FROM visits
GROUP BY id_people
)m
JOIN people p ON (p.id = m.id_people)
JOIN visits v ON (m.id = v.id)
http://www.sqlfiddle.com/#!2/4f644/1/0 But this is not the way your example is set up. So you need another way to disambiguate your latest visit, so you just get one row per person. The only trick we have at our disposal is to use the largest id number.
So, we need to get a list of the visit.id numbers that are the latest ones, by this definition, from your tables. This query does that, with a MAX(year)...GROUP BY(id_people) nested inside a MAX(id)...GROUP BY(id_people) query.
SELECT v.id_people,
MAX(v.id) id
FROM (
SELECT id_people,
MAX(year) year
FROM visits
GROUP BY id_people
)p
JOIN visits v ON (p.id_people = v.id_people AND p.year = v.year)
GROUP BY v.id_people
The overall query (http://www.sqlfiddle.com/#!2/c2da2/1/0) is this.
SELECT p.id, p.name, v.year, v.note
FROM (
SELECT v.id_people,
MAX(v.id) id
FROM (
SELECT id_people,
MAX(year) year
FROM visits
GROUP BY id_people
)p
JOIN visits v ON ( p.id_people = v.id_people
AND p.year = v.year)
GROUP BY v.id_people
)m
JOIN people p ON (m.id_people = p.id)
JOIN visits v ON (m.id = v.id)
Disambiguation in SQL is a tricky business to learn, because it takes some time to wrap your head around the idea that there's no inherent order to rows in a DBMS.