SQL performance issue : find a route - mysql

I'm struggling with performance issue on my of my SQL query
I have a train journey traveling 5 stations named "A - B - C - D - E".
A passenger book a ticket for only "B - C - D" ride.
I need to retrieve all stations my passengers goes to.
What I have stored :
JOURNEY
+----+--------------------+-------------------+-------------------+-----------------+
| id | departure_datetime | arrival_datetime | departure_station | arrival_station |
+----+--------------------+-------------------+-------------------+-----------------+
| 1 | 2018-01-01 06:00 | 2018-01-01 10:00 | A | E |
+----+--------------------+-------------------+-------------------+-----------------+
BOOKING
+----+------------+-------------------+-----------------+
| id | journey_id | departure_station | arrival_station |
+----+------------+-------------------+-----------------+
| 1 | 1 | B | D |
+----+------------+-------------------+-----------------+
LEG
+----+------------+-------------------+-----------------+------------------+------------------+
| id | journey_id | departure_station | arrival_station | departure_time | arrival_time |
+----+------------+-------------------+-----------------+------------------+------------------+
| 1 | 1 | A | B | 2018-01-01 06:00 | 2018-01-01 07:00 |
| 2 | 1 | B | C | 2018-01-01 07:00 | 2018-01-01 08:00 |
| 3 | 1 | C | D | 2018-01-01 08:00 | 2018-01-01 09:00 |
| 4 | 1 | D | E | 2018-01-01 09:00 | 2018-01-01 10:00 |
+----+------------+-------------------+-----------------+------------------+------------------+
Only way I found to retrieve stations is :
select b.id as booking, l.departure_station, l.arrival_station
from JOURNEY j
inner join BOOKING b on j.id = b.journey_id
inner join LEG dl on (j.id = dl.journey_id and b.departure_station = dl.departure_station)
inner join LEG al on (j.id = al.journey_id and b.arrival_station = al.arrival_station)
inner join LEG l on (j.id = l.journey_id and l.departure_time >= dl.departure_time and l.arrival_time <= al.arrival_time)
where b.id = 1
But my LEG table is huge and doing this 3 joins on is very slow. Is there a way I can join only one time LEG table to increase performance ?
Intended return :
+------------+-------------------+-----------------+
| booking_id | departure_station | arrival_station |
+------------+-------------------+-----------------+
| 1 | B | C |
| 1 | C | D |
+------------+-------------------+-----------------+
I work on mariadb 12.2 so i have access to window function but i'm still not very comfortable with it.
Thanks.
EDIT : create tables :
CREATE TABLE `BOOKING` (
`id` INT(11) NOT NULL,
`journey_id` INT(11) NULL DEFAULT NULL,
`departure_station` VARCHAR(50) NULL DEFAULT NULL,
`arrival_station` VARCHAR(50) NULL DEFAULT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `JOURNEY` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`departure_time` DATETIME NULL DEFAULT NULL,
`arrival_time` DATETIME NULL DEFAULT NULL,
`departure_station` VARCHAR(50) NULL DEFAULT NULL,
`arrival_station` VARCHAR(50) NULL DEFAULT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `LEG` (
`id` INT(11) NOT NULL,
`journey_id` INT(11) NULL DEFAULT NULL,
`departure_station` VARCHAR(50) NULL DEFAULT NULL,
`arrival_station` VARCHAR(50) NULL DEFAULT NULL,
`departure_time` DATETIME NULL DEFAULT NULL,
`arrival_time` DATETIME NULL DEFAULT NULL,
PRIMARY KEY (`id`)
);

I don't like your DB schema.
But in your particular case, since you have your query working good for you.
I would just create few indexes too speed up execution.
In general there is nothing wrong when you need to join table few times to itself.
http://sqlfiddle.com/#!9/1a467/1
Try just add 4 indexes:
CREATE INDEX journey ON BOOKING (journey_id);
CREATE INDEX arrival ON LEG (journey_id, arrival_station);
CREATE INDEX departure ON LEG (journey_id, departure_station);
CREATE INDEX d_a_time ON LEG (journey_id, departure_time, arrival_time);
And run your query again, it should be much faster when using indexes.

I would suggest using Common Table Expression (CTE):
WITH leg_cte as
(
SELECT l.* FROM leg l
JOIN booking b
ON l.journey_id = b.journey_id
WHERE b.id = 1
)
SELECT
b.id as booking,
l.departure_station,
l.arrival_station
FROM
booking b
JOIN leg_cte dl
ON b.departure_station = dl.departure_station
JOIN leg_cte al
ON b.arrival_station = al.arrival_station
JOIN leg_cte l
ON l.departure_time >= dl.departure_time AND l.arrival_time <= al.arrival_time
WHERE b.id = 1

Try it left join and use REGEXP to filiter departure_station and arrival_station
select T3.id booking_id , T1.departure_station,T1.arrival_station
from LEG T1
left join JOURNEY T2 on T1.`journey_id` = T2.`id`
and (T1.`departure_time` >= T2.`departure_datetime` and T1.`arrival_time` <= T2.`arrival_datetime`)
left join BOOKING T3 on T3.`id` = T2.`id`
and T1.departure_station REGEXP (CONCAT('[',T3.departure_station , '-' , T3.arrival_station,']' ))
and T1.arrival_station REGEXP (CONCAT('[',T3.departure_station , '-' , T3.arrival_station,']' ))
where T1.journey_id = 1 and T3.id is not null ;
SQL Fiddle Demo Link
| booking_id | departure_station | arrival_station |
|------------|-------------------|-----------------|
| 1 | B | C |
| 1 | C | D |
Test DDL:
CREATE TABLE JOURNEY
(`id` int, `departure_datetime` datetime, `arrival_datetime` datetime, `departure_station` varchar(1), `arrival_station` varchar(1))
;
INSERT INTO JOURNEY
(`id`, `departure_datetime`, `arrival_datetime`, `departure_station`, `arrival_station`)
VALUES
(1, '2018-01-01 06:00:00', '2018-01-01 10:00:00', 'A', 'E')
;
CREATE TABLE BOOKING
(`id` int, `journey_id` int, `departure_station` varchar(1), `arrival_station` varchar(1))
;
INSERT INTO BOOKING
(`id`, `journey_id`, `departure_station`, `arrival_station`)
VALUES
(1, 1, 'B', 'D')
;
CREATE TABLE LEG
(`id` int, `journey_id` int, `departure_station` varchar(1), `arrival_station` varchar(1), `departure_time` datetime, `arrival_time` datetime)
;
INSERT INTO LEG
(`id`, `journey_id`, `departure_station`, `arrival_station`, `departure_time`, `arrival_time`)
VALUES
(1, 1, 'A', 'B', '2018-01-01 06:00:00', '2018-01-01 07:00:00'),
(2, 1, 'B', 'C', '2018-01-01 07:00:00', '2018-01-01 08:00:00'),
(3, 1, 'C', 'D', '2018-01-01 08:00:00', '2018-01-01 09:00:00'),
(4, 1, 'D', 'E', '2018-01-01 09:00:00', '2018-01-01 10:00:00')
;

Related

MySql: how to get the desired result

I've a table like this:
CREATE TABLE `base_build_floor` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`build_no` varchar(64) NOT NULL,
`build_name` varchar(64) DEFAULT NULL,
`floor_no` varchar(64) DEFAULT NULL,
`floor_name` varchar(64) DEFAULT NULL,
PRIMARY KEY (`id`)
)
and insert some data:
INSERT INTO `base_build_floor` VALUES ('41', 'BUILD40210011', 'A', null, null);
INSERT INTO `base_build_floor` VALUES ('42', 'BUILD40210012', 'B', null, null);
INSERT INTO `base_build_floor` VALUES ('43', 'BUILD40210013', 'C', null, null);
INSERT INTO `base_build_floor` VALUES ('44', 'BUILD40210013', 'C', 'FLOOR40210002', 'C1');
INSERT INTO `base_build_floor` VALUES ('45', 'BUILD40210013', 'C', 'FLOOR40210003', 'C2');
INSERT INTO `base_build_floor` VALUES ('46', 'BUILD40210012', 'B', 'FLOOR40210004', 'B1');
the table is about a build-floor table, first you should make a building, then, a building can has no or some floors. the A building has no floor, the B building has one floor named B1, the C building has two floors named C1 and C2, I want to get the result as below:
41 BUILD40210011 A null null
44 BUILD40210013 C FLOOR40210002 C1
45 BUILD40210013 C FLOOR40210003 C2
46 BUILD40210012 B FLOOR40210004 B1
it means that, if a building has no floors, then get it, while if a building has any one floor, the building itself should not be got, so how to write the mysql?I've tried to use Subquery but doesn't work
I've try like this :
SELECT
b.*
FROM
base_build_floor b
WHERE
b.floor_no IS NOT NULL
OR (
b.floor_no IS NULL
AND b.build_no NOT IN (
SELECT
GROUP_CONCAT(nostr)
FROM
(
SELECT
concat("'", f.build_no, "'") as nostr
FROM
base_build_floor f
WHERE
f.floor_no IS NOT NULL
GROUP BY
f.build_no
) t
)
)
but I get all the data
With NOT EXISTS:
select t.* from base_build_floor t
where t.floor_no is not null
or not exists (
select 1 from base_build_floor
where build_no = t.build_no and floor_no is not null
)
See the demo.
Results:
| id | build_no | build_name | floor_no | floor_name |
| --- | ------------- | ---------- | ------------- | ---------- |
| 41 | BUILD40210011 | A | | |
| 44 | BUILD40210013 | C | FLOOR40210002 | C1 |
| 45 | BUILD40210013 | C | FLOOR40210003 | C2 |
| 46 | BUILD40210012 | B | FLOOR40210004 | B1 |
This query would be much simpler if you had normalized tables. Ideally, you would have a buildings table with building id, no, and name, and a floors table with building id, floor no, and floor name. Then you could just join the two tables. Since that's not the case, we can basically extract the building and floor sub-tables from the main one and join them like this:
SELECT
b.build_no,
b.build_name,
f.floor_no,
f.floor_name
FROM
(SELECT DISTINCT build_no, build_name
FROM base_build_floor) b
LEFT OUTER JOIN
(SELECT *
FROM base_build_floor
WHERE floor_no IS NOT NULL) f ON b.build_no = f.build_no

Get Users available based on validation of status, id, and time spread across 3 tables

Lets say I want to pull a list of users(From TableA) who:
are currently within Status = 2
and
currently not in TableB,
OR
are in TableB.UserId AND (TableB.AppId > 1year) by comparing the dates between TableC.id's CreatedDate and with the CurrentDate.
Current Schema Setup...
TableA (UserID)
--------------------------------------
id | fName | lName | Status | CreatedDate
1 | John | Doe | 2 | 2017-03-02 06:31:15.482
2 | Marry | Jane | 2 | 2017-05-03 16:43:56.937
3 | William | Thompson | 4 | 2017-06-15 13:12:32.219
4 | Timothy | Limmons | 2 | 2017-09-27 01:52:42.842
TableB
--------------------------------------
id | AppID | UserID | CreatedDate
1 | 2 | 1 | 2019-04-16 23:21:56.099
2 | 3 | 4 | 2019-08-03 04:32:18.472
TableC (AppID)
--------------------------------------
id | Title | CreatedDate
1 | ToDo List | 2017-03-09 22:45:12.907
2 | Magic Marshmellows | 2018-11-14 07:01:04.050
3 | Project Falcon | 2019-07-23 14:22:44.837
The info above should pull users from TableA with the id's of 1 and 2.
Marry has not been paired with an App, and is therefor available
John is paired with the App Magic Marshmellows, but the project began over 1 year ago and is therefor available
The following info should NOT pull users with the id's of 3 and 4.
William is a status of 4 (not 2) and is therefor NOT available.
Timothy is paired with the App Project Falcon, and this app began within a year from the current DateTime (12/15/2019)... and is therefor NOT available
I need something like...
Select *
FROM
[TableA] a
WHERE
a.Status = 2
IF
TableB.UserID NOT CONTAINS a.id
ELSE IF
TableB.UserID = a.id
AND WHERE
TableB.AppID = TableC.id
AND WHERE
TableC.CreatedDate is less than 1 year old from Current Date
I'm just not sure how to go about using the right syntax for this. Any help would be appreciated.
P.S. If there is a better title for this complicated question, please let me know.
IN MYSQL you would do a query like this.
CREATE TABLE UserID
(`id` int, `fName` varchar(7), `lName` varchar(8), `Status` int, `CreatedDate` Date)
;
INSERT INTO UserID
(`id`, `fName`, `lName`, `Status`, `CreatedDate`)
VALUES
(1, 'John', 'Doe', 1, '2017-03-02 06:31:15.482'),
(2, 'Marry', 'Jane', 2, '2017-05-03 16:43:56.937'),
(3, 'William', 'Thompson', 4, '2017-06-15 13:12:32.219'),
(4, 'Timothy', 'Limmons', 2, '2017-09-27 01:52:42.842')
;
✓
✓
CREATE TABLE TableB
(`id` int, `AppID` int, `UserID` int, `CreatedDate` Date)
;
INSERT INTO TableB
(`id`, `AppID`, `UserID`, `CreatedDate`)
VALUES
(1, 2, 1, '2019-04-16 23:21:56.099'),
(2, 3, 4, '2019-08-03 04:32:18.472')
;
✓
✓
CREATE TABLE APPID
(`id` int, `Title` varchar(18), `CreatedDate` Date)
;
INSERT INTO APPID
(`id`, `Title`, `CreatedDate`)
VALUES
(1, 'ToDo List', '2017-03-09 22:45:12.907'),
(2, 'Magic Marshmellows', '2018-11-14 07:01:04.050'),
(3, 'Project Falcon', '2019-07-23 14:22:44.837')
;
✓
✓
SELECT u.*
From UserID u LEFT JOIN TableB b ON u.id = b.UserID
LEFT JOIN APPID a ON b.APPID = a.id
WHERE Status = 2
AND (u.id NOT IN (SELECT UserID FROM TableB)
OR (u.id IN (SELECT UserID FROM TableB) AND a.CreatedDate > NOW() - INTERVAL 1 YEAR));
id | fName | lName | Status | CreatedDate
-: | :------ | :------ | -----: | :----------
4 | Timothy | Limmons | 2 | 2017-09-27
2 | Marry | Jane | 2 | 2017-05-03
SELECT * FROM APPID WHERE `CreatedDate` > NOW() - INTERVAL 1 YEAR
id | Title | CreatedDate
-: | :------------- | :----------
3 | Project Falcon | 2019-07-23
db<>fiddle here
SELECT
a.*
FROM
TableA AS a
LEFT JOIN (
SELECT
b.UserID
FROM
TableB AS b
INNER JOIN TableC AS c ON (
b.AppID = c.id
)
WHERE
c.CreatedDate >= DATE_SUB(NOW(), INTERVAL 1 YEAR)
) AS s ON (
a.id = s.UserID
)
WHERE
a.Status = 2
AND s.UserID IS NULL
OR
SELECT
a.*
FROM
TableA AS a
LEFT JOIN TableB AS b ON (
b.UserID = a.id
)
LEFT JOIN TableC AS c ON (
c.id = b.AppID
)
WHERE
a.Status = 2
AND (
b.UserID IS NULL
OR c.CreatedDate < DATE_SUB(NOW(), INTERVAL 1 YEAR)
)
OR
SELECT
a.*
FROM
TableA AS a
WHERE
a.Status = 2
AND NOT EXISTS (
SELECT
*
FROM
TableB AS b
INNER JOIN TableC AS c ON (
b.AppID = c.id
)
WHERE
b.UserID = a.id
AND c.CreatedDate >= DATE_SUB(NOW(), INTERVAL 1 YEAR)
)
=>
id fName lName Status CreatedDate
1 John Doe 2 2017-03-02
2 Marry Jane 2 2017-05-03
CREATE TABLE IF NOT EXISTS `TableA` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`fName` varchar(7) DEFAULT NULL,
`lName` varchar(8) DEFAULT NULL,
`Status` int(11) DEFAULT NULL,
`CreatedDate` date DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `Status` (`Status`,`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8;
DELETE FROM `TableA`;
INSERT INTO `TableA` (`id`, `fName`, `lName`, `Status`, `CreatedDate`) VALUES
(1, 'John', 'Doe', 2, '2017-03-02'),
(2, 'Marry', 'Jane', 2, '2017-05-03'),
(3, 'William', 'Thompson', 4, '2017-06-15'),
(4, 'Timothy', 'Limmons', 2, '2017-09-27');
CREATE TABLE IF NOT EXISTS `TableB` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`AppID` int(11) DEFAULT NULL,
`UserID` int(11) DEFAULT NULL,
`CreatedDate` date DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `AppID` (`AppID`),
KEY `UserID` (`UserID`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;
DELETE FROM `TableB`;
INSERT INTO `TableB` (`id`, `AppID`, `UserID`, `CreatedDate`) VALUES
(1, 2, 1, '2019-04-16'),
(2, 3, 4, '2019-08-03');
CREATE TABLE IF NOT EXISTS `TableC` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`Title` varchar(18) DEFAULT NULL,
`CreatedDate` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `CreatedDate` (`CreatedDate`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8;
DELETE FROM `TableC`;
INSERT INTO `TableC` (`id`, `Title`, `CreatedDate`) VALUES
(1, 'ToDo List', '2017-03-09 22:45:12'),
(2, 'Magic Marshmellows', '2018-11-14 07:01:04'),
(3, 'Project Falcon', '2019-07-23 14:22:44');
UPDATED: WHERE condition fixed

Count with join, subquery and group by

I have this table:
CREATE TABLE `logs` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`visitor_id` INT(11) NOT NULL,
`date_time` DATETIME NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `info` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`id_no` INT(11) NOT NULL,
`name` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
);
INSERT INTO `info`
VALUES
(1,20, 'vip'),(2,21, 'customer'),(3,22,'vip')
,(4,23, 'customer'),(5,24, 'vip'),(6,30,'customer')
,(7,31, 'vip'),(8,32,'customer'),(9,33,'vip' ),(10,34, 'vip'),(11,35,'vip');
INSERT INTO `logs`
VALUES
(1,20, '2019-01-01 08:00:00'),(2,21, '2019-01-01 08:05:00'),(3,22,'2019-01-01 08:08:00')
,(4,23, '2019-01-01 08:10:00'),(5,24, '2019-01-01 08:15:00'),(6,30,'2019-01-02 09:00:00')
,(7,31, '2019-01-02 09:10:00'),(8,32,'2019-01-02 09:15:00'),(9,33,'2019-01-02 09:17:00' ),(10,34, '2019-01-02 09:18:00');
This query:
select date(l.date_time) as `date`, (select count(distinct(l.visitor_id)) from `logs` l join info i on (i.id_no = l.visitor_id) where i.`name` = 'CUSTOMER' and l.visitor_id=i.id_no) as total_customer, (select count(l.visitor_id) from `logs` l join info i on (i.id_no = l.visitor_id) where i.`name` = 'vip') as total_vip, count(distinct(l.visitor_id)) as total from `logs` l join info i on (i.id_no = l.visitor_id) where l.date_time between '2019-01-01 00:00:00' and '2019-01-02 23:00:00' group by date(l.date_time);
has this result:
| date | total_customer | total_vip | total |
-------------------------------------------------------
| 2019-01-01 | 4 | 6 | 5 |
| 2019-01-02 | 4 | 6 | 5 |
my desired result is this:
| date | total_customer | total_vip | total |
-------------------------------------------------------
| 2019-01-01 | 2 | 3 | 5 |
| 2019-01-02 | 2 | 3 | 5 |
May I know what's wrong with my query? I'm using mysql 5.5. Thank you.
You don't need subqueries, you can use sum() case
select date(l.date_time) as date
, sum(case when i.name = 'customer' then 1 else 0 end) as customers
, sum(case when i.name = 'vip' then 1 else 0 end) as visitors
, count(1) as total
from logs l
join info i on (i.id_no = l.visitor_id)
where l.date_time between '2019-01-01 00:00:00' and '2019-01-02 23:00:00'
group by date(l.date_time);

GROUP BY + HAVING ignore row

Basically what I wanted is that I can select all the race records with record holder and best time. I looked up about similar queries and managed to find 3 queries that were faster than the rest.
The problem is it completely ignores the race the userid 2 owns the record of.
These are my tables, indexes, and some sample data:
CREATE TABLE `races` (
`raceid` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(20) NOT NULL,
PRIMARY KEY (`raceid`),
UNIQUE KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `users` (
`userid` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(20) NOT NULL,
PRIMARY KEY (`userid`),
UNIQUE KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `race_times` (
`raceid` smallint(5) unsigned NOT NULL,
`userid` mediumint(8) unsigned NOT NULL,
`time` mediumint(8) unsigned NOT NULL,
PRIMARY KEY (`raceid`,`userid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `races` (`raceid`, `name`) VALUES
(1, 'Doherty'),
(3, 'Easter Basin Naval S'),
(5, 'Flint County'),
(6, 'Fort Carson'),
(4, 'Glen Park'),
(2, 'Palomino Creek'),
(7, 'Tierra Robada');
INSERT INTO `users` (`userid`, `name`) VALUES
(1, 'Player 1'),
(2, 'Player 2');
INSERT INTO `race_times` (`raceid`, `userid`, `time`) VALUES
(1, 1, 51637),
(1, 2, 50000),
(2, 1, 148039),
(3, 1, 120516),
(3, 2, 124773),
(4, 1, 101109),
(6, 1, 89092),
(6, 2, 89557),
(7, 1, 77933),
(7, 2, 78038);
So if I run these 2 queries:
SELECT rt1.raceid, r.name, rt1.userid, p.name, rt1.time
FROM race_times rt1
LEFT JOIN users p ON (rt1.userid = p.userid)
JOIN races r ON (r.raceid = rt1.raceid)
WHERE rt1.time = (SELECT MIN(rt2.time) FROM race_times rt2 WHERE rt1.raceid = rt2.raceid)
GROUP BY r.name;
or..
SELECT rt1.*, r.name, p.name
FROM race_times rt1
LEFT JOIN users p ON p.userid = rt1.userid
JOIN races r ON r.raceid = rt1.raceid
WHERE EXISTS (SELECT NULL FROM race_times rt2 WHERE rt2.raceid = rt1.raceid
GROUP BY rt2.raceid HAVING MIN(rt2.time) >= rt1.time);
I receive correct results as shown below:
raceid | name | userid | name | time |
-------+----------------------+--------+----------+--------|
1 | Doherty | 2 | Player 2 | 50000 |
3 | Easter Basin Naval S | 1 | Player 1 | 120516 |
6 | Fort Carson | 1 | Player 1 | 89092 |
4 | Glen Park | 1 | Player 1 | 101109 |
2 | Palomino Creek | 1 | Player 1 | 148039 |
7 | Tierra Robada | 1 | Player 1 | 77933 |
and here is the faulty query:
SELECT rt.raceid, r.name, rt.userid, p.name, rt.time
FROM race_times rt
LEFT JOIN users p ON p.userid = rt.userid
JOIN races r ON r.raceid = rt.raceid
GROUP BY r.name
HAVING rt.time = MIN(rt.time);
and the result is this:
raceid | name | userid | name | time |
-------+----------------------+--------+----------+--------|
3 | Easter Basin Naval S | 1 | Player 1 | 120516 |
6 | Fort Carson | 1 | Player 1 | 89092 |
4 | Glen Park | 1 | Player 1 | 101109 |
2 | Palomino Creek | 1 | Player 1 | 148039 |
7 | Tierra Robada | 1 | Player 1 | 77933 |
As you can see, race "Doherty" (raceid: 1) is owned by "Player 2" (userid: 2) and it is not shown along with the rest of race records (which are all owned by userid 1). What is the problem?
Regards,
Having is a post filter. The query gets all the results, and then further filters them based on having. The GROUP BY compacting the rows based on the group, which gives you the first entry in each set. Since player 1 is the first entry for race 1, that's the result that is being processed by the HAVING. It is then filtered out because its time does not equal the MIN(time) for the group result.
This is why the other ones you posted are using a sub-query. My personal preference is for the first example, as to me it's slightly easier to read. Performance wise they should be the same.
While it's not a bad idea to try and avoid sub queries in the where clause, this is mostly valid when you can accomplish the same result with a JOIN. Other times it's not possible to get the result with a JOIN and a sub query is required.

Rewriting MySQL query without using GROUP BY

Here is the table information:
Table name is Teaches,
+-----------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| ID | varchar(5) | NO | PRI | NULL | |
| course_id | varchar(8) | NO | PRI | NULL | |
| sec_id | varchar(8) | NO | PRI | NULL | |
| semester | varchar(6) | NO | PRI | NULL | |
| year | decimal(4,0) | NO | PRI | NULL | |
+-----------+--------------+------+-----+---------+-------+
The requirement is to find which course appeared more than once in 2009(ID is the id of teachers)
Here is my query using GROUP BY:
select course_id
from teaches
where year= 2009
group by course_id
having count(id) >= 2;
How could I write this without using GROUP BY?
You may try this:
SELECT DISTINCT
T.course_id
FROM
teaches T
WHERE
T.course_id NOT IN (
SELECT
T1.course_id
FROM teaches AS T1 INNER JOIN teaches AS T2 ON T1.course_id = T2.course_id
AND T1.`year` = T2.`year`
AND T1.id <> T2.id
WHERE T1.`year` = 2009
);
Test Schema And Data:
DROP TABLE IF EXISTS `teaches`;
CREATE TABLE `teaches` (
`ID` varchar(5) CHARACTER SET utf8 DEFAULT NULL,
`course_id` varchar(8) CHARACTER SET utf8 DEFAULT NULL,
`sec_id` varchar(8) CHARACTER SET utf8 DEFAULT NULL,
`semester` varchar(6) CHARACTER SET utf8 DEFAULT NULL,
`year` decimal(4,0) DEFAULT NULL
);
INSERT INTO `teaches` VALUES ('66', '100', 'B', '11', '2009');
INSERT INTO `teaches` VALUES ('71', '100', 'A', '11', '2009');
INSERT INTO `teaches` VALUES ('64', '102', 'C', '12', '2010');
INSERT INTO `teaches` VALUES ('77', '102', 'B', '22', '2009');
Expected Output:
course_id
102
SQL FIDDLE DEMO
You could use something like this
select t.course_id
from (select t.*, count(*) as cnt from teaches t where year= 2009 ) t
where cnt > 1
Try to use stack overflow to search before posting a question,
Select Rows that appear more than once
Something like below should work:
select distinct course_id
from teaches o
where (select count(1) from teaches i where i.course_id = o.course_id and i.[year] = 2009) > 1
For your homework, below sql can be done.
This is followed your logic, id exist more than once is means course appeared more than once.
select DISTINCT T1.course_id
from teaches T1
where T1.course_id not in (
select a.course_id
from teaches as a inner join teaches as b
on a.course_id = b.course_id and a.year = b.year and a.id <> b.id
where a.year= 2009 )