Join two tables matching multiple ID's to names - mysql

Fiddle here: http://sqlfiddle.com/#!9/53d3c/2/0
I have two tables, one containing Member Names and their ID Number. Let's call that table Names:
CREATE TABLE Names (
ID int,
Title text
);
INSERT INTO Names
VALUES (11,'Chad'),
(10,'Deb'),
(34,'Steph'),
(13,'Chris'),
(98,'Peter'),
(33,'Daniel'),
(78,'Christine'),
(53,'Yolanda')
;
My second table contains meeting information, where someone is a Coach and someone is a Player. Each entry is a separate line (i.e. Meeting_ID 1 has two entries, one for the coach, one for the participant). Further, there is a column identifier for if that row is for a coach or player.
CREATE TABLE Meeting_Data (
Meeting_ID int,
Player_ID int,
Coach_ID int,
field_id int
);
INSERT INTO Meeting_Data
VALUES (1,0,11,2),
(1,10,0,1),
(2,34,0,1),
(2,0,13,2),
(3,98,0,1),
(3,0,33,2),
(4,78,0,1),
(4,0,53,2)
;
What I'm trying to do is create a table that puts each Meeting on one row, and then puts the ID#s and Names of the people meeting. When I attempt this, I get one column to pull successfully and then one column of (null) values.
SELECT Meeting_ID,
Max(CASE
WHEN field_id = 1 THEN Player_ID
END) AS Player_ID,
Max(CASE
WHEN field_id = 2 THEN Coach_ID
END) AS Coach_ID,
Player_Names.Title as Player_Names,
Coach_Names.Title as Coach_Names
FROM Meeting_Data
LEFT JOIN Names Player_Names
ON Player_ID = Player_Names.ID
LEFT JOIN Names Coach_Names
ON Coach_ID = Coach_Names.ID
GROUP BY Meeting_ID
Which results in:
| Meeting_ID | Player_ID | Coach_ID | Player_Names | Coach_Names |
|------------|-----------|----------|--------------|-------------|
| 1 | 10 | 11 | Deb | (null) |
| 2 | 34 | 13 | Steph | (null) |
| 3 | 98 | 33 | Peter | (null) |
| 4 | 78 | 53 | Christine | (null) |

How about something like this (http://sqlfiddle.com/#!9/53d3c/52/0):
SELECT Meeting_ID, Player_ID, Coach_ID, Players.Title, Coaches.Title
FROM (
SELECT Meeting_ID,
MAX(Player_ID) as Player_ID,
MAX(Coach_ID) as Coach_ID
FROM Meeting_Data
GROUP BY Meeting_ID
) meeting
LEFT JOIN Names Players ON Players.ID = meeting.Player_ID
LEFT JOIN Names Coaches ON Coaches.ID = meeting.Coach_ID

Related

Select locations where the user doesn't have it bound yet from 3 tables

I have 3 tables
User Info
id
name
1
bob
2
jane
3
tom
Locations
id
name
1
Test1
2
Test2
3
Test3
4
Test4
User Locations
userID
locationID
1
1
1
2
2
3
Basically What I am trying to achieve is to pull the location names where the user doesn't have it bound already.
In the above list Bob has 2 locations bounded "test 1" and "test 2" but he doesn't have "test 3" or "test 4" yet. I Only want the data to return test 3 and 4 since those are the only ones Bob doesn't have.
For Jane She only has Test 3 bounded but none of the remaining 3
Originally I had tried this and it somewhat worked. However Every time another user gets an unbounded location the its removed from the list. I'm not sure how I would add the user ID in all this so it's only specific to that user.
SELECT `name` FROM `locations`
WHERE `id` NOT IN (SELECT `locationID` FROM `user_locations`)
Create a cartesain product of the user and locations table (cross join), then using an outer join allows us to find rows that are as yet unmatched in user_locations:
select
user_info.ID AS UserID
, locations.ID AS locationID
from user_info
cross join locations
left outer join user_locations on user_info.id = user_locations.userid
and locations.id = user_locations.locationid
where user_locations.userid IS NULL
and user_info.name = 'bob'
SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE user_info(
id INTEGER NOT NULL PRIMARY KEY
,name VARCHAR(4) NOT NULL
);
INSERT INTO user_info(id,name) VALUES (1,'bob');
INSERT INTO user_info(id,name) VALUES (2,'jane');
INSERT INTO user_info(id,name) VALUES (3,'tom');
CREATE TABLE locations(
id INTEGER NOT NULL PRIMARY KEY
,name VARCHAR(5) NOT NULL
);
INSERT INTO locations(id,name) VALUES (1,'Test1');
INSERT INTO locations(id,name) VALUES (2,'Test2');
INSERT INTO locations(id,name) VALUES (3,'Test3');
INSERT INTO locations(id,name) VALUES (4,'Test4');
CREATE TABLE user_locations(
userID INTEGER NOT NULL
,locationID INTEGER NOT NULL
);
INSERT INTO user_locations(userID,locationID) VALUES (1,1);
INSERT INTO user_locations(userID,locationID) VALUES (1,2);
INSERT INTO user_locations(userID,locationID) VALUES (2,3);
Query 1:
select
user_info.ID AS UserID
, locations.ID AS locationID
from user_info
cross join locations
left outer join user_locations on user_info.id = user_locations.userid
and locations.id = user_locations.locationid
where user_locations.userid IS NULL
order by 1,2
Results:
| UserID | locationID |
|--------|------------|
| 1 | 3 |
| 1 | 4 |
| 2 | 1 |
| 2 | 2 |
| 2 | 4 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
| 3 | 4 |

How to SELECT name, max(value from a table joined on name)?

Summary: I have a forum with a forums table and a posts table. Every post has a unique ID (an auto-incrementing integer), and every post references a forums.id.
I'm trying to issue a SELECT query that retrieves all the forum names, all the forum IDs, and then the highest posts.id associated with that forum.
It's possible that there are no posts in a forum and in that case I want the max-posts-id to be 0.
Forums table:
| ID | Name |
|----|------|
| 1 | Dogs |
| 2 | Food |
| 3 | Work |
Posts table:
| ID | Forum_ID | Author | Text |
|----|----------|--------|------|
| 42 | 1 | Mr. S | foo |
| 43 | 3 | Mr. Y | bar |
| 44 | 1 | Ms. X | baz |
| 45 | 2 | Ms. A | foo |
| 46 | 1 | Mr. M | foo |
| 47 | 3 | Ms. A | bar |
| 48 | 2 | Mr. L | baz |
Desired result:
| Forum_ID | Name | Max_Posts_ID |
|----------|------|--------------|
| 1 | Dogs | 46 |
| 2 | Food | 48 |
| 3 | Work | 47 |
My attempt
SELECT
forums.id AS id,
forums.name AS name,
COALESCE(MAX(SELECT id FROM posts WHERE forums.id = ?), 0)
JOIN
posts ON forums.id = posts.forum_id;
But I don't think I can pass a parameter to my nested SELECT query, I don't think that's the right approach. What should I do instead?
You could use LEFT JOIN and aggregation:
SELECT f.id AS id,
f.name AS name,
COALESCE(MAX(p.id),0) AS Max_Posts_ID
FROM Forums f
LEFT JOIN Posts p
ON f.Id = p.forum_id
GROUP BY f.id, f.name
ORDER BY f.id;
Solution to your problem:
MySQL
SELECT fs.id AS Forum_ID ,
fs.name AS Name,
IFNULL(MAX(ps.ID),0) AS Max_Posts_ID
FROM forums fs
LEFT JOIN posts ps
ON fs.id = ps.forum_id
GROUP BY fs.id,fs.name;
Link To the MySQL Demo:
http://sqlfiddle.com/#!9/a18ab2/1
MSSQL
SELECT fs.id AS Forum_ID ,
fs.name AS Name,
ISNULL(MAX(ps.ID),0) AS Max_Posts_ID
FROM forums fs
LEFT JOIN posts ps
ON fs.id = ps.forum_id
GROUP BY fs.id,fs.name;
Link To the MSSQL Demo:
http://sqlfiddle.com/#!18/a18ab/2
OUTPUT:
Forum_ID Name Max_Posts_ID
1 Dogs 46
2 Food 48
3 Work 47
Let me correct your current attempt with correlation subquery
SELECT
id AS id,
name AS name,
(SELECT COALESCE(MAX(ID), 0) FROM Posts where forum_id = f.Id) AS Max_Posts_ID
FROM Forums f
Corrections :
Your outer query wasn't have from clause
Subquery wasn't referenced the outer query column id with Posts forum_id
Link to Code : http://tpcg.io/pI2HO5
BEGIN TRANSACTION;
/* Create a table called NAMES */
CREATE TABLE Forums (Id integer PRIMARY KEY, Name text);
/* Create few records in this table */
INSERT INTO Forums VALUES(1,'Dogs');
INSERT INTO Forums VALUES(2,'Food');
INSERT INTO Forums VALUES(3,'Work');
/* Create a table called NAMES */
CREATE TABLE Posts (Id integer PRIMARY KEY, forId integer);
/* Create few records in this table */
INSERT INTO Posts VALUES(42,1);
INSERT INTO Posts VALUES(43,3);
INSERT INTO Posts VALUES(64,1);
INSERT INTO Posts VALUES(45,2);
INSERT INTO Posts VALUES(46,1);
INSERT INTO Posts VALUES(47,3);
INSERT INTO Posts VALUES(48,2);
INSERT INTO Posts VALUES(51,2);
COMMIT;
/* Display all the records from the table */
SELECT Distinct forId as Forum_Id, Posts.id, (Select Name from Forums where forId == Forums.id) FROM Posts,Forums GROUP BY Posts.forId;
Output :
1|64|Dogs
2|51|Food
3|47|Work

MySQL Select from column use ^ as delimiter

My question similar to MySQL Split String and Select with results. Currently I have 2 tables:
student
uid | subject_id | name
1 | 1^2^3^4 | a
2 | 2^3^ | b
3 | 1 | c
subject
uid | subject_name
1 | math
2 | science
3 | languange
4 | sport
The result I expected is:
uid | name | subject_passed
1 | a | math, science, languange, sport
2 | b | science, languange
3 | c | sport
I have tried this query:
SELECT
student.uid,
student.name,
group_concat(subject.subject_name) as subjects_passed
from student
join subject on find_in_set(subject.uid,student.subject_id ) > 0
group by student.uid
Which returns the error:
#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use
near 'join subject on find_in_set(subject.uid,student.subject_id ) > 0
group' at line 7
I believe because of FIND_IN_SET. According to documentation, this function expects , as delimiter. Is there any alternative query I might use ?
Why not REPLACE the separator:
SELECT
student.uid,
student.name,
GROUP_CONCAT(subject.subject_name) AS subjects_passed
FROM student
JOIN subject ON FIND_IN_SET(subject.uid, REPLACE(student.subject_id, '^', ',')) > 0
GROUP BY student.uid
SQLFiddle
If you decide to de-normalize your tables it is fairly straight forward to create the junction table and generate the data:
-- Sample table structure
CREATE TABLE student_subject (
student_id int NOT NULL,
subject_id int NOT NULL,
PRIMARY KEY (student_id, subject_id)
);
-- Sample query to denormalize student <-> subject relationship
SELECT
student.uid AS student_id,
subject.uid AS subject_id
FROM student
JOIN subject ON FIND_IN_SET(subject.uid, REPLACE(student.subject_id, '^', ',')) > 0
+------------+------------+
| student_id | subject_id |
+------------+------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
| 2 | 2 |
| 2 | 3 |
| 3 | 1 |
+------------+------------+
You should never store data with a delimiter separator and should normalize the table and create the 3rd table to store student to subject relation.
However in the current case you may do it as
select
st.uid,
st.name,
group_concat(sb.subject_name) as subject_name
from student st
left join subject sb on find_in_set(sb.uid,replace(st.subject_id,'^',',')) > 0
group by st.uid
Here is the option to create third table and store the relation
create table student_to_subject (id int primary key auto_increment, stid int, subid int);
insert into student_to_subject(stid,subid) values
(1,1),(1,2),(1,3),(1,4),(2,2),(2,3),(3,1);
Now you can remove the column subject_id from the student table
So the query becomes
select
st.uid,
st.name,
group_concat(sb.subject_name) as passed_subject
from student st
join student_to_subject sts on sts.stid = st.uid
join subject sb on sb.uid = sts.subid
group by st.uid;
http://www.sqlfiddle.com/#!9/f02df
Think you can replace ^with , before calling find_in_set:
SELECT
student.uid,
student.name,
group_concat(subject.subject_name) as subjects_passed
from student
join subject on find_in_set(subject.uid, replace(student.subject_id,'^',',') ) > 0
group by student.uid
But of course storeing values in such a format is very bad db design.

SQL Self Join with null values

I am having all employees(manager and employees) under one table called Employee. Table looks as follows,
Table
+-------+------------+---------+---------+------------+
|emp_id | name | dept_id | salary | manager_id |
+=======+============+=========+=========+============+
| 1 | Sally | 1 | 20000 | null |
| 2 | Ajit | 2 | 20000 | 1 |
| 3 | Rahul | 1 | 20000 | 1 |
| 4 | uday | 1 | 20000 | null |
| 5 | john | 1 | 20000 | null |
| 6 | netaji | 2 | 20000 | 2 |
| 7 | prakriti | 3 | 1111 | 3 |
| 8 | sachin | 3 | 1111 | 3 |
| 9 | santosh | 1 | 1111 | 2 |
| 10 | Ravi | 1 | 1111 | 2 |
+-------+------------+---------+---------+------------+
Both managers and employees belong to same table. manager_id refers = emp_id who is manager.
I want to write query to count number of employees belonging to each manager. So even if certain manager doesn't have any employee under her or him the count will show as 0
Result should be as follows,
Expected Output
+------+----------+
|Count | Manager |
+======+==========+
| 2 | Sally |
| 3 | Ajit |
| 2 | Rahul |
| 0 | Uday |
| 0 | John |
+------+----------+
You need to do left self-join on the table. The left join will ensure that there is a row for every manager even if there are no employees under them. You need to use the COUNT() aggregate on a field from the employee side of the join that will be NULL if the manager has no employees. COUNT() doesn't actually count NULLs so this should give you zeroes where you want them.
The WHERE clause in this query defines managers by looking if their manager_id is NULL or if there are any matches in the joined table which means there are people that have them set as their manager.
SELECT mgr.name, COUNT(emp.emp_id) AS employee_count
FROM Employee AS mgr
LEFT JOIN Employee AS emp ON emp.manager_id=mgr.emp_id
WHERE mgr.manager_id IS NULL OR emp.emp_id IS NOT NULL
GROUP BY mgr.name
The correct solution likely involves fixing the scheme as any approach will fail for a "sub-manager" (who is managed and thus has a manager_id) but does not currently manage anybody.
Anyway, if the above limitation is acceptable, then people are managers if either
They have a NULL manager_id (as stated in a comment), or
They currently manage people other employees
Then this query (example sqlfiddle) can be used:
SELECT m.name as Manager, COUNT(e.id) as `Count`
FROM employee m
LEFT JOIN employee e
ON m.id = e.manager_id
GROUP BY m.id, m.name, m.manager_id
HAVING `Count` > 0 OR m.manager_id IS NULL
Notes/explanation:
The LEFT [OUTER] join is important here; otherwise managers who did not manage anybody would not be found. The filtering is then applied via the HAVING clause on the grouped result.
The COUNT is applied to a particular column, instead of *; when done so, NULL values in that column are not counted. In this case that means that employees (m) without a match (e) are not automatically selected by the COUNT condition in the HAVING. (The LEFT JOIN leaves in the left-side records, even when there is no join-match - all the right-side columns are NULL in this case.)
The GROUP BY contains all the grouping fields, even if they appear redundant. This allows the manager_id field to be used in the HAVING, for instance. (The group on ID was done in case two managers ever have the same name, or it is to be selected in the output clause.)
here is the solution; you are to make self join on employee table.
SELECT e1.manager_id, e2.name, COUNT (1) AS COUNT
FROM Employee e1 JOIN Employee e2 ON e1.manager_id = e2.id
GROUP BY e1.manager_id, e2.name
UNION ALL
SELECT e3.id, e3.name, 0 AS COUNT
FROM Employee e3
WHERE manager_id IS NULL
AND e3.id NOT IN ( SELECT e1.manager_id
FROM Employee e1
JOIN
Employee e2
ON e1.manager_id = e2.id
GROUP BY e1.manager_id, e2.name)
Maybe that helps:
select t1.name, count(*) -- all managers with emps
from t t1
join t t2
on t1.emp_id = t2.manager_id
group
by t1.name
union
all
select t1.name, 0 -- all managers without emps
from t t1
left
join t t2
on t1.emp_id = t2.manager_id
where t1.manager_id is null
and t2.emp_id is null
try below:
select (select count(*) from employees b where b.manager_id = a.emp_id)) as Count, a.Name as manager from employees a where a.emp_id in (select distict c.manager_id from employees c)
Query
CREATE TABLE employee(emp_id varchar(5) NOT NULL,
emp_name varchar(20) NULL,
dt_of_join date NULL,
emp_supv varchar(5) NULL,
CONSTRAINT emp_id PRIMARY KEY(emp_id) ,
CONSTRAINT emp_supv FOREIGN KEY(emp_supv)
REFERENCESemployee(emp_id));
you need to do
LEFT OUTER JOIN like this:
SELECT movies.title,sequelies.title AS sequel_title
FROM movies
LEFT OUTER JOIN movies sequelies
ON movies.sequel_id = sequelies.id ;

Improve SQL query performance

I have three tables where I store actual person data (person), teams (team) and entries (athlete). The schema of the three tables is:
In each team there might be two or more athletes.
I'm trying to create a query to produce the most frequent pairs, meaning people who play in teams of two. I came up with the following query:
SELECT p1.surname, p1.name, p2.surname, p2.name, COUNT(*) AS freq
FROM person p1, athlete a1, person p2, athlete a2
WHERE
p1.id = a1.person_id AND
p2.id = a2.person_id AND
a1.team_id = a2.team_id AND
a1.team_id IN
( SELECT team.id
FROM team, athlete
WHERE team.id = athlete.team_id
GROUP BY team.id
HAVING COUNT(*) = 2 )
GROUP BY p1.id
ORDER BY freq DESC
Obviously this is a resource consuming query. Is there a way to improve it?
SELECT id
FROM team, athlete
WHERE team.id = athlete.team_id
GROUP BY team.id
HAVING COUNT(*) = 2
Performance Tip 1: You only need the athlete table here.
You might consider the following approach which uses triggers to maintain counters in your team and person tables so you can easily find out which teams have 2 or more athletes and which persons are in 2 or more teams.
(note: I've removed the surrogate id key from your athlete table in favour of a composite key which will better enforce data integrity. I've also renamed athlete to team_athlete)
drop table if exists person;
create table person
(
person_id int unsigned not null auto_increment primary key,
name varchar(255) not null,
team_count smallint unsigned not null default 0
)
engine=innodb;
drop table if exists team;
create table team
(
team_id int unsigned not null auto_increment primary key,
name varchar(255) not null,
athlete_count smallint unsigned not null default 0,
key (athlete_count)
)
engine=innodb;
drop table if exists team_athlete;
create table team_athlete
(
team_id int unsigned not null,
person_id int unsigned not null,
primary key (team_id, person_id), -- note clustered composite PK
key person(person_id) -- added index
)
engine=innodb;
delimiter #
create trigger team_athlete_after_ins_trig after insert on team_athlete
for each row
begin
update team set athlete_count = athlete_count+1 where team_id = new.team_id;
update person set team_count = team_count+1 where person_id = new.person_id;
end#
delimiter ;
insert into person (name) values ('p1'),('p2'),('p3'),('p4'),('p5');
insert into team (name) values ('t1'),('t2'),('t3'),('t4');
insert into team_athlete (team_id, person_id) values
(1,1),(1,2),(1,3),
(2,3),(2,4),
(3,1),(3,5);
select * from team_athlete;
select * from person;
select * from team;
select * from team where athlete_count >= 2;
select * from person where team_count >= 2;
EDIT
Added the following as initially misunderstood question:
Create a view which only includes teams of 2 persons.
drop view if exists teams_with_2_players_view;
create view teams_with_2_players_view as
select
t.team_id,
ta.person_id,
p.name as person_name
from
team t
inner join team_athlete ta on t.team_id = ta.team_id
inner join person p on ta.person_id = p.person_id
where
t.athlete_count = 2;
Now use the view to find the most frequently occurring person pairs.
select
p1.person_id as p1_person_id,
p1.person_name as p1_person_name,
p2.person_id as p2_person_id,
p2.person_name as p2_person_name,
count(*) as counter
from
teams_with_2_players_view p1
inner join teams_with_2_players_view p2 on
p2.team_id = p1.team_id and p2.person_id > p1.person_id
group by
p1.person_id, p2.person_id
order by
counter desc;
Hope this helps :)
EDIT 2 checking performance
select count(*) as counter from person;
+---------+
| counter |
+---------+
| 10000 |
+---------+
1 row in set (0.00 sec)
select count(*) as counter from team;
+---------+
| counter |
+---------+
| 450000 |
+---------+
1 row in set (0.08 sec)
select count(*) as counter from team where athlete_count = 2;
+---------+
| counter |
+---------+
| 112644 |
+---------+
1 row in set (0.03 sec)
select count(*) as counter from team_athlete;
+---------+
| counter |
+---------+
| 1124772 |
+---------+
1 row in set (0.21 sec)
explain
select
p1.person_id as p1_person_id,
p1.person_name as p1_person_name,
p2.person_id as p2_person_id,
p2.person_name as p2_person_name,
count(*) as counter
from
teams_with_2_players_view p1
inner join teams_with_2_players_view p2 on
p2.team_id = p1.team_id and p2.person_id > p1.person_id
group by
p1.person_id, p2.person_id
order by
counter desc
limit 10;
+----+-------------+-------+--------+---------------------+-------------+---------+---------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------------+-------------+---------+---------------------+-------+----------------------------------------------+
| 1 | SIMPLE | t | ref | PRIMARY,t_count_idx | t_count_idx | 2 | const | 86588 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | t | eq_ref | PRIMARY,t_count_idx | PRIMARY | 4 | foo_db.t.team_id | 1 | Using where |
| 1 | SIMPLE | ta | ref | PRIMARY,person | PRIMARY | 4 | foo_db.t.team_id | 1 | Using index |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 4 | foo_db.ta.person_id | 1 | |
| 1 | SIMPLE | ta | ref | PRIMARY,person | PRIMARY | 4 | foo_db.t.team_id | 1 | Using where; Using index |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 4 | foo_db.ta.person_id | 1 | |
+----+-------------+-------+--------+---------------------+-------------+---------+---------------------+-------+----------------------------------------------+
6 rows in set (0.00 sec)
select
p1.person_id as p1_person_id,
p1.person_name as p1_person_name,
p2.person_id as p2_person_id,
p2.person_name as p2_person_name,
count(*) as counter
from
teams_with_2_players_view p1
inner join teams_with_2_players_view p2 on
p2.team_id = p1.team_id and p2.person_id > p1.person_id
group by
p1.person_id, p2.person_id
order by
counter desc
limit 10;
+--------------+----------------+--------------+----------------+---------+
| p1_person_id | p1_person_name | p2_person_id | p2_person_name | counter |
+--------------+----------------+--------------+----------------+---------+
| 221 | person 221 | 739 | person 739 | 5 |
| 129 | person 129 | 249 | person 249 | 5 |
| 874 | person 874 | 877 | person 877 | 4 |
| 717 | person 717 | 949 | person 949 | 4 |
| 395 | person 395 | 976 | person 976 | 4 |
| 415 | person 415 | 828 | person 828 | 4 |
| 287 | person 287 | 470 | person 470 | 4 |
| 455 | person 455 | 860 | person 860 | 4 |
| 13 | person 13 | 29 | person 29 | 4 |
| 1 | person 1 | 743 | person 743 | 4 |
+--------------+----------------+--------------+----------------+---------+
10 rows in set (2.02 sec)
Should there be an additional constraint a1.person_id != a2.person_id, to avoid creating a pair with the same player? This may not affect the final ordering of the results but will affect the accuracy of the count.
If possible you can add a column called athlete_count (with an index) in the team table which can be updated whenever a player is added or removed to a team and this can avoid the subquery which needs to go through the entire athlete table for finding the two player teams.
UPDATE1:
Also, if I am understanding the original query correctly, when you group by p1.id you only get the number of times a player played in a two player team and not the count of the pair itself. You may have to Group BY p1.id, p2.id.
REVISION BASED on EXACTLY TWO PER TEAM
By the inner-most pre-aggregate of exactly TWO people, I can get each team with personA and PersonB to a single row per team using MIN() and MAX(). This way, the person's IDs will always be in low-high pair setup to be compared for future teams. Then, I can query the COUNT by the common Mate1 and Mate2 across ALL teams and directly get their Names.
SELECT STRAIGHT_JOIN
p1.surname,
p1.name,
p2.surname,
p2.name,
TeamAggregates.CommonTeams
from
( select PreQueryTeams.Mate1,
PreQueryTeams.Mate2,
count(*) CommonTeams
from
( SELECT team_id,
min( person_id ) mate1,
max( person_id ) mate2
FROM
athlete
group by
team_id
having count(*) = 2 ) PreQueryTeams
group by
PreQueryTeams.Mate1,
PreQueryTeams.Mate2 ) TeamAggregates,
person p1,
person p2
where
TeamAggregates.Mate1 = p1.Person_ID
and TeamAggregates.Mate2 = p2.Person_ID
order by
TeamAggregates.CommonTeams
ORIGINAL ANSWER FOR TEAMS WITH ANY NUMBER OF TEAMMATES
I would do by the following. The inner prequery first joining all possible combinations of people on each individual team, but having person1 < person2 will eliminate counting the same person as person1 AND person2.. In addition, will prevent the reverse based on higher numbered person IDs... Such as
athlete person team
1 1 1
2 2 1
3 3 1
4 4 1
5 1 2
6 3 2
7 4 2
8 1 3
9 4 3
So, from team 1 you would get person pairs of
1,2 1,3 1,4 2,3 2,4 3,4
and NOT get reversed duplicates such as
2,1 3,1 4,1 3,2 4,2 4,3
nor same person
1,1 2,2 3,3 4,4
Then from team 2, you would hav pairs of
1,3 1,4 3,4
Finally in team 3 the single pair of
1,4
thus teammates 1,4 have occured in 3 common teams.
SELECT STRAIGHT_JOIN
p1.surname,
p1.name,
p2.surname,
p2.name,
PreQuery.CommonTeams
from
( select
a1.Person_ID Person_ID1,
a2.Person_ID Person_ID2,
count(*) CommonTeams
from
athlete a1,
athlete a2
where
a1.Team_ID = a2.Team_ID
and a1.Person_ID < a2.Person_ID
group by
1, 2
having CommonTeams > 1 ) PreQuery,
person p1,
person p2
where
PreQuery.Person_ID1 = p1.id
and PreQuery.Person_ID2 = p2.id
order by
PreQuery.CommonTeams
Here, Some tips to improve SQL select query performance like:
Use SET NOCOUNT ON it is help to decrease network traffic thus
improve performance.
Use fully qualified procedure name (e.g.
database.schema.objectname)
Use sp_executesql instead of execute for dynamic query
Don't use select * use select column1,column2,.. for IF EXISTS
or SELECT operation
Avoid naming user Stored Procedure like sp_procedureName Becouse,
If we use Stored Procedure name start with sp_ then SQL first
search in master db. so it can down query performance.