I am trying to create a full outer join in Mysql. I found several answers to the basic question, and I'm using "union" to make it work. However, I was unable to get the syntax correct without resorting to creating a few temporary tables. I've tried to generate the query without the tables, but I was never able to get the results to include the entries with a null partner_id.
Here is a reduced set of the data, already filtered by meeting_id:
+-----+---------+--------+------------+------------+
| pid | first | gender | meeting_id | partner_id |
+-----+---------+--------+------------+------------+
| 2 | Vicki | F | 74 | NULL |
| 54 | Fazal | M | 74 | 4 |
| 4 | Lisa | F | 74 | 54 |
| 10 | Rod | M | 74 | 57 |
| 57 | Kellee | F | 74 | 10 |
| 11 | Jake | M | 74 | 55 |
| 55 | Rosa | F | 74 | 11 |
| 47 | Ralph | M | 74 | 46 |
| 46 | Holly | F | 74 | 47 |
| 40 | Wes | M | 74 | 12 |
| 12 | Lori | F | 74 | 40 |
| 5 | Richard | M | 74 | 6 |
| 6 | Rita | F | 74 | 5 |
| 15 | John | M | 74 | 16 |
| 16 | Corie | F | 74 | 15 |
+-----+---------+--------+------------+------------+
My original query looked like this:
set #mtg=74;
select
a.pid,
concat(a.first, ' ', a.last) as guy,
a.issub as guysub,
b.pid,
concat(b.first, ' ', b.last) as gal,
b.issub as galsub,
b.partner_id
from
scheduled_players a
left outer join
scheduled_players b
on a.partner_id = b.pid
where
a.gender = 'M' and a.meeting_id = #mtg and b.meeting_id = #mtg
union
select
a.pid,
concat(a.first, ' ', a.last) as guy,
a.issub as guysub,
b.pid,
concat(b.first, ' ', b.last) as gal,
b.issub as galsub,
b.partner_id
from
scheduled_players a
left outer join
scheduled_players b
on b.partner_id = a.pid
where
a.gender = 'M' and a.meeting_id = #mtg and b.meeting_id = #mtg
;
That query did not return the single entry with a null partner_id. I read a number of answers on StackOverflow and it seemed as if the where clause could cause the outer join to revert to an inner join. In my case, I did not see how this could happen, but to test this, I decided to create temporary tables to contain the 'where' clause elements. I needed to create 2 temporary tables for each of the 'guys' and 'gals', since I had the tables 2 times in the query. The results are here:
set #mtg=74;
create temporary table if not exists
meeting_guys as select * from scheduled_players
where meeting_id = #mtg and gender='M';
create temporary table if not exists
meeting_gals as select * from scheduled_players
where meeting_id = #mtg and gender='F';
create temporary table if not exists
meeting_guys2 as select * from scheduled_players
where meeting_id = #mtg and gender='M';
create temporary table if not exists
meeting_gals2 as select * from scheduled_players
where meeting_id = #mtg and gender='F';
select
a.pid,
concat(a.first, ' ', a.last) as guy,
a.issub as guysub,
b.pid,
concat(b.first, ' ', b.last) as gal,
b.issub as galsub,
b.partner_id
from
meeting_guys a
left outer join
meeting_gals b
on a.partner_id = b.pid
union
select
a.pid,
concat(a.first, ' ', a.last) as guy,
a.issub as guysub,
b.pid,
concat(b.first, ' ', b.last) as gal,
b.issub as galsub,
b.partner_id
from
meeting_guys2 a
right outer join
meeting_gals2 b
on b.partner_id = a.pid
;
It turned out this worked, and I received the results I was expecting (I removed the last names since these are real people):
+------+---------+--------+------+--------+--------+------------+
| pid | guy | guysub | pid | gal | galsub | partner_id |
+------+---------+--------+------+--------+--------+------------+
| 54 | Fazal | 0 | 4 | Lisa | 0 | 54 |
| 10 | Rod | 0 | 57 | Kellee | 0 | 10 |
| 11 | Jake | 0 | 55 | Rosa | 0 | 11 |
| 47 | Ralph | 0 | 46 | Holly | 0 | 47 |
| 40 | Wes | 0 | 12 | Lori | 0 | 40 |
| 5 | Richard | 0 | 6 | Rita | 0 | 5 |
| 15 | John | 0 | 16 | Corie | 0 | 15 |
| NULL | NULL | NULL | 2 | Vicki | 0 | NULL |
+------+---------+--------+------+--------+--------+------------+
I was able to get the results I was looking for, but I don't understand why the previous query did not work. Fortunately, I have a working solution, but I'd really like to find out if there is a better, more optimal way.
Firstly to point out that this is untested so you might just need to tweak it but you sound more than capable of fixing the odd error. If you do need me to clarify why I did something or you need me to fix something, just say the word.
To explain why your first attempt eliminated the null records unexpectedly, you are right that it is your where clause that is doing it. for the left join, instead of a.meeting_id = #mtg and b.meeting_id = #mtg you would use `a.meeting_id = #mtg and (b.meeting_id = #mtg or b.meeting_id is null)' obviously for the right join you would check for the null in the left table.
As for an alternate solution, I have used a temp table to limit the result set to just the matching meeting_id's early (for performance) in case your table is large, and then I filter for M/F in the derived tables.
Hope it helps you.
set #mtg=74;
create temporary table if not exists
meeting as
select
pid,
concat(first, ' ', last) as full_name,
issub,
partner_id,
meeting_id,
gender
from scheduled_players
where meeting_id = #mtg;
select
M.pid,
M.full_name as guy,
M.issub as guysub,
F.pid,
F.full_name as gal,
F.issub as galsub,
F.partner_id
from
(select * from meeting where gender = 'M') M
left outer join (select * from meeting where gender = 'F') F
on M.partner_id = F.pid
UNION
select
M.pid,
M.full_name as guy,
M.issub as guysub,
F.pid,
F.full_name as gal,
F.issub as galsub,
F.partner_id
from
(select * from meeting where gender = 'M') M
right outer join (select * from meeting where gender = 'F') F
on F.partner_id = M.pid
EDIT
If performance isn't an issue then maybe it is just simpler to forget the temp table altogether and refer to the table directly in the derived tables as;
select concat(first, ' ', last) as full_name, * from scheduled_players where gender = 'M' and meeting_id = #mtg
select concat(first, ' ', last) as full_name, * from scheduled_players where gender = 'F' and meeting_id = #mtg
You could also create a single temp table and then insert and update that in separate queries.
Whatever works for you at the end of the day.
Related
I got working code from three queries but I would like to combine them into one or two. Basically I am checking if a provided phone number exists in table contacts or leads as well as if it exists as a secondary number in customfieldsvalues (not all leads have a customfield value though). I am using a CRM system based on CodeIgniter.
What I want to do (non-correct/hypothetical query):
SELECT * FROM contacts OR leads WHERE phonenumber = replace(X, '-', '')
OR leads.id = customvaluefields.relid AND cfields.fieldid = 41 AND cfields.value = X
Tables
table : contacts
+-------+----------------+----------------+
| id | firstname | phonenumber |
+-------+----------------+----------------+
| 1 | John | 214-444-1234 |
| 2 | Mary | 555-111-1234 |
+-------+----------------+----------------+
table : leads
+-------+-----------+---------------------+
| id | name | phonenumber |
+-------+-----------+---------------------+
| 1 | John | 214-444-1234 |
| 2 | Mary | 555-111-1234 |
+-------+-----------+---------------------+
table : customvaluefields
+-------+-----------+-------------+-----------+
| id | relid | fieldid | value |
+-------+-----------+-------------+-----------+
| 1 | 1 | 41 | 222333444 |
| 2 | 1 | 20 | Management|
| 3 | 2 | 41 | 333444555 |
+-------+-----------+-------------+-----------+
If I understand what you are trying to, maybe UNION ALL would work. This is something to get you started:
SELECT C.ID, C.FirstName, C.Phonenumber
FROM Contacts C
JOIN CustomValueField CVF
ON c.ID = CVF.RelID AND
CVF.ID = 41
AND REPLACE(Phonenumber,'-','') = cvf.Value
UNION ALL
SELECT L.ID, L.FirstName, L.Phonenumber
FROM Leads L
JOIN CustomValueField CVF
ON L.ID = CVF.RelID AND
CVF.ID = 41
AND REPLACE(Phonenumber,'-','') = cvf.Value
I'm joining the contacts and leads tables to CustomeValueField in each query and then UNION them together along with the WHERE clause in each. I'm sure it's not 100% correct for what you need, but should get you headed to a solution. Here is more information: https://dev.mysql.com/doc/refman/8.0/en/union.html
I have a query that fetches some groups in Moodle:
select
g.id groupid,
g.name groupname,
count(distinct gm.userid)
from prefix_groups g
left join prefix_groups_members gm on gm.groupid = g.id
group by g.id
output
| groupid | groupname | count(distinct gm.userid) |
|---------|-----------|---------------------------|
| 1 | 20NEW-4A | 6 |
| 2 | 18PAR-5F | 3 |
| 3 | 20BER-6G | 2 |
| 4 | 50NEV-6G | 6 |
| 5 | 34HOG-5Q | 77 |
| 6 | 10BAT-GG | 5 |
| etc. | etc. | etc. |
I want to add a column called location, which lists the location of the group (as per the groupname standard in this platform. e.g. BER = Berlin). I have over 100 of these locations to filter on. I know I can throw all of them in a case statment and call it a very long day (such as below), but I want to do it the most efficient way possible. In this instance, I cannot create a temporary table to do this. Any ideas?
select
g.id groupid,
g.name groupname,
case when substring(g.name, 3, 3) = 'BER'
then 'Berlin'
when substring(g.name, 3, 3) = 'NEW'
then 'Newcastle'
-- etc.
end location,
count(distinct gm.userid)
from prefix_groups g
left join prefix_groups_members gm on gm.groupid = g.id
group by g.id
output:
| groupid | groupname | location | count(distinct gm.userid) |
|---------|-----------|-----------|---------------------------|
| 1 | 20NEW-4A | Newcastle | 6 |
| 2 | 18PAR-5F | Paris | 3 |
| 3 | 20BER-6G | Berlin | 2 |
| 4 | 50NEV-6G | Neverland | 6 |
| 5 | 34HOG-5Q | Hogwarts | 77 |
| 6 | 10BAT-GG | Bath | 5 |
| etc. | etc. | etc. | etc. |
CREATE TABLE locations ( code CHAR(3), location VARCHAR(255) );
INSERT INTO locations VALUES ('BER', 'Berlin'), ('NEW', 'Newcastle'), ... ;
and then
select
g.id groupid,
g.name groupname,
locations.location,
count(distinct gm.userid)
from prefix_groups g
left join prefix_groups_members gm on gm.groupid = g.id
LEFT JOIN locations ON substring(g.name, 3, 3) = locations.code
group by g.id, g.name, locations.location;
For to improve the performance you may add generated column into prefix_groups and join by it avoiding a function in joining condition.
In this instance, I cannot create a temporary table
If so then for to avoid long CASE you may use an expression like:
ELT(FIND_IN SET(SUBSTRING(g.name, 3, 3), 'BER,NEW,...'), 'Berlin', 'Newcastle', ...) AS location
I'm not sure that this will be more effective. But it will be shorter with guarantee.
PS. You may use another variant of CASE function:
case substring(g.name, 3, 3) when 'BER' then 'Berlin'
when 'NEW' then 'Newcastle'
-- etc.
end location,
Blog table:
| bid | btitle |
| 29 | ...... |
| 38 | ...... |
likes table:
| lid | bid |
| 1 | 29 |
| 2 | 29 |
| 3 | 29 |
| 4 | 38 |
| 5 | 38 |
comment table
| commid | bid |
| 1 | 29 |
| 2 | 29 |
| 3 | 38 |
I had tried the following query but that will not work for me:
SELECT blog.bid,blog.btitle,COUNT(likes.lid) AS likecnt,COUNT(comment.comid) AS commentcnt FROM blog,likes,comment WHERE blog.bid=likes.bid AND blog.bid=comment.bid GROUP BY blog.bid
i want output like:
| bid | btitle | likecnt | commentcnt |
| 29 | ...... | 3 | 2 |
| 38 | ...... | 2 | 1 |
You can do left join with separate aggregation :
select b.bid, b.btitle,
coalesce(l.likecnt, 0) as likecnt,
coalesce(c.commentcnt, 0) as commentcnt
from blog b left join
(select l.bid, count(*) as likecnt
from likes l
group by l.bid
) l
on l.bid = b.bid left join
(select c.bid, count(*) as commentcnt
from comment c
group by c.bid
) c
on c.bid = l.bid;
If you want only matching bids the use INNER JOIN instead of LEFT JOIN & remove COALESCE().
Under many circumstances, correlated subqueries may be the fastest solution:
select b.bid, b.btitle,
(select count(*) from likes l where l.bid = b.bid) as num_likes,
(select count(*) from comment c where c.bid = b.bid) as num_comments
from blog b;
When is this a win performance wise. First, you want indexes on likes(bid) and comments(bid). With those indexes, it might be the fastest approach for your query.
It is particularly better if you have a where clause filtering the blogs in the outer query. It only has to do the counts for the blogs in the result set.
Use proper joins and count DISTINCT values because multiple joins increase the number of returned rows:
SELECT b.bid, b.btitle,
COUNT(DISTINCT l.lid) AS likecnt,
COUNT(DISTINCT c.comid) AS commentcnt
FROM blog b
LEFT JOIN likes l ON b.bid = l.bid
LEFT JOIN comment c ON b.bid = c.bid
GROUP BY b.bid, b.btitle
See the demo.
I use LEFT joins just in case there are no comments or likes for a post.
Results:
| bid | btitle | likecnt | commentcnt |
| --- | ------ | ------- | ---------- |
| 29 | ...... | 3 | 2 |
| 38 | ...... | 2 | 1 |
I have four tables, a clients, persons, client_functions and functions table.
I wrote this query:
SELECT
P.number,
P.first_name
GROUP_CONCAT(F.description) AS Functions
FROM clients AS C
LEFT JOIN persons AS P ON P.id=C.id
LEFT JOIN client_functions as CF ON CF.client_id=C.id
LEFT JOIN functions AS F ON F.id=CF.function_id
WHERE P.person_type = 'client' AND P.company_id = 3
GROUP BY
P.number,
P.first_name
In my GROUP_CONCAT() i only want to group F.description if CF.archived = 0. Does anybody has an idea on how i can put a condition on the GROUP_CONCAT?
Current query results in:
--------------------------------------------
| 93 | Jan Lochtenberg | PV,PV,PV,PV |
| 94 | Chris van Eijk | VP-I,VP,PV |
| 95 | Gertrude Irene | VP-I,PV,PV,PV |
| 96 | Wiekert Jager | VP-I,PV |
| 97 | Antonius Kode | VP,VP-I,VP |
| 98 | HansLelie | PV,PV,PV |
---------------------------------------------
But i only want to see the active functions
--------------------------------------------
| 93 | Jan Lochtenberg | PV |
| 94 | Chris van Eijk | VP-I,VP,PV |
| 95 | Gertrude Irene | VP-I,PV |
| 96 | Wiekert Jager | VP-I,PV |
| 97 | Antonius Kode | VP,VP-I,VP |
| 98 | HansLelie | PV |
---------------------------------------------
Your where is undoing some of your left joins. In fact, you don't need the clients table at all. Then you can put the filtering condition on functions in the ON clause:
SELECT P.number, P.first_name, P.last_name,
GROUP_CONCAT(F.description) AS Functions
FROM persons P LEFT JOIN
client_functions CF
ON CF.client_id = p.id LEFT JOIN
functions F
ON F.id = CF.function_id AND cf.archived = 0
WHERE P.person_type = 'client' AND P.company_id = 3
GROUP BY P.number, P.first_name, P.last_name;
In my GROUP_CONCAT() i only want to group F.description if CF.archived = 0
Translated to SQL:
GROUP_CONCAT(IF(CF.archived = 0, F.description, NULL))
The GROUP_CONCAT() function ignores the NULL values. It returns, however, NULL if there isn't any not-NULL value to work with.
In this example, I have a listing of users (main_data), a pass list (pass_list) and a corresponding priority to each pass code type (pass_code). The query I am constructing is looking for a list of users and the corresponding pass code type with the lowest priority. The query below works but it just seems like there may be a faster way to construct it I am missing. SQL Fiddle: http://sqlfiddle.com/#!2/2ec8d/2/0 or see below for table details.
SELECT md.first_name, md.last_name, pl.*
FROM main_data md
JOIN pass_list pl on pl.main_data_id = md.id
AND
pl.id =
(
SELECT pl2.id
FROM pass_list pl2
JOIN pass_code pc2 on pl2.pass_code_type = pc2.type
WHERE pl2.main_data_id = md.id
ORDER BY pc2.priority
LIMIT 1
)
Results:
+------------+-----------+----+--------------+----------------+
| first_name | last_name | id | main_data_id | pass_code_type |
+------------+-----------+----+--------------+----------------+
| Bob | Smith | 1 | 1 | S |
| Mary | Vance | 8 | 2 | M |
| Margret | Cough | 5 | 3 | H |
| Mark | Johnson | 9 | 4 | H |
| Tim | Allen | 13 | 5 | M |
+------------+-----------+----+--------------+----------------+
users (main_data)
+----+------------+-----------+
| id | first_name | last_name |
+----+------------+-----------+
| 1 | Bob | Smith |
| 2 | Mary | Vance |
| 3 | Margret | Cough |
| 4 | Mark | Johnson |
| 5 | Tim | Allen |
+----+------------+-----------+
pass list (pass_list)
+----+--------------+----------------+
| id | main_data_id | pass_code_type |
+----+--------------+----------------+
| 1 | 1 | S |
| 3 | 2 | E |
| 4 | 2 | H |
| 5 | 3 | H |
| 7 | 4 | E |
| 8 | 2 | M |
| 9 | 4 | H |
| 10 | 4 | H |
| 11 | 5 | S |
| 12 | 3 | S |
| 13 | 5 | M |
| 14 | 1 | E |
+----+--------------+----------------+
Table which specifies priority (pass_code)
+----+------+----------+
| id | type | priority |
+----+------+----------+
| 1 | M | 1 |
| 2 | H | 2 |
| 3 | S | 3 |
| 4 | E | 4 |
+----+------+----------+
Due to mysql's unique extension to its GROUP BY, it's simple:
SELECT * FROM
(SELECT md.first_name, md.last_name, pl.*
FROM main_data md
JOIN pass_list pl on pl.main_data_id = md.id
ORDER BY pc2.priority) x
GROUP BY md.id
This returns only the first row encountered for each unique value of md.id, so by using an inner query to order the rows before applying the group by you get only the rows you want.
A version that will get the details as required, and should also work across different flavours of SQL
SELECT md.first_name, md.last_name, MinId, pl.main_data_id, pl.pass_code_type
FROM main_data md
INNER JOIN pass_list pl
ON md.id = pl.main_data_id
INNER JOIN pass_code pc
ON pl.pass_code_type = pc.type
INNER JOIN
(
SELECT pl.main_data_id, pl.pass_code_type, Sub0.MinPriority, MIN(pl.id) AS MinId
FROM pass_list pl
INNER JOIN pass_code pc
ON pl.pass_code_type = pc.type
INNER JOIN
(
SELECT main_data_id, MIN(priority) AS MinPriority
FROM pass_list a
INNER JOIN pass_code b
ON a.pass_code_type = b.type
GROUP BY main_data_id
) Sub0
ON pl.main_data_id = Sub0.main_data_id
AND pc.priority = Sub0.MinPriority
GROUP BY pl.main_data_id, pl.pass_code_type, Sub0.MinPriority
) Sub1
ON pl.main_data_id = Sub1.main_data_id
AND pl.id = Sub1.MinId
AND pc.priority = Sub1.MinPriority
ORDER BY pl.main_data_id
This does not rely on the flexibility of MySQLs GROUP BY functionality.
I'm not familiar with the special behavior of MySQL's group by, but my solution for these types of problems is to simply express as where there doesn't exist a row with a lower priority. This is standard SQL so should work on any DB.
select distinct u.id, u.first_name, u.last_name, pl.pass_code_type, pc.id, pc.priority
from main_data u
inner join pass_list pl on pl.main_data_id = u.id
inner join pass_code pc on pc.type = pl.pass_code_type
where not exists (select 1
from pass_list pl2
inner join pass_code pc2 on pc2.type = pl2.pass_code_type
where pl2.main_data_id = u.id and pc2.priority < pc.priority);
How well this performs is going to depend on having the proper indexes (assuming that main_data and pass_list are somewhat large). In this case indexes on the primary (should be automatically created) and foreign keys should be sufficient. There may be other queries that are faster, I would start by comparing this to your query.
Also, I had to add distinct because you have duplicate rows in pass_list (id 9 & 10), but if you ensure that duplicates can't exist (unique index on main_data_id, pass_code_type) then you will save some time by removing the distinct which forces a final sort of the result set. This savings would be more noticeable the larger the result set is.