Write a big sql query or handle it through code? - mysql

I have 2 tables built in this way:
Trips
- id
- organization_id REQUIRED
- collaboration_organization_id OPTIONAL
...other useless fields...
Organizations
- id
- name REQUIRED
...other useless fields...
Now I have been asked to create this type of report:
I want the sum of all trips for each organization, considering that if
they have a collaboration_organization_id it should count as 0.5,
obviusly the organization in collaboration_organization_id get a +0.5
too
So whenever I have a trip that has organization_id AND collaboration_organization_id set, that trip count as 0.5 for both organizations. If instead only organization_id is set, it counts as 1.
Now my question is composed by two parts:
1.
Is a good idea to "solve" the problem all in SQL?
I already know how to solve it through code, my idea is currently "select all trips (only those 3 fields) and start counting in ruby". Please consider that I'm using ruby on rails so could still be a good reason to say "no because it will work only on mysql".
2.
If point 1 is YES, I have no idea how to count for 0.5 each trip where it's required, because count is a "throw-in-and-do-it" function

I'm not familiar with ruby on rails, but this is how you can do this with MySQL.
Sample data:
CREATE TABLE Trips(
id int not null primary key,
organization_id int not null,
collaboration_organization_id int null
);
INSERT INTO Trips (id,organization_id,collaboration_organization_id)
VALUES
(1,1,5),
(2,1,1),
(3,1,2),
(4,11,1),
(5,1,null),
(6,2,null),
(7,10,null),
(8,6,2),
(9,1,3),
(10,1,4);
MySQL Query:
SELECT organization_id,
sum(CASE WHEN collaboration_organization_id IS null THEN 1 ELSE 0.5 End) AS number
FROM Trips
GROUP BY organization_id;
Try it out via: http://www.sqlfiddle.com/#!2/1b01d/107
EDIT: adding collaboration organization
Sample data:
CREATE TABLE Trips(
id int not null primary key,
organization_id int not null,
collaboration_organization_id int null
);
INSERT INTO Trips (id,organization_id,collaboration_organization_id)
VALUES
(1,1,5),
(2,1,1),
(3,1,2),
(4,11,1),
(5,1,null),
(6,2,null),
(7,10,null),
(8,6,2),
(9,1,3),
(10,1,4);
CREATE TABLE Organizations(
id int auto_increment primary key,
name varchar(30)
);
INSERT INTO Organizations (name)
VALUES
("Org1"),
("Org2"),
("Org3"),
("Org4"),
("Org5"),
("Org6"),
("Org7"),
("Org8"),
("Org9"),
("Org10"),
("Org11"),
("Org12"),
("Org13"),
("Org14"),
("Org15"),
("Org16");
MySQL query:
SELECT O.id, O.name,
sum(CASE WHEN T.collaboration_organization_id IS null THEN 1 ELSE 0.5 End) AS number
FROM Organizations AS O LEFT JOIN Trips AS T
ON T.organization_id = O.id OR T.collaboration_organization_id = O.id
WHERE T.collaboration_organization_id = O.id OR O.id = T.organization_id
GROUP BY O.id;
http://www.sqlfiddle.com/#!2/ee557/15

Related

Associated Name spiderweb

Say for instance I have the following entries in my table:
ID - 1
Name - Daryl
ID - 2
Name - Terry
ID - 3
Name - Dave
ID - 4
Name - Mitch
I eventually wish to search my table(s) for one specific name, but show all associated names. For instance,
Searching Daryl will return Terry, Dave & Daryl.
Searching Terry will return Dave, Daryl & Terry
Searching Mitch will only return Mitch.
The current table housing the names is as followed:
--
-- Table structure for table `members`
--
CREATE TABLE `members` (
`ID` int(255) NOT NULL,
`GuildID` int(255) NOT NULL,
`ToonName` varchar(255) NOT NULL,
`AddedOn` date NOT NULL,
`AddedByID` int(255) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
--
-- Dumping data for table `members`
--
INSERT INTO `members` (`ID`, `GuildID`, `ToonName`, `AddedOn`, `AddedByID`) VALUES
(1, 1, 'Daryl', '2020-01-17', 5),
(2, 1, 'Terry', '2020-01-17', 5),
(3, 1, 'Mitch', '2020-01-17', 5),
(4, 1, 'Dave', '2020-01-17', 5);
--
For Reference. GuildID will be a default search criteria based on the searchers login details. With a spiderweb like this, how would I go about creating another table (or another Column) to bring a combined search spiderweb structure based on the search criteria?
I was thinking something along the lines of:
CREATE TABLE `Associated`(
`ID` INT(255) NOT NULL,
`MainID` INT(255) NOT NULL,
`SecondaryID` INT(255) NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `Associated` (`ID`, `MainID`, `SecondaryID`) VALUES
(1, 1, 2) -- Daryl Associated With Terry
(2, 1, 4) -- Daryl Associated With Dave
But I feel this will make an over complicated value structure with alot of redundant inputs. Is there a more effective way to create a unified search?
The whole idea of operation is that each name is Individual. So certain Entries can be put under Daryl, Terry acting alone. But one search will bring together all associated Names by searching one name then pull together total entries based on the alisas?
You can try This
Select IFNULL(m.ToonName , members.ToonName) as ToonName
from members
LEFT JOIN Associated on Associated.MainID = members.ID
LEFT JOIN members as m on m.ID = Associated.SecondaryID
Where members.ToonName = "Mitch"
While you have entry for
"Mitch" in Associated table it will return you Daryl and when you dont have associated Id it will return the name from members table.
And If you will check this with "Daryl", it will give you Two results,
Select IFNULL(m.ToonName , members.ToonName) as ToonName
from members
LEFT JOIN Associated on Associated.MainID = members.ID
LEFT JOIN members as m on m.ID = Associated.SecondaryID
Where members.ToonName = "Daryl"
In case you want all the names in a single column you can use GROUP_CONCAT as #flash suggested in another answer.
You can directly get the data from the following SQL statement.
For Individual row
SELECT `members`.ToonName FROM `associated` JOIN `members` ON associated.SecondaryID = members.ID WHERE `associated`.MainID = (SELECT ID FROM `members` WHERE ToonName = 'Daryl');
# Output: **ToonName**
Terry,Dave
Grouping Row
// You can also group all rows by comma from following statement
SELECT GROUP_CONCAT(`members`.ToonName) FROM `associated` JOIN `members` ON associated.SecondaryID = members.ID WHERE `associated`.MainID = (SELECT ID FROM `members` WHERE ToonName = 'Daryl');
# Output: **ToonName**
Terry
Dave
Plan A
Add a column to each member. It is the number (or name) if the one group he/she belongs to. Terry, Dave & Daryl would get one value; Mitch would get a different value. Index the column for efficient lookup of related names.
Plan B
Implement a graph, like you suggested. Some tips: Get rid of id; instead have PRIMARY KEY(MainID, SecondaryID). The is an issue to resolve... This design implies a "directedness" of the relationships: Terry --> Dave, but not necessarily Dave --> Terry. If you want to force it to be reflexive, the force two rows to be inserted or insert the two IDs in an canonical order, but then check both directions.
Also, you need to "walk the graph". This is best done with a Recursive CTE. For that feature, you need MySQL 8.0 or MariaDB 10.2.
Plan C
Without the directedness of B, you run into more difficult issues. One is "cluster analysis". Another is messy paths and loops in the 'graph'. Let's avoid these.
Short Answer: Yes you need a second table Associated and it will not make completed structure.
Below is the query to get the required result
SELECT ID, ToonName,
(
SELECT GROUP_CONCAT(ToonName) FROM Associated
JOIN members child ON SecondaryID = child.ID
WHERE MainID = parent.ID
)
FROM members parent
You can also use join but I think, in this case sub query will be better.
NOTE : your tables need some optimization like remove ID field from Associated table, Add index etc.

MySQL lookup based on round(1 + rand() * x) produces NULL and multiple results

I'm trying to select first names from a lookup table at random in MySQL to build a test dataset. I have a table with 200 first names, genders and a row id going from 1 to 200. Something like this:
id firstname gender
1 Aaron m
2 Adam m
3 Alan m
etc...
I'm selecting from this table using a random generator with the following query:
SELECT id, firstname FROM firstname WHERE id = round(1 + (rand() * 199));
I am expecting the random number to tally up with exactly one id from the lookup table, thus producing a single results like
id firstname
43 Jason
Running the code again and again instead gives me a selection of
single rows (as above)
or multiple rows like
id firstname
29 Ethan
147 Jean
or no results (just NULL in both fields).
If I run the random generator on its own, it will always generate a number between 1 and 200. As you can see below, the id field is INT, and the query behaves the same way if I cast the result as SIGNED. I have also tried to use FLOOR instead of ROUND, just to see if that worked any differently - alas, no.
Can anyone tell my why the anomaly? What am I missing?
Here is some code to create the first 20 rows of the original table for testing purposes:
-- First Name --
drop table if exists firstname;
CREATE TABLE firstname (
id INT NOT NULL,
firstname VARCHAR(20) NOT NULL,
gender VARCHAR(1) NOT NULL,
PRIMARY KEY (id),
UNIQUE (firstname)
);
INSERT INTO firstname
(id,firstname,gender)
VALUES
(1,"Aaron","m"),
(2,"Adam","m"),
(3,"Alan","m"),
(4,"Albert","m"),
(5,"Alexander","m"),
(6,"Andrew","m"),
(7,"Anthony","m"),
(8,"Arthur","m"),
(9,"Austin","m"),
(10,"Benjamin","m"),
(11,"Billy","m"),
(12,"Bobby","m"),
(13,"Brandon","m"),
(14,"Brian","m"),
(15,"Bruce","m"),
(16,"Bryan","m"),
(17,"Carl","m"),
(18,"Charles","m"),
(19,"Christian","m"),
(20,"Christopher","m");
Since RAND() is not deterministic, the WHERE condition is evaluated/executed once per each row. Thus each row has a chance of 1/199 to be selected. You can use a subquery in the FROM clause (derived table) instead to generate exactly one random number:
SELECT f.id, f.firstname
FROM firstname f
JOIN (SELECT floor(rand()*200)+1 as rnd) r ON r.rnd = f.id

SQL - Column in field list is ambiguous

I have two tables BOOKINGS and WORKER. Basically there is table for a worker and a table to keep track of what the worker has to do in a time frame aka booking. I’m trying to check if there is an available worker for a job, so I query the booking to check if requested time has available workers between the start end date. However, I get stuck on the next part. Which is returning the list of workers that do have that time available. I read that I could join the table passed on a shared column, so I tried doing an inner join with the WORKER_NAME column, but when I try to do this I get a ambiguous error. This leads me to believe I misunderstood the concept. Does anyone understand what I;m trying to do and knows how to do it, or knows why I have the error below. Thanks guys !!!!
CREATE TABLE WORKER (
ID INT NOT NULL AUTO_INCREMENT,
WORKER_NAME varchar(80) NOT NULL,
WORKER_CODE INT,
WORKER_WAGE INT,
PRIMARY KEY (ID)
)
CREATE TABLE BOOKING (
ID INT NOT NULL AUTO_INCREMENT,
WORKER_NAME varchar(80) NOT NULL,
START DATE NOT NULL,
END DATE NOT NULL,
PRIMARY KEY (ID)
)
query
SELECT *
FROM WORKERS
INNER JOIN BOOKING
ON WORKER_NAME = WORKER_NAME
WHERE (START NOT BETWEEN '2010-10-01' AND '2010-10-10')
ORDER BY ID
#1052 - Column 'WORKER_NAME' in on clause is ambiguous
In your query, the column "worker_name" exists in two tables; in this case, you must reference the tablename as part of the column identifer.
SELECT *
FROM WORKERS
INNER JOIN BOOKING
ON workers.WORKER_NAME = booking.WORKER_NAME
WHERE (START NOT BETWEEN '2010-10-01' AND '2010-10-10')
ORDER BY ID
In your query, the column WORKER_NAME and ID columns exists in both tables, where WORKER_NAME retains the same meaning and ID is re-purposed; in this case, you must either specify you are using WORKER_NAME as the join search condition or 'project away' (rename or omit) the duplicate ID problem.
Because the ID columns are AUTO_INCREMENT, I assume (hope!) they have no business meaning. Therefore, they could both be omitted, allowing a natural join that will cause duplicate columns to be 'projected away'. This is one of those situations where one wishes SQL had a WORKER ( ALL BUT ( ID ) ) type syntax; instead, one is required to do it longhand. It might be easier in the long run to to opt for a consistent naming convention and rename the columns to WORKER_ID and BOOKING_ID respectively.
You would also need to identify a business key to order on e.g. ( START, WORKER_NAME ):
SELECT *
FROM
( SELECT WORKER_NAME, WORKER_CODE, WORKER_WAGE FROM WORKER ) AS W
NATURAL JOIN
( SELECT WORKER_NAME, START, END FROM BOOKING ) AS B
WHERE ( START NOT BETWEEN '2010-10-01' AND '2010-10-10' )
ORDER BY START, WORKER_NAME;
This is good, but its returning the start and end times as well. I'm just wanting the WOKER ROWS. I cant take the start and end out, because then sql doesn’t recognize the where clause.
Two approaches spring to mind: push the where clause to the subquery:
SELECT *
FROM
( SELECT WORKER_NAME, WORKER_CODE, WORKER_WAGE FROM WORKER ) AS W
NATURAL JOIN
( SELECT WORKER_NAME, START, END
FROM BOOKING
WHERE START NOT BETWEEN '2010-10-01' AND '2010-10-10' ) AS B
ORDER BY START, WORKER_NAME;
Alternatively, replace SELECT * with a list of columns you want to SELECT:
SELECT WORKER_NAME, WORKER_CODE, WORKER_WAGE
FROM
( SELECT WORKER_NAME, WORKER_CODE, WORKER_WAGE FROM WORKER ) AS W
NATURAL JOIN
( SELECT WORKER_NAME, START, END FROM BOOKING ) AS B
WHERE START NOT BETWEEN '2010-10-01' AND '2010-10-10'
ORDER BY START, WORKER_NAME;
This error comes after you attempt to call a field which exists in both tables, therefore you should make a reference. For instance in example below I first say cod.coordinator so that DBMS know which coordinator I want
SELECT project__number, surname, firstname,cod.coordinator FROMcoordinatorsAS co JOIN hub_applicants AS ap ON co.project__number = ap.project_id JOIN coordinator_duties AS cod ON co.coordinator = cod.email

field in subquery based on age of row instead of "group by"

I can't seem to get this query right. I have tables like this (simplified):
person: PersonID, ...other stuff...
contact: ContactID, PersonID, ContactDate, ContactTypeID, Description
I want to get a list of all the people who had a contact of a certain type (or types) but none of another type(s) that occurred later. An easy-to-understand example: Checking for records of gifts received without having sent a thank-you card afterward. There might have been other previous thank-you cards sent (pertaining to other gifts), but if the most recent occurrence of a Gift Received (we'll say that's ContactTypeID=12) was not followed by a Thank You Sent (ContactTypeID=11), the PersonID should be in the result set. Another example: A mailing list would be made up of everyone who has opted in (12) without having opted out (11) more recently.
My attempt at a query is this:
SELECT person.PersonID FROM person
INNER JOIN (SELECT PersonID,ContactTypeID,MAX(ContactDate) FROM contact
WHERE ContactTypeID IN (12,11) GROUP BY PersonID) AS seq
ON person.PersonID=seq.PersonID
WHERE seq.ContactTypeID IN (12)`
It seems that the ContactTypeID returned in the subquery is for the last record entered in the table, regardless of which record has the max date. But I can't figure out how to fix it. Sorry if this has been asked before (almost everything has!), but I don't know what terms to search for.
Wow. A system to check who has been good and sent thank yous. I think I would be in your list...
Anyway. Give this a go. The idea is to create two views: the first with personId and the time of the most recently received gift and the second with personId and the most recently sent thanks. Join them together using a left outer join to ensure that people who have never sent a thank you are included and then add in a comparison between the most recently received time and the most recent thanks time to find impolite people:
select g.personId,
g.mostRecentGiftReceivedTime,
t.mostRecentThankYouTime
from
(
select p.personId,
max(ContactDate) as mostRecentGiftReceivedTime
from person p inner join contact c on p.personId = c.personId
where c.ContactTypeId = 12
group by p.personId
) g
left outer join
(
select p.personId,
max(ContactDate) as mostRecentThankYouTime
from person p inner join contact c on p.personId = c.personId
where c.ContactTypeId = 11
group by p.personId
) t on g.personId = t.personId
where t.mostRecentThankYouTime is null
or t.mostRecentThankYouTime < g.mostRecentGiftReceivedTime;
Here is the test data I used:
create table person (PersonID int unsigned not null primary key);
create table contact (
ContactID int unsigned not null primary key,
PersonID int unsigned not null,
ContactDate datetime not null,
ContactTypeId int unsigned not null,
Description varchar(50) default null
);
insert into person values (1);
insert into person values (2);
insert into person values (3);
insert into person values (4);
insert into contact values (1,1,'2013-05-01',12,'Person 1 Got a present');
insert into contact values (2,1,'2013-05-03',11,'Person 1 said "Thanks"');
insert into contact values (3,1,'2013-05-05',12,'Person 1 got another present. Lucky person 1.');
insert into contact values (4,2,'2013-05-01',11,'Person 2 said "Thanks". Not sure what for.');
insert into contact values (5,2,'2013-05-08',12,'Person 2 got a present.');
insert into contact values (6,3,'2013-04-25',12,'Person 3 Got a present');
insert into contact values (7,3,'2013-04-30',11,'Person 3 said "Thanks"');
insert into contact values (8,3,'2013-05-02',12,'Person 3 got another present. Lucky person 3.');
insert into contact values (9,3,'2013-05-05',11,'Person 3 said "Thanks" again.');
insert into contact values (10,4,'2013-04-30',12,'Person 4 got his first present');

optimising and scaling mysql structure + queries for large mailing groups

So I have a system that stores contacts and allows them to be put into groups. These groups can be defined by criteria (everyone with surname 'smith'), or by explicitly adding / excluding people.
The problem I am having is that when I list the mailing groups, I need to count how many contacts are in each one. This number can change as contacts are added / removed from the contacts table. On small groups / amounts of contacts it is fine, however using 50k ish contacts runs into problems
An example query I use for this is as follows:
SELECT COUNT(c_id) FROM contacts, mgroups
LEFT JOIN mgroups_explicit ON mg_id = me_mg_id
WHERE mgroups.site_id = '10'
AND mg_id = '20'
AND me_c_id = c_id
AND contacts.site_id = '10'
OR (contacts.site_id = '10' AND ( c_tags LIKE '%tag1%')) AND c_id NOT IN
( SELECT mex_c_id FROM mgroups_exclude WHERE c_id = mex_c_id ) GROUP BY c_id
The criteria table does not feature in this query, as the problem presents itself when large groups are created explicitly, rather than with a criteria. This is required as criteria based groups grow or shrink on the fly as you modify your contacts, where as explicit is generally set in stone. So in this case, if you explicitly add 20k contacts to a group, it adds 20k rows to the table marked with that mg_id as a foreign key.
This basically takes ages / times out / gets the wrong number / generally doesn't work very well. I either need to figure out a more efficient query, or figure out a better way to store everything.
Any ideas?
The 5 main tables that make up the database
contacts - where the actual contacts reside
Field Type Null Default Comments
c_id int(8) No
site_id int(6) No
c_email varchar(500) No
c_source varchar(255) No
c_subscribed tinyint(1) No 0
c_special tinyint(1) No 0
c_domain text No
c_title varchar(12) No
c_name varchar(128) No
c_surname varchar(128) No
c_company varchar(128) No
c_jtitle text No
c_ad1 text No
c_ad2 text No
c_ad3 text No
c_county varchar(64) No
c_city varchar(128) No
c_postcode varchar(32) No
c_lat varchar(100) No
c_lng varchar(100) No
c_country varchar(64) No
c_tel varchar(20) No
c_mob varchar(20) No
c_dob date No
c_registered datetime No
c_updated datetime No
c_twitter varchar(255) No
c_facebook varchar(255) No
c_tags text No
c_special_1 text No
c_special_2 text No
c_special_3 text No
c_special_4 text No
c_special_5 text No
c_special_6 text No
c_special_7 text No
c_special_8 text No
mgroups - basic mailing group info
Field Type Null Default Comments
mg_id int(8) No
site_id int(6) No
mg_name varchar(255) No
mg_created datetime No
mgroups_criteria - criteria for said mailing groups
Field Type Null Default Comments
mc_id int(8) No
site_id int(6) No
mc_mg_id int(8) No
mc_criteria text No
mgroups_exclude - anyone to exclude from criteria
Field Type Null Default Comments
mex_id int(8) No
site_id int(6) No
mex_c_id int(8) No
mex_mg_id int(8) No
mgroups_explicit - anyone to explicitly add without the use of criteria
Field Type Null Default Comments
me_id int(8) No
site_id int(6) No
me_c_id int(8) No
me_mg_id int(8) No
And the indexs / explain of query. Must admit, indexes are not my strong point, any improvements?
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY mgroups ALL PRIMARY,mg_id NULL NULL NULL 9 Using temporary; Using filesort
1 PRIMARY mgroups_explicit ref me_mg_id me_mg_id 4 engine_4.mgroups.mg_id 8750
1 PRIMARY contacts ALL PRIMARY,c_id NULL NULL NULL 86012 Using where; Using join buffer
2 DEPENDENT SUBQUERY NULL NULL NULL NULL NULL NULL NULL Impossible WHERE noticed after reading const table...
I don't see any indexes in the schema above, you do have indexes don't you?
run an explain on the query
EXPLAIN
SELECT COUNT(c_id) FROM
contacts, mgroups LEFT JOIN mgroups_explicit ON mg_id = me_mg_id
WHERE
mgroups.site_id = '10'
AND mg_id = '20'
AND me_c_id = c_id
AND contacts.site_id = '10'
OR (contacts.site_id = '10'
AND ( c_tags LIKE '%tag1%'))
AND c_id NOT IN (SELECT mex_c_id FROM mgroups_exclude WHERE c_id = mex_c_id ) GROUP BY c_id
That will tell you about what indexes are being used how many records it has to sort through etc..
DC
Right so I got this answered elsewhere (Huge thanks to Hambut_Bulge), so for the sake of it being useful to anyone else heres the solution:
First things off you're mixing old and new (ANSI) style joins in the same query. This is considered a bad idea in SQL circles. By old style I mean we write a query with a join along these lines
SELECT a.column_name, b.column2
FROM table1 a, second_table b
WHERE a.id_key = b.fid_key
AND b.some_other_criteria = 'Y';
In the newer ANSI style we'd rewrite the above to this:
SELECT a.column_name, b.column2
FROM table1 a INNER JOIN second_table b ON a.id_key = b.fid_key
WHERE b.some_other_criteria = 'Y';
Its neater and easier to read which bits are join conditions and which are where clauses. Its also best to get into the habit of using ANSI style as old style support may (at some point) be discontinued.
Also try and be consistent in your use of dot notation and/or aliases. Again it makes big queries easier to read.
Back to your problem query, I began by starting to convert it into ANSI style and straight-away noticed that you don't have a join condition between contacts and mgroups. This means that optimizer will create a cross join (also called a cartesian product), which was probably something you don't want to do. The cross join (in case you didn't know) joins every row in the contacts table with every row in the mgroups table. So if you have 50,000 rows in contacts and 20,000 rows in mgroup you're going to get a joined result set containing 1,000,000,000 rows!
The other thing that is going to slow this query drastically is the subquery on mgroups_exclude. A subquery is executed once for each row in the outer query eg:
SELECT a.column1
FROM table1 a
WHERE a.id_key NOT IN ( SELECT * FROM table2 b WHERE a.id_key = b.fid_key);
Assume that table1 has 2,000,000 rows and table2 has 500,000. For each and every row in the outer query (table1) the database is going to have to do a full scan on the inner query. So to get a result the database will have read 1,000,000,000,000 rows and we may only be interested in 1,000! It will not touch any indexes no matter what.
To get around this we can use a left join (also called a left outer join) on the two tables.
SELECT a.column1
FROM table1 a LEFT JOIN table2 b ON a.id_key = b.fid_key
WHERE b.fid_key IS NULL;
An outer join does not require each record in the joined tables to have a matching record. So the example above we'd get all the records from table1 even if there is no match on table2. For non-matched records the database returns a NULL and we can test for that in the where clause. Now the optimizer can scan the indexes on the two tables id_key fields (assuming there are any), resulting in a much faster query.
So, to wrap up. I'd rewrite your orginal query thus:
SELECT COUNT( a.c_id )
FROM contacts a
INNER JOIN mgroups b ON a.c_id = b.mg_id
LEFT JOIN mgroups_explicit c ON b.mg_id = c.me_mg_id
LEFT JOIN mgroups_exclude d ON a.c_id = d.mex_c_id
WHERE b.mg_id = '20'
AND a.site_id = '10'
AND a.c_tags LIKE '%tag1%'
AND d.mex_c_id IS NULL
GROUP BY c_id;