MySQL find entries matching multiple pivot rows - mysql

I have a schema with 3 regular tables and 2 link tables like so:
tutorial
id
exam
id
user
id
exam_user
user
exam
tutorial_user
user
tutorial
Where the link tables determine which users have attended which exam/tutorial.
Starting with an exam ID, I'm trying to find tutorials whose members have ALL attended the exam.
For example:
exam_user
+----+-------+----------+
| id | exam | user |
+----+-------+----------+
| 23 | 8 | 1 |
| 24 | 8 | 5 |
| 25 | 8 | 8 |
| 26 | 8 | 11 |
| 27 | 8 | 12 |
+----+-------+----------+
tutorial_user
+----+----------+----------+
| id | tutorial | user |
+----+----------+-----------
| 56 | 2 | 1 |
| 57 | 2 | 5 |
| 58 | 2 | 8 |
| 59 | 2 | 11 |
+----+----------+----------+
+----+----------+----------+
| id | tutorial | user |
+----+----------+-----------
| 60 | 3 | 1 |
| 61 | 3 | 5 |
| 62 | 3 | 8 |
| 63 | 3 | 15 |
+----+----------+----------+
Tutorial 2 would match because all entries are in exam 8, but tutorial 3 would not because user 11 is not present.
Is there any (relatively) efficient way to achieve the above?

How about you try something like this:
select * from tutorial_user where tutorial not in
(
select tutorial from
(
(
select ex.exam, ex.user as ex_user, tut.tutorial as tutorial,
tut.user as tut_user from exam_user ex
inner join tutorial_user tut on
ex.User = tut.User
)
union all
(
select null, null, tutorial, user as tut_user from tutorial_user
)
) src
group by tutorial, tut_user
having count(tut_user) = 1
)
Now the first query should give you a list that looks like this (essentially a combination of exams and tutorials for every user):
+----+-------+----------+-------------+---------+
| id | exam | user | Tutorial | User |
+----+-------+----------+-------------+---------+
| 23 | 8 | 1 | 2 | 1 |
| 23 | 8 | 1 | 3 | 1 |
| 24 | 8 | 5 | 2 | 5 |
| 24 | 8 | 5 | 3 | 5 |
...
Next, by doing a union all with the next query, you'll end up with something like this:
+-----+-------+----------+-------------+---------+
| id | exam | user | Tutorial | User |
+-----+-------+----------+-------------+---------+
| 23 | 8 | 1 | 2 | 1 |
| 23 | 8 | 1 | 3 | 1 |
| 24 | 8 | 5 | 2 | 5 |
| 24 | 8 | 5 | 3 | 5 |
| null| null | null | 2 | 1 |
| null| null | null | 2 | 5 |
| null| null | null | 2 | 8 |
...
Basically, what this allows us to do is a group by statement that's going to highlight users in the tutorial table and not in the exam table. The said highlight is going to be with the having() clause. If the count is 1, then you have someone in the tutorial table that's not in the exam table. All you have to do now is select the tutorials from the tutorial table that are NOT part of what you found and you should be good.

Related

Joining two MySQL tables and then performing basic filtering on one of them

I have two tables, they look roughly like below:
Categories:
+----+-----------+--------+----------------+
| id | entity_id | set_id | type |
+----+-----------+--------+----------------+
| 1 | 49 | 1 | signup |
| 2 | 57 | 1 | signup |
| 3 | 65 | 1 | signup |
| 4 | 69 | 1 | recommendation |
| 5 | 73 | 1 | signup |
| 6 | 77 | 1 | recommendation |
| 7 | 23 | 2 | comment |
| 8 | 45 | 2 | recommendation |
| 9 | 56 | 2 | signup |
| 10| 76 | 2 | signup |
+----+-----------+--------+----------------+
Steps:
+----+--------+----------+--------+
| id | set_id | start_id | end_id |
+----+--------+----------+--------+
| 1 | 1 | 49 | 57 |
| 2 | 1 | 57 | 65 |
| 3 | 1 | 65 | 69 |
| 4 | 1 | 69 | 73 |
| 5 | 1 | 77 | 57 |
| 6 | 2 | 23 | 45 |
| 7 | 2 | 45 | 56 |
| 8 | 2 | 56 | 76 |
+----+--------+----------+--------+
I need to select the rows of a given set_id from categories, but only if their entity_id is found in steps in the start_id column but NOT the end_id column.
So from this example, these are the two records I would expect to be returned:
+----+-----------+--------+----------------+
| id | entity_id | set_id | type |
+----+-----------+--------+----------------+
| 1 | 49 | 1 | signup |
| 6 | 77 | 1 | recommendation |
+----+--------+----------+-----------------+
I know how to perform a basic join on set_id, but I'm not sure how to search all the records of that set_id from steps and exclude those that have an entity_id that appears in end_id
Something like:
SELECT * FROM categories JOIN steps ON categories.set_id = steps.set_id WHERE ....categories.entity_id is present in steps.start_id but not in steps.end_id??
How should this logic be properly formed?
Exists logic works well here and is also straightforward:
SELECT c.*
FROM categories c
WHERE
EXISTS (SELECT 1 FROM steps s WHERE s.set_id = c.set_id AND c.entity_id = s.start_id)
AND
NOT EXISTS (SELECT 1 FROM steps s WHERE s.set_id = c.set_id AND c.entity_id = s.end_id);
The above query, read in plain English, says to find all category records such that there exists a matching record in steps where the entity_id matches the start_id, but there does not also exist a record in steps where the entity_id appears in end_id.

How to get only the second duplicated record in laravel 5.5?

Let's say i have a user table like this :
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 1 | John | john.doe1#mail.com | 24 |
| 2 | Josh | josh99#mail.com | 29 |
| 3 | Joseph | joseph410#mail.com | 21 |
| 4 | George | gge.48#mail.com | 28 |
| 5 | Joseph | jh.city89#mail.com | 24 |
| 6 | Kim | kimsd#mail.com | 32 |
| 7 | Bob | bob.s#mail.com | 38 |
| 8 | Joseph | psa.jos#mail.com | 34 |
| 9 | Joseph | joseph.la#mail.com | 28 |
| 10 | Jonathan | jonhan#mail.com | 22 |
+----+-----------+---------+------------+------+
In the actual, the database consists of more data and some of the data is duplicated, with more than two records. But the point is i want to get only the first and the second row of the duplicated rows that contains the name of "Joseph", How can i achieve this ? My code so far...
User::withTrashed()->groupBy('name')->havingRaw('count("name") >= 1')->get();
With that code the result will retrieve :
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 1 | John | john.doe1#mail.com | 24 |
| 2 | Josh | josh99#mail.com | 29 |
| 3 | Joseph | joseph410#mail.com | 21 |
| 4 | George | gge.48#mail.com | 28 |
| 6 | Kim | kimsd#mail.com | 32 |
| 7 | Bob | bob.s#mail.com | 38 |
| 10 | Jonathan | jonhan#mail.com | 22 |
+----+-----------+---------+------------+------+
And i use this code to try to get the second duplicated row :
User::withTrashed()->groupBy('name')->havingRaw('count("name") >= 2')->get();
The result is still same as the mentioned above :
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 1 | John | john.doe1#mail.com | 24 |
| 2 | Josh | josh99#mail.com | 29 |
| 3 | Joseph | joseph410#mail.com | 21 |
| 4 | George | gge.48#mail.com | 28 |
| 6 | Kim | kimsd#mail.com | 32 |
| 7 | Bob | bob.s#mail.com | 38 |
| 10 | Jonathan | jonhan#mail.com | 22 |
+----+-----------+---------+------------+------+
I want the result is to get record that have the id "5" with name "Joseph" like this :
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 1 | John | john.doe1#mail.com | 24 |
| 2 | Josh | josh99#mail.com | 29 |
| 4 | George | gge.48#mail.com | 28 |
| 5 | Joseph | jh.city89#mail.com | 24 |
| 6 | Kim | kimsd#mail.com | 32 |
| 7 | Bob | bob.s#mail.com | 38 |
| 10 | Jonathan | jonhan#mail.com | 22 |
+----+-----------+---------+------------+------+
But it seems only the first duplicate row is retrieved and i can't get the second duplicated row, can anybody give me suggestion ?
Let's start from your query
User::withTrashed()->groupBy('name')->havingRaw('count("name") >= 1')->get();
This will show all groups of rows whose count equals to 1 ore more. and this is the description of DISTINCT.
If you want to get only duplicate records you should get groups whose count is LARGER than 1.
The other thing to notice here is that a non-aggrigated column will be chosen randomly. because when you get a name and it's count, for example if you select name,count(name), email (email is not in the group by clause - not aggregated), and 4 rows have the same name. so you'll see:
+--------+-------------+-------+
| Name | Count(Name) | Email |
+--------+-------------+-------+
| Joseph | 4 | X |
+--------+-------------+-------+
what do you expect instead of X? which one of the 4 emails? actually, in SQLServer it's forbidden to select a non-aggrigated column and other databases will just give you a random one of the counted 3.
see this answer for more details it's explained very well: Do all columns in a SELECT list have to appear in a GROUP BY clause
So, we'll use having count(name) > 1 and select only the aggregated column name
DB::from('users')->select('name')->groupBy('name')->havingRaw('count("name") > 1')->get();
This should give you (didn't test it) this:
+--------+-------------+
| name | Count(name) |
+--------+-------------+
| Joseph | 4 |
+--------+-------------+
This will give you all names who have 2 or more instances. you can determine the number of duplicates in the having clause. for example having count(name) = 3 will give you all names which have exactly 3 duplicates.
So how to get the second duplicate? I have a question for that:
What is the first (original) duplicate? is it the one with the oldest created_at or the oldest updated_at ? or maybe some other condition?. because of that you should make another query with order by clause to give you the duplicates in the order most convenient to you. for example:
select * from `users` where `name` in (select `name` from users group by `name` having count(`name`) > 1) order by `id` asc
which will give:
+----+-----------+----------------------+------+
| ID | Name | Email | Age |
+----+-----------+----------------------+------+
| 3 | Joseph | joseph410#mail.com | 21 |
| 5 | Joseph | jh.city89#mail.com | 24 |
| 8 | Joseph | psa.jos#mail.com | 34 |
| 9 | Joseph | joseph.la#mail.com | 28 |
+----+-----------+---------+------------+------+

Comparing all data to the data of specifically selected user ID?

I have a mysql table that holds data for team games.
Objective:
Count the number of times other SquadID's have have shared the same Team value as SquadID=21
// Selections table
+--------+---------+------+
| GameID | SquadID | Team |
+--------+---------+------+
| 1 | 5 | A |
| 1 | 7 | B |
| 1 | 11 | A |
| 1 | 21 | A |
| 2 | 5 | A |
| 2 | 7 | B |
| 2 | 11 | A |
| 2 | 21 | A |
| 3 | 5 | A |
| 3 | 7 | B |
| 3 | 11 | A |
| 3 | 21 | A |
| 4 | 5 | A |
| 4 | 11 | B |
| 4 | 21 | A |
| 5 | 5 | A |
| 5 | 11 | B |
| 5 | 21 | A |
| 6 | 5 | A |
| 6 | 11 | B |
| 6 | 21 | A |
+--------+---------+------+
// Desired Result
+---------+----------+
| SquadID | TeamMate |
+---------+----------+
| 5 | 6 |
| 7 | 0 |
| 11 | 3 |
| 21 | 6 |
+----------+---------+
I've attempted to use a subquery specifying the specific player I wish to compare with and because this subquery has multiple rows, I've used in instead of =.
// Current Query
SELECT
SquadID,
COUNT(Team IN (SELECT Team FROM selections WHERE SquadID=21) AND GameID IN (SELECT GameID FROM selections WHERE SquadID=21)) AS TeamMate
FROM
selections
GROUP BY
SquadID;
The result I'm getting is the number of Games a user has played rather than the number of games a user has been on the same team as SquadID=21
// Current Result
+---------+----------+
| SquadID | TeamMate |
+---------+----------+
| 5 | 6 |
| 7 | 3 |
| 11 | 6 |
| 21 | 6 |
+---------+----------+
What am I missing?
// DESCRIBE selections;
+---------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------+------+-----+---------+-------+
| GameID | int(11) | NO | PRI | 0 | |
| SquadID | int(4) | NO | PRI | NULL | |
| Team | char(1) | NO | | NULL | |
| TeamID | int(11) | NO | | 1 | |
+---------+---------+------+-----+---------+-------+
General rule is to avoid nested selects and look for a better way of logically arranging joins. Lets look at a cross join:
From selections s1
inner join selects s2 on s1.gameid = s2.gameid and s1.team = s2.team
This will produce a cross joined list of each squadID that participated with another squadID (IE: they were in the same game and on same team). We are only interested in the times where the squad participated with squad 21, so add a where clause:
where s2.squadid = 21
Then it's simply choosing the field/count you want:
select s1.squad, count(1) as teammate
any aggregate needs a group by
group by s1.squad
Combine it together and give a go. Oddly, this will produce a list where squad 21 will be showing as playing on it's own team all 6 times. Adding a where clause can eliminate this
where s1.squadid <> s2.squadid
SELECT SquadID, count(t1.team) as TeamMate
FROM selections as t1 join
(select distinct team, gameid from selections where SquadID=21) as t2
on t1.Team=t2.Team and t1.Gameid=t2.Gameid
GROUP BY SquadID

Mysql query to convert table from long format to wide format

I have a table called ContactAttrbiutes which contains a list of each contacts' attributes. The kind of data stored for these contacts include: Title, Forename, Surname telephone number etc.
Current Table
+-------------+-----------+------------------------------+
| attributeId | ContactId | AttributeValue |
+-------------+-----------+------------------------------+
| 1 | 5 | Lady |
| 2 | 5 | Elizabeth |
| 3 | 5 | E |
| 4 | 5 | Anson |
| 5 | 5 | |
| 6 | 5 | |
| 7 | 5 | |
| 8 | 5 | |
| 10 | 5 | 0207 72776 |
| 11 | 5 | |
| 12 | 5 | 0207 22996 |
| 13 | 5 | 0207 72761 |
| 14 | 5 | |
| 15 | 5 | |
| 60 | 5 | Lloyds |
| 61 | 5 | |
| 1 | 10 | Mr |
| 2 | 10 | John |
| 3 | 10 | J C |
| 4 | 10 | Beveridge |
| 5 | 10 | Esq QC |
| 6 | 10 | Retired |
| 7 | 10 | |
| 8 | 10 | |
| 10 | 10 | 0207 930 |
| 11 | 10 | |
| 12 | 10 | |
| 13 | 10 | 0207 930 |
| 14 | 10 | |
| 15 | 10 | |
| 60 | 10 | |
| 61 | 10 | |
+-------------+-----------+------------------------------+
However I would like to run a query to create a table that looks like...
New Table
+-----------+----------------------+-------------------------+-----------------------+------------------------+
| ContactId | AttributeValue_Title | AttributeValue_ForeName |AttributeValue_Initial | AttributeValue_Surname |
+-----------+----------------------+-------------------------+-----------------------+------------------------+
| 5 | Lady | Elizabeth | E | Anson |
+-----------+----------------------+-------------------------+-----------------------+------------------------+
| 10 | Mr | John | J C | Beveridge |
+-----------+----------------------+-------------------------+-----------------------+------------------------+
I am sure there is a very simple answer but I have spent hours looking. Can anyone help?
The above is only a small extract of my table, I have 750,000 contacts. In addition I would like the final table to have more columns than I have described above but they will come from different Attributes with the existing table.
Thank you very much in advance.
try this
SELECT ContactId ,
max(CASE when attributeId = 1 then AttributeValue end) as AttributeValue_Title ,
max(CASE when attributeId = 2 then AttributeValue end )as AttributeValue_ForeName ,
max(CASE when attributeId = 3 then AttributeValue end )as AttributeValue_Initial ,
max(CASE when attributeId = 4 then AttributeValue end) as AttributeValue_Surname
from Table1
group by ContactId
DEMO HERE
if you want to make your result more longer for other attributeId then just add a case statment as in the code.
SELECT
t_title.AttributeValue AS title,
t_name.AttributeValue AS name,
...
FROM the_table AS t_title
JOIN the_table AS t_firstname USING(contact_id)
JOIN ...
WHERE
t_title.attributeId = 1 AND
t_firstname.attributeId = 2 AND
...
EAV "model" is an antipattern in most cases. Are you really going to have a variable number of attributes? If yes, then no-SQL solution might be more appropriate than a relational database.

join with a group by?

i have a table called rc_language_type_table with:
id language
1 english
2 Xhosa
3 afrikaans
etc
then i have a table rc_language_type_assoc_table with:
profile_id | language_type_id |
+------------+------------------+
| 3 | 1 |
| 13 | 1 |
| 15 | 1 |
| 16 | 1 |
where i have profiles and each profile is connected to a language id in a 1 to many
so then i did:
select *,count(*) from rc_language_type_assoc_table group by language_type_id;
+------------+------------------+----------+
| profile_id | language_type_id | count(*) |
+------------+------------------+----------+
| 3 | 1 | 96 |
| 3 | 2 | 19 |
| 3 | 3 | 18 |
| 64 | 4 | 51 |
| 94 | 5 | 10 |
| 37 | 6 | 26 |
| 3 | 7 | 21 |
| 3 | 8 | 4 |
| 3 | 9 | 6 |
| 88 | 10 | 4 |
| 3 | 11 | 3 |
+------------+------------------+----------+
what i want now is: instead having the language_type_id i want to display the actual language...how would i do this please???
i tried:
select *, count(*)
from rc_language_type_assoc_table, rc_language_type_table
group by language_type_id
where rc_language_type_assoc_table.language_type_id = rc_language_type_table.id;
but i get a syntax error...
please help??
thank you
GROUP BY should be "after" the WHERE statement and not before
select *, count(*)
from rc_language_type_assoc_table, rc_language_type_table
where rc_language_type_assoc_table.language_type_id = rc_language_type_table.id
group by language_type_id ;