Mysql: join across five tables - mysql

I have a mysql database with this setup (omitting fields not relevant to this question)
users
id #primary key
user_group_teachers
id #primary key
teacher_id #foreign key to users.id
user_group_id #foreign key to users_groups.id
user_groups
id #primary key
user_group_members
id #primary key
pupil_id #foreign key to pupils.id
user_group_id #foreign key to users_groups.id
pupils
id #primary key
I have a collection of user ids in an array, called "user_ids".
For each of those user ids, i want to collect the pupil ids associated with that user via the
user -> user_group_teachers -> user_groups -> user_group_members -> pupils
association. Ie, some kind of join across the tables.
So, i'd like to get some kind of result where the rows look like
[1, [6,7,8,9]]
where 1 is the teacher id, and [6,7,8,9] are the ids of pupils. I'd only like each pupil id to appear once in the second list.
Can anyone tell me how to do this in as small a number of queries as possible (or, more broadly, as efficiently as possible). I will probably usually have between 1000 and 10,000 ids in user_ids.
I'm doing this in a ruby script, so can store the results as variables (arrays or hashes) in between queries, if that makes things simpler.
Thanks! max
EDIT for Lyhan
Lyhan - thanks but your solution doesn't seem to work. For example in the first row of the results, using your method, i have
| user_id | group_concat(pupils.id separator ",")
| 1 | 2292
But, if i get the associated pupil ids in a slower, step by step way, then i get different results:
select group_concat(user_group_teachers.user_group_id separator ",")
from user_group_teachers
where user_group_teachers.teacher_id = 1
group by user_group_teachers.teacher_id;
I get
| group_concat(user_group_teachers.user_group_id separator ",")
| 12,1033,2117,2280,2281
Plugging these values (user_group ids) into another query:
select group_concat(user_group_members.pupil_id separator ",")
from user_group_members
where user_group_members.user_group_id in (12,1033,2117,2280,2281)
group by user_group_members.user_group_id;
I get
| group_concat(user_group_members.pupil_id separator ",")
| 47106,47107
Thanks for the group_concat method btw, that's handy :)

I made a couple comments above that are important to the solution for this, but I think you could start with these two queries to see if it gets you far enough along to get what you need.
To get ordered lists for a teacher for pupils across all groups, you could do this:
select distinct t.teacher_id, m.pupil_id
from user_groups g
inner join user_group_teachers t
on t.user_group_id = g.id
inner join user_group_members m
on t.user_group_id = g.id
order by t.teacher_id, m.pupil_id
To get ordered lists for a teacher for pupils with the relationship to group in tact, you could do this:
select g.id, t.teacher_id, m.pupil_id
from user_groups g
inner join user_group_teachers t
on t.user_group_id = g.id
inner join user_group_members m
on t.user_group_id = g.id
order by g.id, t.teacher_id, m.pupil_id
You would have to walk these result sets and transform them into the nested arrays, but it is the data you wanted.
Update: Update: If the data set is too large or you do not want to walk a single result set, then you could do this to emulate the results of the first query above and build your sub-arrays based on query result sets:
/* Use this query to drive the batch */
select distinct t.teacher_id
from user_groups_teachers t
order by t.teacher_id
/* Inside a loop based on first query result, pull out the array of pupils for a teacher */
select distinct m.pupil_id
from user_groups_members m
inner join user_groups g
on g.id = m.user_group_id
inner join user_groups_teachers t
on t.user_group_id = g.id
where t.teacher_id = /* parameter */
order by m.pupil_id

This is what i came up with:
select pupil_group_teachers.teacher_id, group_concat(pupil_group_members.pupil_id separator ',')
from pupil_group_teachers join pupil_groups on pupil_group_teachers.pupil_group_id = pupil_groups.id
join pupil_group_members on pupil_group_members.pupil_group_id = pupil_groups.id
group by pupil_group_teachers.teacher_id;
it seems to work, and is really fast. Lyhan (who has since deleted his answer) and David Fleeman both helped me figure it out. Cheers guys.

Related

LEFT JOIN but with WHERE criteria, rows getting lost

I have a simple database with three tables. In the database I have a table for users of my system, a table for applications to a competition, and an intermediary table that allows me to track which users have selected which applications to view.
Table 1 = users (user_id, username, first, last, etc...)
Table 2 = applications (application_id, company_name, url, etc...)
Table 3 = picks (pick_id, user_id, application_id, picked)
I am trying to write an SQL query that will show all the applications that have been submitted and if any individual application has been selected by a user will show that it has been "picked" (1=picked, 0=not picked).
So for user_id = 1 I'd like to see:
Column Names (application_id, company_name, picked)
1, Foo, 1
2, Bar, 1
3, Alpha, Null
4, Beta, Null
I tried it with the following query:
SELECT applications.application_id, applications.company_name, picks.picked
FROM applications
LEFT JOIN picks ON applications.application_id = picks.application_id
ORDER BY applications.application_id ASC
Which is returning this:
1, Foo, 1
1, Foo, 1
2, Bar, null
3, Alpha, null
4, Beta, null
I have a second user (user_id = 2) that also picked application 1 ("Foo") which I know is returning the second row.
Then I tried to limit the scope by specifying user_id = 1 here:
SELECT applications.application_id, applications.company_name, picks.picked
FROM applications
LEFT JOIN picks ON applications.application_id = picks.application_id
WHERE user_id = 1
ORDER BY applications.application_id ASC
Now I'm only getting:
1, Foo, 1
Any suggestions on how I can get what I'm looking for? Again, ideally for a single user I'd like to see:
Column Names (application_id, company_name, picked)
1, Foo, 1
2, Bar, 1
3, Alpha, Null
4, Beta, Null
You have a so-called join table in your database schema. In your case it's called picks. This allows you to create a many-to-many relationship between your users and applications.
To use that join table correctly you need to join all three tables. These queries are easier to write if you use table aliases (applications AS a, etc.)
SELECT a.application_id, a.company_name, p.picked, u.user_id, u.username
FROM applications AS a
LEFT JOIN picks AS p ON a.application_id = p.application_id
LEFT JOIN users AS u ON p.user_id = u.user_id
ORDER BY a.application_id, u.user_id
This will give you a list of all applications with the users who have made them. If no users are related to an application, the LEFT JOIN operations will retain the application row and you'll see NULL values for columns from the picks and users table.
Now, if you add a WHERE p.something = something or u.something = something clause to this query in an attempt to narrow down the presentation, it has the effect of converting the LEFT JOIN clauses into INNER JOIN clauses. That is, you won't retain the applications rows that don't have matching rows in the other tables.
If you want to retain those unmatched rows in your result set, put the condition in the first ON clause instead of the WHERE clause, like so.
SELECT a.application_id, a.company_name, p.picked, u.user_id, u.username
FROM applications AS a
LEFT JOIN picks AS p ON a.application_id = p.application_id AND p.user_id = 1
LEFT JOIN users AS u ON p.user_id = u.user_id
ORDER BY a.application_id, u.user_id
Edit Many join tables like your picks table are set up with a composite primary key, in your example (application_id, user_id). That ensures just one row per possible relationship between the tables being joined. In your case you have the potential for multiple such rows.
To use only the most recent of those rows (the one with the highest pick_id) takes a little more work. You need a subquery (virtual table) to extract it, and to retrieve the appropriate value of picked so your query works. So now things get interesting.
SELECT MAX(pick_id) AS pick_id,
application_id, user_id
FROM picks
GROUP BY application_id, user_id
retrieves the unique relationship pair. That is good. But next we have to fetch the picked column detail value from those rows. That takes another join, using the MAX value of pick_id, like so
SELECT q.application_id, q.user_id, r.picked
FROM (
SELECT MAX(pick_id) AS pick_id,
application_id, user_id
FROM picks
GROUP BY application_id, user_id
) AS q
JOIN picks AS r ON q.pick_id = r.pick_id
So, we need to substitute this little virtual table (subquery) in place of the pick AS p table in the original query. That looks like this.
SELECT a.application_id, a.company_name, p.picked, u.user_id, u.username
FROM applications AS a
LEFT JOIN (
SELECT q.application_id, q.user_id, r.picked
FROM (
SELECT MAX(pick_id) AS pick_id,
application_id, user_id
FROM picks
GROUP BY application_id, user_id
) AS q
JOIN picks AS r ON q.pick_id = r.pick_id
) AS p ON a.application_id = p.application_id AND p.user_id = 1
LEFT JOIN users AS u ON p.user_id = u.user_id
ORDER BY a.application_id, u.user_id
Some developers prefer to create VIEW objects for subqueries like the one here, rather than creating a club sandwich of a query like this one. It's not called Structured Query Language on a foolish whim, eh? These subqueries sometimes can be elements of a structure.

MySQL - 3 tables, is this complex join even possible?

I have three tables: users, groups and relation.
Table users with fields: usrID, usrName, usrPass, usrPts
Table groups with fields: grpID, grpName, grpMinPts
Table relation with fields: uID, gID
User can be placed in group in two ways:
if collect group minimal number of points (users.usrPts > group.grpMinPts ORDER BY group.grpMinPts DSC LIMIT 1)
if his relation to the group is manually added in relation tables (user ID provided as uID, as well as group ID provided as gID in table named relation)
Can I create one single query, to determine for every user (or one specific), which group he belongs, but, manual relation (using relation table) should have higher priority than usrPts compared to grpMinPts? Also, I do not want to have one user shown twice (to show his real group by points, but related group also)...
Thanks in advance! :) I tried:
SELECT * FROM users LEFT JOIN (relation LEFT JOIN groups ON (relation.gID = groups.grpID) ON users.usrID = relation.uID
Using this I managed to extract specified relations (from relation table), but, I have no idea how to include user points, respecting above mentioned priority (specified first). I know how to do this in a few separated queries in php, that is simple, but I am curious, can it be done using one single query?
EDIT TO ADD:
Thanks to really educational technique using coalesce #GordonLinoff provided, I managed to make this query to work as I expected. So, here it goes:
SELECT o.usrID, o.usrName, o.usrPass, o.usrPts, t.grpID, t.grpName
FROM (
SELECT u.*, COALESCE(relationgroupid,groupid) AS thegroupid
FROM (
SELECT u.*, (
SELECT grpID
FROM groups g
WHERE u.usrPts > g.grpMinPts
ORDER BY g.grpMinPts DESC
LIMIT 1
) AS groupid, (
SELECT grpUID
FROM relation r
WHERE r.userUID = u.usrID
) AS relationgroupid
FROM users u
)u
)o
JOIN groups t ON t.grpID = o.thegroupid
Also, if you are wondering, like I did, is this approach faster or slower than doing three queries and processing in php, the answer is that this is slightly faster way. Average time of this query execution and showing results on a webpage is 14 ms. Three simple queries, processing in php and showing results on a webpage took 21 ms. Average is based on 10 cases, average execution time was, really, a constant time.
Here is an approach that uses correlated subqueries to get each of the values. It then chooses the appropriate one using the precedence rule that if the relations exist use that one, otherwise use the one from the groups table:
select u.*,
coalesce(relationgroupid, groupid) as thegroupid
from (select u.*,
(select grpid from groups g where u.usrPts > g.grpMinPts order by g.grpMinPts desc limit 1
) as groupid,
(select gid from relations r where r.userId = u.userId
) as relationgroupid
from users u
) u
Try something like this
select user.name, group.name
from group
join relation on relation.gid = group.gid
join user on user.uid = relation.uid
union
select user.name, g1.name
from group g1
join group g2 on g2.minpts > g1.minpts
join user on user.pts between g1.minpts and g2.minpts

MySQL query for multiple tables being secondary tables multiple items?

I have a query where I currently get information from 2 tables like this:
SELECT g.name
FROM site_access b
JOIN groups g
ON b.group_id = g.id
WHERE b.site_id = 1
ORDER BY g.status ASC
Now I wanted to have another table with this query but this one table would return more then 1 row is that possible at all ?
All I could make was it pull 1 row from that table, the field I want is a string field and it is ok to join the result with a separator too as long as all the matchs can be pulled together in this query.
If you need more information about the tables or anything feel free to say I didnt think it would be needed as this is mostly an example of how to pull multiple rows from a join/select query.
UPDATE of what the above query would result:
Admin
Member
Banned
Now with my 3rd table each access have commands they are allowed to use so this 3rd table would list what commands each one has access to, example:
Admin - add, del, announce
Member - find
Banned - none
UPDATE2:
site_access
site_id
group_id
groups
id
name
status
groups_commands
group_id
command_id
commands
id
name
SELECT g.name, GROUP_CONCAT(c.command) AS commands
FROM site_access b
JOIN groups g
ON b.group_id = g.id
JOIN groups_commands gc
ON g.id = gc.group_id
JOIN commands c
ON gc.command_id = c.id
WHERE b.site_id = 1
GROUP BY g.name
ORDER BY g.status ASC

MySQL joins and COUNT(*) from another table

I have two tables: groups and group_members.
The groups table contains all the information for each group, such as its ID, title, description, etc.
In the group_members table, it lists all the members who are apart of each group like this:
group_id | user_id
1 | 100
2 | 23
2 | 100
9 | 601
Basically, I want to list THREE groups on a page, and I only want to list groups which have MORE than four members. Inside the <?php while ?> loop, I then want to four members who are apart of that group. I'm having no trouble listing the groups, and listing the members in another internal loop, I just cannot refine the groups so that ONLY those with more than 4 members show.
Does anybody know how to do this? I'm sure it's with MySQL joins.
MySQL use HAVING statement for this tasks.
Your query would look like this:
SELECT g.group_id, COUNT(m.member_id) AS members
FROM groups AS g
LEFT JOIN group_members AS m USING(group_id)
GROUP BY g.group_id
HAVING members > 4
example when references have different names
SELECT g.id, COUNT(m.member_id) AS members
FROM groups AS g
LEFT JOIN group_members AS m ON g.id = m.group_id
GROUP BY g.id
HAVING members > 4
Also, make sure that you set indexes inside your database schema for keys you are using in JOINS as it can affect your site performance.
SELECT DISTINCT groups.id,
(SELECT COUNT(*) FROM group_members
WHERE member_id = groups.id) AS memberCount
FROM groups
Your groups_main table has a key column named id. I believe you can only use the USING syntax for the join if the groups_fans table has a key column with the same name, which it probably does not. So instead, try this:
LEFT JOIN groups_fans AS m ON m.group_id = g.id
Or replace group_id with whatever the appropriate column name is in the groups_fans table.
Maybe I am off the mark here and not understanding the OP but why are you joining tables?
If you have a table with members and this table has a column named "group_id", you can just run a query on the members table to get a count of the members grouped by the group_id.
SELECT group_id, COUNT(*) as membercount
FROM members
GROUP BY group_id
HAVING membercount > 4
This should have the least overhead simply because you are avoiding a join but should still give you what you wanted.
If you want the group details and description etc, then add a join from the members table back to the groups table to retrieve the name would give you the quickest result.

Using GROUP_CONCAT on subquery in MySQL

I have a MySQL query in which I want to include a list of ID's from another table. On the website, people are able to add certain items, and people can then add those items to their favourites. I basically want to get the list of ID's of people who have favourited that item (this is a bit simplified, but this is what it boils down to).
Basically, I do something like this:
SELECT *,
GROUP_CONCAT((SELECT userid FROM favourites WHERE itemid = items.id) SEPARATOR ',') AS idlist
FROM items
WHERE id = $someid
This way, I would be able to show who favourited some item, by splitting the idlist later on to an array in PHP further on in my code, however I am getting the following MySQL error:
1242 - Subquery returns more than 1 row
I thought that was kind of the point of using GROUP_CONCAT instead of, for example, CONCAT? Am I going about this the wrong way?
Ok, thanks for the answers so far, that seems to work. However, there is a catch. Items are also considered to be a favourite if it was added by that user. So I would need an additional check to check if creator = userid. Can someone help me come up with a smart (and hopefully efficient) way to do this?
Thank you!
Edit: I just tried to do this:
SELECT [...] LEFT JOIN favourites ON (userid = itemid OR creator = userid)
And idlist is empty. Note that if I use INNER JOIN instead of LEFT JOIN I get an empty result. Even though I am sure there are rows that meet the ON requirement.
OP almost got it right. GROUP_CONCAT should be wrapping the columns in the subquery and not the complete subquery (I'm dismissing the separator because comma is the default):
SELECT i.*,
(SELECT GROUP_CONCAT(userid) FROM favourites f WHERE f.itemid = i.id) AS idlist
FROM items i
WHERE i.id = $someid
This will yield the desired result and also means that the accepted answer is partially wrong, because you can access outer scope variables in a subquery.
You can't access variables in the outer scope in such queries (can't use items.id there). You should rather try something like
SELECT
items.name,
items.color,
CONCAT(favourites.userid) as idlist
FROM
items
INNER JOIN favourites ON items.id = favourites.itemid
WHERE
items.id = $someid
GROUP BY
items.name,
items.color;
Expand the list of fields as needed (name, color...).
I think you may have the "userid = itemid" wrong, shouldn't it be like this:
SELECT ITEMS.id,GROUP_CONCAT(FAVOURITES.UserId) AS IdList
FROM FAVOURITES
INNER JOIN ITEMS ON (ITEMS.Id = FAVOURITES.ItemId OR FAVOURITES.UserId = ITEMS.Creator)
WHERE ITEMS.Id = $someid
GROUP BY ITEMS.ID
The purpose of GROUP_CONCAT is correct but the subquery is unnecessary and causing the problem. Try this instead:
SELECT ITEMS.id,GROUP_CONCAT(FAVOURITES.UserId)
FROM FAVOURITES INNER JOIN ITEMS ON ITEMS.Id = FAVOURITES.ItemId
WHERE ITEMS.Id = $someid
GROUP BY ITEMS.ID
Yes, soulmerge's solution is ok. But I needed a query where I had to collect data from more child tables, for example:
main table: sessions (presentation sessions) (uid, name, ..)
1st child table: events with key session_id (uid, session_uid, date, time_start, time_end)
2nd child table: accessories_needed (laptop, projector, microphones, etc.) with key session_id (uid, session_uid, accessory_name)
3rd child table: session_presenters (presenter persons) with key session_id (uid, session_uid, presenter_name, address...)
Every Session has more rows in child tables tables (more time schedules, more accessories)
And I needed to collect in one collection for every session to display in ore row (some of them):
session_id | session_name | date | time_start | time_end | accessories | presenters
My solution (after many hours of experiments):
SELECT sessions.uid, sessions.name,
,(SELECT GROUP_CONCAT( `events`.date SEPARATOR '</li><li>')
FROM `events`
WHERE `events`.session_id = sessions.uid ORDER BY `events`.date) AS date
,(SELECT GROUP_CONCAT( `events`.time_start SEPARATOR '</li><li>')
FROM `events`
WHERE `events`.session_id = sessions.uid ORDER BY `events`.date) AS time_start
,(SELECT GROUP_CONCAT( `events`.time_end SEPARATOR '</li><li>')
FROM `events`
WHERE `events`.session_id = sessions.uid ORDER BY `events`.date) AS time_end
,(SELECT GROUP_CONCAT( accessories.name SEPARATOR '</li><li>')
FROM accessories
WHERE accessories.session_id = sessions.uid ORDER BY accessories.name) AS accessories
,(SELECT GROUP_CONCAT( presenters.name SEPARATOR '</li><li>')
FROM presenters
WHERE presenters.session_id = sessions.uid ORDER BY presenters.name) AS presenters
FROM sessions
So no JOIN or GROUP BY needed.
Another useful thing to display data friendly (when "echoing" them):
you can wrap the events.date, time_start, time_end, etc in "<UL><LI> ... </LI></UL>" so the "<LI></LI>" used as separator in the query will separate the results in list items.
I hope this helps someone. Cheers!