I am trying to join multiple rows of information for single row, but it seems to multiply every time there is more rows in one of the joins.
My tables structure is as follows:
news
id | title | public
------------------------
1 | Test | 0
news_groups_map
id | news_id | members_group_id
------------------------------------
1 | 1 | 5
2 | 2 | 6
members_groups_map
id | member_id | group_id
------------------------------
1 | 750 | 5
2 | 750 | 6
The query I've got so far is:
SELECT
n.title,
n.public,
CAST(GROUP_CONCAT(ngm.members_group_id) AS CHAR(1000)) AS news_groups,
CAST(GROUP_CONCAT(member_groups.group_id) AS CHAR(1000)) AS user_groups
FROM news n
LEFT JOIN news_groups_map ngm ON n.id = ngm.news_id
JOIN (
SELECT group_id
FROM members_groups_map
WHERE member_id = 750
) member_groups
WHERE n.public = 0
GROUP BY n.id
However, the result is as follows:
title | public | news_groups | user_groups
-------------------------------------------------
Test | 0 | 5,6,5,6 | 6,6,5,5
As you can see, the news_group and user_groups are duplicating, so if a news article is in 3 groups, the user_groups will be multiplied as well and show something like 5,6,6,6,5,5.
How can I group those groups, so that they are only displayed once?
The ultimate goal here is to compare news_groups and user_groups. So if at least one group matches (meaning user has enough permissions), then there should be a boolean with true returned, and false otherwise. I don't know how to do that either, however, I thought I should sort out the grouping first, as once the number of groups gets bigger there is going to be unnecessary lots of same data selected.
Thanks!
The simplest method is to use distinct:
SELECT n.title, n.public,
GROUP_CONCAT(DISTINCT ngm.members_group_id) AS news_groups,
GROUP_CONCAT(DISTINCT mg.group_id) AS user_groups
FROM news n LEFT JOIN
news_groups_map ngm
ON n.id = ngm.news_id CROSS JOIN
(SELECT group_id
FROM members_groups_map
WHERE member_id = 750
) mg
WHERE n.public = 0
GROUP BY n.id;
This query doesn't actually make sense. First, the subquery is not needed:
SELECT n.title, n.public,
GROUP_CONCAT(DISTINCT ngm.members_group_id) AS news_groups,
GROUP_CONCAT(DISTINCTD mg.group_id) AS user_groups
FROM news n LEFT JOIN
news_groups_map ngm
ON n.id = ngm.news_id CROSS JOIN
members_groups_map mg
ON member_id = 750
WHERE n.public = 0
GROUP BY n.id;
Second, the CROSS JOIN (or equivalently, JOIN without an ON clause) doesn't make sense. Normally, I would expect a join condition to one of the other tables.
Use DISTINCT in the GROUP_CONCAT
...
CAST(GROUP_CONCAT(DISTINCT ngm.members_group_id) AS CHAR(1000)) AS news_groups,
CAST(GROUP_CONCAT(DISTINCT member_groups.group_id) AS CHAR(1000)) AS user_groups
...
Related
I have a situation where I want to join multiple SQL tables and get back one row per record in the base table as well as GROUP_CONCAT the other table data together with |. Unfortunately, with the query method I'm currently using, I'm getting back undesired multiplicity in the GROUP_CONCAT data and I don't know how to solve it.
I have the following basic DB structure:
things
id | name
1 | Some Thing
2 | Some Other Thing
items
id | name
1 | Blob
2 | Starfish
3 | Wrench
4 | Stereo
users
id | name
1 | Alice
2 | Bill
3 | Charlie
4 | Daisy
things_items
thing_id | item_id
1 | 1
1 | 2
2 | 3
2 | 4
things_users
thing_id | user_id
1 | 1
1 | 2
1 | 3
2 | 4
And I would ideally like to write a query that gets back the following for the Some Thing row in the things table:
Some Thing | Blob|Starfish | Alice|Bill|Charlie
However, what I'm getting back is the following:
Some Thing | Blob|Blob|Blob|Starfish|Starfish|Starfish | Alice|Alice|Bill|Bill|Charlie|Charlie
And this is the query I'm using:
SELECT things.name,
GROUP_CONCAT(items.name SEPARATOR '|')
GROUP_CONCAT(users.name SEPARATOR '|')
FROM things
JOIN things_items ON things.id = things_items.thing_id
JOIN items ON things_items.item_id = items.id
JOIN things_users ON things.id = things_users.thing_id
JOIN users ON things_items.user_id = users.id
GROUP BY things.id;
How should I change the query to get the data back the way I'd like to and avoid the multiplying of the GROUP_CONCAT data? Thank you.
You are concatenating along two separate dimensions. The simplest solution is DISTINCT:
SELECT t.name,
GROUP_CONCAT(DISTINCT i.name SEPARATOR '|')
GROUP_CONCAT(DISTINCT u.name SEPARATOR '|')
FROM things t JOIN
things_items ti
ON t.id = ti.thing_id JOIN
items i
ON ti.item_id = i.id JOIN
things_users tu
ON t.id = tu.thing_id JOIN
users u
ON tu.user_id = u.id
GROUP BY t.id;
Note the above filters out things that have either no items or no users.
The above will work fine if there are a handful of items and users for each thing. As the numbers grow, the performance gets worse because it generates a Cartesian product for each thing.
That can be solved by aggregating before joining:
SELECT t.name, i.items, u.users
FROM things t JOIN
(SELECT ti.thing_id, GROUP_CONCAT(i.name SEPARATOR '|') as items
FROM things_items ti JOIN
items i
ON ti.item_id = i.id
GROUP BY ti.thing_id
) i
ON t.id = ti.thing_id JOIN
(SELECT tu.user_id, GROUP_CONCAT(DISTINCT u.name SEPARATOR '|') as users
FROM things_users tu JOIN
users u
ON tu.user_id = u.id
GROUP BY tu.user_id
) tu
ON t.id = tu.thing_id ;
You can replace the outer JOINs with LEFT JOIN if you want all things, even those with no items or names.
I have the following tables:
jobs:
-------------------------------------------------------
| id | title | slug |
-------------------------------------------------------
employments:
-------------------------------------------------------
| id | job_type|
-------------------------------------------------------
applications:
-------------------------------------------------------
| id | job_opening_id| application_state_id|
-------------------------------------------------------
application_states
-------------------------------------------------------
| id | name|
-------------------------------------------------------
I want to create a query that counts the different application_state_id's
----------------------------------------------------------------------------
| j.title| j.slug| e.job_type | candidates | hired
----------------------------------------------------------------------------
This is the query that i have at the moment:
SELECT
j.title,
j.slug,
e.job_type,
count(a1.application_state_id) as candidates,
count(a2.application_state_id) as hired
FROM
jobs AS j
INNER JOIN employments AS e ON j.employment_id = e.id
LEFT JOIN applications AS a1 ON a1.job_opening_id = job_openings.id
LEFT JOIN application_states AS as ON as.id = a1.application_state_id
LEFT JOIN applications AS a2 ON a2.job_opening_id = j.id AND a2.application_state_id = 1
GROUP BY
a1.application_state_id,
a2.application_state_id,
j.id,
j.title,
j.slug
I thought i could create 2 joins and set the application_state_id, but all that does is count records double.
What do i need to change in this query? I hope someone can help me.
You did not provide sample data, but as I see from your code
you are joining the table applications twice,
so by the 1st to get the total number of candidates
and by the 2nd to get the total number of hired candidates.
I think you can drop the 2nd join and do conditional counting to get the total number of hired candidates.
Also:
the select statement must include the columns that you group by and any aggregated columns
and I don't see why you need to join to the application_states table.
Try this:
SELECT
j.title,
j.slug,
e.job_type,
count(a.application_state_id) as candidates,
sum(case when a.application_state_id = 1 then 1 else 0 end) as hired
FROM
jobs AS j INNER JOIN employments AS e ON j.employment_id = e.id
LEFT JOIN applications AS a ON a.job_opening_id = job_openings.id
GROUP BY
j.title,
j.slug,
e.job_type
I have these three tables movies, category and relationship as shown below.
movies--
-----------------------
id|name|duration|
1 |x |5 mins |
2 |y |10 mins |
----------------------
category--
-----------------------
id|type |value |
1 |genre |action |
2 |language|english |
3 |genre |thriller |
4 |language|spanish |
------------------------
relationship--
id| movie_id|category_id|
1 |1 | 2 |
2 |1 | 3 |
------------------------------
i want a query that will fetch both genre and language for the movie in a single query.
below is the expected output.
name|duration|language|genre |
x |5 mins |english |thriller|
--------------------------------
in short i want to use the type column twice.
Please help
You need mysql pivot table for that. That is turn some columns into row data. The following query will produce what you want:
SELECT
m.name,
m.duration,
MAX(IF(c.type = 'language', c.value, NULL)) AS language,
MAX(IF(c.type = 'genre', c.value, NULL)) AS genre
FROM movies AS m
INNER JOIN relationship AS r ON m.id = r.movie_id
INNER JOIN category AS c ON r.category_id = c.id
WHERE m.name = 'x'
GROUP BY m.name;
That will produce:
| name | duration | language | genre |
| x | 5 mins | english | thriller |
See DEMO
Step 1: Join all the three table together. Now you get all the category infos for each movie.
Step 2: Select what you want from the big combined table.
Step 3: Use two subquery to satisfy your special needs for language and genre.
Step 4: Add a LIMIT 1 to avoid redundant records.
The final query might be something like this:
SELECT name, duration, (SELECT value FROM t WHERE type = 'language' AND name = 'x') AS language, (SELECT value FROM t WHERE type = 'genre' AND name = 'x') AS genre
FROM
(
SELECT m.name, m.duration, c.type FROM movies AS m
JOIN relationship AS r ON m.id = r.movie_id
JOIN category AS c ON r.category_id = c.id
) AS t LIMIT 1;
Note:
Replace your own query condition for WHERE clause.
This query might not be strictly syntax correct. Please fix it by yourself.
One method uses two joins, one for each type:
select m.*, cl.value as language, cg.language as genre
from movies m join
relationships r
on m.id = r.movie_id left join
categories cl
on cl.id = r.category_id and type = 'language' left join
categories cg
on cg.id = r.category_id and type = 'genre';
Note that movies typically have only one language, but they can have multiple genres. If this is the case you will get a separate row for each genre.
For transaction listing I need to provide the following columns:
log_out.timestamp
items.description
log_out.qty
category.name
storage.name
log_out.dnr ( Representing the users id )
Table structure from log_out looks like this:
| id | timestamp | storageid | itemid | qty | categoryid | dnr |
| | | | | | | |
| 1 | ........ | 2 | 23 | 3 | 999 | 123 |
As one could guess, I only store the corresponding ID's from other tables in this table. Note: log_out.id is the primary key in this table.
To get the the corresponding strings, int's or whatever back, I tried two queries.
Approach 1
SELECT i.description, c.name, s.name as sname, l.*
FROM items i, categories c, storages s, log_out l
WHERE l.itemid = i.id AND l.storageid = s.id AND l.categoryid = c.id
ORDER BY l.id DESC
Approach 2
SELECT log_out.id, items.description, storages.name, categories.name AS cat, timestamp, dnr, qty
FROM log_out
INNER JOIN items ON log_out.itemid = items.id
INNER JOIN storages ON log_out.storageid = storages.id
INNER JOIN categories ON log_out.categoryid = categories.id
ORDER BY log_out.id DESC
They both work fine on my developing machine, which has approx 99 dummy transactions stored in log_out. The DB on the main server got something like 1100+ tx stored in the table. And that's where trouble begins. No matter which of these two approaches I run on the main machine, it always returns 0 rows w/o any error *sigh*.
First I thought, it's because the main machine uses MariaDB instead of MySQL. But after I imported the remote's log_out table to my dev-machine, it does the same as the main machine -> return 0 rows w/o error.
You guys got any idea what's going on ?
If the table has the data then it probably has something to do with JOIN and related records in corresponding tables. I would start with log_out table and incrementally add the other tables in the JOIN, e.g.:
SELECT *
FROM log_out;
SELECT *
FROM log_out
INNER JOIN items ON log_out.itemid = items.id;
SELECT *
FROM log_out
INNER JOIN items ON log_out.itemid = items.id
INNER JOIN storages ON log_out.storageid = storages.id;
SELECT *
FROM log_out
INNER JOIN items ON log_out.itemid = items.id
INNER JOIN storages ON log_out.storageid = storages.id
INNER JOIN categories ON log_out.categoryid = categories.id;
I would execute all the queries one by one and see which one results in 0 records. Additional join in that query would be the one with data discrepancy.
You're queries look fine to me, which makes me think that it is probably something unexpected with the data. Most likely the ids in your joins are not maintained right (do all of them have a foreign key constraint?). I would dig around the data, like SELECT COUNT(*) FROM items WHERE id IN (SELECT itemid FROM log_out), etc, and seeing if the returns make sense. Sorry I can't offer more advise, but I would be interested in hearing if the problem is in the data itself.
Hey I try to select a row from a table with two matching entries on another one.
The structure is as following:
----------------- ---------------------
| messagegroups | | user_messagegroup |
| | | |
| - id | | - id |
| - status | | - user_id |
| | | - messagegroup_id |
----------------- | |
---------------------
There exist two rows in user_messagegroup with the ids of two users and both times the same messagegroup_id.
I would like to select the messagegroup where this two users are inside.
I dont get it.. so I would appreciate some help ;)
The specification you provide isn't very clear.
You say "with the ids of two users"... if we take that to mean you have two user_id values you want to supply in the query, then one way to to find the messagegroups that contain these two specific users:
SELECT g.id
, g.status
FROM messagegroups g
JOIN ( SELECT u.messagegroup_id
FROM user_messagegroup u
WHERE u.user_id IN (42, 11)
GROUP BY u.messagegroup_id
HAVING COUNT(DISTINCT u.user_id) = 2
) c
ON c.messagegroup_id = g.id
The returned messagegroups could also contain other users, besides the two that were specified.
If you want to return messagegroups that contain ONLY these two users, and no other users...
SELECT g.id
, g.status
FROM messagegroups g
JOIN ( SELECT u.messagegroup_id
FROM user_messagegroup u
WHERE u.user_id IS NOT NULL
GROUP BY u.messagegroup_id
HAVING COUNT(DISTINCT IF(u.user_id IN (42,11),u.user_id,NULL)) = 2
AND COUNT(DISTINCT u.user_id) = 2
) c
ON c.messagegroup_id = g.id
For improved performance, you'll want suitable indexes on the tables, and it may be possible to rewrite these to eliminate the inline view.
Also, if you only need the messagegroup_id value, you could get that from just the inline view query, without the need for the outer query and the join operation to the messagegroups table.