How to "remove duplicates" from a UNION query - mysql

I have two tables in MySQL: One called gtfsws_users which contains users for a system I'm developing and another called gtfsws_repository_users which contains roles for these users.
gtfsws_users has these fields: email, password, name, is_admin and enabled.
gtfsws_repository_users has: user_email, repository_id and role.
The role is an integer that defines privileges over a GTFS repository (public transportation data, not relevant for my problem).
One important thing is that every administrator accont (that is, every user that has the is_admin flag set as 1 in gtfsws_users) has full access to all repositories.
Now, only users registered in gtfsws_repository_users will have access to a specific repository defined there (unless they are administrators, of course). One user can have multiple repositories which him/her can access.
What I'm trying to do is to get all users with access to a specific repository (it doesn't matter which type of role the user has, I just want to know if they can access the repository or not). So I'm writing this SQL statement for getting them:
(
SELECT DISTINCT
gtfsws_users.email AS email,
gtfsws_users.name AS name,
gtfsws_users.is_admin AS is_admin,
gtfsws_users.enabled AS enabled,
gtfsws_repository_users.role AS role
FROM
gtfsws_users
INNER JOIN
gtfsws_repository_users
ON
gtfsws_users.email = gtfsws_repository_users.user_email
WHERE
gtfsws_repository_users.repository_id = '2'
)
UNION
(
SELECT
email,
name,
is_admin,
enabled,
null AS role
FROM
gtfsws_users
WHERE
is_admin = 1
)
Now, this works fine for users with access to different repositories. It also gets all administrators too.
The problem is when I have an administrator that is also registered in gtfsws_repository_users, because I get it duplicated.
So for example, it I have this in gtfsws_users:
('test#test.com', '*****', 'Real name', 1, 1)
And also the user is registered in gtfsws_repository_users as this:
('test#test.com', 2, 10)
When I do the SELECT in MySQL (using the UNION to add all administrators) I get:
('test#test.com', 'Real name', 1, 1, 10)
('test#test.com', 'Real name', 1, 1, NULL)
What I need to do is to filter that table to remove duplicates, that is getting only:
('test#test.com', 'Real name', 1, 1, NULL)
Yes, getting NULL as the role (since it will be ignored as the user is an administrator).
Does anybody have a clue on how to achieve that?
Thanks a lot.
EDIT: Ok, thanks to Katrin's suggestion, I'm getting some progress. I do get one row, but it's the one with the role number defined. Any way to preserve the one with the NULL role instead of the defined one?

Since aggregate functions ignore null values, you can convert null to a number that can be extracted as min or max.
Assuming all your roles are greater than 0:
SELECT email, name, is_admin, enabled, nullif(min(coalesce(role, 0)), 0) as role
from
((
SELECT DISTINCT
gtfsws_users.email AS email,
gtfsws_users.name AS name,
gtfsws_users.is_admin AS is_admin,
gtfsws_users.enabled AS enabled,
gtfsws_repository_users.role AS role
FROM
gtfsws_users
INNER JOIN
gtfsws_repository_users
ON
gtfsws_users.email = gtfsws_repository_users.user_email
WHERE
gtfsws_repository_users.repository_id = '2'
)
UNION
(
SELECT
email,
name,
is_admin,
enabled,
null AS role
FROM
gtfsws_users
WHERE
is_admin = 1
)) as Q
GROUP BY email, name, is_admin, enabled

Related

MySQL Query - pass argument into same query

I would like to combine all issues that have same ID and show who is affected without showing all the people affected with the same issue (multiple lines).
SELECT
plugin_id as 'Plugin id',
cve as CVE,
cvss3_base_score as cvss3_base_score,
risk as Risk,
COUNT(DISTINCT(Host)) as 'Affected Unique Hosts',
Name,
plugin_family as 'Plugin Family',
last_seen as 'Last Seen',
vulnerability_state as 'Vulnerability State'
from tenable_network
GROUP BY plugin_id
ORDER BY plugin_id DESC
Fetch Assets Query
============
SELECT
role_id,
Host,
Name,
Synopsis,
Port,
Protocol,
OS,
FQDN
FROM tenable_network
WHERE plugin_id='90317'
GROUP BY Host
As you can see, I can group all roles that have plugin_id 90317 (issue)
Is there a simple way to pass the argument within the query?
Thanks

Finding friends of a user

I have a table of user with differents fields : id, firstname, name.
I have a table called friend with differents fields : invite_id, friend_sender (id of a user), friend_receiver (id of a user), validity (boolean).
I'm filling the friend table with
1, 1, 2, 0;
2, 3, 1, 1;
3, 1, 5, 1;
Let's imagine I'm user 1, and I want to find all my friends. I can be the one who sent the friend invitation (sender), or the one who received it (receiver). When the receiver accept the invitation, the validity of the relation is set to 1. So for example, I'm not friend with user 2 because he didn't accepted.
The result I should get from doing the query with user 1 should be :
3, firstnameofuser3, nameofuser3
5, firstnameofuser3, nameofuser3
I tried some SQL things, with double JOIN, renaming table to avoid the "double same table" problems etc ... but I couldn't figure it out.
I've found some post about it, but for more complex things, like here :
Finding mutual friend sql
Thank you in advance for you help.
I know there are already answers, but mine is unique AND I have a fiddle! ;)
SELECT
id,
firstname,
name
FROM
user
WHERE id IN
(
SELECT
CASE WHEN friend_sender = 1 THEN friend_receiver ELSE friend_sender END
FROM friend
WHERE
(friend_sender = 1 OR friend_receiver = 1)
AND
validity = 1
)
Fiddle: http://sqlfiddle.com/#!9/d8f55a/1
Try this:
SELECT u.*
FROM user u
WHERE u.id IN (
SELECT f.friend_sender
FROM friend f
WHERE f.friend_receiver = 2 -- My fixed ID about Jin Jey
UNION
SELECT f.friend_receiver
FROM friend f
WHERE f.friend_sender = 2 AND f.validity = 1)
I used UNION because you can query two sets of data and merge it.
I fixed ID (2) because in your request you want to know all friends about Jin Jey
You can try the below query
select
id,
firstname,
name
from
user inner join friend
on
(friend_sender=1 or friend_receiver=1 )and validity=1
and
user.id=
(case
when friend_sender=1 then friend_receiver
else friend_sender
end)
demo link here

Django querysets: Excluding NULL values across multiple joins

I'm trying to avoid using extra() here, but haven't found a way to get the results I want using Django's other queryset methods.
My models relationships are as follows:
Model: Enrollment
FK to Course
FK to User
FK to Mentor (can be NULL)
Model: Course
FK to CourseType
In a single query: given a User, I'm trying to get all of the CourseTypes they have access to. A User has access to a CourseType if they have an Enrollment with both a Course of that CourseType AND an existing Mentor.
This user has 2 Enrollments: one in a Course for CourseType ID 6, and the other for a Course for CourseType ID 7, but her enrollment for CourseType ID 7 does not have a mentor, so she does not have access to CourseType ID 7.
user = User.objects.get(pk=123)
This works fine: Get all of the CourseTypes that the user has enrollments for, but don't (yet) query for the mentor requirement:
In [28]: CourseType.objects.filter(course__enrollment__user=user).values('pk')
Out[28]: [{'pk': 6L}, {'pk': 7L}]
This does not give me the result I want: Excluding enrollments with NULL mentor values. I want it to return only ID 6 since that is the only enrollment with a mentor, but it returns an empty queryset:
In [29]: CourseType.objects.filter(course__enrollment__user=user).exclude(course__enrollment__mentor=None).values('pk')
Out[29]: []
Here's the generated SQL for the last queryset that isn't returning what I want it to:
SELECT `courses_coursetype`.`id` FROM `courses_coursetype` INNER JOIN `courses_course` ON ( `courses_coursetype`.`id` = `courses_course`.`course_type_id` ) INNER JOIN `store_enrollment` ON ( `courses_course`.`id` = `store_enrollment`.`course_id` ) WHERE (`store_enrollment`.`user_id` = 3877 AND NOT (`courses_coursetype`.`id` IN (SELECT U0.`id` AS `id` FROM `courses_coursetype` U0 LEFT OUTER JOIN `courses_course` U1 ON ( U0.`id` = U1.`course_type_id` ) LEFT OUTER JOIN `store_enrollment` U2 ON ( U1.`id` = U2.`course_id` ) WHERE U2.`mentor_id` IS NULL)))
The problem, it seems, is that in implementing the exclude(), Django is creating a subquery which is excluding more rows than I want excluded.
To get the desired results, I had to use extra() to explicitly exclude NULL Mentor values in the WHERE clause:
In [36]: CourseType.objects.filter(course__enrollment__user=user).extra(where=['store_enrollment.mentor_id IS NOT NULL']).values('pk')
Out[36]: [{'pk': 6L}]
Is there a way to get this result without using extra()? If not, should I file a ticket with Django per the docs? I looked at the existing tickets and searched for this issue but unfortunately came up short.
I'm using Django 1.7.10 with MySQL.
Thanks!
Try using isnull.
CourseType.objects.filter(
course__enrollment__user=user,
course__enrollment__mentor__isnull=False,
).values('pk')
Instead of exclude() you can create complex queries using Q(), or in your case ~Q():
filter_q = Q(course__enrollment__user=user) | ~Q(course__enrollment__mentor=None)
CourseType.objects.filter(filter_q).values('pk')
This might lead to a different SQL statement.
See docs:
https://docs.djangoproject.com/en/3.2/topics/db/queries/#complex-lookups-with-q-objects

SQL Query Duplicate Result

I am doing a project using MySQL 5. The requirement is the following:
Give the user names, device types, OS version and fruit involved in picks, where users had the
same device type, were running iOS 4 or 4.1, and picked the same fruit
The relevant tables are as follows:
User: {uID: INT, name: VARCHAR(45), deviceOS: VARCHAR(45), deviceType: VARCHAR(45)}
Pick: {uID: INT, ts: TIMESTAMP, fruit: VARCHAR(45)}
(Primary keys in bold. uID in Pick is a foreign key of uID in User.)
I am doing the following query:
SELECT DISTINCT NAME1, OS1, DEV1, NAME2, OS2, DEV2, P1.fruit FROM Pick AS P1, Pick AS P2,
(SELECT U1.uID AS User1, U1.name AS NAME1, U1.deviceOS AS OS1, U1.deviceType AS DEV1,
U2.uID AS User2, U2.name AS NAME2, U2.deviceOS AS OS2, U2.deviceType AS DEV2
FROM User AS U1, User AS U2
WHERE (U1.uID != U2.uID) AND
(U1.deviceType = U2.deviceType) AND
(U1.deviceOS = "4" OR U1.deviceOS = "4.1") AND
(U2.deviceOS = "4" OR U2.deviceOS = "4.1")) AS PartialResult
WHERE (P1.uID = PartialResult.User1) AND
(P2.uID = PartialResult.User2) AND
(P1.fruit = P2.fruit)
This returns the following result, but as you see, it is in some way "duplicated":
I have tried solving this using GROUP BY fruit but it will not return the correct result on the general case. Limit 1 also would not work on the general case. So after numerous hours trying to figure this out, I must ask:
Is there a way to prevent this kind of duplication on the general case?
Thank you!
Instead of U1.uID != U2.uID, write U1.uID > U2.uID.
The problem you're encountering is that every single row is going to be duplicated, a--b and b--a. You need some way of specifying that you only want one or the other, but the question is, which one? Do you have a preference whether Priscilla is listed before Marcia, or vice versa?
If there is no preference, then you can just make up some arbitrary rule that will only allow one or the other to go through. For example, you can compare names and only grab rows where the first name is lexicographically before the second (see last line):
SELECT DISTINCT NAME1, OS1, DEV1, NAME2, OS2, DEV2, P1.fruit FROM Pick AS P1, Pick AS P2,
(SELECT U1.uID AS User1, U1.name AS NAME1, U1.deviceOS AS OS1, U1.deviceType AS DEV1,
U2.uID AS User2, U2.name AS NAME2, U2.deviceOS AS OS2, U2.deviceType AS DEV2
FROM User AS U1, User AS U2
WHERE (U1.uID != U2.uID) AND
(U1.deviceType = U2.deviceType) AND
(U1.deviceOS = "4" OR U1.deviceOS = "4.1") AND
(U2.deviceOS = "4" OR U2.deviceOS = "4.1")) AS PartialResult
WHERE (P1.uID = PartialResult.User1) AND
(P2.uID = PartialResult.User2) AND
(P1.fruit = P2.fruit) AND
(STRCMP(NAME1, NAME2) < 0)
Of course, you can implement any rule you want that picks one or the other. #igelkott's answer solves the problem the same way by enforcing person 1's uID to be higher than person 2's uID, which is very reasonable (and faster than doing string compares).

Multiple mysql joins, how to combine these 2 select statments

I have 2 select statements I would like to combine into one, though I really only need the info from one field in the second select statement(The field data from user_info_data). The fields I need are Firstname, lastname, email, course fullname, role, and the field data where fieldid = '15'. The first select statement will give me everything but the data field. And the second gives me everything but the course. I tried doing the second select statement similar to Role field but it complains about it returning more than one row. If I try and just use the course name without the fieldid='15' part, it brings up over 100k records(Each user shows up in each course and all their data).
Fields for tables:
user(id,auth,confirmed,policyagreed,username,password,idnumber,firstname,lastname,email,phone etc..)
user_info_data(id,userid,fieldid,data)
role(id,name,shortname,description,sortorder)
role_assignments(id,roleid,contextid,userid...)
context(id,contextlevel,instanceid,path,depth)
First statement:
SELECT user.firstname AS Firstname, user.lastname AS Lastname, user.email AS Email, course.fullname AS Course, role.name AS Role
FROM user AS user, course AS course,role,role_assignments AS asg
INNER JOIN context AS context ON asg.contextid=context.id
WHERE context.contextlevel = 50
AND role.id=asg.roleid
AND user.id=asg.userid
AND context.instanceid=course.id
Output of first stament:
Firstname Lastname Email Course Role
John Doe john.doe#email.com Course-Name Student
Second statement:
SELECT user.firstname AS 'First Name', user.lastname AS 'Last Name', user.email AS 'Email', user_info_data.data AS 'IBCLC Certified'
FROM user, user_info_data
WHERE user.id = user_info_data.userid
AND fieldid = '15'
Output of second stament:
Firstname Lastname Email IBCLC Certified
John Doe john.doe#email.com Yes
Desired Output:
FirstName,LastName,Email,IBCLC Certified,Course,Role
Other select statement I tried: Brings up 9,494 records, but right now the field data where fieldid is 15 is a list of possible choices, could that be why?
SELECT user.firstname AS Firstname, user.lastname AS Lastname, user.email AS Email, userdata.data, course.fullname AS Course, role.name AS Role
FROM user AS user, course AS course, user_info_data AS userdata, role,role_assignments AS asg
INNER JOIN context AS context ON asg.contextid=context.id
WHERE context.contextlevel = 50
AND userdata.fieldid = 15
AND role.id=asg.roleid
AND user.id=asg.userid
AND context.instanceid=course.id
I added user_info_data to your first request like this:
SELECT user.firstname AS Firstname,
user.lastname AS Lastname,
user.email AS Email,
course.fullname AS Course,
role.name AS Role,
ibclcCert.data AS 'IBCLC Certified'
FROM user,
course,
role,
role_assignments AS asg,
context,
user_info_data AS ibclcCert
WHERE context.contextlevel = 50
AND role.id=asg.roleid
AND user.id=asg.userid
AND context.instanceid=course.id
AND asg.contextid=context.id
AND ibclcCert.userid = user.id
AND ibclcCert.fieldid = '15'
I renamed the user_info_data table reference to something denoting the actual field, ibclcCert in this case. This renaming is a provision in case that you one day want to access more than one data field. When you do, you'd include the table multiple times, one for every field you need. See also this answer about how to deal with such data formats.