Very beginner question but haven't been able to come up with answer after reading various help resources.
I have a table group_affiliations which is a joining table between the tables users and groups. Relevant columns: Id, user_id, group_id. I am doing a data cleanup where users were assigned a group_id based on a location which used to be a 3 character abbreviation of a city but has since gone to spelling out full city (ex: a group_id for CHA was previously assigned and now a group_id for Charlotte). Most users currently have both group_ids associated with their user_id but some still only have the old group_id and were never assigned the new one.
What is the most efficient way of finding which ids are in this result set:
select user_id from group_affiliations where group_id=OldId;
and not in this result set:
select user_id from group_affiliations where group_id=NewId;
SELECT 'user_id'
from 'group_affiliations'
where 'group_id' = OldId
and 'group_id' != NewId
how about using a JOIN
SELECT g1.'user_id'
from 'group_affiliations' g1
inner join 'group_affiliations' g2
on g2.'group_id' != NewId
and g2.'group_id' = OldId
and g1.'user_id'=g2.'user_id'
Related
I have a table called followers and I want to be able to find the current users followers, display that list, and then compare those values against another user ID to see if they are following them or not from that same table
This is the table I'm using currently to get the list of followers for a specific user:
followers
-------
followId - primary key, the unique id for the follow relationship
userId - the user that is following someone
orgId - that someone that the user is following
I tried using a union query but I wouldn't want to due to performance reasons (table may contain high volume of records) and because it isn't scalable (I think)?
The expected output should be the list of orgId's for which the user(the user I am checking against) is following, and another column that shows whether my user(my userId that I provide) is following that orgId value (i.e a following column).
Hmmm, if I understand correctly, you have two users and you want to know which orgs followed by the first are also followed by the second:
select f.orgid,
(exists (select 1
from followers f2
where f2.userId = $seconduserid and
f2.orgid = f.orgid
)
) as seconduserflag
from followers f
where f.userId = $firstuserid
I have a cross-reference table that supplies the many-to-many relationship between users and user group tables. It contains two relevant columns: group_id and user_id (surprise, surprise!). When a user wants to create a new group, I want to first check if that set of users already exists as a group.
Essentially I would define the problem as "Given a set of user ids, find any set of rows that match the set of user ids and all share the same group id".
Edit:
I'm looking for the exact set of users, not interesting in seeing in the resultset groups that include those users in addition to other users.
Sample Data
I have the hunch that a subquery is the way to go, but I can't figure out how to arrange it. Any help would be greatly appreciated!
Is this what you want?
select groupid
from usergroups ug
where userid in ($user1, $user2, . . . , $usern)
group by groupid
having count(*) = <n>;
This returns all groups that have the supplied list of users.
If you want the exact set, then:
select groupid
from usergroups ug
group by groupid
having count(*) = sum( userid in ($user1, $user2, . . . , $usern) );
This assumes that groups don't have the same user twice (it is not hard to adjust for that, but the condition becomes more complicated).
I have created a table as a union of two SELECT statements, say FRIENDS_AND_NEIGHBORS, and I want to remove the repetitions of the two, except that they do not coincide in all the fields.
Simplifying my case, I have a table called FRIENDS (that has pairs of users, and its link id) and a table for USER that includes ZIP_CODE, from which I get the NEIGHBORS section. I'm also fixing a reference user with USER_ID = #usr, and ZIP_CODE = #zip. Then I do the following.
CREATE TABLE FRIENDS_AND_NEIGHBORS
(SELECT
USER_ID AS FRND_ID, # Choose neighbors by zipcode.
ZIP_CODE AS FRND_ZIP, #
0 AS FRND_LINK # The reference for friendship comes later.
FROM USER
WHERE ZIP_CODE = #zip)
UNION
(SELECT
frd.FRIEND_ID AS FRND_ID,
usr.ZIP_CODE AS FRND_ZIP,
frd.LINK AS FRND_LINK
FROM FRIENDS frd
JOIN USER usr
ON frd.FRIEND_ID = usr.USER_ID
WHERE frd.USER_ID = #usr);
Then I may be counting some neighbor/friends twice, but they still differ in the FRND_LINK
column, as I gave it a zero because I couldn't join the two.
I want to remove the corresponding neighbor row that has 0, when it has been counted as a friend.
Thank you for your help.
I am working on a MYSQL database which has the following three columns: emails, name, surname.
What I need to do is deduplicate the emails where I know I can use a function such as this one (this query just to sort not delete):
select distinct emails, name, surname from emails;
or
select emails, name, surname from emails group by emails having count(*) >= 2;
However I also need to make sure that when there a duplicate email address is found that the one kept is the one that has a name and/or surname value.
For example:
|id | emails | name | surname |
|1 | bob#bob.com | bob | paulson |
|2 | bob#bob.com | | |
In this case I would like to keep the first result and delete the second.
I have been looking into using 'case' or 'if' statements but am not experienced with using those. I tried expanding the above functions with those statements but to no avail.
Could anyone point me in the right direction?
PS: The first column in the table is an auto-incremented id value, in case that helps
UPDATE 1: So far #Bohemian answer below is working great but fails in one case where there is a duplicate emails address where in one row it has a name but no surname and in the next row it has no name but has a surname. It will keep both records. All that needs to be edited is so that one of these two records gets deleted, no matter which.
UPDATE 2: #Bohemian's answer is great, but after more testing I've found that it has a fundamental flaw in that it works only when there is a duplicate email row where the name and surname fields have data (like the first entry in the table above). If there are duplicates of an email but none of the rows have both the name and surname fields filled in then all those rows will be ignored and not deduplicated.
The last step for this query would be to work out how to delete the duplicates that don't meet the current necessary conditions. If one row has just name and the other just surname, it really doesn't matter which gets deleted as the email is the important thing to keep.
You could use this DELETE query, which is generic and can be easily adapted to support more fields:
DELETE tablename.*
FROM
tablename LEFT JOIN (
SELECT MIN(id) min_id
FROM
tablename t INNER JOIN (
SELECT
emails, MAX((name IS NOT NULL) + (surname IS NOT NULL)) max_non_nulls
FROM
tablename
GROUP BY
emails) m
ON t.emails=m.emails
AND ((t.name IS NOT NULL) + (t.surname IS NOT NULL))=m.max_non_nulls
GROUP BY
t.emails) ids
ON tablename.id=ids.min_id
WHERE
ids.min_id IS NULL
Please see fiddle here.
This query returns the maximum number of non null fields, for every email:
SELECT
emails,
MAX((name IS NOT NULL) + (surname IS NOT NULL)) max_non_nulls
FROM
tablename
GROUP BY
emails
I'm then joining this query with tablename, to get the minimum ID for every email that has the maximum number of non null fields:
SELECT MIN(id) min_id
FROM
tablename t INNER JOIN (
SELECT
emails, MAX((name IS NOT NULL) + (surname IS NOT NULL)) max_non_nulls
FROM
tablename
GROUP BY
emails) m
ON t.emails=m.emails
AND ((t.name IS NOT NULL) + (t.surname IS NOT NULL))=m.max_non_nulls
GROUP BY
t.emails
and then I'm deleting all rows that have an ID that is not returned by this query.
This is easy with mysql's multiple-table delete syntax:
delete b
from mytable a
join mytable b
on a.email = b.email
and a.id != b.id
where a.name is not null
and a.surname is not null
Delete record with duplicate email id
delete
from duplicate_email where id in(
select id from (
select id, email from duplicate_email group by email having count(id) > 1) as id
)
but there is one problem you can delete those record which have only one duplicate email i.e two same email but if there are three or more, you can repeat this query until you get zero record deleted
I have a table with the following columns:
subid - id of the resource
authorid - id of the author
ordering - order of author within citation
For an application where users can submit resources and cite multiple authors. Authors can cite primary and secondary authors in their submissions and usually do.
There is one case where a user (call him user 111) submitted all entries listing himself as the primary and the actual author as secondary. Unfortunately that person has left the project so it has fallen to me to fix this (I have to do it purely in sql).
I am trying to figure out how to build a query to do the following:
Find all entries
where the subid value shows up more than once in the table
where at least one of the authorid values is 111
where the ordering for 111 is greater than the ordering for any users that are not 111
& update them so
the not(111) author has ordering of '0'
and the 111 author has ordering '1'
Try this solution:
UPDATE tbl a
INNER JOIN
(
SELECT subid
FROM tbl
GROUP BY subid
HAVING COUNT(*) > 1 AND SUM(author_id = 111) > 0
) b ON a.subid = b.subid
SET a.ordering = (a.author_id = 111)
Replace tbl with your actual table name.