Trying to find duplicates and who made them

Trying to find duplicates and who made them - mysql

I'm working on a legacy system that allowed the insertion of multiple entries with the same email. In the people table are present entries with same name and email and also different name with an already used email (es. the user didn't know or didn't ask the email address to the person and chose to put a fake one).
A person could subscribe to a User multiple times yearly basis
They asked me for a report of which users entered the most of these entries.
Let's say I have 3 tables
| people| | subscriptions| |users|
| ------| |--------------| |-----|
| id | | id | |id |
| name | | personId | |name |
| email | | userId |
| subYear |
I found all duplicate emails and their occurrences using this query
SELECT users.name, people.email, count(subscriptions.id) nSub
FROM people
INNER JOIN (SELECT email, count(id) occurrences
FROM people
where email is not null and email != ""
GROUP BY email
HAVING occurrences > 1) duplicates
ON people.email = duplicates.email
JOIN subscriptions ON people.id = subscriptions.personId
JOIN users on users.id = subscriptions.userId
group by users.name, people.email;
but now I'm stuck when I have to integrate users, the query gives incorrect results or gets stuck in a loop.
I'm sure I'm getting the grouping wrong but I got lost
The result I'm trying to achieve is something like (based on data provided in fiddle)
|users.name| people.email | occurrences |
|----------|-------------------------|-------------|
| User1 | example#example.com | 1 |
| User2 | example#example.com | 2 |
| User2 | fake#email.com | 3 |
| User3 | fake#email.com | 1 |
Any suggestion you can give me is welcome. Thank's in advance
UPDATE: Sorry for the sloppiness, I created a fiddle
sql-fiddle

Related

MySQL - Recursive - getting email addresses from 2 different tables and columns

I have a first table called emails with a list of all the emails of my colleagues
| email |
| ----------------------- |
| saramaia#email.com |
| miguelferreira#email.com |
| joaosilva#email.com |
| joanamaia#email.com |
I have a second table called aliases, with a list of all the secondary emails/aliases my colleagues are using
| alias1 | alias2 |
| ------------------------ | ------------------- |
| joanamaia#email.com | maiajoana#email.com |
| maiajoana#email.com | maia#email.com |
| miguelferreira#email.com | miguel#email.com |
| maia#email.com | joana#email.com |
| joanamaia#email.com | jomaia#email.com |
| joana#email.com | jmaia#email.com |
I can see that the users joanamaia#email.com and miguelferreira#email.com are using aliases. But let's focus on the user joanamaia#email.com.
I need to get a list of all the email addresses the user joanamaia#email.com is using. The difficult part is that I need to get a list with the main email address plus all the intersections where the first email and consecutive ones are being used by this user. The end result should look like this
| emails |
| ------------------- |
| joanamaia#email.com |
| jomaia#email.com |
| maiajoana#email.com |
| maia#email.com |
| joana#email.com |
| jmaia#email.com |
If I do WHERE email='joanamaia#email.com' it should look like this, but I also need the same result if I do
WHERE email='jmaia#email.com'
I've been through some days of testing queries and I don't seem to have a solution for this (I've been using right joins, full outer joins and unions, but no luck so far). Is there a good way to do this?

You can use a recursive CTE to walk the graph and get the full list of interconnected aliases. Care needs to be taken to handle cycles; that requires the query to use UNION instead of the traditional UNION ALL to separate the anchor and recursive member of the CTE.
The query can take the form:
with recursive
n as (
select 'joanamaia#email.com' as email
union
select case when a.alias1 = n.email then a.alias2 else a.alias1 end
from n
join aliases a on (a.alias1 = n.email or a.alias2 = n.email)
and a.alias1 <> a.alias2
)
select * from n;
Result:
email
-------------------
joanamaia#email.com
maiajoana#email.com
jomaia#email.com
maia#email.com
joana#email.com
jmaia#email.com
See running example at DB Fiddle.

Relational MySQL Query

I have a table called users:
+----+---------+--------+----------------+----------------+----------+
| ID | Name | Zip | Email | Phone | Username |
+----+---------+--------+----------------+----------------+----------+
| 0 | Jill | 33333 | jill#aol.com | (123)123-1245 | idjill |
| 1 | Jack | 11111 | jack#aol.com | (123)111-1111 | idjack |
| 2 | Bob | 66666 | bob#aol.com | (123)222-2222 | idbob |
| 3 | jMarie | 12345 | jill#aol.com | (123)123-1245 | none |
+----+---------+--------+----------------+----------------+----------+
If I run SELECT * FROM users WHERE Phone=(123)123-1245 will return both ID# 0 and 3.
What I would like to do is be able to select the user and but also return any other users that have the same phone or email but not zip code. So for example if I run SELECT * FROM users WHERE Username= idjill I'd like it to return user 0 and 3 because they both have the same phone number.
How can I do that? Thanks. If anyone has a better idea for a title to this post, please share. My first post, sorry.
Edit: I think I need to clarify my question a bit. So I have this select query right here:
SELECT * FROM users WHERE Username = 'idjill' OR Email = 'idjill'
That perfectly returns ID 0, I would like it to return ID 0 and 3. Because the phone and the email match (I am using the same input to search between username and email).
How can I expand on this?

Using INNER JOIN like below.
SELECT DISTINCT a.* FROM users a INNER JOIN
(SELECT * FROM users WHERE Username='idjill') b
ON (a.Phone=b.Phone OR a.Email=b.Email) AND a.Zip<>b.Zip;

Nested query can be used
Select *
from users
where Phone = (select Phone from users where Username = "idjill");

You can use nested query like this.
SELECT *
FROM users
WHERE Phone=(SELECT PHONE
FROM users
WHERE Name='Jill') OR
Email=(SELECT Email
FROM users
WHERE Name='Jill');

join two tables in mysql and get records

I have two tables "contacts" and "users". Users table storing data with "," separated. Need to distinct data in "Contacts" column from "Contacts" table. And need to join with "Users" table, and get the records.
Contacts Table
--------------------------
id | user_Id | contats
--------------------------
1 | 2147483647 | 90123456789,90123456789,90123456789,90123456789
2 | 2147483647 | 90123456789,90123456789,90123456789,90123456789
3 | 919444894154 | 90123456789,90123456789,90123456789,90123456789
Users Table
-----------------------------
id | username | email | phone
-----------------------------
1 | bhavan | bhavanram93#gmail.com | 90123456789
2 | bhavan | bhavanram93#gmail.com | 90123456789
3 | prince | prince#gmail.com | 1234567980
4 | bhavan | bhavanram93#gmail.com | 90123456789
5 | hello | hello#gmail.com | 1234567890
6 | bhavan | bhavanram93#gmail.com | 90123456789

Your table Contacts shouldn't be constructed this way.
Since you want 1 Users table containing all the data about a user, and 1 Contacts table containing links between different users, you'd rather do this kind of table structure :
Contacts table
id | user_id | contact_id
-------------------------
1 | 1 | 2
2 | 1 | 3
3 | 2 | 3
That'll allow you to do something like :
SELECT *
FROM Users
JOIN Contacts ON (Users.id = Contacts.contact_id)
WHERE Contacts.user_id = 1
Which will return all the data of the contacts of the user 1.
Your current structure is a huge ongoing mess, it's the opposite of being flexible.

You should restructure your db to a normalized format as Steve suggest.
But if you cant:
SELECT *
FROM Users
JOIN Contacts
ON CONCAT(',', Contacts.contacts, ',') like
CONCAT('%,', Users.phone, ',%')
WHERE Contacts.user_id = 1
the idea is you convert your contacts to
, <numbers> ,
,90123456789,90123456789,90123456789,90123456789,
and try to match with
%,90123456789,%
Note this approach cant use any index so will have bad performance with many
rows. if you are in the order of 1k-10k rows may be ok. More than that you need consider restructure your db.

Data Between Two Tables

Excuse any novice jibberish I may use to explain my conundrum but hopefully someone here will be able to look past that and provide me with an answer to get me unstuck.
SESSIONS
+--------+---------+----------+
| id | appID | userID |
+--------+---------+----------+
| 1 | 1 | 96 |
+--------+---------+----------+
| 2 | 2 | 97 |
+--------+---------+----------+
| 3 | 1 | 98 |
+--------+---------+----------+
USERS
+--------+---------+
| id | name |
+--------+---------+
| 96 | Bob |
+--------+---------+
| 97 | Tom |
+--------+---------+
| 98 | Beth |
+--------+---------+
For each session in the Sessions table that has an appID of 1, I want to get the users name from the Users table. The Sessions userID column is linked with the Users tables id column.
So my desired result would be:
["Bob", "Beth"]
Any suggestions/help?

try this:
SELECT USERS.name FROM USERS INNER JOIN SESSIONS ON users.id = SESSIONS.userID WHERE SESSIONS.appID = 1
I would read up on http://blog.codinghorror.com/a-visual-explanation-of-sql-joins/ for how all the joins work.

It looks like you forgot to post your code.
But in explanation.... It seems like you can just select the userID from the sessions table and then simply join the users table. Then create a WHERE clause to select all users that are attached to that ID.
Hope it helps.
If you post your code I can probably help you out more and if this doesnt seem just right lemme know and ill help you how i can

You need to create a join table (http://www.tutorialspoint.com/postgresql/postgresql_using_joins.htm) and then request the data using the equal operator.
SELECT USERS.name FROM USERS, SESSIONS WHERE SESSIONS.userID = USERS.ID ;

How to write a proper If...Else Statement with JOIN in MySQL?

I'm quite a beginner in MySQL I just know the totally basic statements, however now I'ts time for me to get into some more difficult, but worth stuff.
I actually have 3 tables in MySQL, here is the representation:
users:
user_id | name | country
---------------------------
1 | Joseph | US
2 | Kennedy | US
3 | Dale | UK
admins:
admin_id | name | country
----------------------------
1 | David | UK
2 | Ryan | US
3 | Paul | UK
notes:
id | n_id | note | comment | country | type | manager
----------------------------------------------------------------
1 | 3 | This is the 1st note | First | US | admin | 2
2 | 2 | This is the 2nd note | Second | US | user | 1
3 | 2 | This is the 3rd note | Third | UK | user | 2
Now I would like to execute something like this SQL (I'm going to type not real commands here, because I'm not really familiar with all of the SQL expressions):
IF notes.type = admin
THEN
SELECT
notes.note,
notes.comment,
notes.country,
admins.name,
admins.country
FROM notes, admins
WHERE notes.n_id = admin.admin_id
ELSEIF notes.type = 'user'
SELECT
notes.note,
notes.comment,
notes.country,
users.name,
users.country
FROM notes, users
WHERE notes.n_id = users.user_id
I hope you understand what would I like to achieve here. I could do this easily with more SQL statements, but I would like to try some query which doesn't use that much resources.
Edit 1:
I would like to Get all of the Notes and get which usergroup has submitted it than apply the user's name to it. I mean, if the admin submitted the note, than SQL should choose the ID from the Admin table (as per the type value) but if a User submitted the note, it should get the name from the Users table.
The result should look something similar to this:
result:
------
id | note | comment | country | name
--------------------------------------------------------
1 | This is the 1st note | First | US | Paul
2 | This is the 2nd note | Second | US | Kennedy
3 | This is the 3rd note | Third | UK | Kennedy
Edit 2:
I have actually forgot to mention, that all of these should be listed to a manager. So a 'manager ID' should be added to the Notes and list all of the notes where the manager is for example: 2.

Here is a method that you can do in one query:
SELECT n.note, n.comment, n.country,
coalesce(a.name, u.name) as name, coalesce(a.country, u.country) as country
FROM notes n left join
admins a
on n.n_id = a.admin_id and n.type = 'admin' left join
users u
on n.n_id = u.user_id and n.type = 'user';
This uses left join to bring the records together from both tables. It then chooses the matching record for the select.
To select a particular manager, remove the semicolon and add:
where n.manager = 2;

If you expect admins and users in one result you have got several options. The simplest way is to make a union select like this:
SELECT
notes.note,
notes.comment,
notes.country,
admins.name,
admins.country
FROM
notes join admins on notes.n_id = admin.admin_id
WHERE
notes.manager = 2
UNION ALL
SELECT
notes.note,
notes.comment,
notes.country,
users.name,
users.country
FROM
notes join users on notes.n_id = users.user_id
WHERE
notes.manager = 2

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Trying to find duplicates and who made them - mysql

Related

MySQL - Recursive - getting email addresses from 2 different tables and columns

Relational MySQL Query

join two tables in mysql and get records

Data Between Two Tables

How to write a proper If...Else Statement with JOIN in MySQL?

Categories

Resources