Dropping all duplicate rows in mySQL 5.7.9? - mysql

I want to drop all rows in a table of mySQL that have a duplicate using GROUP BY. My table has fields name date position email and looks like
+----------+---------------+----------+--------------------+
| M | 1976-10-03 | 1 | m#gmail |
| R | 1982-03-26 | 2 | r#gmail.com |
| C | 1987-09-03 | 3 | c#gmail.com |
| M | 1976-10-03 | 1 | m#gmail |
+----------+---------------+----------+--------------------+
I want to get
+----------+---------------+----------+--------------------+ |
| R | 1982-03-26 | 2 | r#gmail.com |
| C | 1987-09-03 | 3 | c#gmail.com |
+----------+---------------+----------+--------------------+
My attempt (from the answers to similar questions)
DELETE FROM ts1 WHERE * IN (SELECT * FROM ts1 GROUP BY * HAVING COUNT(*)>1);
Where are the errors? I understand I'm using too many * but I want to avoid naming all columns because they are too many in my actual table. Notice that I want to check for duplicates over the entire row.

You can't use GROUP BY * - you want to use GROUP BY name:
DELETE FROM ts1 WHERE name IN (SELECT name FROM ts1 GROUP BY name HAVING COUNT(*)>1);
Note that this would assume that users have unique names.
So you may actually want to check their emails instead:
DELETE FROM ts1 WHERE email IN (SELECT email FROM ts1 GROUP BY email HAVING COUNT(*)>1);

Related

Selecting values from second column alongside the values from first column in the same row

I am trying to get values matching the value from the second column. For example, I want to know who is the sender for Bill Gates by only using IDs.
I have two tables,
*users* table
| user_ID | Full_name |
| -------- | -------------- |
| 1 | Steve Jobs |
| 2 | Bill Gates |
| 3 | Elon Musk |
*relationships* table (with both column foreign keys)
| user_sender | user_receiver |
| ------------ | -------------- |
| 1 | 2 |
| 3 | 1 |
| 3 | 2 |
I want to select based on "user_receiver" column the matching values in the column "user_sender"
For example, I want to know who is user_sender for 2
OUTPUT:
| | |
| ------------ | -------------- |
| 1 | 2 |
| 3 | 2 |
You need to join the tables and select the rows you want
you have access to all columns of both tables by addressing them with their alias
SELECT u.user_ID , u.Full_name,r.user_receiver
FROM users u JOIN
relationships r ON u.user_ID = r.user_sender
WHERE r.user_receiver = 2
If you want to look based on the name, then join the relationships to users.
SELECT
rel.user_sender
, rel.user_receiver
-- , sender.Full_name AS sender_name
-- , receiver.Full_name AS receiver_name
FROM relationships AS rel
JOIN users AS sender ON sender.user_ID = rel.user_sender
JOIN users As receiver ON receiver.user_ID = rel.user_receiver
WHERE receiver.Full_name = 'Bill Gates'
If you already know the user_receiver number, and you only want the ID's
SELECT *
FROM relationships
WHERE user_receiver = 2

Update column based on duplicate columns

I have 2 tables:
account
+----+---------------------+
| Id | Email |
+----+---------------------+
| 1 | "test#example.com" |
| 2 | "test2#example.com" |
| 3 | "test3#example.com" |
| 4 | "test#example.com" |
+----+---------------------+
character
+----+-----------+
| Id | AccountId |
+----+-----------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| 4 | 3 |
| 5 | 4 |
+----+-----------+
character.AccountId is a FK to account.Id. Both Id columns are PK's to their respective tables.
I need to update the character table such that the new AccountId matches a row in account with the lowest account's Id but with the same Email as the currently set AccountId.
For example, in the mock data presented above all accounts have unique emails except account Id 1 and 4, they both share test#example.com as email.
This means that after the update, the rows in the character table should stay the same except for the row with Id = 5, this row has an AccountId = 4 and this account shares an email with an account that has a lower account Id, namely Id 1. So the result output should be :
+----+-----------+
| Id | AccountId |
+----+-----------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| 4 | 3 |
| 5 | 1 |
+----+-----------+
After the operation.
I've got this to work with a TRIGGER BEFORE INSERT on character to check if there are lower accountIds but can't get this to work with a simple UPDATE:
UPDATE `character` SET `AccountId` = (SELECT MIN(Id) FROM `account` WHERE ?);
I've thought of maybe making a temporary table to store the account Id and the email with a GROUP BY email but this also fails. It being MySQL I can't use MERGE either.
You can use a window function to get the minimum id per email. Then just join that in:
UPDATE character c JOIN
(SELECT a.*,
MIN(id) OVER (PARTITION BY a.email) as min_id
FROM account a
) a
ON c.accountId = a.id
SET c.accountId = a.min_id
WHERE c.accountId <> a.min_id;
This is fairly straightforward without window functions, too; you want to find accounts with the same email and lower ids (the a2 join) and make sure you've got the lowest (the a3 join):
update `character` c
join account a on a.id=c.account_id
join account a2 on a2.email=a.email and a2.id < a.id
left join account a3 on a3.email=a2.email and a3.id < a2.id
set c.account_id=a2.id
where a3.id is null;
(As usual, the left join ... where ... is null could be replaced with a where not exists (...), if you prefer the query to show the intent more clearly instead of the probable query plan more clearly.)
fiddle

MySQL - Select values and remove duplicates by table name

I have two tables which have the same structure but another names (in first table I store default values, in second table I store saved values by user).
I select these values using union all:
SELECT * FROM `table_default` UNION ALL SELECT * FROM `table_saved`
Structure of table_default:
| ID | SOME_VAL |
| 1 | def_val |
| 2 | def_val |
| 3 | def_val |
Structure of table_saved:
| ID | SOME_VAL |
| 1 | test |
| 3 | text |
And now, when I using this query:
SELECT * FROM `table_default` UNION ALL SELECT * FROM `table_saved`
I got:
| ID | SOME_VAL |
| 1 | def_val |
| 2 | def_val |
| 3 | def_val |
| 1 | test |
| 3 | text |
But I want to get unique values by ID. Table_saved is more important so when select return duplicates I want to remove always record from table_default.
So finally I want to get:
| ID | SOME_VAL |
| 2 | def_val | --> from TABLE_DEFAULT because this record (by ID) is not exist in table_saved
| 1 | test | --> from TABLE_SAVED
| 3 | text | --> from TABLE_SAVED
I can't use GROUP BY id because I don't know which record will be remove (sometime GROUP BY remove duplicate from table_default but sometimes GROUP BY also remove duplicates from table_saved) so I can't manage this.
Is it possible to remove duplicates (something like GROUP BY) using table name and row name ? Or maybe somebody has another idea. Please help.
Thanks.
If I understand correctly, you want to always retain all records from table_saved, plus records from table_default having IDs not appearing in table_saved. One approach is to use a left join to find the unique records from table_default. Then union that with all records from table_saved.
SELECT t1.ID, t1.SOME_VAL
FROM table_default t1
LEFT JOIN table_saved t2
ON t1.ID = t2.ID
WHERE t2.ID IS NULL
UNION ALL
SELECT ID, SOME_VAL
FROM table_saved;
If a default value is always present you could use a LEFT JOIN and COALESCE:
SELECT d.ID, COALESCE(s.SOME_VAL, d.SOME_VAL) AS SOME_VAL
FROM table_default d
LEFT JOIN table_saved s USING(ID)

MySql use default value if searched doesn't exist

I'm working with MySQL
I have a Actions_table which has an action an number of users.
I also have a Timing_table which I map the timing of each action to do.
I can match up the action in the Actions table to the Timing table but I want it to use a default time if there is no exact match eg tables
Actions_Table
------------------------------
| Action | No Ids |
------------------------------
|Delete ID | 5 |
|Install App1 | 1 |
|Create ID | 1 |
|Rename ID | 2 |
------------------------------
Timing_Table
-------------------
|Action |Time |
-------------------
|Delete ID | 100 |
|Install App1| 200 |
|Create ID | 50 |
|Default | 60 |
--------------------
As there is nothing listed for "Rename ID" in the Timings_Table I want it to use the time value for 'Default' instead so I will have something link this.
-------------------------------------
| Action | No Ids | Total Time|
-------------------------------------
|Delete ID | 5 | 500 |
|Install App1 | 1 | 200 |
|Create ID | 1 | 50 |
|Rename ID | 2 | 120 | <== value was calculated from Default
-------------------------------------
The basic code
Select a.Action, a.`No Ids`, (a.`No Ids` * b.time) as `TotalTime
From Action_Table a, Timing_Table b
Where a.Action = b.Action
However that won't match any unmatched to Default.
What you want is a left join and a cross join:
Select a.Action, a.`No Ids`,
coalesce(b.time, def.time) as ImputedBTime,
(a.`No Ids` * coalesce(b.time, def.time)) as `TotalTime
From Action_Table a left join
Timing_Table b
on a.Action = b.Action cross join
(select t.* from Timing_Table t where t.action = 'default') def
Simple rule: Never use commas in the from clause. Always use explicit JOIN syntax. You should learn the different types of joins.

Selecting a column from a table in MySQL twice

I have table a which stores the user id and the ids of his origin and destination. On table b I have the location id and the specific name of the place. What I'm trying to do is join the tables but the name column from table b will have to be used twice since I'm trying to get 2 addresses. I'm trying to read up on MySQL but just keep doing it wrong. Any help would be appreciated.
table a
------------------------
| uid | to | from |
------------------------
| 1 | 1 | 2 |
------------------------
table b
---------------
| lid | name |
---------------
| 1 | one |
---------------
| 2 | two |
---------------
/what I'm trying to achieve/
------------------------------------------
|a.uid | a.to | b.name | a.from | b.name |
------------------------------------------
| 1 | 1 | one | 2 | two |
------------------------------------------
You will have to join table b twice, and every time using different table name (b1, b2) using as
select *
from a join b as b1 on a.to = b1.lid
join b as b2 on a.from = b2.lid
so the result would be
--------------------------------------------
|a.uid | a.to | b1.name | a.from | b2.name |
--------------------------------------------
| 1 | 1 | one | 2 | two |
--------------------------------------------
but what you probably want is to prevent name clash - if you e.g. call it from PHP - so then also rename the columns:
select a.*, b1.name as toName, b2.name as fromName
... (rest of the query as above)
If it's limited to just being twice just join in table b twice.
But it looks like you could have any number of numbers between a.from and a.to and in that case I would suggest that you do this in two or more queries.
One to get the row from a and than one to get all rows in b that is between a.from and a.to.