Update column based on duplicate columns - mysql

I have 2 tables:
account
+----+---------------------+
| Id | Email |
+----+---------------------+
| 1 | "test#example.com" |
| 2 | "test2#example.com" |
| 3 | "test3#example.com" |
| 4 | "test#example.com" |
+----+---------------------+
character
+----+-----------+
| Id | AccountId |
+----+-----------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| 4 | 3 |
| 5 | 4 |
+----+-----------+
character.AccountId is a FK to account.Id. Both Id columns are PK's to their respective tables.
I need to update the character table such that the new AccountId matches a row in account with the lowest account's Id but with the same Email as the currently set AccountId.
For example, in the mock data presented above all accounts have unique emails except account Id 1 and 4, they both share test#example.com as email.
This means that after the update, the rows in the character table should stay the same except for the row with Id = 5, this row has an AccountId = 4 and this account shares an email with an account that has a lower account Id, namely Id 1. So the result output should be :
+----+-----------+
| Id | AccountId |
+----+-----------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| 4 | 3 |
| 5 | 1 |
+----+-----------+
After the operation.
I've got this to work with a TRIGGER BEFORE INSERT on character to check if there are lower accountIds but can't get this to work with a simple UPDATE:
UPDATE `character` SET `AccountId` = (SELECT MIN(Id) FROM `account` WHERE ?);
I've thought of maybe making a temporary table to store the account Id and the email with a GROUP BY email but this also fails. It being MySQL I can't use MERGE either.

You can use a window function to get the minimum id per email. Then just join that in:
UPDATE character c JOIN
(SELECT a.*,
MIN(id) OVER (PARTITION BY a.email) as min_id
FROM account a
) a
ON c.accountId = a.id
SET c.accountId = a.min_id
WHERE c.accountId <> a.min_id;

This is fairly straightforward without window functions, too; you want to find accounts with the same email and lower ids (the a2 join) and make sure you've got the lowest (the a3 join):
update `character` c
join account a on a.id=c.account_id
join account a2 on a2.email=a.email and a2.id < a.id
left join account a3 on a3.email=a2.email and a3.id < a2.id
set c.account_id=a2.id
where a3.id is null;
(As usual, the left join ... where ... is null could be replaced with a where not exists (...), if you prefer the query to show the intent more clearly instead of the probable query plan more clearly.)
fiddle

Related

Mysql, Multiple Left Join - 3 Tables

I have some issues with multiple joins in MysQL
I have 3 tables:
cms_data_company
cms_data_company_categories
cms_datasrc_category
Sample records from: cms_data_company:
+----+-----------+------------+
| id | name | address |
+----+-----------+------------+
| 1 | Name1 | Samplestr1 |
+----+-----------+------------+
| 2 | Name2 | Samplestr2 |
+----+-----------+------------+
| 3 | Name3 | Samplestr3 |
+----+-----------+------------+
Sample records from: cms_data_company_categories ( It contains Company_id field and category_id ) Point is that there is serveral records for one company_id )
+----+-----------+------------+
| id | company_id| category_id|
+----+-----------+------------+
| 1 | 2 | 14 |
+----+-----------+------------+
| 2 | 2 | 11 |
+----+-----------+------------+
| 3 | 1 | 15 |
+----+-----------+------------+
Sample records from: cms_datasrc_category ( Here is a issue that i need only that rows where:
datasrc = 1 AND parent = 0
+----+-----------+------------+-----------+
| id | datasrc | parent |name |
+----+-----------+------------+-----------+
| 1 | 1 | 0 |category1 |
+----+-----------+------------+-----------+
| 2 | 2 | 0 |category2 |
+----+-----------+------------+-----------+
| 3 | 3 | 5 |category3 |
+----+-----------+------------+-----------+
What i would like to recive is that:
All fields from cms_data_company and field name from cms_datasrc_company
I need to join it as follows:
id from cms_data_company match with company_id from cms_data_company_categories
Then matched category_id from
cms_data_company_categories with ID from cms_datasrc_category (only these records where datasrc=0 and parent=0)
Return name as new column with all field from cms_data_company
I think I could make it messy, but my MySQL statement is as follows:
select * ,cms_datasrc_category.name_en
from cms_data_company
LEFT JOIN cms_data_company_categories.company_id
ON cms_data_company.id = cms_data_company_categories.company_id
LEFT JOIN cms_datasrc_category
ON cms_data_company_categories.category_id = cms_datasrc_category.id
WHERE cms_datasrc_category.datasrc = 1 AND cms_datasrc_category.parent = 0
It seems It is working somehow but, there is only records from cms_data_company where query can find something. I would like to change my statemat to show NULLs when There is no matching fields.
It is because WHERE applies to all Query ?
When you use left joins, conditions on all but the first table should be in the on clauses:
SELECT *, cdc.name_en
FROM cms_data_company dc LEFT JOIN
cms_data_company_categories.company_id c
ON dc.id = c.company_id LEFT JOIN
cms_datasrc_category cdc
ON c.category_id = cdc.id AND
cdc.datasrc = 1 AND cdc.parent = 0;
Notes:
Table aliases make a query easier to write and to read.
You should use table aliases for all column references, when your query has more than one table.
The select * already selects all columns from all tables. There is no need to include another column. Or, better yet, list the columns you really need.
The filtering conditions are on the last table, so they are now in the on clause.

How to use JOIN instead of comma?

I have this query:
INSERT INTO Votes (id_post,id_user)
SELECT ?,?
FROM Posts p, Users u
WHERE p.id_user = :id_author
AND u.id = $_SESSION['id']
AND u.active = 1
limit 1;
Now I want to use JOIN instead of ,. But there isn't any common column between those two tables. So what should I write in ON clause?
What I'm trying to do:
I have three tables:
// Posts
+----+----------+---------------+-----------+
| id | title | content | id_author |
+----+----------+---------------+-----------+
| 1 | title1 | content1 | 1234 |
| 2 | title2 | content2 | 5678 |
+----+----------+---------------+-----------+
// ^ the id of post's author
// Users
+----+--------+--------+
| id | name | active |
+----+--------+--------+
| 1 | jack | 1 |
| 2 | peter | 0 |
| 3 | John | 1 |
+----+--------+--------+
// Votes
+----+---------+---------+
| id | id_post | id_user |
+----+---------+---------+
| 1 | 32 | 1234 |
| 2 | 634 | 5678 |
| 3 | 352 | 1234 |
+----+---------+---------+
// ^ the id of current user
Now I need to check two conditions before inserting a new vote into Votes table:
Is the id of author the same as what I pass as id_author? Posts.id_user = :id_author (I know I can do that by a FK, but I don't want)
The account of current user is active? Users.active = 1
Sum Up: I'm trying to don't let people be able to vote who are inactive (active = 0). For example if Stackoverflow bans you, then you cannot vote to posts anymore, because you (current user) are banned. So I'm pretty sure $_SESSION['id'] should be used in the query to determine current user.
I suggest using exists instead of join:
INSERT INTO Votes (id_post, id_user)
SELECT id_post, id_user FROM (SELECT ? id_post, ? id_user) a
WHERE EXISTS (
SELECT 1 FROM Users
WHERE id = ?
AND active = 1
) AND EXISTS (
SELECT 1 FROM posts
WHERE id_user = :id_author
)
You already have a join here! This is an implicit join.
INNER JOIN and , (comma) are semantically equivalent in the absence of
a join condition: both produce a Cartesian product between the
specified tables (that is, each and every row in the first table is
joined to each and every row in the second table).
So there isn't a need for you to 'introduce' a join here.

Dropping all duplicate rows in mySQL 5.7.9?

I want to drop all rows in a table of mySQL that have a duplicate using GROUP BY. My table has fields name date position email and looks like
+----------+---------------+----------+--------------------+
| M | 1976-10-03 | 1 | m#gmail |
| R | 1982-03-26 | 2 | r#gmail.com |
| C | 1987-09-03 | 3 | c#gmail.com |
| M | 1976-10-03 | 1 | m#gmail |
+----------+---------------+----------+--------------------+
I want to get
+----------+---------------+----------+--------------------+ |
| R | 1982-03-26 | 2 | r#gmail.com |
| C | 1987-09-03 | 3 | c#gmail.com |
+----------+---------------+----------+--------------------+
My attempt (from the answers to similar questions)
DELETE FROM ts1 WHERE * IN (SELECT * FROM ts1 GROUP BY * HAVING COUNT(*)>1);
Where are the errors? I understand I'm using too many * but I want to avoid naming all columns because they are too many in my actual table. Notice that I want to check for duplicates over the entire row.
You can't use GROUP BY * - you want to use GROUP BY name:
DELETE FROM ts1 WHERE name IN (SELECT name FROM ts1 GROUP BY name HAVING COUNT(*)>1);
Note that this would assume that users have unique names.
So you may actually want to check their emails instead:
DELETE FROM ts1 WHERE email IN (SELECT email FROM ts1 GROUP BY email HAVING COUNT(*)>1);

Match data against one column in mysql

Here is the sqlFiddle
I want to filter the users who have selected entities ,So if I want to filter user with entity say entity having ids "1" and "3" I hope to get the users which have both of these entities.
No of entities selected can vary in number .
Query I am using is
SELECT user_id from user_entities where entity_id IN(1,3)
but for obvious reason it is returing me result as
+----+-----------+---------+--------+
| ID | ENTITY_ID | USER_ID | STATUS |
+----+-----------+---------+--------+
| 1 | 1 | 3 | 1 |
| 2 | 3 | 3 | 1 |
| 7 | 1 | 2 | 1 |
| 29 | 3 | 1 | 1 |
+----+-----------+---------+--------+
So I will apply distinct to it it will give me user id with ids 1,2,3 but I only want user 3 as this is the only user having both entities .
What can be modified to get the exact results
You could join the table to itself specifying both IDs as part of the join condition:
SELECT e1.user_id
FROM user_entities e1
INNER JOIN user_entities e2
ON e1.user_id = e2.user_id AND
e1.entity_id = 1 AND
e2.entity_id = 3;

Delete duplicate rows and add the deleted rows values to one that remain

I have a table like :
id | name | profit | cost
---------------------------------------
1 | aaa | 4 | 2
2 | aaa | 4 | 3
3 | aaa | 4 | 2
4 | bbb | 4 | 1
I want to delete from this table duplicate rows (according the name)
but before do the delete add the value of the deleted rows to the remain row
so in this case I want that the table after run queries look like :
id | name | profit | cost
---------------------------------------
1 | aaa | 12 | 7
4 | bbb | 4 | 1
Is it possible to do it in mysql, without create another table and copy the data, because this is a big table (1 million rows but increase every day) ?
SQLFiddle demo
First update rows with min(id) for each NAME
UPDATE T a JOIN
(
SELECT min(ID) as minID,name,SUM(profit) as SP,SUM(cost) as SC
FROM T GROUP BY name
) b
ON a.id = b.minID
SET a.profit = b.sp,a.cost=b.sc;
And then delete rows except only those rows with min(id) for each NAME
DELETE T
FROM T
LEFT JOIN
(
SELECT min(ID) minid ,name FROM T GROUP BY name
) b
ON t.id = b.minid
WHERE b.minid is NULL