I want to remove all duplicates where combination of first name and last name is same
table users
mysql> select * from users;
+----+------------+-----------+
| id | LastName | FirstName |
+----+------------+-----------+
| 1 | Kowalski | Jan |
| 2 | Malinowski | Marian |
| 3 | Malinowski | Marian |
| 4 | Kowalski | Jan |
| 5 | Malinowski | Marian |
| 6 | Malinowski | Marian |
+----+------------+-----------+
I've created script
set #x = 1;
set #previous_name = '';
DELETE FROM users where id IN (SELECT id from (
select id, #previous_name,IF (CONCAT(FirstName, LastName) = #previous_name, #x:= #x + 1, IF(#previous_name:=CONCAT(FirstName, LastName), #x, IF(#x:=1, #x, #x))) as occurance
from users order by CONCAT(FirstName, LastName)
) AS occurance_table where occurance_table.occurance > 1);
but sql returns error
ERROR 1292 (22007): Truncated incorrect DOUBLE value: 'JanKowalski'
I found a few similar questions, but solution were remove and word form syntax.
I want to prepare db for adding unique constrain for 2 columns, so I want to clear table from duplications.
What is best way to reach it?
I tried with the query mentioned in Answer section.
I believe that does not work. Instead I have modified the query to work
DELETE FROM users
WHERE id NOT IN
(
SELECT MIN(a.id)
FROM (SELECT * FROM users) a
GROUP BY a.LastName, a.FirstName
)
Please do correct me if I am wrong. #juergen
There is no need for a script. A single query is enough:
delete u1
from users u1
left join
(
select min(id) as min_id
from users
group by LastName, FirstName
) u2 on u1.id = u2.min_id
where u2.min_id is null
The subselect gets the lowest user id for each unique set of name. Joining to that you can delete everything else.
Related
i have a table like this
i want to get the row of each table that have min responsetime
i have tried this query :
select tablename,
index1,
index2,
min(responsetime)
from tableconf
group by tablename
order by responsetime asc
but it doesn't give what i want
the output that i want is
+------------------+------------------+--------+--------------+
| tablename | index1 | index2 | responsetime |
+------------------+------------------+--------+--------------+
| salesorderheader | TotalDue | NULL | 6.1555 |
| salesterritory | Name | NULL | 11.66667 |
| store | BusinessEntityId | Name | 3.6222 |
| previous | previous | NULL | 5.03333 |
| NONE | NONE | NULL | 5.6 |
+------------------+------------------+--------+--------------+
what query i should use for get the output that i want
Select the minimum date per table name. Use an IN clause on these to get the rows:
select *
from tableconf
where (tablename, responsetime) in
(
select tablename, min(responsetime)
from tableconf
group by tablename
);
(Edited from previous answer)
I don't know if all SQL syntax accept a comma separated where parameter. Another option building off of the highest voted answer right now utilizes a join:
select *
from tableconf t
inner join (
select tablename, min(responsetime) min_rt
from tableconf t2
group by tablename
) t3 on t.tablename = t2.tablename and t.responsetime = t2.min_rt
I have a MySQL table which has three columns:
Userid | Email | Points
---------------------------------------------------------
1 | jdoe#company.com | 20
2 | jdoe%40company.com | 25
3 | rwhite#company.com | 14
4 | rwhite%40company.com| 10
What I want to do is to delete duplicate email and merge points. I want my table to look like this:
Userid | Email | Points
---------------------------------------------------------
1 | jdoe#company.com | 45
3 | rwhite#company.com | 24
How would my query look like to return my desire table?
Anyone knows how to do this ?
Thanks in advance!
Are you looking for something like this?
SELECT MIN(userid) userid, email, SUM(points) points
FROM
(
SELECT userid, REPLACE(email, '%40', '#') email, points
FROM table1
) q
GROUP BY email
Output:
| USERID | EMAIL | POINTS |
|--------|--------------------|--------|
| 1 | jdoe#company.com | 45 |
| 3 | rwhite#company.com | 24 |
Here is SQLFiddle demo
Now if you want to deduplicate your table in-place you can do
-- Fix emails
UPDATE table1
SET email = REPLACE(email, '%40', '#')
WHERE email LIKE '%\%40%';
-- Merge points for duplicate records
UPDATE table1 t JOIN
(
SELECT email, SUM(points) points
FROM table1
GROUP BY email
HAVING COUNT(*) > 1
) q ON t.email = q.email
SET t.points = q.points;
-- Delete all duplicate records except ones with lowest `userid`
DELETE t
FROM table1 t JOIN
(
SELECT MIN(userid) userid, email
FROM table1
GROUP BY email
HAVING COUNT(*) > 1
) q ON t.email = q.email
WHERE t.userid <> q.userid;
Here is SQLFiddle demo
Use this query assuming you want to match email as is without any modification
SELECT MIN(user_id), SUM(points)as points, email FROM table_name GROUP BY email
I am trying to get some rows from the same table. It's a user table: user has user_id and user_parent_id.
I need to get the user_id row and user_parent_id row. I have coded something like this:
SELECT user.user_fname, user.user_lname
FROM users as user
INNER JOIN users AS parent
ON parent.user_parent_id = user.user_id
WHERE user.user_id = $_GET[id]
But it doesn't show the results. I want to display user record and its parent record.
I think the problem is in your JOIN condition.
SELECT user.user_fname,
user.user_lname,
parent.user_fname,
parent.user_lname
FROM users AS user
JOIN users AS parent
ON parent.user_id = user.user_parent_id
WHERE user.user_id = $_GET[id]
Edit:
You should probably use LEFT JOIN if there are users with no parents.
You can also use UNION like
SELECT user_fname ,
user_lname
FROM users
WHERE user_id = $_GET[id]
UNION
SELECT user_fname ,
user_lname
FROM users
WHERE user_parent_id = $_GET[id]
Perhaps this should be the select (if I understand the question correctly)
select user.user_fname, user.user_lname, parent.user_fname, parent.user_lname
... As before
Your query should work fine, but you have to use the alias parent to show the values of the parent table like this:
select
CONCAT(user.user_fname, ' ', user.user_lname) AS 'User Name',
CONCAT(parent.user_fname, ' ', parent.user_lname) AS 'Parent Name'
from users as user
inner join users as parent on parent.user_parent_id = user.user_id
where user.user_id = $_GET[id];
I don't know how the table is created but try this...
SELECT users1.user_id, users2.user_parent_id
FROM users AS users1
INNER JOIN users AS users2
ON users1.id = users2.id
WHERE users1.user_id = users2.user_parent_id
Lets try to answer this question, with a good and simple scenario, with 3 MySQL tables i.e. datetable, colortable and jointable.
first see values of table datetable with primary key assigned to column dateid:
mysql> select * from datetable;
+--------+------------+
| dateid | datevalue |
+--------+------------+
| 101 | 2015-01-01 |
| 102 | 2015-05-01 |
| 103 | 2016-01-01 |
+--------+------------+
3 rows in set (0.00 sec)
now move to our second table values colortable with primary key assigned to column colorid:
mysql> select * from colortable;
+---------+------------+
| colorid | colorvalue |
+---------+------------+
| 11 | blue |
| 12 | yellow |
+---------+------------+
2 rows in set (0.00 sec)
and our final third table jointable have no primary keys and values are:
mysql> select * from jointable;
+--------+---------+
| dateid | colorid |
+--------+---------+
| 101 | 11 |
| 102 | 12 |
| 101 | 12 |
+--------+---------+
3 rows in set (0.00 sec)
Now our condition is to find the dateid's, which have both color values blue and yellow.
So, our query is:
mysql> SELECT t1.dateid FROM jointable AS t1 INNER JOIN jointable t2
-> ON t1.dateid = t2.dateid
-> WHERE
-> (t1.colorid IN (SELECT colorid FROM colortable WHERE colorvalue = 'blue'))
-> AND
-> (t2.colorid IN (SELECT colorid FROM colortable WHERE colorvalue = 'yellow'));
+--------+
| dateid |
+--------+
| 101 |
+--------+
1 row in set (0.00 sec)
Hope, this would help many one.
Firstly, pardon the incredibly vague/long question, I'm really not sure how to summarise my query without the full explanation.
Ok, I have a single MySQL table with the format like so
some_table
user_id
some_key
some_value
If you imagine that, for each user, there are multiple rows, for example:
1 | skill | html
1 | skill | php
1 | foo | bar
2 | skill | html
3 | skill | php
4 | foo | bar
If I want to find all the users who have listed HTML as a skill I can simply do:
SELECT user_id
FROM some_table
WHERE some_key = 'skill' AND some_value='html'
GROUP BY user_id
Easy enough. This would give me user ID's 1 and 2.
If I want to find all users who have listed HTML or PHP as a skill then I can do:
SELECT user_id
FROM some_table
WHERE (some_key = 'skill' AND some_value='html') OR (some_key = 'skill' AND some_value='php')
GROUP BY user_id
This would give me use ID's 1, 2 and 3.
Now, what I'm struggling to work out is how I can query the same table but this time say "give me all the users who have listed both HTML and PHP as a skill", i.e: just user ID 1.
Any advice, guidance or links to docs massively appreciated.
Thanks.
Here's one way:
SELECT user_id
FROM some_table
WHERE user_id IN (SELECT user_id FROM some_table where (some_key = 'skill' AND some_value='html'))
AND user_id IN (SELECT user_id FROM some_table where (some_key = 'skill' AND some_value='php'))
you need to use a nested query (or a self join, which is different)
I set up the following table.
+-------+----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| type | char(10) | YES | | NULL | |
| value | char(10) | YES | | NULL | |
+-------+----------+------+-----+---------+-------+
inserted the following values
+------+-------+-------+
| id | type | value |
+------+-------+-------+
| 1 | skill | html |
| 1 | skill | php |
| 2 | skill | html |
| 3 | skill | php |
| 2 | skill | php |
+------+-------+-------+
ran this query
select id
from test
where type = 'skill'
and value = 'html'
and id in (
select id
from test
where type = 'skill'
and value = 'php');
and got
+------+
| id |
+------+
| 1 |
| 2 |
+------+
a self join would be as follows
select e1.id
from test e1, test e2
where e1.id = e2.id
and e2.type = 'skill'
and e2.value = 'html'
and e1.type = 'skill'
and e1.value = 'php'
;
and produce the same result.
so there you have two ways to try it in your code.
I don't know if this is valid for mysql, but should be (works for other db engines):
SELECT php.user_id
FROM some_table php, some_table html
WHERE php.user_id = html.user_id
AND php.some_key = 'skill'
AND html.some_key = 'skill'
AND php.some_value = 'php'
AND html.some_value = 'html';
And alternative, by using HAVING statement:
SELECT user_id, count(*)
FROM some_table
WHERE some_key = 'skill'
AND some_value in ('php','html')
GROUP BY user_id
HAVING count(*) = 2;
And a third option is to use inner selects. A slight alternative approach to David's approach:
SELECT user_id FROM some_table
WHERE
some_key = 'skill' AND
some_value = 'html' AND
user_id IN (
SELECT user_id FROM some_table
WHERE
some_key = 'skill' AND
some_value = 'php' AND
user_id IN (
SELECT user_id FROM some_table
WHERE
some_key = 'skill' AND
some_value = 'js' -- AND user_id IN ... for next level, etc.
)
);
... idea is that you can "pipe" the inner selects. With each new property you add new inner select to the most inner one.
I have got the following table where if more than 1 row contain the same 'user_badge_name' and the 'user_email', the are considered duplicates.
user_id | user_name | user_badge_name | user_email
--------------------------------------------------
234 | Kylie | ky001 | kylie#test.com
235 | Francois | FR007 | france#test.com
236 | Maria | MA300 | Marie#test.com
237 | Francine | FR007 | france#test.com
I need to display the user_ids and username of those rows where 'user_badge_name' and 'user_email' are replicated.
I tried the following sql but it is not returning all user_ids, only the first id
SELECT user_id, username , COUNT(user_badge_name) AS user_badge_name_Count FROM user GROUP BY user_badge_name HAVING user_badge_name_Count > 1
Any suggestion is most appreciated
select a.user_id, a.user_name
from user as a
inner join
(SELECT user_badge_name, user_email
FROM user
GROUP BY user_badge_name, user_email
HAVING count(*)>1
) as dups
on a.user_badge_name=dups.user_badge_name and a.user_email=dups.user_email
order by a.user_badge_name, a.user_email
If you want to see all of the user ids in the same row, then you can used a GROUP_CONCAT:
SELECT GROUP_CONCAT(user_id) AS user_ids, GROUP_CONCAT(username) AS usernames, COUNT(user_badge_name) AS user_badge_name_Count FROM user GROUP BY user_badge_name HAVING user_badge_name_Count > 1
That will give you something like this:
user_ids | usernames | user_badge_name_Count
-----------------------------------------------
235,237 | Francois,Francine | 2