Remove duplicate rows on many to many table (Mysql) - mysql

I have a table which is many to many and my table looks like this
+----+--------+
| Customers |
+----+--------+
| id | name |
+----+--------+
| 1 | john |
| 1 | john |
| 1 | james |
| 2 | george |
| 2 | michael|
+----+--------+
What i want is to remove the duplicate rows with the same name.

Unfortunately, you have no way to distinguish one row from another. So, the easiest way to do this is the temporary table approach:
create table temp as
select distinct id, name
from customers;
truncate table customers;
insert into customers(id, name)
select id, name
from temp;
drop table temp;

Take a look at GROUP BY aggregate function

Well a small variation #Gordon Linoff answer, is by avoiding the "Insert into" and doing a "Rename table" and making the queries to work on any table.
Solution-1: By using a temp table
CREATE TABLE table_name_clean AS SELECT DISTINCT
*
FROM
table_name;
DROP TABLE table_name;
RENAME TABLE table_name_clean TO table_name;
Solution-2: Adding a UNIQUE INDEX (recommended as it will prevent the creation of duplicate entries in your table)
ALTER IGNORE TABLE table_name ADD UNIQUE INDEX u_id (id,name);

Related

Replace current table data with result from querying the same table

I have a table that does not have any index or primary key in my MySQL database. I cannot change the schema of the table (it is not "my" table). As the table stores data that arrives in intervals, there can be (are) duplicates.
For example:
+--------------+--------------+--------+----------+----------+---------+
| first_seen | last_seen | type | name | hitcnt | data |
+--------------+--------------+--------+----------+----------+---------+
| 15:12:02 | 16:02:32 | 5 | foo | 3 | difank |
+--------------+--------------+--------+----------+----------+---------+
| 19:52:23 | 22:06:20 | 5 | foo | 4 | difank |
+--------------+--------------+--------+----------+----------+---------+
Now I would like to "reduce" this to:
+--------------+--------------+--------+----------+----------+---------+
| first_seen | last_seen | type | name | hitcnt | data |
+--------------+--------------+--------+----------+----------+---------+
| 15:12:02 | 22:06:20 | 5 | foo | 7 | difank |
+--------------+--------------+--------+----------+----------+---------+
And I would like to do this "in situ" (i.e. in place) if possible.
Using GROUP BY, MIN(), MAX(), etc. I can write a query that returns exactly what I want to end up with:
SELECT
MIN(first_seen),
MAX(last_seen),
type,
name,
SUM(hitcnt) as hit,
data
FROM <table>
GROUP BY type, name, data
ORDER BY hit desc, type;
The question is: how can I replace the existing data (efficiently) with the result of that query?
Do I have to use a temporary table (i.e. move the data to a temporary table, truncate the existing table and SELECT INTO from the temporary table)?
Can I do this in a transaction (to prevent data loss if something goes wrong)?
Are there other (better?) options than a temporary table?
TRUNCATE TABLE table_name;
INSERT INTO table_name (column1.....)
SELECT
MIN(first_seen),
MAX(last_seen),
type,
name,
SUM(hitcnt) as hit,
data
FROM <table>
GROUP BY type, name, data
ORDER BY hit desc, type;
Make sure number of columns of insert and select staement matches.
I ended up using the following approach:
CREATE TABLE tmp_table IF NOT EXISTS AS (SELECT ...);
then move the "new" table into place using ALTER TABLE ... RENAME ...;
followed by a DROP TABLE ...; for the old/original table.
That seems to work.

Deleting almost duplicate rows in MySQL?

I have seen a few different answers for this question, but none really hit exactly what I needed to do in MySQL.
I did find a thread for MS SQL that is exactly to what I need to do here but nothing min MySQL.
Data Example
+--------+----------+--------+
| Col1 | Col2 | UniqueID |
+--------+----------+--------+
| Peaches| Outdoor | 1 |
| Peaches| Outdoor | 2 |
| Apples | Indoor | 3 |
| Apples | Indoor | 4 |
+--------+----------+--------+
Desired Output
+--------+----------+--------+
| Col1 | Col2 | UniqueID |
+--------+----------+--------+
| Peaches| Outdoor | 1 |
| Apples | Indoor | 3 |
+--------+----------+--------+
Your way is OK. You only forgot the KEYWORD TABLE
CREATE TABLE NewTable AS SELECT Col1,Col2 ,MAX(col3) FROM t GROUP BY Col1,col2
but the structure can be different from the original table
Do this way:
CREATE TABLE NewTable like t;
then add a unique key:
ALTER TABLE NewTable ADD KEY (Col1,col2);
and now copy old data in new table with ON DUPLICATE KEY UPDATE
INSERT INTO NewTable
SELECT *
from t
ON DUPLICATE KEY UPDATE Col3=GREATEST(Col3,VALUES(Col3));
so you copy every row and the duplicates tests for maximum
Im going to post the answer to the answer provided above so its clear...it is just one simple query:
CREATE NewTable AS SELECT Col1,Col2 ,MAX(col3) FROM t GROUP BY Col1,col2
Just querying max was the trick...so simple.
Thank you!

Handle dynamic missed columns in MySQL

I have a small doubt in MySQL. While loading data from one table to another table I faced one issue
first table: emp
id | name | sal | deptno | loc | referby
1 | abc | 100 | 10 | hyd | xyz
2 | mnc | 200 | 20 |chen | pqr
second table:emprefers
id | name | sal | deptno | loc | referby
Now I want to load the emp table data into the emprefers table. I wrote a query like
insert into emprefers select * from emp after
I ran the query, the data was loaded into the emprefers table like below:
id | name | sal |deptno | loc |referby
1 | abc | 100 | 10 | hyd | xyz
2 | mnc | 200 | 20 | chen | pqr
Now I ran the same query a second time. It has failed. The reason is the name column is deleted from the emp table.
I edited the query like:
insert into emprefers select id,'null'as name,sal,deptno,loc,referby from emp
After I ran the edited query again, now records are loading into the emprefers table and the data looks like:
id | name | sal |deptno | loc |referby
1 | null | 100 |10 | hyd | xyz
2 | null | 200 |20 |chen | pqr
Every time before loading the emprefers table I truncate the emprefers table data. And the emprefers table structure never changed.
Again, a third time I ran the same query again. The query has failed, the reason was that the sal and deptno columns were missing in the emp table.
I don't want to edit the query again, reason is we don't know which columns are/get deleted from the emp table.
This time we want solve the issue.
We want to load the data into the second table if the columns are available in the emp table, then load the data - otherwise we need to pass null or empty values for those columns.
Please tell me how to write a query to check if a column exist or not, and if it exists to retrieve the same column, otherwise assign null values for that column.
Rather than changing the existing query and truncating the table, it might be a better idea to make delete the whole table, make a copy of the original emp table and then insert the data into it. That way they'll always be the same.
DROP TABLE emprefers IF EXISTS
CREATE TABLE emprefers LIKE emp;
INSERT INTO emprefers SELECT * FROM emp
This statement will create the table over the fly.
CREATE TABLE databasename.emprefers SELECT * FROM databasename.emp;

SQL query to remove multiple duplicate entries from table

I have table containing following entries
Id | Accno | Name | Hash
----+----------+-----------+---------
1 | 11 | ABC | 01110
2 | 11 | ABC |
3 | 22 | PQT |
4 | 33 | XYZ | 03330
5 | 44 | LMN | 04440
6 | 33 | XYZ |
I need SQL query to remove duplicate entry from table and keep atleast single entry in table whose hash value is present. and for those entries which are not duplicate should also remain in table.
I think you guys overcomplicate things a lot. This should work just dandy:
DELETE FROM
YourTable
WHERE Hash IS NULL
AND Accno IN
(
SELECT Accno
FROM YourTable
GROUP BY Accno
HAVING COUNT(Name) > 1
)
;
Probably the easiest way to do it is to create a new table and copy non duplicate entries.
create table table_name2 as select distinct * from table_name1;
drop table table_name1;
rename table_name2 to table_name1;
Something like this.
Create table temp2 as SELECT *
FROM temp where id in (select id from temp group by accno having count(accno)>=1 and hash<>'');
drop table old_table;
rename table temp2 to old_table;
Check SQL Fiddle

How to create VIEWs so they alter too when base table is ALTERed?

Is there a way to link two tables, so when i alter base table, view will altered too? Something like that:
CREATE TABLE t (qty INT, price INT);
INSERT INTO t VALUES(3, 50);
CREATE VIEW v AS SELECT * FROM t;
SELECT * FROM v;
+------+-------+
| qty | price |
+------+-------+
| 3 | 50 |
+------+-------+
ALTER TABLE t ADD COLUMN comm INT;
SELECT * FROM t;
+------+-------+------+
| qty | price | comm |
+------+-------+------+
| 3 | 50 | NULL |
+------+-------+------+
SELECT * FROM v;
+------+-------+
| qty | price |
+------+-------+
| 3 | 50 |
+------+-------+
Last two SELECT-s should be equal.
PS. I am aware that MySQL says:
The view definition is “frozen” at creation time, so changes to the underlying tables afterward do not affect the view definition.
And creating trigger is also not possible, because trigger events does not include ALTER TABLE
You need to recreate view when you alter table as stated in manual
The view definition is “frozen” at
creation time, so changes to the
underlying tables afterward do not
affect the view definition. For
example, if a view is defined as
SELECT * on a table, new columns added
to the table later do not become part
of the view.
Either drop & recreate view or ALTER view too.