Deleting duplicate rows, no unique keys - relational table - mysql

I have a relational table that connects two other tables based on their IDs. There can be duplicates for both columns - but there CANNOT be the same row twice. I handle the checking code side.
How do I remove duplicate rows (see below):
select * from people:
a | b
1 2
1 3
1 3
1 7
2 3
2 5
2 5
2 9
I want the result to be:
a | b
1 2
1 3
1 7
2 3
2 5
2 9

This should work:
ALTER IGNORE TABLE people ADD UNIQUE (a,b);
If you don't want to add an index, then this should work:
DROP TABLE IF EXISTS people_old;
DROP TABLE IF EXISTS people_new;
CREATE TABLE people_new LIKE people;
INSERT INTO people_new SELECT DISTINCT * FROM people;
RENAME TABLE people TO people_old, people_new TO people;

There can be duplicates for both columns - but there CANNOT be the same row twice
That's a constraint on the table that you have not implemented. The constraint is a unique index on (a,b). If you had the index you would not have duplicates.
IMHO your best approach is to add the unique index to the table, using a temporary table to first remove the duplicates:
Copy person to person_temp
Delete all from person
Add unique index to person
Copy unique a,b from person_temp to `person.

This is how you can delete duplicate rows... I'll write you my example and you'll need to apply to your code. I have Actors table with ID and I want to delete the rows with repeated first_name
mysql> select actor_id, first_name from actor_2;
+----------+-------------+
| actor_id | first_name |
+----------+-------------+
| 1 | PENELOPE |
| 2 | NICK |
| 3 | ED |
....
| 199 | JULIA |
| 200 | THORA |
+----------+-------------+
200 rows in set (0.00 sec)
-Now I use a Variable called #a to get the ID if the next row have the same first_name(repeated, null if it's not).
mysql> select if(first_name=#a,actor_id,null) as first_names,#a:=first_name from actor_2 order by first_name;
+---------------+----------------+
| first_names | #a:=first_name |
+---------------+----------------+
| NULL | ADAM |
| 71 | ADAM |
| NULL | AL |
| NULL | ALAN |
| NULL | ALBERT |
| 125 | ALBERT |
| NULL | ALEC |
| NULL | ANGELA |
| 144 | ANGELA |
...
| NULL | WILL |
| NULL | WILLIAM |
| NULL | WOODY |
| 28 | WOODY |
| NULL | ZERO |
+---------------+----------------+
200 rows in set (0.00 sec)
-Now we can get only duplicates ID:
mysql> select first_names from (select if(first_name=#a,actor_id,null) as first_names,#a:=first_name from actor_2 order by first_name) as t1;
+-------------+
| first_names |
+-------------+
| NULL |
| 71 |
| NULL |
...
| 28 |
| NULL |
+-------------+
200 rows in set (0.00 sec)
-the Final Step, Lets DELETE!
mysql> delete from actor_2 where actor_id in (select first_names from (select if(first_name=#a,actor_id,null) as first_names,#a:=first_name from actor_2 order by first_name) as t1);
Query OK, 72 rows affected (0.01 sec)
-Now lets check our table:
mysql> select count(*) from actor_2 group by first_name;
+----------+
| count(*) |
+----------+
| 1 |
| 1 |
| 1 |
...
| 1 |
+----------+
128 rows in set (0.00 sec)
it works, if you have any question write me back

Related

What is the order of assignment in mysql?

What is the order of assignment in mysql?
set #rownum := 0;
explain select actor_id, first_name, #rownum as rownum
from actor
where #rownum <= 1
order by first_name, LEAST(0, #rownum:=#rownum + 1);
For anybody else looking in this query spits out 2 rows with 2 row numbers for example given
+----------+-------+
| sudentid | fname |
+----------+-------+
| 101 | NULL |
| 103 | NULL |
| 112 | NULL |
+----------+-------+
3 rows in set (0.00 sec)
and the query produces
+----------+-------+--------+
| sudentid | fname | rownum |
+----------+-------+--------+
| 101 | NULL | 1 |
| 103 | NULL | 2 |
+----------+-------+--------+
2 rows in set (0.00 sec)
with an explain plan of
+------+-------------+---------+------+---------------+------+---------+------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+---------+------+---------------+------+---------+------+------+----------------------------------------------+
| 1 | SIMPLE | student | ALL | NULL | NULL | NULL | NULL | 8 | Using where; Using temporary; Using filesort |
+------+-------------+---------+------+---------------+------+---------+------+------+----------------------------------------------+
1 row in set (0.00 sec)
The explain plan is in line with my expectation that order by is pretty much last (https://blog.jooq.org/2016/12/09/a-beginners-guide-to-the-true-order-of-sql-operations/)
but what appears to happening is that because the order by increments a variable that is used in the select there is another where clause invoked in the ACTUAL order of events which is not reflected in the explain plan (which I think reflects the LOGICAL order of events).
I don't know what the OP is trying to do here (or if the result is incorrect) but this is NOT the way row numbers would usually be assigned prior to version 8.

Auto numerate MySQL Query

Is there any mysql query, to update a table and set numbers starting from 1?
For example, the table "item" has 100000 rows, the query would just update first row and set id ="1", next to 2, 3, 4 etc.
Try:
ALTER TABLE item MODIFY id INT PRIMARY KEY NOT NULL AUTO_INCREMENT
If you have id in your item table.
But it will show error on id column having duplicate values.
add auto increment column (INT) and it should do what you want
Is there already a primary key in the table? If not then create one with AUTO_INCREMENT.
This would do the job done.
ALTER TABLE `your_table`
ADD COLUMN `id_primary` int NOT NULL AUTO_INCREMENT FIRST ,
ADD PRIMARY KEY (`id_primary`);
Note: Changing an existing column to primary key AUTO_INCREMENT would raise error if that column contains duplicate values.
Given this
MariaDB [sandbox]> select * from posts;
+------+--------+----------+
| id | userid | category |
+------+--------+----------+
| NULL | 1 | a |
| NULL | 1 | b |
| NULL | 1 | c |
| NULL | 2 | a |
| NULL | 1 | a |
+------+--------+----------+
5 rows in set (0.00 sec)
This code
use sandbox;
update posts p,(select #rn:=0) rn
set id=(#rn:=#rn+1)
where 1 = 1;
Results in
MariaDB [sandbox]> select * from posts;
+------+--------+----------+
| id | userid | category |
+------+--------+----------+
| 1 | 1 | a |
| 2 | 1 | b |
| 3 | 1 | c |
| 4 | 2 | a |
| 5 | 1 | a |
+------+--------+----------+
5 rows in set (0.00 sec)

A query with those fields, that are not required to get from huge fields within a table

i want a query to display few fields from a table on webpage, but there are huge number of fields in that table, but i want such type of query that except few fields and displaying all remaining fields' data on web page.
Like: i have a table with 50 fields,
+--------+--------+--------+----------+ +----------+
| Col1 | Col2 | Col3 | Col4 | ---- | Col50 |
|---------|---------|----------|-----------| |-----------|
| | | | | ---- | |
| | | | | ---- | |+-------+--------+--------+----------+ +----------+
but i want to display only 48 fields on that page. then any query that except those 2 fields name(Col49 and Col50) that are not required and show remaining data. So instead of writing: SELECT Col1, Col2, Col3, Col4,...Col48 FROM table; any alternate way to writing like that SELECT *-(Col49,Col50) FROM table;
The best way to solve this is using view you can create view with those 18 columns and retrieve data form it
example
mysql> SELECT * FROM calls;
+----+------------+---------+
| id | date | user_id |
+----+------------+---------+
| 1 | 2016-06-22 | 1 |
| 2 | 2016-06-22 | NULL |
| 3 | 2016-06-22 | NULL |
| 4 | 2016-06-23 | 2 |
| 5 | 2016-06-23 | 1 |
| 6 | 2016-06-23 | 1 |
| 7 | 2016-06-23 | NULL |
+----+------------+---------+
7 rows in set (0.06 sec)
mysql> CREATE VIEW C_VIEW AS
-> SELECT id,date from calls;
Query OK, 0 rows affected (0.20 sec)
mysql> select * from C_VIEW;
+----+------------+
| id | date |
+----+------------+
| 1 | 2016-06-22 |
| 2 | 2016-06-22 |
| 3 | 2016-06-22 |
| 4 | 2016-06-23 |
| 5 | 2016-06-23 |
| 6 | 2016-06-23 |
| 7 | 2016-06-23 |
+----+------------+
7 rows in set (0.00 sec)
Instead of mentioning select * from ... include the column names in the query.. select col1, col2,...col18 from ..
This answer is according to you question. So add more details to your question to get more clear answer.
You need to pass column names to get data for only those columns.
e.g.
table_name = dummy_tab
col_1 | col_2 | col_3 | col_4
--------------------------------
1 | John | 20 | abc#def
--------------------------------
2 | Doe | 21 | def#xyz
...
You can do:
SELECT col_1, col_2 FROM dummy_tab;
This will give:
col_1 | col_2
---------------
1 | John
---------------
2 | Doe
A bit of a pain but you can reduce effort and error by interrogating information_schema and excluding the fields you don't want by name or by position for example.
CREATE TABLE `dates` (
`id` INT(11) NULL DEFAULT NULL,
`dte` DATE NULL DEFAULT NULL,
`CalMonth` INT(11) NULL DEFAULT NULL,
`CalMonthDescLong` VARCHAR(10) NULL DEFAULT NULL,
`CalMonthDescShort` VARCHAR(10) NULL DEFAULT NULL,
`calQtr` INT(11) NULL DEFAULT NULL
)
COLLATE='latin1_swedish_ci'
ENGINE=InnoDB
;
use information_schema;
select concat('`',REPLACE(t.NAME,'/dates','.'), '`',replace(T.NAME,'sandbox/',''),c.NAME,'`',',')
from INNODB_SYS_TABLES t
join INNODB_SYS_COLUMNS c on c.TABLE_ID = t.TABLE_ID
where t.NAME like ('sandbox%')
and t.name like ('%dates')
and pos not in(1,4)
Best run from command line with the output piped to a text file.

Mysql insert data in existing table. how to automaticly give the data new id if already exists

I had two tables running in 2 different databases but the structure is identical. I want to import data of one table into the other but the id of the rows was autoincrement. This causes id's in both tables to have the same value but their content is different.
How do I insert the content of table1 into table 2 and auto update the id to a value that doesnt exist yet?
Because the table contains around 1000 rows I can't manually change the numbers or declare each individual row.
Something like ON DUPLICATE 'id' AUTO INCREMENT 'id'
?
this could be the way
Hitesh> desc test;
+-------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+----------------+
| name | varchar(200) | YES | | NULL | |
| id | int(11) | NO | PRI | NULL | auto_increment |
+-------+--------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)
Hitesh> desc test_new;
+-------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+----------------+
| name | varchar(200) | YES | | NULL | |
| id | int(11) | NO | PRI | NULL | auto_increment |
+-------+--------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)
Hitesh> insert into test_new(name) select name from test;
Query OK, 9 rows affected (0.03 sec)
Records: 9 Duplicates: 0 Warnings: 0
Hitesh> select * from test_new;
+-------------------------+----+
| name | id |
+-------------------------+----+
| i am the boss | 1 |
| You will get soon | 2 |
| Happy birthday bro | 3 |
| the beautiful girl | 4 |
| oyee its sunday | 5 |
| cat and dog in a park | 6 |
| dog and cat are playing | 7 |
| cat | 8 |
| dog | 9 |
+-------------------------+----+
9 rows in set (0.00 sec)
INSERT INTO new_db.new_tbl SELECT * FROM old_db.old_tbl;
Above will not generate new ids for new_tbl.
Let me explain it a little further, we consider you have both tables with id as auto increment enabled.
Override the auto increments
insert into B select * from A;
If you insert a value into new_tbl's (B) id column. i.e. if you select all columns, This will override the auto increment for the new table.
Activate the auto increment
insert into B (col1, col2) select col1, col2 from A;
insert into B select 0, col1, col2 from A;
If you want activate the auto increment on new_tbl (B) you can not pass ids to the insert stmnt, so you will need to skip the id (chose the columns you want to migrate without id column) or send DEFAULT/NULL/0 for the id.

How to prevent Duplicate records from my table Insert ignore does not work here

mysql> select * from emp;
+-----+---------+------+------+------+
| eno | ename | dno | mgr | sal |
+-----+---------+------+------+------+
| 1 | rama | 1 | NULL | 2000 |
| 2 | kri | 1 | 1 | 3000 |
| 4 | kri | 1 | 2 | 3000 |
| 5 | bu | 1 | 2 | 2000 |
| 6 | bu | 1 | 1 | 2500 |
| 7 | raa | 2 | NULL | 2500 |
| 8 | rrr | 2 | 7 | 2500 |
| 9 | sita | 2 | 7 | 1500 |
| 10 | dlksdgj | 2 | 2 | 2000 |
| 11 | dlksdgj | 2 | 2 | 2000 |
| 12 | dlksdgj | 2 | 2 | 2000 |
| 13 | dlksdgj | 2 | 2 | 2000 |
| 14 | dlksdgj | 2 | 2 | 2000 |
+-----+---------+------+------+------+
Here is my table. I want to eliminate or prevent insertion of the duplicate records, as the eno field is auto increment total row never be duplicate, but the records are duplicates. How can I prevent inserting those duplicate records?
I tried using INSERT IGNORE AND ON DUPLICATE KEY UPDATE (I think I have not used them properly).
The way I used them is,
mysql> insert into emp(ename,dno,mgr,sal) values('dlksdgj',2,2,2000);
Query OK, 1 row affected (0.03 sec)
mysql> insert ignore into emp(ename,dno,mgr,sal) values('dlksdgj',2,2,2000);
Query OK, 1 row affected (0.03 sec)
mysql> insert into emp(ename,dno,mgr,sal) values('dlksdgj',2,2,2000) ON DUPLICATE KEY UPDATE eno=eno;
Query OK, 1 row affected (0.03 sec)
mysql> insert into emp(ename,dno,mgr,sal) values('dlksdgj',2,2,2000) ON DUPLICATE KEY UPDATE eno=eno;
Query OK, 1 row affected (0.04 sec
mysql> desc emp;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| eno | int(11) | NO | PRI | NULL | auto_increment |
| ename | varchar(50) | YES | | NULL | |
| dno | int(11) | YES | | NULL | |
| mgr | int(11) | YES | MUL | NULL | |
| sal | int(11) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+
alter the table by adding UNIQUE constraint
ALTER TABLE employee ADD CONSTRAINT emp_unique UNIQUE (ename,dno,mgr,sal)
but you can do this if the table employee is empty.
or if records existed, try adding IGNORE
ALTER IGNORE TABLE employee ADD CONSTRAINT emp_unique UNIQUE (ename,dno,mgr,sal)
UPDATE 1
Something went wrong, I guess. You only need to add unique constraint on column ename since eno will always be unique due to AUTO_INCREMENT.
In order to add unique constraint, you need to do some cleanups on your table.
The queries below delete some duplicate records, and alters table by adding unique constraint on column ename.
DELETE a
FROM Employee a
LEFT JOIN
(
SELECT ename, MIN(eno) minEno
FROM Employee
GROUP BY ename
) b ON a.eno = b.minEno
WHERE b.minEno IS NULL;
ALTER TABLE employee ADD CONSTRAINT emp_unique UNIQUE (ename);
Here's a full demonstration
SQLFiddle Demo
Create a UNIQUE CONSTRAINT on which you think the duplicacy exist .
like
ALTER TABLE MYTABLE ADD CONSTRAINT constraint1 UNIQUE(column1, column2, column3)
This will work regardless of whether you clean up your table first (i.e. you can stop inserting duplicates immediately and clean up on separate schedule) and without having to add any unique constraints or altering table in any other way:
INSERT INTO
emp (ename, dno, mgr, sal)
SELECT
e.ename, 2, 2, 2000
FROM
(SELECT 'dlksdgj' AS ename) e
LEFT JOIN emp ON e.ename = emp.ename
WHERE
emp.ename IS NULL
The above query assumes you want to use ename as a "unique" field, but in the same way you could define any other fields or their combinations as unique for the purposes of this INSERT.
It works because it's an INSERT ... SELECT format where the SELECT part only produces a row (i.e. something to insert) if its left joined emp does not already have that value. Naturally, if you wanted to change which field(s) defined this "uniqueness" you would modify the SELECT and the LEFT JOIN accordingly.