I'm trying to remove duplicates from a table where FieldA,FieldB and FieldC are identical. I want to keep the record where FieldD is NOT NULL.
I generally remove duplicates (and prevent future ones) like so:
CREATE TABLE newtable LIKE oldtable;
INSERT newtable SELECT * FROM oldtable group by FieldA,FieldB,FieldC;
Drop Table oldtable;
Alter Table newtable RENAME oldtable;
CREATE Unique INDEX UniqueIndex ON oldtable (FieldA,FieldB,FieldC)
However I am unclear how to modify this to include the Not Null FieldD. It occurs to me I could use a Max(Char_Length(FieldD) but that simply seems to return the max value for each group, not the record with the max valule
For now I did the following though not (IMHO) a perfect solution
Update Table1 as T1
Inner Join Table1 as T2
On T1.FieldA=T2.FieldA
And T1.FieldB=T2.FieldB
And T1.Field=T2.FieldC
Set T1.FieldD=T2.FieldD
Where T1.FieldD is NULL and T2.FieldD is NOT NULL
This allowed me to standardize FieldD to a single non-null value and then I was able to easily remove dupes using the sequence I posted above:
CREATE TABLE newtable LIKE oldtable;
INSERT newtable SELECT * FROM oldtable group by FieldA,FieldB,FieldC;
Drop Table oldtable;
Alter Table newtable RENAME oldtable;
CREATE Unique INDEX UniqueIndex ON oldtable (FieldA,FieldB,FieldC)
In an ideal world I'd have figured out an update query to remove the dupes per the question but this intermediary step worked fine for now.
Leaving the question open and unsolved in case someone has a more direct solution to my question.
Related
I have a table called leads with duplicate records
Leads:
*account_id
*campaign_id
I want to remove all the duplicate account_id where campaign_id equal to "51"
For example, if account_id = 1991 appears two times in the table then remove the one with campaign_id = "51" and keep the other one.
You could use a delete join:
DELETE t1
FROM yourTable t1
INNER JOIN yourTable t2
ON t2.account_id = t1.account_id AND
t2.campaign_id <> 51
WHERE
t1.campaign_id = 51;
There's no problem to delete from a table provided that:
You use the correct syntax.
You have done a backup of the table BEFORE you do any deleting.
However, I would suggest a different method:
Create a new table based on the existing table:
CREATE TABLE mytable_new LIKE mytable;
Add unique constraint (or PRIMARY KEY) on column(s) you don't want to have duplicates:
ALTER TABLE mytable_new ADD UNIQUE(column1,[column2]);
Note: if you want to identify a combination of two (or more) columns as unique, place all the column names in the UNIQUE() separated by comma. Maybe in your case, the constraint would be UNIQUE(account_id, campaign_id).
Insert data from original table to new table:
INSERT IGNORE INTO mytable_new SELECT * FROM mytable;
Note: the IGNORE will insert only non-duplicate values that match with the UNIQUE() constraint. If you have an app that runs a MySQL INSERT query to the table, you have to update the query by adding IGNORE.
Check data consistency and once you're satisfied, rename both tables:
RENAME TABLE mytable TO mytable_old;
RENAME TABLE mytable_new TO mytable;
The best thing about this is that in case that if you see anything wrong with the new table, you still have the original table.
Changing the name of the tables only take less than a second, the probable issue here is that it might take a while to do the INSERT IGNORE if you have a large data.
Demo fiddle
DELETE t1
FROM yourTable t1
INNER JOIN yourTable t2
ON t2.account_id = t1.account_id AND
t2.campaign_id <> 51
WHERE
t1.campaign_id = 51;
I have crate a db in MySQL which has a lot of tables. I want the value of one table to be automatically saved on another table too.
For example I write something on: table1.lastname, I want this to be also stored in table2.lastname .
How is this called and how I can do that with PHP My Admin?
CREATE TABLE new_table_name LIKE old_table_name
Create trigger after_insert on new_table
like this
CREATE TRIGGER `AFTER_INSERT` AFTER INSERT ON `new_table_name` FOR EACH ROW BEGIN
insert into new_table_name (column_names) values (column_values) ;
END
For first you must create table for data store.
Then you must create trigger on wanted table for catch event and insert data in early created table.
This will do what you want:
INSERT INTO table2 (lastname)
SELECT lastname
FROM table1
If you want to include all rows from table1. Otherwise you can add a WHERE statement to the end if you want to add only a subset of table1.
I hope this helps.
If the table doesn't exist, you can create one with the same schema like so:
CREATE TABLE table2 LIKE table1;
Then, to copy the data over:
INSERT INTO table2 SELECT * FROM table1
Or If the tables have different structures you can also:
INSERT INTO table2 (`col1`,`col2`) SELECT `col1`,`col2` FROM table1;
EDIT: to constrain this..
INSERT INTO table2 (`col1_`,`col2_`) SELECT `col1`,`col2` FROM
table1 WHERE `foo`=1
I have two tables ,location and locationdata. I want to query data from both the tables using join and to store the result in a new table(locationCreatedNew) which is not already present in the MySQL.Can I do this in MySQL?
SELECT location.id,locationdata.name INTO locationCreatedNew FROM
location RIGHT JOIN locationdata ON
location.id=locationdata.location_location_id;
Your sample code in OP is syntax in SQL Server, the counter part of that in MySQL is something like:
CREATE TABLE locationCreatedNew
SELECT * FROM location RIGHT JOIN locationdata
ON location.id=locationdata.location_location_id;
Referance: CREATE TABLE ... SELECT
For CREATE TABLE ... SELECT, the destination table does not preserve information about whether columns in the selected-from table are generated columns. The SELECT part of the statement cannot assign values to generated columns in the destination table.
Some conversion of data types might occur. For example, the AUTO_INCREMENT attribute is not preserved, and VARCHAR columns can become CHAR columns. Retrained attributes are NULL (or NOT NULL) and, for those columns that have them, CHARACTER SET, COLLATION, COMMENT, and the DEFAULT clause.
When creating a table with CREATE TABLE ... SELECT, make sure to alias any function calls or expressions in the query. If you do not, the CREATE statement might fail or result in undesirable column names.
CREATE TABLE newTbl
SELECT tbl1.clm, COUNT(tbl2.tbl1_id) AS number_of_recs_tbl2
FROM tbl1 LEFT JOIN tbl2 ON tbl1.id = tbl2.tbl1_id
GROUP BY tbl1.id;
NOTE: newTbl is the name of the new table you want to create. You can use SELECT * FROM othertable which is the query that returns the data the table should be created from.
You can also explicitly specify the data type for a column in the created table:
CREATE TABLE foo (a TINYINT NOT NULL) SELECT b+1 AS a FROM bar;
For CREATE TABLE ... SELECT, if IF NOT EXISTS is given and the target table exists, nothing is inserted into the destination table, and the statement is not logged.
To ensure that the binary log can be used to re-create the original tables, MySQL does not permit concurrent inserts during CREATE TABLE ... SELECT.
You cannot use FOR UPDATE as part of the SELECT in a statement such as CREATE TABLE new_table SELECT ... FROM old_table .... If you attempt to do so, the statement fails.
Please check it for more. Hope this help you.
Use Query like below.
create table new_tbl as
select col1, col2, col3 from old_tbl t1, old_tbl t2
where condition;
I have been looking for a way to duplicate a row, and insert it back to the table, but with a different id value (whose type is auto increment).
I could do this by specifying every column manually, but as there are many columns, and as the columns can be added or removed in the future, I want to use some easy query to do this without having to specify every column name.
Try this:
CREATE TEMPORARY TABLE temp_table SELECT * FROM source_table WHERE ...;
ALTER TABLE temp_table DROP COLUMN column_with_auto_increment;
INSERT INTO source_table SELECT * from temp_table; DROP TABLE temp_table;
Try
INSERT INTO new_table (attr1, attr2, attr3) SELECT oldatr1, oldatr2, oldatr3 FROM old_table WHERE <the filter you want>
It also may work if new_table and old_table are the same.
I want to achieve the following use the following command to add a column to an existing table:
ALTER TABLE foo ADD COLUMN bar AFTER COLUMN old_column;
Can this option take substantially longer than the same command without the AFTER COLUMN option, as follows?
ALTER TABLE foo ADD COLUMN bar;
Will the first command use a greater amount of tmp table space during execution to perform the action?
Context: I have a very large table (think over a billion rows) and I want to add an additional column using the AFTER COLUMN option, but I don't want to be penalized too much.
Here's what I would do:
CREATE TABLE newtable LIKE oldtable;
ALTER TABLE newtable ADD COLUMN columnname INT(10) UNSIGNED NOT NULL DEFAULT 0;
I don't know the type of your column. I give an example with INT. Now here you can specify WHERE you want to add this new column. By default it will add it at the end unless you specify the AFTER keyword, if you provide it, you will have to specify in the order you will insert otherwise you need to put it at the end.
INSERT INTO newtable SELECT field1, field2, field3 /*etc...*/, newcolumn = 0 FROM oldtable;
OR, if you added it between columns:
# eg: ALTER TABLE newtable ADD COLUMN columnname INT(10) UNSIGNED NULL AFTER field2;
INSERT INTO newtable SELECT field1, field2, newcolumn = 0, field3 /*etc...*/ FROM oldtable;
You can add a where clause if you want to do them in batch.
Once all the records are there
DROP TABLE oldtable;
RENAME TABLE newtable to oldtable;
Create another table and alter the new table. ( like Book Of Zeus did )
And using ALTER TABLE newtable DISABLE KEYS and ALTER TABLE newtable ENABLE KEYS before and after the inserting query can make it faster. ( like below )
CREATE TABLE newtable ....;
ALTER TABLE newtable ....;
ALTER TABLE newtable DISABLE KEYS;
INSERT INTO newtable ....;
ALTER TABLE newtable ENABLE KEYS;
DROP TABLE oldtable;
While the other answers are useful as examples of the syntax required to add columns to a table, the answer to the actual question was provided by N.B.:
You'd get more CPU usage since records would have to be shifted.
From the memory usage point of view - it'd be the same with AFTER
COLUMN option and without it.
In most cases, a tmp table is created. There are MySQL engines that support hot schema changes (TokuDB being one) that don't create
the tmp table and waste tons of resources.
However, if you're doing this with MyISAM or InnoDB - I'd say that
"AFTER COLUMN" option will take slightly more time due to record
shifting.
– N.B.