I want make sql query which will insert values from one table to another table by checking where condition on 1st table.
I have to check is that row present previously in 1st table or not. If not present then add otherwise don't add.
There is query "insert into select from" pattern in sql.
I have tried following query. But it inserts many duplicates.
INSERT INTO
company_location (company_id, country_id, city_id)
SELECT
ci.company_id, hq_location, hq_city
FROM
company_info ci, company_location cl
WHERE
ci.company_id <> cl.company_id
AND cl.country_id <> ci.hq_location
AND cl.city_id <> ci.hq_city;
Duplicate avoiding means that tuple (company_id, country_id, city_id) shouldn't added again. And I have to add from more 4 tables into these table.
Also I require query for removing duplicates from company_location. i.e. combination of (company_id, country_id, city_id) should exist only single time. Keep only one tuple and remove other rows.
I hope this untested Script helps! It inserts every combination just once.
INSERT INTO company_location
(company_id,country_id,city_id)
SELECT distinct ci.company_id,
ci.hq_location,
ci.hq_city
FROM company_info ci
WHERE ci.company_id NOT IN
(SELECT cl1.company_id FROM company_location cl1
WHERE cl1.country_id = ci.hq_location
AND cl1.city_id = ci.hq_city
AND cl1.company_id = ci.company_id)
INSERT INGORE works.
If you want a column (or column set) to be unique, put a UNIQUE constraint on your table. If yu no have UNIQUE CONSTRAINT, therefore, by definition, the table cannot contain any undesirable duplicates, since not putting a UNIQUE constraint means duplicates are desirable.
Add UNIQUE( company_id,country_id,city_id )(or maybe it's your primary key for that table)
use INSERT IGNORE
You can also rewrite your query correctly. The query does not do what you think it does, and you cannot do what you want by using the old join syntax from the 18th century.
SELECT * FROM t1, t2, t3
Is a CROSS JOIN, this means it takes all possible combinations of rows from table t1,t2,t3. Usually the WHERE contains some "t1.id=t2.id" conditions to restrict it and turn it into an INNER JOIN, but "<>" conditions do not do this...
You need a proper LEFT JOIN :
INSERT INTO company_location (company_id,country_id,city_id)
SELECT ci.company_id, hq_location, hq_city
FROM company_info ci,
LEFT JOIN company_location cl ON (
ci.company_id = cl.company_id
AND cl.country_id = ci.hq_location
AND cl.city_id = ci.hq_city
)
WHERE cl.company_id IS NULL
Here the answer to your second Question; Query to delete duplicate entries:
Please be careful with the statements they are not tested.
Solution 1:
This solution only works, if you have a row-Id in your table.
DELETE FROM company_location
WHERE id NOT IN
(SELECT MAX(cl1.id)
FROM company_location cl1
WHERE cl1.company_id = company_location.company_id
AND cl1.country_id = company_location.country_id
AND cl1.city_id = company_location.city_id)
Solution 2:
This works without row_id. It writes all data into a Temporary table. Deletes the content on the first table. And inserts every tupel just once.
To that solution: Be careful if you have defined constraints on that table!
CREATE TEMPORARY TABLE tmp_company_location
(
company_id bigint
,country_id bigint
,city_id bigint
);
INSERT INTO tmp_company_location
(company_id,country_id,city_id)
SELECT DISTINCT
company_id
,country_id
,city_id
FROM company_location WHERE 1;
DELETE FROM company_location;
INSERT INTO company_location
SELECT DISTINCT
company_id
,country_id
,city_id
FROM tmp_company_location;
use INSERT IGNORE INTO
from Mysql Docs
Specify IGNORE to ignore rows that would cause duplicate-key violations.
Related
I have a table called leads with duplicate records
Leads:
*account_id
*campaign_id
I want to remove all the duplicate account_id where campaign_id equal to "51"
For example, if account_id = 1991 appears two times in the table then remove the one with campaign_id = "51" and keep the other one.
You could use a delete join:
DELETE t1
FROM yourTable t1
INNER JOIN yourTable t2
ON t2.account_id = t1.account_id AND
t2.campaign_id <> 51
WHERE
t1.campaign_id = 51;
There's no problem to delete from a table provided that:
You use the correct syntax.
You have done a backup of the table BEFORE you do any deleting.
However, I would suggest a different method:
Create a new table based on the existing table:
CREATE TABLE mytable_new LIKE mytable;
Add unique constraint (or PRIMARY KEY) on column(s) you don't want to have duplicates:
ALTER TABLE mytable_new ADD UNIQUE(column1,[column2]);
Note: if you want to identify a combination of two (or more) columns as unique, place all the column names in the UNIQUE() separated by comma. Maybe in your case, the constraint would be UNIQUE(account_id, campaign_id).
Insert data from original table to new table:
INSERT IGNORE INTO mytable_new SELECT * FROM mytable;
Note: the IGNORE will insert only non-duplicate values that match with the UNIQUE() constraint. If you have an app that runs a MySQL INSERT query to the table, you have to update the query by adding IGNORE.
Check data consistency and once you're satisfied, rename both tables:
RENAME TABLE mytable TO mytable_old;
RENAME TABLE mytable_new TO mytable;
The best thing about this is that in case that if you see anything wrong with the new table, you still have the original table.
Changing the name of the tables only take less than a second, the probable issue here is that it might take a while to do the INSERT IGNORE if you have a large data.
Demo fiddle
DELETE t1
FROM yourTable t1
INNER JOIN yourTable t2
ON t2.account_id = t1.account_id AND
t2.campaign_id <> 51
WHERE
t1.campaign_id = 51;
i am using MySql Workbench version 6.3.9 with mySql 5.6.35.
i have the following tables:
EQUIPMENT
eID | caochID | eName
COACH
coachID | coachName
SQLfiddle prepared http://sqlfiddle.com/#!9/e333d/1
eID is a primary key. there are multiple coachID's in different equipment, so there will be duplicate coachIDs with different equipment, but the eID will be unique as it is a primary key.
REQUIRED
i need to insert a row in the equipment table, if it does not already exist. If it exists, do nothing.
various posts online have pointed me towards two options:
a) INSERT...ON DUPLICATE KEY UPDATE...
b)INSERT...WHERE NOT EXISTS
PROBLEM i have problems with both of these solutions. for the first solution (ON DUPLICATE KEY UPDATE) the query inserts the row as required but does not update the existing row. instead it creates a new entry. for the second solution (WHERE NOT EXISTS) i get an error : SYNTAX ERROR: 'WHERE' (WHERE) is not a valid input at this position.
the sql query doesnt need to make any joins. i listed both tables so that you can see how they are related. the insert query i need will only insert for the equipment table.
You can insert by using a tmp table and ensuring that the same record is not existing from current table. Add limit 1 to ensure only one record is inserted. Below query will not insert since 1 and small ball exists.
INSERT INTO `Equipment` (`c_id`, `eName`)
SELECT * FROM (SELECT '1', 'small ball') tmp
WHERE NOT EXISTS (
SELECT c_id FROM Equipment WHERE `c_id`='1' and `eName` = 'small ball'
) LIMIT 1;
NOT EXISTS
insert into table2 (....) --- all if not columns ... destination
select ....
from table1 t1 --- source of data to check
where not exists (
select 1
from table2 t2
where t2.col = t1.col --- match source and destination table making sure table1 data is not in table2
)
Let's say I have two columns member_id, email in one table users. I'm trying to add a new row if no similar data is found with below statement:
INSERT INTO users(member_id, email)
VALUES (1,'k#live.com')
WHERE NOT EXISTS (SELECT * FROM users WHERE member_id=1 AND email='k#live.com');
However, it's not working. #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE EXISTS
Please shed some light. Thanks.
Assuming you have a unique constraint on member_id, email or a combination of both, I believe you will be better served with an INSERT IGNORE, if the record doesn't exist, it will be inserted.
INSERT IGNORE INTO users(member_id, email)
values (1, 'k#live.com');
If there is no unique constraint, use this technique here
INSERT INTO users(member_id, email)
SELECT 1,'k#live.com'
FROM dual
WHERE NOT EXISTS
(SELECT * FROM users WHERE member_id=1 AND email='k#live.com');
Dual is used in the dummy select rather than users in order to limit the rows inserted to 1.
There cannot be a WHERE clause in an INSERT ... VALUES ... statement.
The normal pattern for avoiding duplicates is to add UNIQUE constraint(s).
If you want to avoid adding any duplicate "member_id" values, and you also want to avoid adding any duplicate "email" values, then
CREATE UNIQUE INDEX mytab_UX1 ON mytab (member_id);
CREATE UNIQUE INDEX mytab_UX2 ON mytab (email);
Whenever an INSERT or UPDATE attempts to create a duplicate value, a duplicate key exception (error) will be thrown. MySQL provides the IGNORE keyword which will suppress the error, and allow the statement to complete successfully, but without introducing any duplicates.
Given an empty table, the first statement would insert a row, the second and third statements would not.
INSERT IGNORE INTO mytab (member_id, email) VALUES (1,'k#live.com');
INSERT IGNORE INTO mytab (member_id, email) VALUES (2,'k#live.com');
INSERT IGNORE INTO mytab (member_id, email) VALUES (1,'aaa#bbb.com');
If you want to restrict just the combination of the two columns to being unique, that is you would allow the 2nd and 3rd statements to insert a row, then you'd add a UNIQUE constraint on the combination of the two columns, rather than two separate unique indexes as above.
CREATE UNIQUE INDEX mytab_UX1 on mytab (member_id, email);
Aside from that convention, say you don't have a unique constraint, but you only want to modify the behavior of the single insert statement, then you can use a SELECT statement to return the values you want to insert, and then you can add a WHERE clause to the SELECT.
To avoid adding any duplicate member_id or duplicate email, then something like this would accomplish that:
INSERT INTO mytab (member_id, email)
SELECT s.member_id, s.email
FROM (SELECT 1 AS member_id, 'k#live.com' AS email) s
WHERE NOT EXISTS (SELECT 1 FROM mytab d WHERE d.member_id = s.member_id)
AND NOT EXISTS (SELECT 1 FROM mytab e WHERE e.email = s.email)
For best performance with a large table, you're going to want at least two indexes, one with a leading column of member_id, and one with a leading column of email. The NOT EXISTS subqueries can make use of an index to quickly locate a "matching" row, rather than scanning every row in the table.)
Again, if it's just the combination of the two columns you want to be unique, you'd use a single NOT EXISTS subquery, as in your original query.
Alternatively, you could use an anti-join pattern as an equivalent to the NOT EXISTS subquery.
INSERT INTO mytab (member_id, email)
SELECT s.member_id, s.email
FROM (SELECT 2 AS member_id, 'k#live.com' AS email) s
LEFT
JOIN mytab d
ON d.member_id = s.member_id
LEFT
JOIN mytab e
ON e.email = s.email
WHERE d.member_id IS NULL
AND e.email IS NULL
UPDATE AggregatedData SET datenum="734152.979166667",
Timestamp="2010-01-14 23:30:00.000" WHERE datenum="734152.979166667";
It works if the datenum exists, but I want to insert this data as a new row if the datenum does not exist.
UPDATE
the datenum is unique but that's not the primary key
Jai is correct that you should use INSERT ... ON DUPLICATE KEY UPDATE.
Note that you do not need to include datenum in the update clause since it's the unique key, so it should not change. You do need to include all of the other columns from your table. You can use the VALUES() function to make sure the proper values are used when updating the other columns.
Here is your update re-written using the proper INSERT ... ON DUPLICATE KEY UPDATE syntax for MySQL:
INSERT INTO AggregatedData (datenum,Timestamp)
VALUES ("734152.979166667","2010-01-14 23:30:00.000")
ON DUPLICATE KEY UPDATE
Timestamp=VALUES(Timestamp)
Try using this:
If you specify ON DUPLICATE KEY UPDATE, and a row is inserted that would cause a duplicate value in a UNIQUE index orPRIMARY KEY, MySQL performs an [UPDATE`](http://dev.mysql.com/doc/refman/5.7/en/update.html) of the old row...
The ON DUPLICATE KEY UPDATE clause can contain multiple column assignments, separated by commas.
With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values. If you specify the CLIENT_FOUND_ROWS flag to mysql_real_connect() when connecting to mysqld, the affected-rows value is 1 (not 0) if an existing row is set to its current values...
This is not too bad, but we could actually combine everything into one query. I found different solutions on the internet. The simplest, but MySQL only solution is this:
INSERT INTO wp_postmeta (post_id, meta_key)
SELECT
?id,
‘page_title’
FROM
DUAL
WHERE
NOT EXISTS (
SELECT
meta_id
FROM
wp_postmeta
WHERE
post_id = ?id
AND meta_key = ‘page_title’
);
UPDATE
wp_postmeta
SET
meta_value = ?page_title
WHERE
post_id = ?id
AND meta_key = ‘page_title’;
Link to documentation.
I had a situation where I needed to update or insert on a table according to two fields (both foreign keys) on which I couldn't set a UNIQUE constraint (so INSERT ... ON DUPLICATE KEY UPDATE won't work). Here's what I ended up using:
replace into last_recogs (id, hasher_id, hash_id, last_recog)
select l.* from
(select id, hasher_id, hash_id, [new_value] from last_recogs
where hasher_id in (select id from hashers where name=[hasher_name])
and hash_id in (select id from hashes where name=[hash_name])
union
select 0, m.id, h.id, [new_value]
from hashers m cross join hashes h
where m.name=[hasher_name]
and h.name=[hash_name]) l
limit 1;
This example is cribbed from one of my databases, with the input parameters (two names and a number) replaced with [hasher_name], [hash_name], and [new_value]. The nested SELECT...LIMIT 1 pulls the first of either the existing record or a new record (last_recogs.id is an autoincrement primary key) and uses that as the value input into the REPLACE INTO.
Using the answer from this question: Need MySQL INSERT - SELECT query for tables with millions of records
new_table
* date
* record_id (pk)
* data_field
INSERT INTO new_table (date,record_id,data_field)
SELECT date, record_id, data_field FROM old_table
ON DUPLICATE KEY UPDATE date=old_table.data, data_field=old_table.data_field;
I need this to work with a group by and join.. so to edit:
INSERT INTO new_table (date,record_id,data_field,value)
SELECT date, record_id, data_field, SUM(other_table.value) as value FROM old_table JOIN other_table USING(record_id) GROUP BY record_id
ON DUPLICATE KEY UPDATE date=old_table.data, data_field=old_table.data_field, value = value;
I can't seem to get the value updated. If I specify old_table.value I get a not defined in field list error.
Per the docs at http://dev.mysql.com/doc/refman/5.0/en/insert-select.html
In the values part of ON DUPLICATE KEY UPDATE, you can refer to columns in other tables, as long as you do not use GROUP BY in the SELECT part. One side effect is that you must qualify nonunique column names in the values part.
So, you cannot use the select query because it has a group by statement. You need to use this trick instead. Basically, this creates a derived table for you to query from. It may not be incredibly efficient, but it works.
INSERT INTO new_table (date,record_id,data_field,value)
SELECT date, record_id, data_field, value
FROM (
SELECT date, record_id, data_field, SUM(other_table.value) as value
FROM old_table
JOIN other_table
USING(record_id)
GROUP BY record_id
) real_query
ON DUPLICATE KEY
UPDATE date=real_query.date, data_field=real_query.data_field, value = real_query.value;
While searching around some more, I found a related question: "MySQL ON DUPLICATE KEY UPDATE with nullable column in unique key".
The answer is that VALUES() can be used to refer to column "value" in the select sub-query.