Using the answer from this question: Need MySQL INSERT - SELECT query for tables with millions of records
new_table
* date
* record_id (pk)
* data_field
INSERT INTO new_table (date,record_id,data_field)
SELECT date, record_id, data_field FROM old_table
ON DUPLICATE KEY UPDATE date=old_table.data, data_field=old_table.data_field;
I need this to work with a group by and join.. so to edit:
INSERT INTO new_table (date,record_id,data_field,value)
SELECT date, record_id, data_field, SUM(other_table.value) as value FROM old_table JOIN other_table USING(record_id) GROUP BY record_id
ON DUPLICATE KEY UPDATE date=old_table.data, data_field=old_table.data_field, value = value;
I can't seem to get the value updated. If I specify old_table.value I get a not defined in field list error.
Per the docs at http://dev.mysql.com/doc/refman/5.0/en/insert-select.html
In the values part of ON DUPLICATE KEY UPDATE, you can refer to columns in other tables, as long as you do not use GROUP BY in the SELECT part. One side effect is that you must qualify nonunique column names in the values part.
So, you cannot use the select query because it has a group by statement. You need to use this trick instead. Basically, this creates a derived table for you to query from. It may not be incredibly efficient, but it works.
INSERT INTO new_table (date,record_id,data_field,value)
SELECT date, record_id, data_field, value
FROM (
SELECT date, record_id, data_field, SUM(other_table.value) as value
FROM old_table
JOIN other_table
USING(record_id)
GROUP BY record_id
) real_query
ON DUPLICATE KEY
UPDATE date=real_query.date, data_field=real_query.data_field, value = real_query.value;
While searching around some more, I found a related question: "MySQL ON DUPLICATE KEY UPDATE with nullable column in unique key".
The answer is that VALUES() can be used to refer to column "value" in the select sub-query.
Related
UPDATE AggregatedData SET datenum="734152.979166667",
Timestamp="2010-01-14 23:30:00.000" WHERE datenum="734152.979166667";
It works if the datenum exists, but I want to insert this data as a new row if the datenum does not exist.
UPDATE
the datenum is unique but that's not the primary key
Jai is correct that you should use INSERT ... ON DUPLICATE KEY UPDATE.
Note that you do not need to include datenum in the update clause since it's the unique key, so it should not change. You do need to include all of the other columns from your table. You can use the VALUES() function to make sure the proper values are used when updating the other columns.
Here is your update re-written using the proper INSERT ... ON DUPLICATE KEY UPDATE syntax for MySQL:
INSERT INTO AggregatedData (datenum,Timestamp)
VALUES ("734152.979166667","2010-01-14 23:30:00.000")
ON DUPLICATE KEY UPDATE
Timestamp=VALUES(Timestamp)
Try using this:
If you specify ON DUPLICATE KEY UPDATE, and a row is inserted that would cause a duplicate value in a UNIQUE index orPRIMARY KEY, MySQL performs an [UPDATE`](http://dev.mysql.com/doc/refman/5.7/en/update.html) of the old row...
The ON DUPLICATE KEY UPDATE clause can contain multiple column assignments, separated by commas.
With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values. If you specify the CLIENT_FOUND_ROWS flag to mysql_real_connect() when connecting to mysqld, the affected-rows value is 1 (not 0) if an existing row is set to its current values...
This is not too bad, but we could actually combine everything into one query. I found different solutions on the internet. The simplest, but MySQL only solution is this:
INSERT INTO wp_postmeta (post_id, meta_key)
SELECT
?id,
‘page_title’
FROM
DUAL
WHERE
NOT EXISTS (
SELECT
meta_id
FROM
wp_postmeta
WHERE
post_id = ?id
AND meta_key = ‘page_title’
);
UPDATE
wp_postmeta
SET
meta_value = ?page_title
WHERE
post_id = ?id
AND meta_key = ‘page_title’;
Link to documentation.
I had a situation where I needed to update or insert on a table according to two fields (both foreign keys) on which I couldn't set a UNIQUE constraint (so INSERT ... ON DUPLICATE KEY UPDATE won't work). Here's what I ended up using:
replace into last_recogs (id, hasher_id, hash_id, last_recog)
select l.* from
(select id, hasher_id, hash_id, [new_value] from last_recogs
where hasher_id in (select id from hashers where name=[hasher_name])
and hash_id in (select id from hashes where name=[hash_name])
union
select 0, m.id, h.id, [new_value]
from hashers m cross join hashes h
where m.name=[hasher_name]
and h.name=[hash_name]) l
limit 1;
This example is cribbed from one of my databases, with the input parameters (two names and a number) replaced with [hasher_name], [hash_name], and [new_value]. The nested SELECT...LIMIT 1 pulls the first of either the existing record or a new record (last_recogs.id is an autoincrement primary key) and uses that as the value input into the REPLACE INTO.
I'm reading about conditional updates on duplicate key based on IF statements - e.g., MySQL Conditional Insert on Duplicate.
I'm trying to do something similar, but within an insert from a select:
INSERT IGNORE INTO data1 (id, date, quantity)
SELECT id, date, quantity
FROM other_table
WHERE date = '2015-03-01'
AND id=123
ON DUPLICATE KEY UPDATE
quantity = IF(quantity IS NULL, VALUES(quantity), quantity)
However, this generates an error:
#1052 - Column 'quantity' in field list is ambiguous
I can't quite figure out how to tell MySQL which 'quantity' field is which in order to resolve the ambiguity problem. Adding aliases to each table doesn't seem to help (calling data1 'd' throws a different error).
Anyone have experience with this?
You should qualify the references to the quantity field that belongs to table data1 in the ON DUPLICATE KEY UPDATE part of the query:
INSERT INTO data1 (id, date, quantity)
SELECT id, date, quantity
FROM other_table
WHERE date = '2015-03-01'
AND id=123
ON DUPLICATE KEY UPDATE
quantity = IF(data1.quantity IS NULL, VALUES(quantity), data1.quantity)
A shorter way to write this IF() expression is to use function IFNULL() or COALESCE():
ON DUPLICATE KEY UPDATE
quantity = IFNULL(data1.quantity, VALUES(quantity))
or
ON DUPLICATE KEY UPDATE
quantity = COALESCE(data1.quantity, VALUES(quantity))
Also, there is no need to use IGNORE. The errors that IGNORE converts to warnings does not happen any more because of the ON DUPLICATE KEY UPDATE clause.
mySQL doesn't know to which quantity collumn you are referring since it is present in both data1 and other_table tables.
You have to use it like this: other_table.quantity
Your query will change like this
INSERT IGNORE INTO data1 (id, date, quantity)
SELECT id, date, quantity
FROM other_table
WHERE date = '2015-03-01'
AND id=123
ON DUPLICATE KEY UPDATE
data1.quantity = IF(other_table.quantity IS NULL,
VALUES(other_table.quantity), other_table.quantity)
Let's say I have two columns member_id, email in one table users. I'm trying to add a new row if no similar data is found with below statement:
INSERT INTO users(member_id, email)
VALUES (1,'k#live.com')
WHERE NOT EXISTS (SELECT * FROM users WHERE member_id=1 AND email='k#live.com');
However, it's not working. #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE EXISTS
Please shed some light. Thanks.
Assuming you have a unique constraint on member_id, email or a combination of both, I believe you will be better served with an INSERT IGNORE, if the record doesn't exist, it will be inserted.
INSERT IGNORE INTO users(member_id, email)
values (1, 'k#live.com');
If there is no unique constraint, use this technique here
INSERT INTO users(member_id, email)
SELECT 1,'k#live.com'
FROM dual
WHERE NOT EXISTS
(SELECT * FROM users WHERE member_id=1 AND email='k#live.com');
Dual is used in the dummy select rather than users in order to limit the rows inserted to 1.
There cannot be a WHERE clause in an INSERT ... VALUES ... statement.
The normal pattern for avoiding duplicates is to add UNIQUE constraint(s).
If you want to avoid adding any duplicate "member_id" values, and you also want to avoid adding any duplicate "email" values, then
CREATE UNIQUE INDEX mytab_UX1 ON mytab (member_id);
CREATE UNIQUE INDEX mytab_UX2 ON mytab (email);
Whenever an INSERT or UPDATE attempts to create a duplicate value, a duplicate key exception (error) will be thrown. MySQL provides the IGNORE keyword which will suppress the error, and allow the statement to complete successfully, but without introducing any duplicates.
Given an empty table, the first statement would insert a row, the second and third statements would not.
INSERT IGNORE INTO mytab (member_id, email) VALUES (1,'k#live.com');
INSERT IGNORE INTO mytab (member_id, email) VALUES (2,'k#live.com');
INSERT IGNORE INTO mytab (member_id, email) VALUES (1,'aaa#bbb.com');
If you want to restrict just the combination of the two columns to being unique, that is you would allow the 2nd and 3rd statements to insert a row, then you'd add a UNIQUE constraint on the combination of the two columns, rather than two separate unique indexes as above.
CREATE UNIQUE INDEX mytab_UX1 on mytab (member_id, email);
Aside from that convention, say you don't have a unique constraint, but you only want to modify the behavior of the single insert statement, then you can use a SELECT statement to return the values you want to insert, and then you can add a WHERE clause to the SELECT.
To avoid adding any duplicate member_id or duplicate email, then something like this would accomplish that:
INSERT INTO mytab (member_id, email)
SELECT s.member_id, s.email
FROM (SELECT 1 AS member_id, 'k#live.com' AS email) s
WHERE NOT EXISTS (SELECT 1 FROM mytab d WHERE d.member_id = s.member_id)
AND NOT EXISTS (SELECT 1 FROM mytab e WHERE e.email = s.email)
For best performance with a large table, you're going to want at least two indexes, one with a leading column of member_id, and one with a leading column of email. The NOT EXISTS subqueries can make use of an index to quickly locate a "matching" row, rather than scanning every row in the table.)
Again, if it's just the combination of the two columns you want to be unique, you'd use a single NOT EXISTS subquery, as in your original query.
Alternatively, you could use an anti-join pattern as an equivalent to the NOT EXISTS subquery.
INSERT INTO mytab (member_id, email)
SELECT s.member_id, s.email
FROM (SELECT 2 AS member_id, 'k#live.com' AS email) s
LEFT
JOIN mytab d
ON d.member_id = s.member_id
LEFT
JOIN mytab e
ON e.email = s.email
WHERE d.member_id IS NULL
AND e.email IS NULL
I want make sql query which will insert values from one table to another table by checking where condition on 1st table.
I have to check is that row present previously in 1st table or not. If not present then add otherwise don't add.
There is query "insert into select from" pattern in sql.
I have tried following query. But it inserts many duplicates.
INSERT INTO
company_location (company_id, country_id, city_id)
SELECT
ci.company_id, hq_location, hq_city
FROM
company_info ci, company_location cl
WHERE
ci.company_id <> cl.company_id
AND cl.country_id <> ci.hq_location
AND cl.city_id <> ci.hq_city;
Duplicate avoiding means that tuple (company_id, country_id, city_id) shouldn't added again. And I have to add from more 4 tables into these table.
Also I require query for removing duplicates from company_location. i.e. combination of (company_id, country_id, city_id) should exist only single time. Keep only one tuple and remove other rows.
I hope this untested Script helps! It inserts every combination just once.
INSERT INTO company_location
(company_id,country_id,city_id)
SELECT distinct ci.company_id,
ci.hq_location,
ci.hq_city
FROM company_info ci
WHERE ci.company_id NOT IN
(SELECT cl1.company_id FROM company_location cl1
WHERE cl1.country_id = ci.hq_location
AND cl1.city_id = ci.hq_city
AND cl1.company_id = ci.company_id)
INSERT INGORE works.
If you want a column (or column set) to be unique, put a UNIQUE constraint on your table. If yu no have UNIQUE CONSTRAINT, therefore, by definition, the table cannot contain any undesirable duplicates, since not putting a UNIQUE constraint means duplicates are desirable.
Add UNIQUE( company_id,country_id,city_id )(or maybe it's your primary key for that table)
use INSERT IGNORE
You can also rewrite your query correctly. The query does not do what you think it does, and you cannot do what you want by using the old join syntax from the 18th century.
SELECT * FROM t1, t2, t3
Is a CROSS JOIN, this means it takes all possible combinations of rows from table t1,t2,t3. Usually the WHERE contains some "t1.id=t2.id" conditions to restrict it and turn it into an INNER JOIN, but "<>" conditions do not do this...
You need a proper LEFT JOIN :
INSERT INTO company_location (company_id,country_id,city_id)
SELECT ci.company_id, hq_location, hq_city
FROM company_info ci,
LEFT JOIN company_location cl ON (
ci.company_id = cl.company_id
AND cl.country_id = ci.hq_location
AND cl.city_id = ci.hq_city
)
WHERE cl.company_id IS NULL
Here the answer to your second Question; Query to delete duplicate entries:
Please be careful with the statements they are not tested.
Solution 1:
This solution only works, if you have a row-Id in your table.
DELETE FROM company_location
WHERE id NOT IN
(SELECT MAX(cl1.id)
FROM company_location cl1
WHERE cl1.company_id = company_location.company_id
AND cl1.country_id = company_location.country_id
AND cl1.city_id = company_location.city_id)
Solution 2:
This works without row_id. It writes all data into a Temporary table. Deletes the content on the first table. And inserts every tupel just once.
To that solution: Be careful if you have defined constraints on that table!
CREATE TEMPORARY TABLE tmp_company_location
(
company_id bigint
,country_id bigint
,city_id bigint
);
INSERT INTO tmp_company_location
(company_id,country_id,city_id)
SELECT DISTINCT
company_id
,country_id
,city_id
FROM company_location WHERE 1;
DELETE FROM company_location;
INSERT INTO company_location
SELECT DISTINCT
company_id
,country_id
,city_id
FROM tmp_company_location;
use INSERT IGNORE INTO
from Mysql Docs
Specify IGNORE to ignore rows that would cause duplicate-key violations.
UPDATE AggregatedData SET datenum="734152.979166667",
Timestamp="2010-01-14 23:30:00.000" WHERE datenum="734152.979166667";
It works if the datenum exists, but I want to insert this data as a new row if the datenum does not exist.
UPDATE
the datenum is unique but that's not the primary key
Jai is correct that you should use INSERT ... ON DUPLICATE KEY UPDATE.
Note that you do not need to include datenum in the update clause since it's the unique key, so it should not change. You do need to include all of the other columns from your table. You can use the VALUES() function to make sure the proper values are used when updating the other columns.
Here is your update re-written using the proper INSERT ... ON DUPLICATE KEY UPDATE syntax for MySQL:
INSERT INTO AggregatedData (datenum,Timestamp)
VALUES ("734152.979166667","2010-01-14 23:30:00.000")
ON DUPLICATE KEY UPDATE
Timestamp=VALUES(Timestamp)
Try using this:
If you specify ON DUPLICATE KEY UPDATE, and a row is inserted that would cause a duplicate value in a UNIQUE index orPRIMARY KEY, MySQL performs an [UPDATE`](http://dev.mysql.com/doc/refman/5.7/en/update.html) of the old row...
The ON DUPLICATE KEY UPDATE clause can contain multiple column assignments, separated by commas.
With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values. If you specify the CLIENT_FOUND_ROWS flag to mysql_real_connect() when connecting to mysqld, the affected-rows value is 1 (not 0) if an existing row is set to its current values...
This is not too bad, but we could actually combine everything into one query. I found different solutions on the internet. The simplest, but MySQL only solution is this:
INSERT INTO wp_postmeta (post_id, meta_key)
SELECT
?id,
‘page_title’
FROM
DUAL
WHERE
NOT EXISTS (
SELECT
meta_id
FROM
wp_postmeta
WHERE
post_id = ?id
AND meta_key = ‘page_title’
);
UPDATE
wp_postmeta
SET
meta_value = ?page_title
WHERE
post_id = ?id
AND meta_key = ‘page_title’;
Link to documentation.
I had a situation where I needed to update or insert on a table according to two fields (both foreign keys) on which I couldn't set a UNIQUE constraint (so INSERT ... ON DUPLICATE KEY UPDATE won't work). Here's what I ended up using:
replace into last_recogs (id, hasher_id, hash_id, last_recog)
select l.* from
(select id, hasher_id, hash_id, [new_value] from last_recogs
where hasher_id in (select id from hashers where name=[hasher_name])
and hash_id in (select id from hashes where name=[hash_name])
union
select 0, m.id, h.id, [new_value]
from hashers m cross join hashes h
where m.name=[hasher_name]
and h.name=[hash_name]) l
limit 1;
This example is cribbed from one of my databases, with the input parameters (two names and a number) replaced with [hasher_name], [hash_name], and [new_value]. The nested SELECT...LIMIT 1 pulls the first of either the existing record or a new record (last_recogs.id is an autoincrement primary key) and uses that as the value input into the REPLACE INTO.