MySQL WHERE NOT EXISTS error - mysql

Let's say I have two columns member_id, email in one table users. I'm trying to add a new row if no similar data is found with below statement:
INSERT INTO users(member_id, email)
VALUES (1,'k#live.com')
WHERE NOT EXISTS (SELECT * FROM users WHERE member_id=1 AND email='k#live.com');
However, it's not working. #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE EXISTS
Please shed some light. Thanks.

Assuming you have a unique constraint on member_id, email or a combination of both, I believe you will be better served with an INSERT IGNORE, if the record doesn't exist, it will be inserted.
INSERT IGNORE INTO users(member_id, email)
values (1, 'k#live.com');
If there is no unique constraint, use this technique here
INSERT INTO users(member_id, email)
SELECT 1,'k#live.com'
FROM dual
WHERE NOT EXISTS
(SELECT * FROM users WHERE member_id=1 AND email='k#live.com');
Dual is used in the dummy select rather than users in order to limit the rows inserted to 1.

There cannot be a WHERE clause in an INSERT ... VALUES ... statement.
The normal pattern for avoiding duplicates is to add UNIQUE constraint(s).
If you want to avoid adding any duplicate "member_id" values, and you also want to avoid adding any duplicate "email" values, then
CREATE UNIQUE INDEX mytab_UX1 ON mytab (member_id);
CREATE UNIQUE INDEX mytab_UX2 ON mytab (email);
Whenever an INSERT or UPDATE attempts to create a duplicate value, a duplicate key exception (error) will be thrown. MySQL provides the IGNORE keyword which will suppress the error, and allow the statement to complete successfully, but without introducing any duplicates.
Given an empty table, the first statement would insert a row, the second and third statements would not.
INSERT IGNORE INTO mytab (member_id, email) VALUES (1,'k#live.com');
INSERT IGNORE INTO mytab (member_id, email) VALUES (2,'k#live.com');
INSERT IGNORE INTO mytab (member_id, email) VALUES (1,'aaa#bbb.com');
If you want to restrict just the combination of the two columns to being unique, that is you would allow the 2nd and 3rd statements to insert a row, then you'd add a UNIQUE constraint on the combination of the two columns, rather than two separate unique indexes as above.
CREATE UNIQUE INDEX mytab_UX1 on mytab (member_id, email);
Aside from that convention, say you don't have a unique constraint, but you only want to modify the behavior of the single insert statement, then you can use a SELECT statement to return the values you want to insert, and then you can add a WHERE clause to the SELECT.
To avoid adding any duplicate member_id or duplicate email, then something like this would accomplish that:
INSERT INTO mytab (member_id, email)
SELECT s.member_id, s.email
FROM (SELECT 1 AS member_id, 'k#live.com' AS email) s
WHERE NOT EXISTS (SELECT 1 FROM mytab d WHERE d.member_id = s.member_id)
AND NOT EXISTS (SELECT 1 FROM mytab e WHERE e.email = s.email)
For best performance with a large table, you're going to want at least two indexes, one with a leading column of member_id, and one with a leading column of email. The NOT EXISTS subqueries can make use of an index to quickly locate a "matching" row, rather than scanning every row in the table.)
Again, if it's just the combination of the two columns you want to be unique, you'd use a single NOT EXISTS subquery, as in your original query.
Alternatively, you could use an anti-join pattern as an equivalent to the NOT EXISTS subquery.
INSERT INTO mytab (member_id, email)
SELECT s.member_id, s.email
FROM (SELECT 2 AS member_id, 'k#live.com' AS email) s
LEFT
JOIN mytab d
ON d.member_id = s.member_id
LEFT
JOIN mytab e
ON e.email = s.email
WHERE d.member_id IS NULL
AND e.email IS NULL

Related

skipping duplicate values in mysql while inserting to a target table

Hi i am trying to insert data into another table and i would like to skip duplicate record in the target table. I have used the following mysql query.
insert into adggtnz.`reg02_maininfo`(farmermobile,farmername,farmergender,origin)
select * from (SELECT mobile_no,name,sex,'EADD' FROM EADD.farmer)
as tmp where not exists (select farmermobile from adggeth.`reg02_maininfo` where farmermobile = tmp.mobile_no)
The problem is that when there is a duplicate the query does not completely run how can i avoid the following error
16:09:03 insert into adggtnz.`reg02_maininfo`(farmermobile,farmername,farmergender,origin) select * from (SELECT mobile_no,name,sex,'EADD' FROM EADD.farmer) as tmp where not exists (select farmermobile from adggeth.`reg02_maininfo` where farmermobile = tmp.mobile_no) Error Code: 1062. Duplicate entry '0724961552' for key 'PRIMARY' 0.828 sec
Please help me modify my query
If you want to avoid duplicate entries, you never EVER query first to see if a record exists. You place a unique constraint and use INSERT IGNORE or INSERT INTO ... ON DUPLICATE KEY UPDATE.
The problem with first approach is that you can (and will) get false positives.
In your particular case, the fix is quite easy. You need to add IGNORE after INSERT. That will skip the record if duplicate and continue onto the next one.
INSERT IGNORE INTO adggtnz.`reg02_maininfo`(farmermobile,farmername,farmergender,origin)
SELECT mobile_no, name, sex, 'EADD' FROM EADD.farmer
Get the select query which initially checks the farmer mobile number in reg02_maininfo and then insert into reg02_maininfo.
insert into adggtnz.`reg02_maininfo`(farmermobile,farmername,farmergender,origin)
SELECT mobile_no,name,sex,'EADD' FROM EADD.farmer where mobile_no not in
(select farmermobile from adggeth.`reg02_maininfo`)

Auto-incrementing row ID

I'm trying to update values already stored in a table, and I've implemented an auto-incrementing primary key column so that I can reference specific rows by number (as recommended here).
Using...
ALTER TABLE taxipassengers ADD COLUMN rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY
The problem I'm running into though is that now I'm getting Column count doesn't match value count at row 1 when I insert the same data as before. It's like it wants me to give it a value for the PK. If I delete the column with the PK, the error goes away, and I'm back to square one.
Am I missing something?
EDIT: Here's the insert statement
INSERT INTO taxipassengers SELECT a.post_date, b.vendor_name, c.lastName, d.firstName, null as taxiGroup
FROM (select ID,post_date from wp_posts where post_type = 'shop_order') a,
(SELECT order_id,vendor_name FROM wp_wcpv_commissions) b,
(SELECT post_id,meta_value as lastName FROM wp_postmeta where meta_key ='_billing_last_name') c,
(SELECT post_id,meta_value as firstName FROM wp_postmeta WHERE meta_key ='_billing_first_name') d
WHERE a.ID = b.order_id and b.order_id=c.post_id and c.post_id = d.post_id;
Mind you, the insert statement worked before implementing the PK column, and it still works if I remove the PK column.
Possibly you are using this syntax to insert rows
INSERT INTO mytable VALUES (1, 'abc', 'def');
INSERT syntax from MySQL manual
The columns for which the statement provides values can be specified as follows:
If you do not specify a list of column names for INSERT ... VALUES or INSERT ... SELECT, values for every column in the table must be provided by the VALUES list or the SELECT statement. If you do not know the order of the columns in the table, use DESCRIBE tbl_name to find out.
You must add new column to your INSERT query. For autoincrement column NULL can be inserted to generate new value. And your column will be added last by default (if you don't use AFTER in ALTER TABLE).
To add a column at a specific position within a table row, use FIRST or AFTER col_name. The default is to add the column last. You can also use FIRST and AFTER in CHANGE or MODIFY operations to reorder columns within a table.
So, now your INSERT must look like this:
INSERT INTO mytable VALUES (1, 'abc', 'def', NULL); -- use NULL for autoincrement
INSERT INTO mytable (col1, col2, col3) VALUES (1, 'abc', 'def'); -- or add column names
First check if your database allows mutable PK, or prefers stable PK .
Please read through below articles, I am sure you will get what's going wrong.
http://rogersaccessblog.blogspot.in/2008/12/what-is-primary-key.html
First, the value in the primary key cannot be duplicated
Can we update primary key values of a table?

insert into select using different column in duplicate key update

INSERT INTO options (owner, name, value, modified)
SELECT owner, name, value, modified, #draft:=draft FROM
(
...
) `options`
ON DUPLICATE KEY UPDATE value=VALUES(value), modified=#draft
Above will error with column count doesn't match row count.
Is there a way I can SELECT a column into #draft without it being included as part of the inserts values but so it's usable in the DUPLICATE KEY UPDATE?
As stated in the manual:
In the values part of ON DUPLICATE KEY UPDATE, you can refer to columns in other tables, as long as you do not use GROUP BY in the SELECT part. One side effect is that you must qualify nonunique column names in the values part.
Therefore, you could do:
INSERT INTO options (owner, name, value, modified)
SELECT owner, name, value, modified FROM ( ... ) options2
ON DUPLICATE KEY UPDATE value=VALUES(value), modified=options2.draft
See it on sqlfiddle.

Insert into... select from.... query with where condition

I want make sql query which will insert values from one table to another table by checking where condition on 1st table.
I have to check is that row present previously in 1st table or not. If not present then add otherwise don't add.
There is query "insert into select from" pattern in sql.
I have tried following query. But it inserts many duplicates.
INSERT INTO
company_location (company_id, country_id, city_id)
SELECT
ci.company_id, hq_location, hq_city
FROM
company_info ci, company_location cl
WHERE
ci.company_id <> cl.company_id
AND cl.country_id <> ci.hq_location
AND cl.city_id <> ci.hq_city;
Duplicate avoiding means that tuple (company_id, country_id, city_id) shouldn't added again. And I have to add from more 4 tables into these table.
Also I require query for removing duplicates from company_location. i.e. combination of (company_id, country_id, city_id) should exist only single time. Keep only one tuple and remove other rows.
I hope this untested Script helps! It inserts every combination just once.
INSERT INTO company_location
(company_id,country_id,city_id)
SELECT distinct ci.company_id,
ci.hq_location,
ci.hq_city
FROM company_info ci
WHERE ci.company_id NOT IN
(SELECT cl1.company_id FROM company_location cl1
WHERE cl1.country_id = ci.hq_location
AND cl1.city_id = ci.hq_city
AND cl1.company_id = ci.company_id)
INSERT INGORE works.
If you want a column (or column set) to be unique, put a UNIQUE constraint on your table. If yu no have UNIQUE CONSTRAINT, therefore, by definition, the table cannot contain any undesirable duplicates, since not putting a UNIQUE constraint means duplicates are desirable.
Add UNIQUE( company_id,country_id,city_id )(or maybe it's your primary key for that table)
use INSERT IGNORE
You can also rewrite your query correctly. The query does not do what you think it does, and you cannot do what you want by using the old join syntax from the 18th century.
SELECT * FROM t1, t2, t3
Is a CROSS JOIN, this means it takes all possible combinations of rows from table t1,t2,t3. Usually the WHERE contains some "t1.id=t2.id" conditions to restrict it and turn it into an INNER JOIN, but "<>" conditions do not do this...
You need a proper LEFT JOIN :
INSERT INTO company_location (company_id,country_id,city_id)
SELECT ci.company_id, hq_location, hq_city
FROM company_info ci,
LEFT JOIN company_location cl ON (
ci.company_id = cl.company_id
AND cl.country_id = ci.hq_location
AND cl.city_id = ci.hq_city
)
WHERE cl.company_id IS NULL
Here the answer to your second Question; Query to delete duplicate entries:
Please be careful with the statements they are not tested.
Solution 1:
This solution only works, if you have a row-Id in your table.
DELETE FROM company_location
WHERE id NOT IN
(SELECT MAX(cl1.id)
FROM company_location cl1
WHERE cl1.company_id = company_location.company_id
AND cl1.country_id = company_location.country_id
AND cl1.city_id = company_location.city_id)
Solution 2:
This works without row_id. It writes all data into a Temporary table. Deletes the content on the first table. And inserts every tupel just once.
To that solution: Be careful if you have defined constraints on that table!
CREATE TEMPORARY TABLE tmp_company_location
(
company_id bigint
,country_id bigint
,city_id bigint
);
INSERT INTO tmp_company_location
(company_id,country_id,city_id)
SELECT DISTINCT
company_id
,country_id
,city_id
FROM company_location WHERE 1;
DELETE FROM company_location;
INSERT INTO company_location
SELECT DISTINCT
company_id
,country_id
,city_id
FROM tmp_company_location;
use INSERT IGNORE INTO
from Mysql Docs
Specify IGNORE to ignore rows that would cause duplicate-key violations.

Mysql on duplicate key update + sub query

Using the answer from this question: Need MySQL INSERT - SELECT query for tables with millions of records
new_table
* date
* record_id (pk)
* data_field
INSERT INTO new_table (date,record_id,data_field)
SELECT date, record_id, data_field FROM old_table
ON DUPLICATE KEY UPDATE date=old_table.data, data_field=old_table.data_field;
I need this to work with a group by and join.. so to edit:
INSERT INTO new_table (date,record_id,data_field,value)
SELECT date, record_id, data_field, SUM(other_table.value) as value FROM old_table JOIN other_table USING(record_id) GROUP BY record_id
ON DUPLICATE KEY UPDATE date=old_table.data, data_field=old_table.data_field, value = value;
I can't seem to get the value updated. If I specify old_table.value I get a not defined in field list error.
Per the docs at http://dev.mysql.com/doc/refman/5.0/en/insert-select.html
In the values part of ON DUPLICATE KEY UPDATE, you can refer to columns in other tables, as long as you do not use GROUP BY in the SELECT part. One side effect is that you must qualify nonunique column names in the values part.
So, you cannot use the select query because it has a group by statement. You need to use this trick instead. Basically, this creates a derived table for you to query from. It may not be incredibly efficient, but it works.
INSERT INTO new_table (date,record_id,data_field,value)
SELECT date, record_id, data_field, value
FROM (
SELECT date, record_id, data_field, SUM(other_table.value) as value
FROM old_table
JOIN other_table
USING(record_id)
GROUP BY record_id
) real_query
ON DUPLICATE KEY
UPDATE date=real_query.date, data_field=real_query.data_field, value = real_query.value;
While searching around some more, I found a related question: "MySQL ON DUPLICATE KEY UPDATE with nullable column in unique key".
The answer is that VALUES() can be used to refer to column "value" in the select sub-query.