INSERT if related row does not exist - mysql

INSERT IGNORE doesn't work because there won't actually be a key conflict.
This is for a progress queue. For one application I don't want two rows with status S (for "started"). However other applications should be able to - in particular if I want to force two work items to happen at once, I can. So that's why it's not a database constraint.
So it's a little tricky to say in SQL "insert where does not exist." This works:
insert into user_queue_google
(user_id, status, start_timestamp)
(select 221, 'S', NOW() FROM filter_type
WHERE 221 not in (select user_id from user_queue_google where status = 'S') limit 1);
The problem is filter_type is a completely unrelated table that I just know is small and happens to never be empty. If it were empty for some reason, this would fail.
Can I avoid this very horrible hack without resorting to stored programs or IF/THEN logic in my SQL program?

use the dual system dummy table
insert into user_queue_google (user_id, status, start_timestamp)
select 221, 'S', NOW()
from dual
WHERE not exists
(
select user_id
from user_queue_google
where status = 'S'
and user_id = 221
)
You are permitted to specify DUAL as a dummy table name in situations where no tables are referenced
Source

You need to have a UNIQUE index for INSERT IGNORE to work effectively. What you could do is add an additional column that defaults to something harmless but could be changed to unique the key:
CREATE UNIQUE INDEX index_block ON user_queue_google (user_id, force);
Set force as a column that has DEFAULT 0. If you want to force two jobs to run at the same time, set it to something else to avoid a conflict:
UPDATE user_queue_google SET status='S', force=UNIX_TIMESTAMP() WHERE user_id=221
That'll free up a new slot for inserting new jobs.

This should work:
insert into user_queue_google
(user_id, status, start_timestamp)
select
A.user_id,
A.status,
A.start_timestamp
FROM
(SELECT
221 user_id,
'S' status,
NOW() start_timestamp) AS A
LEFT JOIN user_queue_google ON A.user_id = user_queue_google.user_id AND A.status = user_queue_google.status
WHERE
user_queue_google.user_id IS NULL;
See attached fiddle Insert if row doesn't exist

Related

MySQL UPSERT not recognizing alias

I have a query that performs an UPSERT (Insert - but if exists, update).
MySQL complains it isn't valid, here is the query:
insert into
mytable (user_id, num_products_observed, num_purchased_percent)
(select
A.user_id,
B.total 'num_products_observed',
case
when A.purchased is null then 0
else A.purchased/B.total
end 'num_purchased_percent'
from
(select user_id, count(prod_observed) 'total' from products where user_id = ? ) B
left join (select user_id, count(prod_purch) 'purchased' from products_purchased) A on B.user_id = A.user_id
) newsum -- <--- ISSUE IS HERE
ON DUPLICATE KEY UPDATE
num_products_observed = newsum.num_products_observed,
num_purchased_percent = newsum.num_purchased_percent
I hope this makes sense to you. The issue is at the line which reads ) newsum. MySql complains about the alias I'm giving the table. user_id is unique in this table (mytable).
It IS possible that B.total is null, in which case everything in newsum is null - which is fine, then I don't want to insert or update anything (or an update with user_id and zeros for all would be fine too).
Any thoughts on what I'm doing wrong? Thanks
Are you trying to use VALUES statements.
You can use the VALUES(col_name) function in the UPDATE clause to refer to column values from the INSERT portion of the INSERT ... ON DUPLICATE KEY UPDATE statement
dev.mysql

insert when a particular record not found

I got a situation that need to insert only if record does not exist. Normally, I'm going to use 2 queries with conditions like this:
SELECT FROM TABLE ->
IF RECORD NOT FOUND THEN -> INSERT INTO TABLE
ELSE -> DO NOTHING
I feel my solution is not a good one. How can I achieve the same thing with just a single query? For example:
SELECT * from user where status='A' AND name='Lewis'
IF RECORD NOT FOUND THEN
INSERT INTO user(status,name) VALUES('F','Lewis');
You mean something like:
BEGIN
IF NOT EXISTS (SELECT * from user where status='A' AND name='Lewis')
BEGIN
INSERT INTO user (status, name)
VALUES('F','Lewis')
END
END
This works in SQL Server, and should be possible in MySQL as well.
Edit:
Apparently it's not working in MySQL (just testd). However, you could use INSERT IGNORE:
INSERT IGNORE INTO user2 (status, name)
VALUES('F','Lewis');
Note that that would only work if you have a unique or primary key.
Another way could be to have the same unique or primary key, and then use:
INSERT IGNORE INTO user2 (status, name)
VALUES('F','Lewis')
ON DUPLICATE KEY UPDATE status=status;
This avoids ending up with other errors being ignored, and only ignores the duplicate key warning.

MySQL WHERE NOT EXISTS error

Let's say I have two columns member_id, email in one table users. I'm trying to add a new row if no similar data is found with below statement:
INSERT INTO users(member_id, email)
VALUES (1,'k#live.com')
WHERE NOT EXISTS (SELECT * FROM users WHERE member_id=1 AND email='k#live.com');
However, it's not working. #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'WHERE EXISTS
Please shed some light. Thanks.
Assuming you have a unique constraint on member_id, email or a combination of both, I believe you will be better served with an INSERT IGNORE, if the record doesn't exist, it will be inserted.
INSERT IGNORE INTO users(member_id, email)
values (1, 'k#live.com');
If there is no unique constraint, use this technique here
INSERT INTO users(member_id, email)
SELECT 1,'k#live.com'
FROM dual
WHERE NOT EXISTS
(SELECT * FROM users WHERE member_id=1 AND email='k#live.com');
Dual is used in the dummy select rather than users in order to limit the rows inserted to 1.
There cannot be a WHERE clause in an INSERT ... VALUES ... statement.
The normal pattern for avoiding duplicates is to add UNIQUE constraint(s).
If you want to avoid adding any duplicate "member_id" values, and you also want to avoid adding any duplicate "email" values, then
CREATE UNIQUE INDEX mytab_UX1 ON mytab (member_id);
CREATE UNIQUE INDEX mytab_UX2 ON mytab (email);
Whenever an INSERT or UPDATE attempts to create a duplicate value, a duplicate key exception (error) will be thrown. MySQL provides the IGNORE keyword which will suppress the error, and allow the statement to complete successfully, but without introducing any duplicates.
Given an empty table, the first statement would insert a row, the second and third statements would not.
INSERT IGNORE INTO mytab (member_id, email) VALUES (1,'k#live.com');
INSERT IGNORE INTO mytab (member_id, email) VALUES (2,'k#live.com');
INSERT IGNORE INTO mytab (member_id, email) VALUES (1,'aaa#bbb.com');
If you want to restrict just the combination of the two columns to being unique, that is you would allow the 2nd and 3rd statements to insert a row, then you'd add a UNIQUE constraint on the combination of the two columns, rather than two separate unique indexes as above.
CREATE UNIQUE INDEX mytab_UX1 on mytab (member_id, email);
Aside from that convention, say you don't have a unique constraint, but you only want to modify the behavior of the single insert statement, then you can use a SELECT statement to return the values you want to insert, and then you can add a WHERE clause to the SELECT.
To avoid adding any duplicate member_id or duplicate email, then something like this would accomplish that:
INSERT INTO mytab (member_id, email)
SELECT s.member_id, s.email
FROM (SELECT 1 AS member_id, 'k#live.com' AS email) s
WHERE NOT EXISTS (SELECT 1 FROM mytab d WHERE d.member_id = s.member_id)
AND NOT EXISTS (SELECT 1 FROM mytab e WHERE e.email = s.email)
For best performance with a large table, you're going to want at least two indexes, one with a leading column of member_id, and one with a leading column of email. The NOT EXISTS subqueries can make use of an index to quickly locate a "matching" row, rather than scanning every row in the table.)
Again, if it's just the combination of the two columns you want to be unique, you'd use a single NOT EXISTS subquery, as in your original query.
Alternatively, you could use an anti-join pattern as an equivalent to the NOT EXISTS subquery.
INSERT INTO mytab (member_id, email)
SELECT s.member_id, s.email
FROM (SELECT 2 AS member_id, 'k#live.com' AS email) s
LEFT
JOIN mytab d
ON d.member_id = s.member_id
LEFT
JOIN mytab e
ON e.email = s.email
WHERE d.member_id IS NULL
AND e.email IS NULL

Update with Subquery never completes

I'm currently working on a project with a MySQL Db of more than 8 million rows. I have been provided with a part of it to test some queries on it. It has around 20 columns out of which 5 are of use to me. Namely: First_Name, Last_Name, Address_Line1, Address_Line2, Address_Line3, RefundID
I have to create a unique but random RefundID for each row, that is not the problem. The problem is to create same RefundID for those rows whose First_Name, Last_Name, Address_Line1, Address_Line2, Address_Line3 as same.
This is my first real work related to MySQL with such large row count. So far I have created these queries:
-- Creating Teporary Table --
CREATE temporary table tempT (SELECT tt.First_Name, count(tt.Address_Line1) as
a1, count(tt.Address_Line2) as a2, count(tt.Address_Line3) as a3, tt.RefundID
FROM `tempTable` tt GROUP BY First_Name HAVING a1 >= 2 AND a2 >= 2 AND a3 >= 2);
-- Updating Rows with First_Name from tempT --
UPDATE `tempTable` SET RefundID = FLOOR(RAND()*POW(10,11))
WHERE First_Name IN (SELECT First_Name FROM tempT WHERE First_Name is not NULL);
This update query keeps on running but never ends, tempT has more than 30K rows. This query will then be run on the main DB with more than 800K rows.
Can someone help me out with this?
Regards
The solutions that seem obvious to me....
Don't use a random value - use a hash:
UPDATE yourtable
SET refundid = MD5('some static salt', First_Name
, Last_Name, Address_Line1, Address_Line2, Address_Line3)
The problem is that if you are using an integer value for the refundId then there's a good chance of getting a collision (hint CONV(SUBSTR(MD5(...),1,16),16,10) to get a SIGNED BIGINT). But you didn't say what the type of the field was, nor how strict the 'unique' requirement was. It does carry out the update in a single pass though.
An alternate approach which creates a densely packed seguence of numbers is to create a temporary table with the unique values from the original table and a random value. Order by the random value and set a monotonically increasing refundId - then use this as a look up table or update the original table:
SELECT DISTINCT First_Name
, Last_Name, Address_Line1, Address_Line2, Address_Line3
INTO temptable
FROM yourtable;
set #counter=-1;
UPDATE temptable t SET t,refundId=(#counter:=#counter + 1)
ORDER BY r.randomvalue;
There are other solutions too - but the more efficient ones rely on having multiple copies of the data and/or using a procedural language.
Try using the following:
UPDATE `tempTable` x SET RefundID = FLOOR(RAND()*POW(10,11))
WHERE exists (SELECT 1 FROM tempT y WHERE First_Name is not NULL and x.First_Name=y.First_Name);
In MySQL, it is often more efficient to use join with update than to filter through the where clause using a subquery. The following might perform better:
UPDATE `tempTable` join
(SELECT distinct First_Name
FROM tempT
WHERE First_Name is not NULL
) fn
on temptable.First_Name = fn.First_Name
SET RefundID = FLOOR(RAND()*POW(10,11));

Insert into... select from.... query with where condition

I want make sql query which will insert values from one table to another table by checking where condition on 1st table.
I have to check is that row present previously in 1st table or not. If not present then add otherwise don't add.
There is query "insert into select from" pattern in sql.
I have tried following query. But it inserts many duplicates.
INSERT INTO
company_location (company_id, country_id, city_id)
SELECT
ci.company_id, hq_location, hq_city
FROM
company_info ci, company_location cl
WHERE
ci.company_id <> cl.company_id
AND cl.country_id <> ci.hq_location
AND cl.city_id <> ci.hq_city;
Duplicate avoiding means that tuple (company_id, country_id, city_id) shouldn't added again. And I have to add from more 4 tables into these table.
Also I require query for removing duplicates from company_location. i.e. combination of (company_id, country_id, city_id) should exist only single time. Keep only one tuple and remove other rows.
I hope this untested Script helps! It inserts every combination just once.
INSERT INTO company_location
(company_id,country_id,city_id)
SELECT distinct ci.company_id,
ci.hq_location,
ci.hq_city
FROM company_info ci
WHERE ci.company_id NOT IN
(SELECT cl1.company_id FROM company_location cl1
WHERE cl1.country_id = ci.hq_location
AND cl1.city_id = ci.hq_city
AND cl1.company_id = ci.company_id)
INSERT INGORE works.
If you want a column (or column set) to be unique, put a UNIQUE constraint on your table. If yu no have UNIQUE CONSTRAINT, therefore, by definition, the table cannot contain any undesirable duplicates, since not putting a UNIQUE constraint means duplicates are desirable.
Add UNIQUE( company_id,country_id,city_id )(or maybe it's your primary key for that table)
use INSERT IGNORE
You can also rewrite your query correctly. The query does not do what you think it does, and you cannot do what you want by using the old join syntax from the 18th century.
SELECT * FROM t1, t2, t3
Is a CROSS JOIN, this means it takes all possible combinations of rows from table t1,t2,t3. Usually the WHERE contains some "t1.id=t2.id" conditions to restrict it and turn it into an INNER JOIN, but "<>" conditions do not do this...
You need a proper LEFT JOIN :
INSERT INTO company_location (company_id,country_id,city_id)
SELECT ci.company_id, hq_location, hq_city
FROM company_info ci,
LEFT JOIN company_location cl ON (
ci.company_id = cl.company_id
AND cl.country_id = ci.hq_location
AND cl.city_id = ci.hq_city
)
WHERE cl.company_id IS NULL
Here the answer to your second Question; Query to delete duplicate entries:
Please be careful with the statements they are not tested.
Solution 1:
This solution only works, if you have a row-Id in your table.
DELETE FROM company_location
WHERE id NOT IN
(SELECT MAX(cl1.id)
FROM company_location cl1
WHERE cl1.company_id = company_location.company_id
AND cl1.country_id = company_location.country_id
AND cl1.city_id = company_location.city_id)
Solution 2:
This works without row_id. It writes all data into a Temporary table. Deletes the content on the first table. And inserts every tupel just once.
To that solution: Be careful if you have defined constraints on that table!
CREATE TEMPORARY TABLE tmp_company_location
(
company_id bigint
,country_id bigint
,city_id bigint
);
INSERT INTO tmp_company_location
(company_id,country_id,city_id)
SELECT DISTINCT
company_id
,country_id
,city_id
FROM company_location WHERE 1;
DELETE FROM company_location;
INSERT INTO company_location
SELECT DISTINCT
company_id
,country_id
,city_id
FROM tmp_company_location;
use INSERT IGNORE INTO
from Mysql Docs
Specify IGNORE to ignore rows that would cause duplicate-key violations.