Given:
TableA
Table B
guid
guid
missing
Table A ~400k rows
Table B ~150k rows
Both tables have the same guids but I need to mark in A all the missing guids from B. Both guids have indexes.
Query:
update table_a
left join table_b b on table_a.guid = b.guid
set missing = true
where b.guid is null;
This query works but took 4,5 hours on my machine. Is there any way I can make this query run faster?
UPD:
All three answers below gave me some tips to think on.
The following query ran for 8 seconds.
update table_a a
set missing = true
where a.guid not in (
select a.guid
from table_b b,
table_a a
where b.guid = a.guid
);
its much faster. it only tests if the ROW xists.
UPDATE table_a ta
SET missing = true
WHERE NOT EXISTS ( SELECT 1 from table_b tb WHERE ta.guid = tb.guid );
Is table_b have an index starting with guid? (A PRIMARY KEY is an INDEX.)
Do you need to run this query frequently? Let's get rid of it after this initial update. In the future, whenever you modify table_b, reach over and update table_a. TRIGGERs might be a good way to do such. A DELETE TRIGGER could set missing=1; an INSERT TRIGGER (etc)
In, instead, you choose to run the UPDATE repeatedly, see this for how to chunk the action, etc: http://mysql.rjweb.org/doc.php/deletebig#deleting_in_chunks
Another approach is to check for whether the row is "missing" by a LEFT JOIN in the SELECT, not by having the column.
Using TRIGGERs would be something like this:
DELIMITER //
CREATE TRIGGER del BEFORE DELETE ON table_b
FOR EACH ROW
BEGIN
UPDATE table_a
SET missing = true
WHERE guid = OLD.guid; -- or maybe test id??
END;
//
DELIMITER ;
And one for INSERT. And one for UPDATEs if you might change guid. This would need a two commands for UPDATE table_a -- one for the old guid (a la Delete), one for the new (a la Insert).
These would add a small burden when table_b is modified, but probably not enough to worry about.
Try:
update table_a
set missing = true
where guid not in (select guid from table_b)
Related
I want to use a trigger to automatically update another table but I'm having some problems with it.
DELIMITER $$
DROP TRIGGER IF EXISTS `trigger1` $$
CREATE TRIGGER `trigger1`
AFTER UPDATE ON `table1` FOR EACH ROW
UPDATE `table4`
inner join (SELECT o.`Name`,
o.Date,
(o.`Availability` * (c.Rate)) total
FROM `table2` o
LEFT JOIN `table1` r
ON o.`Name` = r.`Name`
AND o.Date = r.Date
LEFT JOIN (SELECT table3.`Name`,Choke, Rate FROM table3
left join `table1` as w
on table3.`Name` = w.`Name`
and table3.Choke = w.`Size`
where w.`Name` = table3.`Name`
and table3.Date <= w.Date
ORDER BY table3.Date DESC
LIMIT 1) c
ON c.`Name` = o.`Name`)x
set `Contribution` = x.total
where (`table4`.Date) = x.Date and `table4`.`Name` = x.`Name`;
END $$
DELIMITER ;
I would like to use the date from table1 row (that is the table which triggers the trigger) in my left join named c. As it stands c.Rate gives the same value every time because it uses the default table1.
If the row being updated has a date of '2022-01-13' then I want the date used at the line asterisked
and table3.Date <= w.Date
I want w.Date to be '2022-01-13'. But as it stands I can't get that and all the c.Rate give the same value.
Thanks.
The lack of consistent indentation and capitalisation makes your query almost impossible to read. Instead of obfuscating what is going on by using table1, table2, table3 & table4 you would be better off using the real table names, as it will make more sense to anyone trying to read it.
Your current update query makes little sense with the repeated left joins back to the originating table1 but it is hard to be sure given the lack of supporting information in your question. Your first left join to table1, aliased as r, does not get used anywhere. Your second left join to table1, aliased as w, is then referenced in the where clause which turns it into an inner join.
I suggest you update your question with the CREATE TABLE statements and some sample data to show the values before and after executing your update and the trigger update. Your current update query is definitely not the most efficient way of achieving your goal.
I don't really understand your question but it seems that you all are asking is how to use the value from the table1 row being updated? In which case the answer is simply -
and w.Date = NEW.Date
where NEW references the post-update version of the table1 row.
From 25.3.1 Trigger Syntax and Examples -
Within the trigger body, the OLD and NEW keywords enable you to access
columns in the rows affected by a trigger. OLD and NEW are MySQL
extensions to triggers; they are not case-sensitive.
In an INSERT trigger, only NEW.col_name can be used; there is no old
row. In a DELETE trigger, only OLD.col_name can be used; there is no
new row. In an UPDATE trigger, you can use OLD.col_name to refer to
the columns of a row before it is updated and NEW.col_name to refer to
the columns of the row after it is updated.
I have a table t with columns id(primary key),a,b,c,d. assume that the columns id,a,b and c are already populated. I want to set column d =md5(concat(b,c)). Now the issue is that this table contains millions of records and the unique combination of b and c is only a few thousands. I want to save the time required for computing md5 of same values. Is there a way in which I can update multiple rows of this table with the same value without computing the md5 again, something like this:
update t set d=md5(concat(b,c)) group by b,c;
As group by does not work with update statement.
One method is a join:
update t join
(select md5(concat(b, c)) as val
from table t
group by b, c
) tt
on t.b = tt.b and t.c = tt.c
set d = val;
However, it is quite possible that any working with the data would take longer than the md5() function, so doing the update directly could be feasible.
EDIT:
Actually, updating the entire table is likely to take time, just for the updates and logging. I would suggest that you create another table entirely for the b/c/d values and join in the values when you need them.
Create a temp table:
CREATE TEMPORARY TABLE IF NOT EXISTS tmpTable
AS (SELECT b, c, md5(concat(b, c)) as d FROM t group by b, c)
Update initial table:
UPDATE t orig
JOIN tmpTable tmp ON orig.b = tmp.b AND orig.c = tmp.c
SET orig.d = tmp.d
Drop the temp table:
DROP TABLE tmpTable
I must create a mysql query with a large number of queries (about 150,000)
For the moment the query is:
UPDATE table SET activated=NULL
WHERE (
id=XXXX
OR id=YYYY
OR id=ZZZZ
OR id=...
...
)
AND activated IS NOT NULL
Do you know a best way for to do that please?
If you're talking about thousands of items, an IN clause probably isn't going to work. In that case you would want to insert the items into a temporary table, then join with it for the update, like so:
UPDATE table tb
JOIN temptable ids ON ids.id = tb.id
SET tb.activated = NULL
UPDATE table
SET activated = NULL
WHERE id in ('XXXX', 'YYYY', 'zzzz')
AND activated IS NOT NULL
In DB2, I need to do a SELECT FROM UPDATE, to put an update + select in a single transaction.
But I need to make sure to update only one record per transaction.
Familiar with the LIMIT clause from MySQL's UPDATE option
places a limit on the number of rows that can be updated
I looked for something similar in DB2's UPDATE reference but without success.
How can something similar be achieved in DB2?
Edit: In my scenario, I have to deliver 1000 coupon codes upon request. I just need to select (any)one that has not been given yet.
The question uses some ambiguous terminology that makes it unclear what needs to be accomplished. Fortunately, DB2 offers robust support for a variety of SQL patterns.
To limit the number of rows that are modified by an UPDATE:
UPDATE
( SELECT t.column1 FROM someschema.sometable t WHERE ... FETCH FIRST ROW ONLY
)
SET column1 = 'newvalue';
The UPDATE statement never sees the base table, just the expression that filters it, so you can control which rows are updated.
To INSERT a limited number of new rows:
INSERT INTO mktg.offeredcoupons( cust_id, coupon_id, offered_on, expires_on )
SELECT c.cust_id, 1234, CURRENT TIMESTAMP, CURRENT TIMESTAMP + 30 DAYS
FROM mktg.customers c
LEFT OUTER JOIN mktg.offered_coupons o
ON o.cust_id = c.cust_id
WHERE ....
AND o.cust_id IS NULL
FETCH FIRST 1000 ROWS ONLY;
This is how DB2 supports SELECT from an UPDATE, INSERT, or DELETE statement:
SELECT column1 FROM NEW TABLE (
UPDATE ( SELECT column1 FROM someschema.sometable
WHERE ... FETCH FIRST ROW ONLY
)
SET column1 = 'newvalue'
) AS x;
The SELECT will return data from only the modified rows.
You have two options. As noted by A Horse With No Name, you can use the primary key of the table to ensure that one row is updated at a time.
The alternative, if you're using a programming language and have control over cursors, is to use a cursor with the 'FOR UPDATE' option (though that may be probably optional; IIRC, cursors are 'FOR UPDATE' by default when the underlying SELECT means it can be), and then use an UPDATE statement with the WHERE CURRENT OF <cursor-name> in the UPDATE statement. This will update the one row currently addressed by the cursor. The details of the syntax vary with the language you're using, but the raw SQL looks like:
DECLARE CURSOR cursor_name FOR
SELECT *
FROM SomeTable
WHERE PKCol1 = ? AND PKCol2 = ?
FOR UPDATE;
UPDATE SomeTable
SET ...
WHERE CURRENT OF cursor_name;
If you can't write DECLARE in your host language, you have to do manual bashing to find the equivalent mechanism.
I want to do all these update in one statement.
update table set ts=ts_1 where id=1
update table set ts=ts_2 where id=2
...
update table set ts=ts_n where id=n
Is it?
Use this:
UPDATE `table` SET `ts`=CONCAT('ts_', `id`);
Yes you can but that would require a table (if only virtual/temporary), where you's store the id + ts value pairs, and then run an UPDATE with the FROM syntax.
Assuming tmpList is a table with an id and a ts_value column, filled with the pairs of id value, ts value you wish to apply.
UPDATE table, tmpList
SET table.ts = tmpList.ts_value
WHERE table.id = tmpList.id
-- AND table.id IN (1, 2, 3, .. n)
-- above "AND" is only needed if somehow you wish to limit it, i.e
-- if tmpTbl has more idsthan you wish to update
A possibly table-less (but similar) approach would involve a CASE statement, as in:
UPDATE table
SET ts = CASE id
WHEN 1 THEN 'ts_1'
WHEN 2 THEN 'ts_2'
-- ..
WHEN n THEN 'ts_n'
END
WHERE id in (1, 2, ... n) -- here this is necessary I believe
Well, without knowing what data, I'm not sure whether the answer is yes or no.
It certainly is possible to update multiple rows at once:
update table table1 set field1='value' where field2='bar'
This will update every row in table2 whose field2 value is 'bar'.
update table1 set field1='value' where field2 in (1, 2, 3, 4)
This will update every row in the table whose field2 value is 1, 2, 3 or 4.
update table1 set field1='value' where field2 > 5
This will update every row in the table whose field2 value is greater than 5.
update table1 set field1=concat('value', id)
This will update every row in the table, setting the field1 value to 'value' plus the value of that row's id field.
You could do it with a case statement, but it wouldn't be pretty:
UPDATE table
SET ts = CASE id WHEN 1 THEN ts_1 WHEN 2 THEN ts_2 ... WHEN n THEN ts_n END
I think that you should expand the context of the problem. Why do you want/need all the updates to be done in one statement? What benefit does that give you? Perhaps there's another way to get that benefit.
Presumably you are interacting with sql via some code, so certainly you can simply make sure that the three updates all happen atomically by creating a function that performs all three of the updates.
e.g. pseudocode:
function update_all_three(val){
// all the updates in one function
}
The difference between a single function update and some kind of update that performs multiple updates at once is probably not a very useful distinction.
generate the statements:
select concat('update table set ts = ts_', id, ' where id = ', id, '; ')
from table
or generate the case conditions, then connect it to your update statement:
select concat('when ', id, ' then ts_', id) from table
You can use INSERT ... ON DUPLICATE KEY UPDATE. See this quesion: Multiple Updates in MySQL
ts_1, ts_2, ts_3, etc. are different fields on the same table? There's no way to do that with a single statement.