SQL MERGE statement to update data - sql-server-2008

I've got a table with data named energydata
it has just three columns
(webmeterID, DateTime, kWh)
I have a new set of updated data in a table temp_energydata.
The DateTime and the webmeterID stay the same. But the kWh values need updating from temp_energydata table.
How do I write the T-SQL for this the correct way?

Assuming you want an actual SQL Server MERGE statement:
MERGE INTO dbo.energydata WITH (HOLDLOCK) AS target
USING dbo.temp_energydata AS source
ON target.webmeterID = source.webmeterID
AND target.DateTime = source.DateTime
WHEN MATCHED THEN
UPDATE SET target.kWh = source.kWh
WHEN NOT MATCHED BY TARGET THEN
INSERT (webmeterID, DateTime, kWh)
VALUES (source.webmeterID, source.DateTime, source.kWh);
If you also want to delete records in the target that aren't in the source:
MERGE INTO dbo.energydata WITH (HOLDLOCK) AS target
USING dbo.temp_energydata AS source
ON target.webmeterID = source.webmeterID
AND target.DateTime = source.DateTime
WHEN MATCHED THEN
UPDATE SET target.kWh = source.kWh
WHEN NOT MATCHED BY TARGET THEN
INSERT (webmeterID, DateTime, kWh)
VALUES (source.webmeterID, source.DateTime, source.kWh)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
Because this has become a bit more popular, I feel like I should expand this answer a bit with some caveats to be aware of.
First, there are several blogs which report concurrency issues with the MERGE statement in older versions of SQL Server. I do not know if this issue has ever been addressed in later editions. Either way, this can largely be worked around by specifying the HOLDLOCK or SERIALIZABLE lock hint:
MERGE INTO dbo.energydata WITH (HOLDLOCK) AS target
[...]
You can also accomplish the same thing with more restrictive transaction isolation levels.
There are several other known issues with MERGE. (Note that since Microsoft nuked Connect and didn't link issues in the old system to issues in the new system, these older issues are hard to track down. Thanks, Microsoft!) From what I can tell, most of them are not common problems or can be worked around with the same locking hints as above, but I haven't tested them.
As it is, even though I've never had any problems with the MERGE statement myself, I always use the WITH (HOLDLOCK) hint now, and I prefer to use the statement only in the most straightforward of cases.

I often used Bacon Bits great answer as I just can not memorize the syntax.
But I usually add a CTE as an addition to make the DELETE part more useful because very often you will want to apply the merge only to a part of the target table.
WITH target as (
SELECT * FROM dbo.energydate WHERE DateTime > GETDATE()
)
MERGE INTO target WITH (HOLDLOCK)
USING dbo.temp_energydata AS source
ON target.webmeterID = source.webmeterID
AND target.DateTime = source.DateTime
WHEN MATCHED THEN
UPDATE SET target.kWh = source.kWh
WHEN NOT MATCHED BY TARGET THEN
INSERT (webmeterID, DateTime, kWh)
VALUES (source.webmeterID, source.DateTime, source.kWh)
WHEN NOT MATCHED BY SOURCE THEN
DELETE

If you need just update your records in energydata based on data in temp_energydata, assuming that temp_enerydata doesn't contain any new records, then try this:
UPDATE e SET e.kWh = t.kWh
FROM energydata e INNER JOIN
temp_energydata t ON e.webmeterID = t.webmeterID AND
e.DateTime = t.DateTime
Here is working sqlfiddle
But if temp_energydata contains new records and you need to insert it to energydata preferably with one statement then you should definitely go with the answer that Bacon Bits gave.

UPDATE ed
SET ed.kWh = ted.kWh
FROM energydata ed
INNER JOIN temp_energydata ted ON ted.webmeterID = ed.webmeterID

Update energydata set energydata.kWh = temp.kWh
where energydata.webmeterID = (select webmeterID from temp_energydata as temp)

THE CORRECT WAY IS :
UPDATE test1
INNER JOIN test2 ON (test1.id = test2.id)
SET test1.data = test2.data

Related

SQL - is it possible to call variables even if it's different table?

So I have a table
- Members - store the parents
- Child - restore the child
I'm new to sql and my code is not working as expect, but you might be able to understand what I'm trying to accomplish here.
set #variable1 = (select idMembers from members where firstname like '%James%')
set #variable2 = (select FirstName, lastname, relationship from child where idMembers = #variable)
print #variable2
I am recommending you to use JOIN to extract value from multiple related tables. However, as in your case, you are actually assign multiple column value to "variable2" which may be the problem.
BTW, you may want to read this topic to see the difference between SET and SELECT.

Mysql duplicate row deletion with Perl DBI across two tables

This one is a pretty good one IMO and I have not seen a close exampled on SO or Google so here you go. I need to do the following within a Perl application I am building. Unfortunately it can not be done directly in MySQL and will require DBI. In a nutshell I need to take Database1.tableA and locate every record with the column 'status' matching 'started'. This I can do as it is fairly easy (not very good with DBI yet, but have read the docs), but where I am having issues is what I have to do next.
my $started_query = "SELECT primary_ip FROM queue WHERE status='started'";
my $started = $dbh->prepare($started_query);
$started->execute();
while ( my #started = $started->fetchrow_array() ) {
# Where I am hoping to have the following occur so it can go by row
# for only rows with the status 'started'
}
So for each record in the #started array, really only contains one value per iteration of the while loop, I need to see if it exists in the Database2.tableA and IF it does exist in the other database (Database2.tableA) I need to delete it from Database1.tableA, but if it DOES NOT exist in the other database (Database2.tableA) I need to update the record in the current database (Database1.tableA).
Basically replicating the below semi-valid MySQL syntax.
DELETE FROM tableA WHERE primary_ip IN (SELECT primary_ip FROM db2.tablea) OR UPDATE tableA SET status = 'error'
I am limited to DBI to connect to the two databases and the logic is escaping me currently. I could do the queries to both databases and store in #arrays and then do a comparison, but that seems redundant as I think it should be possible within the while ( my #started = $started->fetchrow_array() ) as that will save on runtime and resources required. I am also not familiar enough with passing variables between DBI instances and as the #started array will always contain the column value I need to query for and delete I would like to take full advantage of having that defined and passed to the DBI objects.
I am going to be working on this thing all night and already ran through a couple pots of coffee so your helping me understand this logic is greatly appreciated.
You'll be better off with fetchrow_hashref, which returns a hashref of key/value pairs, where the keys are the column names, rather than coding based on columns showing up at ordinal positions in the array.
You need an additional database handle to do the lookups and updates because you've got a live statement handle on the first one. Something like this:
my $dbh2 = DBI->connect(...same credentials...);
...
while(my $row = $started->fetchrow_hashref)
{
if(my $found = $dbh2->selectrow_hashref("SELECT * FROM db2.t2 WHERE primary_ip = ?",undef,$row->{primary_ip}))
{
$dbh2->do("DELETE FROM db1.t1 WHERE primary_ip = ?",undef,$found->{primary_ip});
}
else
{
$dbh2->do("UPDATE db1.t1 SET status = 'error' WHERE primary_ip = ?",undef,$found->{primary_ip}");
}
}
Technically, I don't "need" to fetch the row from db2.t2 into my $found since you're only apparently testing for existence, there are other ways, but using it here is a bit of insurance against doing something other than you intended, since it will be undef if we somehow get some bad logic going and that should keep us from making some potential wrong changes.
But approaching a relational database with loop iterations is rarely the best tactic.
This "could" be done directly in MySQL with just a couple of queries.
First, the updates, where t1.status = 'started' and t2.primary_ip has no matching value for t1.primary_ip:
UPDATE db1.t1 a LEFT JOIN db2.t2 b ON b.primary_ip = a.primary_ip
SET a.status = 'error'
WHERE b.primary_ip IS NULL AND a.status = 'started';
If you are thinking "but b.primary_ip is never null" ... well, it is null in a left join where there are no matching rows.
Then deleting the rows from t1 can also be accomplished with a join. Multi-table joins delete only the rows from the table aliases listed between DELETE and FROM. Again, we're calling "t1" by the alias "a" and t2 by the alias "b".
DELETE a
FROM db1.t1 a JOIN db2.t2 b ON a.primary_ip = b.primary_ip
WHERE a.status = 'started';
This removes every row from t1 ("a") where status = 'started' AND where a matching row exists in t2.

Update multiple mysql rows with 1 query?

I am porting client DB to new one with different post titles and rows ID's , but he wants to keep the hits from old website,
he has over 500 articles in new DB , and updating one is not an issue with this query
UPDATE blog_posts
SET hits=8523 WHERE title LIKE '%slim charger%' AND category = 2
but how would I go by doing this for all 500 articles with 1 query ? I already have export query from old db with post title and hits so we could find the new ones easier
INSERT INTO `news_items` (`title`, `hits`) VALUES
('Slim charger- your new friend', 8523 )...
the only reference in both tables is product name word within the title everything else is different , id , full title ...
Make a tmp table for old data in old_posts
UPDATE new_posts LEFT JOIN old_posts ON new_posts.title = old_posts.title SET new_posts.hits = old_posts.hits;
Unfortunately that's not how it works, you will have to write a script/program that does a loop.
articles cursor;
selection articlesTable%rowtype;
WHILE(FETCH(cursor into selection)%hasNext)
Insert into newTable selection;
END WHILE
How you bridge it is up to you, but that's the basic pseudo code/PLSQL.
The APIs for selecting from one DB and putting into another vary by DBMS, so you will need a common intermediate format. Basically take the record from the first DB, stick it into a struct in the programming language of your choice, and prefrom an insert using those struct values using the APIs for the other DBMS.
I'm not 100% sure that you can update multiple records at once, but I think what you want to do is use a loop in combination with the update query.
However, if you have 2 tables with absolutely no relationship or common identifiers between them, you are kind of in a hard place. The hard place in this instance would mean you have to do them all manually :(
The last possible idea to save you is that the id's might be different, but they might still have the same order. If that is the case you can still loop through the old table and update the number table as I described above.
You can build a procedure that'll do it for you:
CREATE PROCEDURE insert_news_items()
BEGIN
DECLARE news_items_cur CURSOR FOR
SELECT title, hits
FROM blog_posts
WHERE title LIKE '%slim charger%' AND category = 2;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
OPEN news_items_cur;
LOOP
IF done THEN
LEAVE read_loop;
END IF;
FETCH news_items_cur
INTO title, hits;
INSERT INTO `news_items` (`title`, `hits`) VALUES (title, hits);
END LOOP;
CLOSE news_items_cur;
END;

How to test an SQL Update statement before running it?

In some cases, running an UPDATE statement in production can save the day. However a borked update can be worse than the initial problem.
Short of using a test database, what are options to tell what an update statement will do before running it?
What about Transactions? They have the ROLLBACK-Feature.
#see https://dev.mysql.com/doc/refman/5.0/en/commit.html
For example:
START TRANSACTION;
SELECT * FROM nicetable WHERE somthing=1;
UPDATE nicetable SET nicefield='VALUE' WHERE somthing=1;
SELECT * FROM nicetable WHERE somthing=1; #check
COMMIT;
# or if you want to reset changes
ROLLBACK;
SELECT * FROM nicetable WHERE somthing=1; #should be the old value
Answer on question from #rickozoe below:
In general these lines will not be executed as once. In PHP f.e. you would write something like that (perhaps a little bit cleaner, but wanted to answer quick ;-) ):
$MysqlConnection->query('START TRANSACTION;');
$erg = $MysqlConnection->query('UPDATE MyGuests SET lastname='Doe' WHERE id=2;');
if($erg)
$MysqlConnection->query('COMMIT;');
else
$MysqlConnection->query('ROLLBACK;');
Another way would be to use MySQL Variables (see https://dev.mysql.com/doc/refman/5.7/en/user-variables.html
and
https://stackoverflow.com/a/18499823/1416909
):
# do some stuff that should be conditionally rollbacked later on
SET #v1 := UPDATE MyGuests SET lastname='Doe' WHERE id=2;
IF(v1 < 1) THEN
ROLLBACK;
ELSE
COMMIT;
END IF;
But I would suggest to use the language wrappers available in your favorite programming language.
In addition to using a transaction as Imad has said (which should be mandatory anyway) you can also do a sanity check which rows are affected by running a select using the same WHERE clause as the UPDATE.
So if you UPDATE is
UPDATE foo
SET bar = 42
WHERE col1 = 1
AND col2 = 'foobar';
The following will show you which rows will be updated:
SELECT *
FROM foo
WHERE col1 = 1
AND col2 = 'foobar';
Set Autocommit to OFF.
In MySQL, set autocommit=0; sets the autocommit off for the current session.
You execute your statement, see what it has changed, and then rollback if it's wrong or commit if it's what you expected!
The benefit of using transactions instead of running select query is that you can check the resulting set easily.
For testing update, hash # is your friend.
If you have an update statement like:
UPDATE
wp_history
SET history_by="admin"
WHERE
history_ip LIKE '123%'
You hash UPDATE and SET out for testing, then hash them back in:
SELECT * FROM
#UPDATE
wp_history
#SET history_by="admin"
WHERE
history_ip LIKE '123%'
It works for simple statements.
An additional practically mandatory solution is, to get a copy (backup duplicate), whenever using update on a production table. Phpmyadmin > operations > copy: table_yearmonthday. It just takes a few seconds for tables <=100M.
I've seen many borked prod data situations that could have been avoided by typing the WHERE clause first! Sometimes a WHERE 1 = 0 can help with putting a working statement together safely too. And looking at an estimated execution plan, which will estimate rows affected, can be useful. Beyond that, in a transaction that you roll back as others have said.
You can also use WHERE FALSE for MySQL, but keep in mind other DBMSes like SQL Server won't accept that.
One more option is to ask MySQL for the query plan. This tells you two things:
Whether there are any syntax errors in the query, if so the query plan command itself will fail
How MySQL is planning to execute the query, e.g. what indexes it will use
In MySQL and most SQL databases the query plan command is describe, so you would do:
describe update ...;
make a SELECT of it,
like if you got
UPDATE users SET id=0 WHERE name='jan'
convert it to
SELECT * FROM users WHERE name='jan'
In these cases that you want to test, it's a good idea to focus on only current column values and soon-to-be-updated column values.
Please take a look at the following code that I've written to update WHMCS prices:
# UPDATE tblinvoiceitems AS ii
SELECT ### JUST
ii.amount AS old_value, ### FOR
h.amount AS new_value ### TESTING
FROM tblinvoiceitems AS ii ### PURPOSES.
JOIN tblhosting AS h ON ii.relid = h.id
JOIN tblinvoices AS i ON ii.invoiceid = i.id
WHERE ii.amount <> h.amount ### Show only updatable rows
# SET ii.amount = h.amount
This way we clearly compare already existing values versus new values.
Just run an EXPLAIN query. So just write the word EXPLAIN before your query and it will give you info about how it would execute your update - finding rows, etc. But it won't execute it. However it will let you know if there are any syntax errors. So just use an explain!
EXPLAIN UPDATE ... SET ...
Run select query on same table with all where conditions you are applying in update query.

Does replace into have a where clause?

I'm writing an application and I'm using MySQL as DBMS, we are downloading property offers and there were some performance issues. The old architecture looked like this:
A property is updated. If the number of affected rows is not 1, then the update is not considered successful, elseway the update query solves our problem.
If the update was not successful, and the number of affected rows is more than 1, we have duplicates and we delete all of them. After we deleted duplicates if needed if the update was not successful, an insert happens. This architecture was working well, but there were some speed issues, because properties are deleted if they were not updated for 15 days.
Theoretically the main problem is deleting properties, because some properties are alive for months and the indexes are very far from each other (we are talking about 500, 000+ properties).
Our host told me to use replace into instead of deleting properties and all deprecated properties should be considered as DEAD. I've done this, but problems started to occur because of syntax error and I couldn't find anywhere an example of replace into with a where clause (I'd like to replace a DEAD property with the new property instead of deleting the old property and insert a new to assure optimization). My query looked like this:
replace into table_name(column1, ..., columnn) values(value1, ..., valuen) where ID = idValue
Of course, I've calculated idValue and handled everything but I had a syntax error. I would like to know if I'm wrong and there is a where clause for replace into.
I've found an alternative solution, which is even better than replace into (using simply an update query) because deletes are happening behind the curtains if I use replace into, but I would like to know if I'm wrong when I say that replace into doesn't have a where clause. For more reference, see this link:
http://dev.mysql.com/doc/refman/5.0/en/replace.html
Thank you for your answers in advance,
Lajos Árpád
I can see that you have solved your problem, but to answer your original question:
REPLACE INTO does not have a WHERE clause.
The REPLACE INTO syntax works exactly like INSERT INTO except that any old rows with the same primary or unique key is automaticly deleted before the new row is inserted.
This means that instead of a WHERE clause, you should add the primary key to the values beeing replaced to limit your update.
REPLACE INTO myTable (
myPrimaryKey,
myColumn1,
myColumn2
) VALUES (
100,
'value1',
'value2'
);
...will provide the same result as...
UPDATE myTable
SET myColumn1 = 'value1', myColumn2 = 'value2'
WHERE myPrimaryKey = 100;
...or more exactly:
DELETE FROM myTable WHERE myPrimaryKey = 100;
INSERT INTO myTable(
myPrimaryKey,
myColumn1,
myColumn2
) VALUES (
100,
'value1',
'value2'
);
In your documentation link, they show three alternate forms of the replace command. Even though elided, the only one that can accept a where clause is the third form with the trailing select.
replace seems like overkill relative to update if I am understanding your task properly.