I'm doing large number of:
INSERT.... ON DUPLICATE KEY UPDATE
queries and I want to find out the number of rows affected, ideally the number updated and the number inserted.
At the moment I'm using ROW_COUNT() but that counts as 2 from the above sql if the row is updated or 1 if it is inserted.
Is there a way to find this from a mysql function?
With ON DUPLICATE KEY UPDATE, the affected-rows value is 1, if update its 2. From this you can determine how many rows inserted successfully and how many are updated
Related
I'm using java/Mybatis with MySQL in my project. I need to insert multiple rows into a table and I want to ignore those rows which has a dupliate UNIQUE index. Also I want to get to know which rows are ignored. How to do it? it seems to me that insert ignore into can not tell me which rows are ignored.
I cannot help you out with a solution on how to do that while inserting. But depending on when you need to know the rows that are ignored you could either:
Invert you select so that you get all the duplicates before inserting into the new table.
Deduct the rows in the sink table from the rows in the source table(s) after the insert.
I have a very large table with a primary key of BINARY(20).
The table has around 17 million rows. Every hour a cron job tries to insert as many as 50,000 new entries into this table with the ON_DUPLICATE_KEY_UPDATE syntax.
Each insert in the cronjob is with 1,000 values (multiple insert). How can I get the number of rows inserted into the table from this query? I cannot do a row count before and after as there are around 17million rows and the query is too expensive.
In the manual mysql says for a row inserted the affected number of rows is 1 and for an updated field it is 2, meaning in my 1000 INSERT ON DUPLICATE KEY UPDATE query I could have affected rows ranging from 1000 - 2000, but I have no way of telling how many records were inserted from this number?
How can I overcome this?
Thanks
The number of inserts would be 2000 minus the number of affected rows. More generally:
(numberOfValuesInInsert * 2) - mysql_affected_rows()
EDIT:
As tomas points out, The MySQL docs actually say:
With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values.
[emphasis mine]
Consequently, if setting an existing row to the same values is a possibility, it's impossible to tell how many rows were updated vs. inserted, since two inserts would be indistinguishable from one update with different values + one update with the same values.
When Your job does an Insert of 1000 , some are pure Inserts and some are Updates as you have the ON_DUPLICATE_KEY_UPDATE .
Thus you get the first equation
(1) Inserts + Updates = No of rows Inserted( in this case 1000)
I take a simple example where you get a value of 1350 for the my_sql_affected_rows .
since for an Insert a value of 1 and for update a value of 2 aggregates to my_sql_affected_rows . I get the following equation .
(2) Inserts + 2 * Updates = my_sql_affected_rows (in this case 1350) .
Subtract (2) - (1) . You get
(3) Updates = my_sql_affected_rows - No of rows Inserted
Updates = 1350 - 1000 ( in this example ).
Updates = 350 .
Substitute value of Updates in equation (1) , you get
Inserts = 650
Thus to get the number of Updates , you only need to use equation (3) directly .
I have a MySQL query that performs batch INSERTs and uses ON DUPLICATE KEY UPDATE to update a row in case there's a unique key duplicate.
INSERT INTO table1
(col1,col2,col3)
VALUES
(val1,val2,val3),
(val4,val5,val6),
(val7,val8,val9),
...
(valn,valx,valz)
ON DUPLICATE KEY UPDATE
col3 = VALUES(col3);
In other words, new rows are inserted unless there's a duplicate unique key, in which case col3 is updated.
When the query is finished, I would like to know how many rows were INSERTED as well as how many rows were UPDATED. Is this possible?
No, there's no definitive way to tell from the rows_affected count. There's some corner cases where we can tell... if rows_affected is exactly twice the number of rows we attempted to insert, we know they were all updates. If the rows_affected count is zero, we know that no rows were inserted. If the rows_affected count is one, we know that one row was inserted. But aside from that, there are a lot of permutations.
It might be possible to craft BEFORE INSERT and BEFORE UPDATE trigger to increment user-defined variables. If we initialize the user-defined variables immediately before the INSERT ... ON DUPLICATE KEY UPDATE statement, we could combine use those variables to determine how many rows we attempted to insert, and how many of those rows caused a duplicate key exception. (MySQL doesn't increment the rows_affected for an UPDATE action that causes no actual update to the row.)
EDIT
If you have a guarantee that an UPDATE action will cause an actual change to the row... if you are changing the value of at least one column on each row, for every row that is changed... and if you have a count of the actual number of rows you are attempting to insert, then you could determine from the rows_affected count how many rows were inserted, and how many rows were updated.
The INSERT ... ON DUPLICATE KEY can cause the same row to be inserted and updated, and/or cause the same row to be updated multiple times.
Did you want a count of the number of "update" operations, including updates to the same row, or did you want a count of the number of rows in the table that got updated?
expanding on the hakkikonu's Answer, and read it first or this will make no sense if it does at all ...
and agreeing with #spencer7593 's comments, such things as "concurrency killing (CK) operations with locks",
and the need to fix the formula for determining the update count in hak's Answer
i see no way of getting accurate insert and update counts without CK. Throwing in AFTER triggers
certainly doesn't help solve it without CK, "alone and at the same time being accurate".
were one to have the nullable table1.blabla column only for use with batches against table1, regardless of the frequency
of such batches. if a batch is not running against table1, blabla is guaranteed to be null even if the column is not dropped. it is obvious how.
i believe you can get insert and update counts
accurately. here is how and based on your Insert statement.
table1 has write lock given exclusive to the batch code. let's assume the MyISAM storage engine. hey why not, we
are making assumptions here.
blabla column shows null 'inserted' and 'updated' based on your statement (barely different that what Hakkikonu suggested).
You have your counts.
Concerning what spencer wrote in his Answer about updates and or inserts happening more than once for a given row based on
your Question's Insert statement, I don't see it that way. Unless your batch data has duplicate keys presented in which case what does accuracy matter anyway.
Either the row is there or not to begin with based on whatever threw the ON DUPLICATE KEY. If it threw it, it is an update,
if didn't, insert. Someone correct me.
Then at end the alter table drop blabla is performed or update to null. lock released.
so i guess how important is update and insert counts, the size of the table, and the frequency of batches.
Add a new column something like blabla and give it null as default value.
I assumed, you will use this only once.
Then
ON DUPLICATE KEY UPDATE
col3 = VALUES(col3),
blabla = 'up' ;
SELECT count(blabla) as allrows FROM table1; # returns all rows count
SELECT count(blabla) as updrows FROM table1 WHERE blabla = 'up'; # returns update count
SELECT count(blabla) as insrows FROM table1 WHERE blabla IS NULL; # returns inserts
I have a very large table with a primary key of BINARY(20).
The table has around 17 million rows. Every hour a cron job tries to insert as many as 50,000 new entries into this table with the ON_DUPLICATE_KEY_UPDATE syntax.
Each insert in the cronjob is with 1,000 values (multiple insert). How can I get the number of rows inserted into the table from this query? I cannot do a row count before and after as there are around 17million rows and the query is too expensive.
In the manual mysql says for a row inserted the affected number of rows is 1 and for an updated field it is 2, meaning in my 1000 INSERT ON DUPLICATE KEY UPDATE query I could have affected rows ranging from 1000 - 2000, but I have no way of telling how many records were inserted from this number?
How can I overcome this?
Thanks
The number of inserts would be 2000 minus the number of affected rows. More generally:
(numberOfValuesInInsert * 2) - mysql_affected_rows()
EDIT:
As tomas points out, The MySQL docs actually say:
With ON DUPLICATE KEY UPDATE, the affected-rows value per row is 1 if the row is inserted as a new row, 2 if an existing row is updated, and 0 if an existing row is set to its current values.
[emphasis mine]
Consequently, if setting an existing row to the same values is a possibility, it's impossible to tell how many rows were updated vs. inserted, since two inserts would be indistinguishable from one update with different values + one update with the same values.
When Your job does an Insert of 1000 , some are pure Inserts and some are Updates as you have the ON_DUPLICATE_KEY_UPDATE .
Thus you get the first equation
(1) Inserts + Updates = No of rows Inserted( in this case 1000)
I take a simple example where you get a value of 1350 for the my_sql_affected_rows .
since for an Insert a value of 1 and for update a value of 2 aggregates to my_sql_affected_rows . I get the following equation .
(2) Inserts + 2 * Updates = my_sql_affected_rows (in this case 1350) .
Subtract (2) - (1) . You get
(3) Updates = my_sql_affected_rows - No of rows Inserted
Updates = 1350 - 1000 ( in this example ).
Updates = 350 .
Substitute value of Updates in equation (1) , you get
Inserts = 650
Thus to get the number of Updates , you only need to use equation (3) directly .
I have one mysql table on which i created on-after and on-before trigger for insertion. Each tigger update 2 rows respectively. So once i insert a row to the table, altogether 5 rows are updated, even though the response from the DB will be as "1 row is affected". I need to find a way to know the total no of rows got updated, in this case 5.
The problem seems to be that MySQL does not count along when you insert/update rows in a trigger. The best possible solution might be to count the manually inserted/updated rows for yourself, store the value in a variable, get it after the outer query and add the result of ROW_COUNT() to it.
I think it is giving one row affected because you are adding one row in table. Mysql reply gives information of row added in that table only. And not giving affected rows by triggers.