Getting duplicate count when executing INSERT IGNORE via JDBC - mysql

Is it possible to get the duplicate count when executing MySQL "INSERT IGNORE" statement via JDBC?
For example, when I execute an INSERT IGNORE statement on the mysql command line, and there are duplicates I get something like
Query OK, 0 rows affected (0.02 sec)
Records: 1 Duplicates: 1 Warnings: 0
Note where it says "Duplicates: 1", indicating that there were duplicates that were ignored.
Is it possible to get the same information when executing the query via JDBC?
Thanks.

I believe you can retrieve this by issuing SHOW WARNINGS after your insert.
http://dev.mysql.com/doc/refman/5.0/en/show-warnings.html

You can use the ROW_COUNT() and FOUND_ROWS() functions to determine the number of inserted rows and duplicates when doing INSERT IGNORE.
For Example :
SELECT ROW_COUNT(), FOUND_ROWS()
INTO myRowCount, myFoundRows;
myInsertedRows = myRowCount;
myDuplicateRows = myFoundRows - myInsertedRows;
COMMIT;

You can get the rows affected value via JDBC. Just subtract it from the number of rows you inserted.
Bear in mind that rows affected also includes rows that are indirectly affected by the query. So, if there are any triggers that affect other rows, rows affected will include those rows as well.
See: http://docs.oracle.com/javase/6/docs/api/java/sql/PreparedStatement.html#executeUpdate()

Related

on duplicate key update id=last_insert_id(id) - did the INSERT actually happen?

I need to INSERT a new row into my TABLE(with unique field 'A'), if it already exists(duplicate field, insert failed) - just return the ID of the existing one.
This code works well:
insert into TABLE set A=1 on duplicate key update id=last_insert_id(id)
But now I have another problem: how do I know if the returning ID belongs to a new (inserted) row or it's just an old one?
Yes, I can do "SELECT id WHERE A=1" beforehand, but it would overcomplicate the program code, require two steps, and just looks ugly. Besides, in future I may want to remove some UNIQUE indexes, then I'll have to rewrite the program as well to change all the 'where' checks. Maybe there is a better solution?
[solved, see my answer]
Found this solution. It works in console, but doesn't work in my program (must be a bug in the client, idk) - so probably it will work fine for everyone (except me, sigh)
Just check the 'affected rows count' - it will be 1 for the new record and 0 for the old one
mysql> INSERT INTO EMAIL set addr="test" ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID(id);
Query OK, 1 row affected (0.01 sec)<----- 1 = INSERTed
mysql> select last_insert_id(); //returns 1
mysql> INSERT INTO EMAIL set addr="test" ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID(id);
Query OK, 0 rows affected (0.00 sec) <----- 0 = OLD
mysql> select last_insert_id(); //returns 1
UPD: it didn't work for me because my program kept sending the 'CLIENT_FOUND_ROWS ' flag when connecting to mysql. Removed it, now everything is fine!

Getting the "Records" and "Duplicates" counts of INSERT ... SELECT ... ON DUPLICATE KEY UPDATE

The INSERT ... SELECT... ON DUPLICATE KEY UPDATE returns as affected-rows a number derived from (inserted count) + (updated count) * 2, and this is well documented in multiple places.
However in the output of the MySQL Command-Line Tool, I've noticed this extra info:
> INSERT INTO ...
-> SELECT ... FROM ...
-> ON DUPLICATE KEY UPDATE ...
-> ;
Query OK, 97 rows affected (0.03 sec)
Records: 2425 Duplicates: 28 Warnings: 0
Namely, the numbers Records: and Duplicates:.
Analyzing have determined:
The 97 rows affected is affected-rows (a.k.a. ROW_COUNT()).
Records: 2425 is the number of rows fetched by the SELECT part.
Duplicates: 28 is the number of rows actually changed by the ON DUPLICATE KEY UPDATE part.
Consequently:
affected-rows - Duplicates * 2 is the number of rows actually inserted.
Records - affected-rows - Duplicates is the number of rows duplicated but not changed (i.e. values were set to the same value).
Which brings us to the question: How does one obtain these numbers Records and Duplicates in a program? (I'm using MySQL Connector/J if that helps answer the question.)
Possibly for Records:, issuing a SELECT on FOUND_ROWS() directly after the INSERT ... SELECT ... ON DUPLICATE KEY UPDATE is one way.
I have no idea where Duplicates: comes from.
The C api does not provide direct access to these values (or the underlying information to calculate these values) as numbers, as it does with mysql_affected_rows().
You have however access to that message using mysql_info():
mysql_info()
const char *mysql_info(MYSQL *mysql)
Description
Retrieves a string providing information about the most recently executed statement, but only for the statements listed here. For other statements, mysql_info() returns NULL. The format of the string varies depending on the type of statement, as described here. The numbers are illustrative only; the string contains values appropriate for the statement.
INSERT INTO ... SELECT ...
String format: Records: 100 Duplicates: 0 Warnings: 0
[...]
UPDATE
String format: Rows matched: 40 Changed: 40 Warnings: 0
Return Values
A character string representing additional information about the most recently executed statement. NULL if no information is available for the statement.
You can/have to parse these (query dependent) strings if you need access to those values in detail. The mysql client simply displays this message as it is.
Unfortunately, not every api, including the MySQL Connector/J, implements or relays this feature, so those detailed values seem to not be accessable here.

Fetching last insert id shows wrong number

I have table with three records. There is one filed as auto_increment. The ID's are 1, 2, 3...
When I run query
SELECT LAST_INSERT_ID() FROM myTable
The result is 5, even the last ID is 3. Why?
And what is better to use? LAST_INSERT_ID() or SELECT MAX(ID) FROM myTable?
The LAST_INSERT_ID() function only returns the most recent autoincremented id value for the most recent INSERT operation, to any table, on your MySQL connection.
If you haven't just done an INSERT it returns an unpredictable value. If you do several INSERT queries in a row it returns the id of the most recent one. The ids of the previous ones are lost.
If you use it within a MySQL transaction, the row you just inserted won't be visible to another connection until you commit the transaction. So, it may seem like there's no row matching the returned LAST_INSERT_ID() value if you're stepping through code to debug it.
You don't have to use it within a transaction, because it is a connection-specific value. If you have two connections (two MySQL client programs) inserting stuff, they each have their own distinct value of LAST_INSERT_ID() for the INSERT operations they are doing.
edit If you are trying to create a parent - child relationship, for example name and email addresses, you might try this kind of sequence of MySQL statements.
INSERT INTO user (name) VALUES ('Josef');
SET #userId := LAST_INSERT_ID();
INSERT INTO email (user_id, email) VALUES (#userId, 'josef#example.com');
INSERT INTO email (user_id, email) VALUES (#userId, 'josef#josefsdomain.me');
This uses LAST_INSERT_ID() to get the autoincremented ID from the user row after you insert it. It then makes a copy of that id in #userId, and uses it twice, to insert two rows in the child table. By using more INSERT INTO email requests, you could insert an arbitrary number of child rows for a single parent row.
Pro tip: SELECT MAX(id) FROM table is a bad, bad way to figure out the ID of the most recently inserted row. It's vulnerable to race conditions. So it will work fine until you start scaling up your application, then it will start returning the wrong values at random. That will ruin your weekends.
last_insert_id() has no relation to specific tables. In the same connection, all table share the same.
Below is a demo for it.
Demo:
mysql> create table t1(c1 int primary key auto_increment);
Query OK, 0 rows affected (0.11 sec)
mysql> create table t2(c1 int primary key auto_increment);
Query OK, 0 rows affected (0.06 sec)
mysql> insert into t1 values(null);
Query OK, 1 row affected (0.01 sec)
mysql> insert into t2 values(4);
Query OK, 1 row affected (0.00 sec)
mysql> insert into t2 values(null);
Query OK, 1 row affected (0.02 sec)
mysql> select last_insert_id() from t1;
+------------------+
| last_insert_id() |
+------------------+
| 5 |
+------------------+
1 row in set (0.00 sec)
I don't think this function does what you think it does. It returns the last id inserted on the current connection.
If you compare that to SELECT MAX(ID) this selects the highest ID irrespective of connection, be careful not to get them mixed up or you will get unexpected results.
As for why it is showing 5 its probably because its the last id to be inserted, I believe that this value will remain even if the record is removed, perhaps someone could confirm this.
Table level triggers is what can come to rescue here. e.g. before insert trigger.
maybe you should restart the database connection than reconnected again for fresh data

Does UPDATE overwrite values if they are identical?

I only want to update a vale in my database if it is different. Reading through the Oracle docs on UPDATE, it says...
...the UPDATE statement updates columns of existing rows in the named table with new values.
Since it doesn't say it won't overwrite identical values, should I take this statement literally? Does this mean MySQL does some sort of Boolean matching check for me?
No, MySQL won't overwrite identical values.
Lets say we insert some data:
insert into foo(id,val1,val2,val3) values (0,1,2,3);
Query OK, 1 row affected (0.00 sec)
If you update it with the same values:
update foo set id=0, val1=1, val2=2, val3=3 where id=0;
Query OK, 0 rows affected (0.00 sec)
Rows matched: 1 Changed: 0 Warnings: 0
Take a look on servers response 0 rows affected
an sql query would update even identical value by practically substituting them. Anyway, you can structure your sql to avoid it will substitute the identical value. (I think also that the latter way would be more time consuming then the normal procedure and maybe useless for the final result)

MySQL row_count() function always returning 0

I want to show the result of rows affected after update, insert, or delete in mysql. I have put
DELETE FROM A WHERE ID='1';
SELECT ROW_COUNT();
With the ROW_COUNT the last statement, but the result show me is 0.
If you want to know number of rows affected by delete query in PHPMYADMIN then by running your query it will show you the result see below screenshot :
As #Flash Thunder said PHPmyadmin does not allow multiple queries sent at once
If you want to see the affected rows then you can also write a script using PHP which will exceute you sql query and returns the number of affected rows
Just to be clear.
phpMyAdmin is written in PHP, and PHP does not allow multiple queries sent at once... if you are sending two queries separately, second query is on new connection, so it has no access to previous query information. That's why SELECT ROW_COUNT(); returns 0.
But by default phpMyAdmin returns affected rows count in information after query. It probably uses mysql(i)_affected_rows() function.
FOUND_ROWS() returns number of tables in database when there was no previous query.
mysql> use hunting;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select found_rows();
+--------------+
| found_rows() |
+--------------+
| 24 |
+--------------+
1 row in set (0.00 sec)
mysql> show tables;
(...)
24 rows in set (0.00 sec)