Normally I can insert a row into a MySQL table and get the last_insert_id back. Now, though, I want to bulk insert many rows into the table and get back an array of IDs. Does anyone know how I can do this?
There are some similar questions, but they are not exactly the same. I don't want to insert the new ID to any temporary table; I just want to get back the array of IDs.
Can I retrieve the lastInsertId from a bulk insert?
Mysql mulitple row insert-select statement with last_insert_id()
Old thread but just looked into this, so here goes: if you are using InnoDB on a recent version of MySQL, you can get the list of IDs using LAST_INSERT_ID() and ROW_COUNT().
InnoDB guarantees sequential numbers for AUTO INCREMENT when doing bulk inserts, provided innodb_autoinc_lock_mode is set to 0 (traditional) or 1 (consecutive).
Consequently you can get the first ID from LAST_INSERT_ID() and the last by adding ROW_COUNT()-1.
The only way I can think it could be done is if you store a unique identifier for each set of rows inserted (guid)
then select the row ids.
e.g:
INSERT INTO t1
(SELECT col1,col2,col3,'3aee88e2-a981-1027-a396-84f02afe7c70' FROM a_very_large_table);
COMMIT;
SELECT id FROM t1
WHERE guid='3aee88e2-a981-1027-a396-84f02afe7c70';
You could also generate the guid in the database by using uuid()
Lets assume we have a table called temptable with two cols uid, col1 where uid is an auto increment field. Doing something like below will return all the inserted id's in the resultset. You can loop through the resultset and get your id's. I realize that this is an old post and this solution might not work for every case. But for others it might and that's why I'm replying to it.
# lock the table
lock tables temptable write;
#bulk insert the rows;
insert into temptable(col1) values(1),(2),(3),(4);
#get the value of first inserted row. when bulk inserting last_insert_id() #should give the value of first inserted row from bulk op.
set #first_id = last_insert_id();
#now select the auto increment field whose value is greater than equal to #the first row. Remember since you have write lock on that table other #sessions can't write to it. This resultset should have all the inserted #id's
select uid from temptable where uid >=#first_id;
#now that you are done don't forget to unlock the table.
unlock tables;
It's worth noting that #Dag Sondre Hansen's answer can also be implemented in case you have innodb_autoinc_lock_mode set to 2 by simply locking the table before insert.
LOCK TABLE my_table WRITE;
INSERT INTO my_table (col_a, col_b, col_c) VALUES (1,2,3), (4,5,6), (7,8,9);
SET #row_count = ROW_COUNT();
SET #last_insert_id = LAST_INSERT_ID();
UNLOCK TABLES;
SELECT id FROM my_table WHERE id >= #last_insert_id AND id <= #last_insert_id + (#row_count - 1);
Here's a fiddle demonstrating: https://www.db-fiddle.com/f/ahXAhosYkkRmwqR9Y4mAsr/0
I wouldn't be sure that auto increment value will increase item by 1. and there will be huge problems if your DB will have Master // Master replication and to resolve auto_increment duplicate exclusion. AI will be +2 instead of +1, also if there will be one more master it will come to +3. so relay on thing like AUTO_INCREMENT is going up for 1 is killing your project.
I see only some good options to do that.
this SQL snippet will have no problems with multiple masters and give good results until you will need only inserted records. on multiple requests without transactions can catch other inserts records.
START TRANSACTION;
SELECT max(id) into #maxLastId FROM `main_table`;
INSERT INTO `main_table` (`value`) VALUES ('first'), ('second') ON DUPLICATE KEY UPDATE `value` = VALUES(`value`);
SELECT `id` FROM `main_table` WHERE id > #maxLastId OR #maxLastId IS NULL;
COMMIT;
(if you will need also updated records by DUPLICATE KEY UPDATE) you will need to refactor database a bit and SQL will look like next, (safe for transactions and no transactions inside one connection.)
#START TRANSACTION
INSERT INTO bulk_inserts VALUES (null);
SET #blukTransactionId = LAST_INSERT_ID();
SELECT #blukTransactionId, LAST_INSERT_ID();
INSERT INTO `main_table` (`value`, `transaction_id`) VALUES ('first', #blukTransactionId), ('second', #blukTransactionId) ON DUPLICATE KEY UPDATE `value` = VALUES(`value`), `transaction_id` = VALUES(`transaction_id`);
SELECT #blukTransactionId, LAST_INSERT_ID();
SELECT id FROM `main_table` WHERE `transaction_id` = #blukTransactionId;
#COMMIT
both cases are safe to transnational. first will show you only inserted records and second will give you all records even updated.
also those options will work even with INSERT IGNORE ...
This thread is old but all these solutions did not help me so I came up with my own.
First, count how many rows you want to insert
let's say we need to add 5 rows:
LOCK TABLE tbl WRITE;
SELECT `AUTO_INCREMENT` FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'my_db' AND TABLE_NAME = 'tbl'
then use the auto_increment just selected to do next query:
ALTER TABLE tbl AUTO_INCREMENT = {AUTO_INCREMENT}+5;
UNLOCK TABLES;
Finally do your inserts
Use the reserved autoincrement range to insert with id.
Warning: this solution requires elevated access level to the tables. But usually bulk inserts are run by crons and importer scripts and what not that may use special access anyway. You would not use this for just a few inserts.
This may leave unused id's if you use ON DUPLICATE KEY UPDATE.
I think you will have to either handle the transaction id in your application, or the item id in your application in order to do this flawlessly.
One way to do this which could work, assuming that all your inserts succeed (!), is the following :
You can then get the inserted id's with a loop for the number of affected rows, starting with lastid (which is the first inserted id of the bulk insert).
And thus, i checked it works perfectly .. just be careful that HeidiSQL for example will not return the correct value for ROW_COUNT(), probably because it's a crappy GUI doing random shit we don't ask it - however it's perfectly correct from either command line or PHP mysqli -
START TRANSACTION;
BEGIN;
INSERT into test (b) VALUES ('1'),('2'),('3');
SELECT LAST_INSERT_ID() AS lastid,ROW_COUNT() AS rowcount;
COMMIT;
In PHP it looks like this (local_sqle is a straight call to mysqli_query, local_sqlec is a call to mysqli_query + convert resultset to PHP array) :
local_sqle("START TRANSACTION;
BEGIN;
INSERT into test (b) VALUES ('1'),('2'),('3');");
$r=local_sqlec("SELECT LAST_INSERT_ID() AS lastid,ROW_COUNT() AS rowcount;");
local_sqle("
COMMIT;");
$i=0;
echo "last id =".($r[0]['lastid'])."<br>";
echo "Row count =".($r[0]['rowcount'])."<br>";
while($i<$r[0]['rowcount']){
echo "inserted id =".($r[0]['lastid']+$i)."<br>";
$i++;
}
The reason the queries are separated is because I wouldn't otherwise get my result using my own functions, if you do this with standard functions, you can put it back in one statement and then retrieve the result you need (it should be result number 2 - assuming you use an extension which handles more than one result set / query).
For anyone using java with JDBC, it is possible. I am getting ids back with batch-insert doing it like this:
PreparedStatement insertBatch = null;
Connection connection = ....;
for (Event event : events) {
if (insertBatch == null){
insertBatch = connection.prepareStatement("insert into `event` (game, `type`, actor, target, arg1, arg2, arg3, created) " +
"values (?, ?, ?, ?, ?, ?, ?, ?)", Statement.RETURN_GENERATED_KEYS);
}
insertBatch.setObject(1, event.game);
insertBatch.setString(2, event.type);
insertBatch.setObject(3, event.actor);
insertBatch.setObject(4, event.target);
insertBatch.setString(5, event.arg1);
insertBatch.setObject(6, event.arg2);
insertBatch.setObject(7, event.arg3);
insertBatch.setTimestamp(8, new Timestamp(event.created.getTime()));
insertBatch.addBatch();
}
}
if (insertBatch != null){
insertBatch.executeBatch();
ResultSet generatedKeys = insertBatch.getGeneratedKeys();
for (Event event : events) {
if ( generatedKeys == null || ! generatedKeys.next()){
logger.warn("Unable to retrieve all generated keys");
}
event.id = generatedKeys.getLong(1);
}
logger.debug("events inserted");
}
Source: "Using MySQL I can do it with JDBC this way:" - Plap - https://groups.google.com/g/jdbi/c/ZDqnfhK758g?pli=1
I have to actually add this to my JDBC url: rewriteBatchedStatements=true. Or else the actual inserts show up in the mysql "general query log" as separate rows. With 7000 rows inserted, I got 2m11s for regular inserts, 46s without rewrite.. on and 1.1s with rewrite.. on. Also, it does not make other people's inserts block (I tested that). When I inserted 200k rows, it grouped them into about 36k per line ie insert into abc(..) values(..),(..),(..)....
I am actually using JDBCTemplate so the way to access the PreparedStatement is:
ArrayList<Long> generatedIds = (ArrayList<Long>) jdbcTemplate.execute(
new PreparedStatementCreator() {
#Override
public PreparedStatement createPreparedStatement(Connection connection) throws SQLException {
return connection.prepareStatement(insertSql, Statement.RETURN_GENERATED_KEYS);
}
},
new PreparedStatementCallback<Object>() {
#Override
public Object doInPreparedStatement(PreparedStatement ps) throws SQLException, DataAccessException {
// see above answer for setting the row data
...
ps.executeBatch();
ResultSet resultSet = ps.getGeneratedKeys();
ArrayList<Long> ids = new ArrayList<>();
while (resultSet.next()) {
ids.add(resultSet.getLong(1));
}
return ids;
}
}
);
$query = "INSERT INTO TABLE (ID,NAME,EMAIL) VALUES (NULL,VALUE1, VALUE2)";
$idArray = array();
foreach($array as $key) {
mysql_query($query);
array_push($idArray, mysql_insert_id());
}
print_r($idArray);
I'm trying to get the inserted data from the last multi insert query so I can verify the written data.
I use pdo to execute the query.
$sql="insert into tableName (col1,col2) values (val1,val2),(val4,val5),(val7,val8) ON DUPLICATE KEY UPDATE Col1=VALUES(Col1),Col2=VALUES(Col2)";
$stmt = dbh->prepare($sql);
$stmt->execute();
and then I run the lastInsertId() to get the last id of my autoIncrement column.
$lastId=$dbh->lastInsertId();
and the rowCount() function to get the number of inserted rows (don't care about the duplicates)
$numberOfNewRows=$dbh->rowCount();
now I want to get the data from the previous insert query
$limitRangeStart=$lastId-$numberOfNewRows;
$sql="select * from previusTable limit $limitRangeStart , $lastId ";
$stmt = dbh->prepare($sql);
$stmt->execute();
So my question is will the last query ALWAYS return the data the previously query inserted since multi insert method used?
Is there any possibility that another insert query that might run at that time will "break" the multi insert rows from the previous query?
I am doing bulk insert and inserting 0.5 million tokens in the database with insert "ignore statement". in 0.5 million tokens there can be duplicate tokens.
so if i insert 0.5 million tokens in the database with insert ignore statement then there is no guarantee that all of tokens are inserted into the database because of duplicate tokens.
After doing insertion i want to know how many tokens are inserted into the database. some people are suggesting to use affected_rows columns to get count of inserted (affected) rows. But affected_rows doesn't give the output of current sql statement it gives the output of last sql statement.
Please tell me the best way to get count of inserted rows with insert ignore statment.
Put select row_count(); just after the insert statement to get the number of rows inserted.
eg:
insert ignore into tbl(col1) values (1),(2); select row_count();
Doing a single SQL insert ignore would work with affected_rows. Not sure tho how would that turn out performance wise since it's 0.5 mil rows to enter.
Anyhow, here's a solution I tried and works with 4 values in a signle INSERT.
<?php
$mysqli = new mysqli('127.0.0.1','root','','test');
if (mysqli_connect_errno()) {
printf("Connect failed: %s\n", mysqli_connect_error());
exit();
}
$sql = "INSERT IGNORE INTO test1 (Name, Attribute, Val) VALUES ('ai', 'blue', '1j'),('ai1', 'white', '2j'),('ai2', 'black', '3j'),('ai1', 'green', '4j')";
$insert = $mysqli->query($sql);
printf ($mysqli->affected_rows);
?>
I created a new mySql table and I need the first field to be an index and a key.
I'm not sure I got the terminology right but I simply need that field to automatically increment by 1 with each insert.
So I defined that field as an index and gave it the auto_increment attribute.
Now I try to insert the first row like this:
$wpdb->insert('wp_branches', array(user_id=>$user_id, branchName=>$bname));
The index/key field branchId is missing from this query because I'm counting on the db to automatically give it the value 1 since it's the first insert, and then increment it with every additional insert.
For some reason the row isn't being inserted and db is left empty.
What am I doing wrong?
try like so:
$sql = $wpdb->prepare(
"INSERT INTO `wp_branches` (`user_id`,`branchName`) values (%d,%s)",
$user_id, $bname);
$wpdb->query($sql);
this will protect you against "injections" too.
modified so:
$wpdb->insert('wp_branches', array('user_id' => $user_id, 'branchName' => $bname));
If I insert multiple records with a loop that executes a single record insert, the last insert id returned is, as expected, the last one. But if I do a multiple records insert statement:
INSERT INTO people (name,age)
VALUES ('William',25), ('Bart',15), ('Mary',12);
Let's say the three above are the first records inserted in the table. After the insert statement I expected the last insert id to return 3, but it returned 1. The first insert id for the statement in question.
So can someone please confirm if this is the normal behavior of LAST_INSERT_ID() in the context of multiple records INSERT statements. So I can base my code on it.
Yes. This behavior of last_insert_id() is documented in the MySQL docs:
Important
If you insert multiple rows using a single INSERT statement, LAST_INSERT_ID() returns the value generated for the first inserted row only. The reason for this is to make it possible to reproduce easily the same INSERT statement against some other server.
This behavior is mentioned on the man page for MySQL. It's in the comments but is not challenged, so I'm guessing it's the expected behavior.
I think it's possible if your table has unique autoincrement column (ID) and you don't require them to be returned by mysql itself. I would cost you 3 more DB requests and some processing. It would require these steps:
Get "Before MAX(ID)" right before your insert:
SELECT MAX(id) AS before_max_id FROM table_name`
Make multiple INSERT ... VALUES () query with your data and keep them:
INSERT INTO table_name
(col1, col2)
VALUES
("value1-1" , "value1-2"),
("value2-1" , "value2-2"),
("value3-1" , "value3-2"),
ON DUPLICATE KEY UPDATE
Get "After MAX(ID)" right after your insert:
SELECT MAX(id) AS after_max_id FROM table_name`
Get records with IDs between "Before MAX(ID)" and "After MAX(ID)" including:
SELECT * FROM table_name WHERE id>$before_max_id AND id<=$after_max_id`
Do a check of retrieved data with data you inserted to match them and remove any records that were not inserted by you. The remaining records have your IDs:
foreach ($after_collection as $after_item) {
foreach ($input_collection as $input_item) {
if ( $after_item->compare_content($input_item) ) {
$intersection_array[] = $after_item;
}
}
}
This is just how a common person would solve it in a real world, with parts of code. Thanks to autoincrement it should get smallest possible amount of records to check against, so they will not take lot of processing. This is not the final "copy & paste" code - eg. you have to create your own function compare_content() according you your needs.