Prevent Auto Increment Skip On Duplicate Key Update - mysql

I have a table with
(ID INT auto_incrment primary key,
tag VARCHAR unique)
I want to insert multiple tags at one. Like this:
INSERT INTO tags (tag) VALUES ("java"), ("php"), ("phyton");
If I would execute this, and "java" is already in the table, I'm getting an error. It doesn't add "php" and "python".
If I do it like this :
INSERT INTO tags (tag) VALUES ("java"), ("php"), ("phyton")
ON DUPLICATE KEY UPDATE tag = VALUES(tag)
it gets added without an error, but it skips 2 values at the ID field.
Example: I have Java with ID = 1 and I run the query. Then PHP will be 3 and Phyton 4. Is there a way to execute this query without skipping the IDs?
I don't want big spaces between them. I also tried INSERT IGNORE.
Thank you!

See "SQL #1" in http://mysql.rjweb.org/doc.php/staging_table#normalization . It is more complex but avoids 'burning' ids. It has the potential drawback of needing the tags in another table. A snippet from that link:
# This should not be in the main transaction, and it should be
# done with autocommit = ON
# In fact, it could lead to strange errors if this were part
# of the main transaction and it ROLLBACKed.
INSERT IGNORE INTO Hosts (host_name)
SELECT DISTINCT s.host_name
FROM Staging AS s
LEFT JOIN Hosts AS n ON n.host_name = s.host_name
WHERE n.host_id IS NULL;
By isolating this as its own transaction, we get it finished in a hurry, thereby minimizing blocking. By saying IGNORE, we don't care if other threads are 'simultaneously' inserting the same host_names. (If you don't have another thread doing such INSERTs, you can toss the IGNORE.)
(Then it goes on to talk about IODKU.)

INNODB engine Its main feature is to support ACID type transactions.
What it usually does that I point out is not a "problem", is that the engine will "reserve" the id before knowing if it is a duplicate or not.
This is a solution, but it depends on your table, if we are talking about a very large one you should do some tests first because the AUTO_INCREMENT function helps you to follow the ordering of the id.
I'll give you some examples:
INSERT INTO tags (java,php,python) VALUES ("val1"), ("val2"), ("val3")
ON DUPLICATE KEY UPDATE java = VALUES(java), id = LAST_INSERT_ID(id);
SELECT LAST_INSERT_ID();
ALTER TABLE tags AUTO_INCREMENT = 1;
Note: I added LAST_INSERT_ID () to you because every time you insert or update it always gives you an inserted or reserved id.
Each time INSERT INTO is called, AUTO_INCREMNT must be followed.

Related

Normalize data before loading to database or use database?

I have some data which I want to add to an existing mysql database. The new data may have entries, which are already saved on DB. Since some of my columns are unique, I get, as expected, an ER_DUP_ENTRY error.
Bulk Insert
Let's say I want to use following statement to save "A", "B" and "C" in a column names of table mytable and "A" is already saved there.
insert into mytable (names) values ("A"), ("B"), ("C");
Is there a way to directly use bulk insert to save "B" and "C" while ignoring "A"? Or do I have to build an insert statement for every new row? This leads to another question:
Normalize Data
Should I assure not to upload duplicate entries before the actual insert statement? In my case I would need to select the data from database, eliminate duplicates and then perform the above seen insert. Or is that a task which is supposed to be done by a database?
If you have UNIQUE constraints that are blocking import, you have a few ways you can work around that:
INSERT IGNORE INTO mytable ...
If any individual rows violate a UNIQUE constraint, they are skipped. Other rows are inserted.
REPLACE INTO mytable ...
If any rows violate a UNIQUE constraint, DELETE the existing row, then INSERT the new row. Keep in mind side-effects of doing this, like if you have foreign keys that cascade on delete referencing the deleted row. Or if the INSERT generates a new auto-increment id.
INSERT INTO mytable ... ON DUPLICATE KEY UPDATE ...
More flexibility. This does not delete the original row, but allows you to set new values for any columns you choose on a case by case basis. See also my answer to "INSERT IGNORE" vs "INSERT ... ON DUPLICATE KEY UPDATE"
If you want to use bulk-loading with mysqlimport or the SQL statement equivalent LOAD DATA INFILE, there are options that match the INSERT IGNORE or REPLACE solutions, but not the INSERT...ON DUPLICATE KEY UPDATE solution.
Read docs for more information:
https://dev.mysql.com/doc/refman/8.0/en/insert.html
https://dev.mysql.com/doc/refman/8.0/en/replace.html
https://dev.mysql.com/doc/refman/8.0/en/insert-on-duplicate.html
https://dev.mysql.com/doc/refman/8.0/en/mysqlimport.html
https://dev.mysql.com/doc/refman/8.0/en/load-data.html
In some situations, I like to do this:
LOAD DATA into a temp table
Clean up the data
Normalize as needed. (2 SQLs per column that needs normalizing -- details)
Augment Summary table(s) (INSERT .. ON DUPLICATE KEY .. SELECT x, y, count(*), sum(z), .. GROUP BY x,y)
Copy clean data from temp table to real table(s) ("Fact" table). (INSERT [IGNORE] .. SELECT [DISTINCT] .. or IODKU with SELECT.)
More on Normalizing:
I do it outside any transactions. There are multiple reasons why this is better.
At worst (as a result of other failures), I occasionally throw an unused entry in the normalization table. No big deal.
No burning of AUTO_INCREMENT ids (except in edge cases).
Very fast.
Since REPLACE is a DELETE plus INSERT it is almost guaranteed to be worse than IODKU. However, both burn ids when the rows exist.
If at all possible, do not "loop" through the rows; instead find SQL statements to handle them all at once.
Depending on the details, de-dup in step 2 (if lots of dups) or in step 5 (dups are uncommon).

re-inserting a table record and updating an auto increment primary index

I'm running MariaDB 5.5.56.
I'm looking to copy an entire row in a database, change one column, then insert the entire row back into the original database (I don't want to have to specify the individual fields because there's a lot of them). The problem I'm running into is how to deal with an auto-increment/primary key column.
example:
create temporary table t_ownership like ownership;
insert into t_ownership (select * from ownership where name='x' LIMIT 1);
update t_ownership set id='something else';
insert into ownership (select * from t_ownership);
I have a column "recno" that is an auto-increment that will create a collision in the database when I try to re-insert the slightly changed record back into the original table.
Something like this seems to work but doesn't result in an insert:
insert into ownership (select * from t_ownership) ON DUPLICATE KEY UPDATE recno=LAST_INSERT_ID(ownership.recno);
The above statement executes without error but does not add a row to table ownership.
So I think I'm close but not quite there...
What would be the best way to do this? I'd like to avoid doing an insert where I manually specify field/values. I just need to regenerate a new A.I. recno column on the insert.
NULL values inserted into auto-incremented fields end up just getting the next auto-increment value, behaving equivalent to INSERTing without specifying the field; so you should be able to update the source (temp copy) to have NULL for that field.
However, one potential issue that could present itself in scenarios like yours is that the CREATE TEMPORARY TABLE ... LIKE could result in a table that would not allow you to set such fields to NULL; this would require you to either ALTER the temporary table, or create it in a more explicit manner. Either way, it now makes code/queries that do not specify columns even more reliant on knowing columns.
Personally, I would take this route in the first place.
INSERT INTO theTable([list all but the auto-inc column])
SELECT [list all but the auto-inc column, with any replacements or modifications desired]
FROM ...[original query]...
It accomplishes the task in one query, makes the queries more self documenting, and only at the cost of a little typing (most of which a decent database browser, or query builder, will do for you).
The only argument really in favor of your current approach is that the table involved can be changed without necessarily breaking your queries; but that begs the question of whether it would be better for such table changes to break the queries, forcing them to be re-examined. If it is not an issue, it is a minor revision; but the alternative is queries that continue to be valid that have the potential to cause unexpected behavior due to copying information they were never intended to.

MySQL incrementing Primary Key on INSERT INTO .. ON DUPLICATE KEY UPDATE [duplicate]

Note: I'm new to databases and PHP
I have an order column that is set to auto increment and unique.
In my PHP script, I am using AJAX to get new data but the problem with that is, is that the order skips numbers and is substantially higher thus forcing me to manually update the numbers when the data is inserted. In this case I would end up changing 782 to 38.
$SQL = "INSERT IGNORE INTO `read`(`title`,`url`) VALUES\n ".implode( "\n,",array_reverse( $sql_values ) );
How can I get it to increment +1?
The default auto_increment behavior in MySQL 5.1 and later will "lose" auto-increment values if the INSERT fails. That is, it increments by 1 each time, but doesn't undo an increment if the INSERT fails. It's uncommon to lose ~750 values but not impossible (I consulted for a site that was skipping 1500 for every INSERT that succeeded).
You can change innodb_autoinc_lock_mode=0 to use MySQL 5.0 behavior and avoid losing values in some cases. See http://dev.mysql.com/doc/refman/5.1/en/innodb-auto-increment-handling.html for more details.
Another thing to check is the value of the auto_increment_increment config variable. It's 1 by default, but you may have changed this. Again, very uncommon to set it to something higher than 1 or 2, but possible.
I agree with other commenters, autoinc columns are intended to be unique, but not necessarily consecutive. You probably shouldn't worry about it so much unless you're advancing the autoinc value so rapidly that you could run out of the range of an INT (this has happened to me).
How exactly did you fix it skipping 1500 for ever insert?
The cause of the INSERT failing was that there was another column with a UNIQUE constraint on it, and the INSERT was trying to insert duplicate values in that column. Read the manual page I linked to for details on why this matters.
The fix was to do a SELECT first to check for existence of the value before attempting to INSERT it. This goes against common wisdom, which is to just try the INSERT and handle any duplicate key exception. But in this case, the side-effect of the failed INSERT caused an auto-inc value to be lost. Doing a SELECT first eliminated almost all such exceptions.
But you also have to handle a possible exception, even if you SELECT first. You still have a race condition.
You're right! innodb_autoinc_lock_mode=0 worked like a charm.
In your case, I would want to know why so many inserts are failing. I suspect that like many SQL developers, you aren't checking for success status after you do your INSERTs in your AJAX handler, so you never know that so many of them are failing.
They're probably still failing, you just aren't losing auto-inc id's as a side effect. You should really diagnose why so many fails occur. You could be either generating incomplete data, or running many more transactions than necessary.
After you change 782 in 38 you can reset the autoincrement with ALTER TABLE mytable AUTO_INCREMENT = 39. This way you continue at 39.
However, you should check why your gap is so high and change your design accordingly. Changing the autoincement should not be "default" behaviour.
I know the question has been answered already.. But if you have deleted rows in the table before, mysql will remember the used ID/Number because typically your Auto increment is Unique.. So therefore will not create duplicate increments.. To reindex and increment from the current max ID/integer you could perform:
ALTER TABLE TableName AUTO_INCREMENT=(SELECT max(order) + 1 FROM tablename)
auto increment doesn't care, if you delete some rows - everytime you insert a row, the value is incremented.
If you want a numbering without gaps, don't use auto increment and do it by yourself. You could use something like this to achive this for inserting
INSERT INTO tablename SET
`order` = (SELECT max(`order`) + 1 FROM (SELECT * from tablename) t),
...
and if you delete a row, you have to rearange the order column manually

MySQL DUPLICATE KEY UPDATE fails to update due to a NOT NULL field which is already set

I have a MySQL DB which is using strict mode so I need to fill all NOT NULL values when I insert a row. The API Im creating is using just DUPLICATE KEY UPDATE functionality to do both inserts/updates.
The client application complains if any NOT NULL attributes are inserted which is expected.
Basic example (id is primary key and theare are two fields that are NOT NULL aaa and xxx)
INSERT INTO tablename (aaa, xxx, id ) VALUES ( "value", "value", 1)
ON DUPLICATE KEY UPDATE aaa=VALUES(aaa), xxx=VALUES(xxx)
All good so far. Once it is inserted, the system would allow doing updates. Nevertheless, I get the following error when updating only one of the fields.
INSERT INTO tablename (aaa, id ) VALUES ( "newValue", 1)
ON DUPLICATE KEY UPDATE aaa=VALUES(aaa)
java.sql.SQLException: Field 'xxx' doesn't have a default value
This Exception is a lie as the row is already inserted and xxx attribute has "value" as value. I would expect the following sentence to be equivalent to:
UPDATE tablename SET aaa="newValue" WHERE id=1
I would be glad if someone can shed some light about this issue.
Edit:
I can use the SQL query in PhpMyAdmin successfully to update just one field so I am afraid that this is not a SQL problem but a driver problem with JDBC. That may not have solution then.
#Marc B: Your insight is probably true and would indicate what I just described. That would mean that there is a bug in JDBC as it should not do that check when the insert is of ON DUPLICATE type as there may be a default value for the row after all. Can't provide real table data but I believe that all explained above is quite clear.
#ruakh: It does not fail to insert, neither I am expecting delayed validation. One requirement I have is to have both insert/updates done using the same query as the servlet does not know if the row exists or not. The JAVA API service only fails to update a row that has NOT NULL fields which were already filled when the insert was done. The exception is a lie because the field DOES have a default value as it was inserted before the update.
This is a typical case of DRY / SRP fail; in an attempt to not duplicate code you've created a function that violates the single responsibility principle.
The semantics of an INSERT statement is that you expect no conflicting rows; the ON DUPLICATE KEY UPDATE option is merely there to avoid handling the conflict inside your code, requiring another separate query. This is quite different from an UPDATE statement, where you would expect at least one matching row to be present.
Imagine that MySQL would only check the columns when an INSERT doesn't conflict and for some reason a row was just removed from the database and your code that expects to perform an update has to deal with an exception it doesn't expect. Given the difference in statement behaviour it's good practice to separate your insert and update logic.
Theory aside, MySQL puts together an execution plan when a query is run; in the case of an INSERT statement it has to assume that it might succeed when attempted, because that's the most optimal strategy. It prevents having to check indices etc. only to find out later that a column is missing.
This is per design and not a bug in JDBC.

Verify a query is going to work before executing another query in reverse order

Ok, I have an update function with a weird twist. Due to the nature of the structure, I run a delete query then insert query, rather than an actual "Update" query. They are specifically run in that order so that the new items inserted are not deleted. Essentially, items are deleted by an attribute id that matches in the insert query. Since the attribute is not a primary index, "ON DUPLICATE KEY UPDATE" is not working.
So here's the dilemma. During development and testing, The delete query will run without fail, but if I'm screwing around with the input for the INSERT query and it fails, then the DATA has been deleted without being reinserted, which means regenerating new test data, and even worse, if it fails in production, then the user will lose everything they were working on.
So, I know MySQL validates a query before it is actually run, so is it possible to make sure the INSERT query validates before running the DELETE query?
<cfquery name="delete" datasource="DSOURCE">
DELETE FROM table
WHERE colorid = 12
</cfquery>
<!--- check this query first before running delete --->
<cfquery name="insert" datasource="DSOURCE">
INSERT INTO table (Name, ColorID)
VALUES ("tom", 12)
</cfquery>
You have 2 problems.
Since the attribute is not a primary index, "ON DUPLICATE KEY UPDATE"
is not working.
Attribute doesn't have to be PRIMARY KEY. It's sufficient if it's defined as UNIQUE KEY, which you can do without penalties.
And number two: if you want to execute a series of queries in sequence, with ALL of them being successful and none failing - the term is transaction. Either all succeed or nothing happens. Google about MySQL transactions to get better overview of how to use them.
Since you use WHERE colorid = 12 as your delete criterium, colorid must be a unique key. This gives you two ways of approachng this with a single query
UPDTAE table SET NAME="tom"
WHERE colorid=12
OR
REPLACE INTO table (Name, ColorID)
VALUES ("tom", 12)