I have an invoices table which stores a single record for each invoice, with the id column (int AUTO_INCREMENT) being the primary key, but also the invoice reference number.
Now, unfortunately I've had to manual migrate some invoices generated on an old system which have a five digit id, instead of a four digit one which the current system uses.
However, even when I reset the AUTO_INCREMENT through PhpMyAdmin (Table Operations) back to the next four digit id, it still inserts a five digit one being the higher id currently in the table plus one.
From searching around, it would seem that I actually need to change the insert_id as well as the AUTO_INCREMENT ? I've tried to execute ALTER TABLE invoices SET insert_id=8125 as well as ALTER TABLE invoices insert_id=8125 but neither of these commands seem to be valid.
Can anyone explain the correct way that I can reset the AUTO_INCREMENT so that it will insert records with id's 8125 onwards, and then when it gets to 10962 it will skip over the four records I've manually added and continue sequential id's from 10966 onwards. If it won't skip over 10962 - 10966 then this doesn't really matter, as the company doesn't generate that many invoices each year so this will occur in a subsequent year hence not causing a problem hopefully.
I would really appreciate any help with this sticky situation I've found myself in! Many Thanks
First thing I'll suggest is to ditch PHPMyAdmin because it's one of the worst "applications" ever made to be used to work with MySQL. Get a proper GUI. My favourite is SQLYog.
Now on to the problem. Never, ever tamper with the primary key, don't try to "reset" it as you said or to update columns that have an integer generated by the database. As for why, the topic is broad and can be discussed in another question, just never, ever touch the primary key once you've set it up.
Second thing is that someone was deleting records of invoices hence the autoincrement is now at 10k+ rather than at 8k+. It's not a bad thing, but if you need sequential values for your invoices (such as there can't be a gap between invoices 1 and 5) then use an extra field called sequence_id or invoice_ref and use triggers to calculate that number. Don't rely on auto_increment feature that it'll reuse numbers that have been lost trough DELETE operation.
Alternatively, what you can do is export the database you've been using, find the CREATE TABLE definition for the invoices table, and find the line where it says "AUTO_INCREMENT = [some number]" and delete that statement. Import into your new database and the auto_increment will continue from the latest invoice. You could do the same by using ALTER TABLE however it's safer to re-import.
Related
Quick question:
I have a sports league database with a list of games (let's say 40 or so). Each game is auto-assigned an ID number as the primary key when importing the entire schedule from a spreadsheet. The games are then displayed on the web page in descending order thanks to this invisible (to the user) primary key. Here's an example: League Schedule
Works great. The only problem is that sometimes the games are rescheduled and moved to a later date or a new game is added and has to be inserted into an already existing schedule. To this point, I've had to manually edit each affected row's ID (using PhpMyAdmin) to account for the changes and this can be quite tedious and time consuming.
What I'd really like to do is set the table to readjust primary key values on the fly. Meaning, if I inserted a brand new game into the fifth row of the table, all games thereafter would automatically be readjusted (ID 5 would become 6, ID 6 would become 7, and so on).
Is there a way to set-up the table to do this, or a particular SQL command I can use to accomplish it just the same? Apologies if this has already been asked many times in different ways. Any and all feedback is appreciated.
You should not use your PRIMARY KEY for that. Add a special column like sort with a regular INDEX, not UNIQUE. It does not have to be INT either, you can use real numbers. This way you will always be able to insert new row between any two rows of your schedule.
No, auto-increment is required to be unique, but it is not required to be in any particular order or even contiguous. The fact that auto-increment is monotonically increasing is only by coincidence of its implementation. Don't rely on the values being in chronological order.
Trying to adjust the values is not only manual and awkward, but it risks race conditions, or else would require locking a lot of rows. What if you insert a row with id 5, but your table has 1 billion rows greater than id 5?
There's also a risk of renumbering primary key columns, because any user who got an email telling them that they need to go to game 42 may end up going to the wrong game.
If you have need to view the rows in a particular order (e.g. chronological), then use a DATE column for that, not an auto-increment column.
I have a table in MySQL using InnoDB and a column is there with the name "id".
So my problem is that whenever I delete the last row from the table and then insert a new value, the new value gets inserted after the deleted id.
I mean suppose my id is 32, and I want to delete it and then if I insert a new row after delete, then the column id auto-increments to 33. So the serial format is broken ie,id =30,31,33 and no 32.
So please help me out to assign the id 32 instead of 33 when ever I insert after deleting the last column.
Short answer: No.
Why?
It's unnecessary work. It doesn't matter, if there are gaps in the serial number.
If you don't want that, don't use auto_increment.
Don't worry, you won't run out of numbers if your column is of type int or even bigint, I promise.
There are reasons why MySQL doesn't automatically decrease the autoincrement value when you delete a row. Those reasons are
danger of broken data integrity (imagine multiple users perform deletes or inserts...doubled entries may occur or worse)
errors may occur when you use master slave replication or transactions
and so on ...
I highly recommend you don't waste time on this! It's really, really error prone.
You have two major misunderstandings about how a relational database works:
there is no such thing as the "last row" in a relational database.
The ID (assuming that is your primary key) has no meaning whatsoever. It doesn't matter if the new row is assigned the 33, 35354 or 236532652632. It's just a value to uniquely identify that row.
Do not rely on consecutive values in your primary key column.
And do not try the max(id)+1 approach. It will simply not work in a system with more than one transaction.
You should stop fighting this, even using SELECT max(id) will not fix this properly when using transactional database engine like Innodb.
Why you might ask? Imagine that you have 2 transactions, A and B, that started almost at the same time, both doing INSERT. First transaction A needs new row id, and it will use it from invisible sequence associated with this table (known as AUTOINCREMENT value), say 21. Another transaction B will use another successive value (say 22) - so far so good.
But, what if transaction A rolls back? Value 21 cannot be reused, and 22 is already committed. And what if there were 10 such transactions?
And max(id) can assign the same value to both A and B, so this is not valid as well.
I suppose you mean "Whenever I delete the last row from the table", isn't it?
Anyway this is how autoincrement works. It's made to keep correct data relations. If in another table you use an id of a record that has been deleted it's more correct to get an error instead of get another record when querying that id.
Anyway here you can see how to get the first free id in a field.
I know full well this should never happen. Ever. However, I started working at a company recently that hasn't had the greatest database design or input validation and this situation has come up.
There is a table which we'll call 'jobs'*. Jobs has a primary key, 'ID'. The job with the ID of 1 has loads of data associated with it; However, stupidly someone has duplicated that job as id 2 (this has happened around ~500 times so far). All of the information for both needs to be merged as id 1 (or 2, it doesn't matter).
The columns ARE linked by Foreign Key with UPDATE: CASCADE and DELETE: RESTRICT. They are not all called jobs_id.
Is my only (seemingly sensible) option here to:
Change id 1 to something I can guarantee is not used (2,147,483,647)
Temporarily remove the Foreign Key DELETE: RESTRICT
Delete the entry with id 1
Update id 2 to 2,147,483,647 (to link it with all the other entries)
Change id 2,147,483,647 to id 2
Reinstate DELETE: RESTRICT
As none of the code actually performs a delete (the restriction is there just as a fail-safe (someone editing direct in DB)), and the update: cascade is left in, data shouldn't get out of sync. This does seem messy though.
This will be wrapped in a transaction.
I could write something to iterate through each table (~180) and each column to find certain names / conditions, then update from 1 to 2, but that would need maintenance when a new table / column came along.
As this has happened a lot, and I don't see a re-write to prevent it happening any time soon, the 'solution' (sticking plaster) needs to be semi-automatic.
not the table's real name. His (or her) identity has been disguised so he (or she) doesn't get bullied.
Appreciate any input.
Assuming that you know how to identify the duplicated records why not create a new table with the same structure (maybe without the FKs), then loop through the original while copying values to the new table. When you hit a duplication, fix the value when writing to the new table. Then drop the original and rename the temp to the original.
This will clean up the table but if processes are still making the duplicated entries you could use a unique key to limit the damage going forward.
I am considering designing a relational DB schema for a DB that never actually deletes anything (sets a deleted flag or something).
1) What metadata columns are typically used to accomodate such an architecture? Obviously a boolean flag for IsDeleted can be set. Or maybe just a timestamp in a Deleted column works better, or possibly both. I'm not sure which method will cause me more problems in the long run.
2) How are updates typically handled in such architectures? If you mark the old value as deleted and insert a new one, you will run into PK unique constraint issues (e.g. if you have PK column id, then the new row must have the same id as the one you just marked as invalid, or else all of your foreign keys in other tables for that id will be rendered useless).
If your goal is auditing, I'd create a shadow table for each table you have. Add some triggers that get fired on update and delete and insert a copy of the row into the shadow table.
Here are some additional questions that you'll also want to consider
How often do deletes occur. What's your performance budget like? This can affect your choices. The answer to your design will be different depending of if a user deleting a single row (like lets say an answer on a Q&A site vs deleting records on an hourly basis from a feed)
How are you going to expose the deleted records in your system. Is it only through administrative purposes or can any user see deleted records. This makes a difference because you'll probably need to come up with a filtering mechanism depending on the user.
How will foreign key constraints work. Can one table reference another table where there's a deleted record?
When you add or alter existing tables what happens to the deleted records?
Typically the systems that care a lot about audit use tables as Steve Prentice mentioned. It often has every field from the original table with all the constraints turned off. It often will have a action field to track updates vs deletes, and include a date/timestamp of the change along with the user.
For an example see the PostHistory Table at https://data.stackexchange.com/stackoverflow/query/new
I think what you're looking for here is typically referred to as "knowledge dating".
In this case, your primary key would be your regular key plus the knowledge start date.
Your end date might either be null for a current record or an "end of time" sentinel.
On an update, you'd typically set the end date of the current record to "now" and insert a new record the starts at the same "now" with the new values.
On a "delete", you'd just set the end date to "now".
i've done that.
2.a) version number solves the unique constraint issue somewhat although that's really just relaxing the uniqueness isn't it.
2.b) you can also archive the old versions into another table.
I have one of my primary key column in my table to auto-increment. However when I delete a row from the table that has the highest primary key id (lets say 11). Then the next time I do an insertion it inserts the key as 12 not 11 (though logically it can use 11 as there is no entry associated with the key 11). How can I make this happen?
Are you really sure you want this? An autoincrement column will guarantee a unique number, and that's enough. You could update the next autoincrement value I guess (i'll have to look it up how that works), but I don't think you should want that.
If you need to control the numbers in a column, you should do so manually.
nevertheless, you can change the autoincrement number like so:
ALTER TABLE tbl AUTO_INCREMENT = 100;
(from: http://dev.mysql.com/doc/refman/5.0/en/example-auto-increment.html )
Another remark: If you have numbers one to ten, and you remove 5, you cannot easily do this. You can hardly make the next auto_increment 5 because 6 is already there.
So again, while you can do something dirty for your example, it's really hard to do this in a real environment. Maybe start a new question with description of your situation, and ask for advice how to approach that problem without the auto_increment tricks :)
Mysql doesn't have that feature out of the box, you'll need to code it in your application. One problem you'll have is that if 2 transactions want to get and id, one of the them will get a duplicate id error. Of couse, this is better to avoid.
All the DB engines lack this "feature", as it not good for concurrency.