I am new to MySQL but learning fast. I am following a tutorial and all is well until I add a couple of records. I am doing this tutorial in Workbench 6.1.
This is what the tutorial ask me to do:
After creating a very simple table with emp_no, first_name and last_name where emp_no is the PK we insert three records:
no fn ln
--------------------
1 Daniel Lamarche
2 Paul Smith
3 Bobz Youruncle
Then the tutorial asks us to UPDATE the third record to:
5 Alan Youruncle
All is well. Then it asks us to confirm that LAST_INSERT_ID() is still equal to 3.
The table now looks like the following:
no fn ln
--------------------
1 Daniel Lamarche
2 Paul Smith
5 Alan Youruncle
Here where I have a problem that the tutorial does not address because it stops there. Adventurous as I am I wonder what will happen if I add three records. Since LAST_INSERT_ID = 3 will the emp_no will take 3, 4 then 6 I ask myself.
So when I insert three records with:
INSERT INTO employees (first_name, last_name)
VALUES ('Paul', 'Lalo'), ('Claude', 'Baker'), ('Alan', 'Brown');
I get the error ERROR Code: 1062. Duplicate entry '5' for key PRIMARY.
Now I do perfectly understand why there is an error. Anyone can help me understand how to deal with this. Is there a way to insert new records and skip the value that it encounters?
Now I also understand that maybe it is not good practice to do this that way or whatever. But let's pretend that this is a real life situation and not just a fun tutorial for beginners like me.
Just in case someone wants to the tut is at: http://www.mysqltutorial.org/mysql-sequence/
Thanks,
Daniel
In most real life situations you would never update or insert an auto_increment primary key value, you would just update or insert the other column values. It is only there as a pointer to the row.
When the tutorial asks you to UPDATE the third record to:
5 Alan Youruncle
It is only highlighting a point about the behaviour of LAST_INSERT_ID(), but should point out that this is not an UPDATE that you should generally run.
If you want to completely change a row, you would generally do a delete followed by an insert.
If you must, you can change the current auto_increment value on a table to one higher than the current maximum. This only becomes necessary if you have done something unusual however:
ALTER TABLE employees AUTO_INCREMENT=6;
Basically, when you perform an INSERT MySQL will update the table's AUTO_INCREMENT. When you perform an update on an "auto increment" column MySQL won't update the table's AUTO_INCREMENT.
Therefore,
INSERT INTO users(id, name) VALUES(20, 'John'); Will update the AUTO_INCREMENT to 21 so the next insert will have the 21st ID.
But if you perform an update UPDATE users SET id = 40 WHERE id = 20 the AUTO_INCREMENT still will be 21 not 41 and the next insert will have the 21st ID. If you keep inserting eventually you'll hit the 40th ID again and it will raise a duplicated primary key exception.
Also, FWIW, AUTO_INCREMENT updates are calculated and performed after inserts and not before.
I've been asked if I can keep track of the changes to the records in a MySQL database. So when a field has been changed, the old vs new is available and the date this took place. Is there a feature or common technique to do this?
If so, I was thinking of doing something like this. Create a table called changes. It would contain the same fields as the master table but prefixed with old and new, but only for those fields which were actually changed and a TIMESTAMP for it. It would be indexed with an ID. This way, a SELECT report could be run to show the history of each record. Is this a good method? Thanks!
Here's a straightforward way to do this:
First, create a history table for each data table you want to track (example query below). This table will have an entry for each insert, update, and delete query performed on each row in the data table.
The structure of the history table will be the same as the data table it tracks except for three additional columns: a column to store the operation that occured (let's call it 'action'), the date and time of the operation, and a column to store a sequence number ('revision'), which increments per operation and is grouped by the primary key column of the data table.
To do this sequencing behavior a two column (composite) index is created on the primary key column and revision column. Note that you can only do sequencing in this fashion if the engine used by the history table is MyISAM (See 'MyISAM Notes' on this page)
The history table is fairly easy to create. In the ALTER TABLE query below (and in the trigger queries below that), replace 'primary_key_column' with the actual name of that column in your data table.
CREATE TABLE MyDB.data_history LIKE MyDB.data;
ALTER TABLE MyDB.data_history MODIFY COLUMN primary_key_column int(11) NOT NULL,
DROP PRIMARY KEY, ENGINE = MyISAM, ADD action VARCHAR(8) DEFAULT 'insert' FIRST,
ADD revision INT(6) NOT NULL AUTO_INCREMENT AFTER action,
ADD dt_datetime DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP AFTER revision,
ADD PRIMARY KEY (primary_key_column, revision);
And then you create the triggers:
DROP TRIGGER IF EXISTS MyDB.data__ai;
DROP TRIGGER IF EXISTS MyDB.data__au;
DROP TRIGGER IF EXISTS MyDB.data__bd;
CREATE TRIGGER MyDB.data__ai AFTER INSERT ON MyDB.data FOR EACH ROW
INSERT INTO MyDB.data_history SELECT 'insert', NULL, NOW(), d.*
FROM MyDB.data AS d WHERE d.primary_key_column = NEW.primary_key_column;
CREATE TRIGGER MyDB.data__au AFTER UPDATE ON MyDB.data FOR EACH ROW
INSERT INTO MyDB.data_history SELECT 'update', NULL, NOW(), d.*
FROM MyDB.data AS d WHERE d.primary_key_column = NEW.primary_key_column;
CREATE TRIGGER MyDB.data__bd BEFORE DELETE ON MyDB.data FOR EACH ROW
INSERT INTO MyDB.data_history SELECT 'delete', NULL, NOW(), d.*
FROM MyDB.data AS d WHERE d.primary_key_column = OLD.primary_key_column;
And you're done. Now, all the inserts, updates and deletes in 'MyDb.data' will be recorded in 'MyDb.data_history', giving you a history table like this (minus the contrived 'data_columns' column)
ID revision action data columns..
1 1 'insert' .... initial entry for row where ID = 1
1 2 'update' .... changes made to row where ID = 1
2 1 'insert' .... initial entry, ID = 2
3 1 'insert' .... initial entry, ID = 3
1 3 'update' .... more changes made to row where ID = 1
3 2 'update' .... changes made to row where ID = 3
2 2 'delete' .... deletion of row where ID = 2
To display the changes for a given column or columns from update to update, you'll need to join the history table to itself on the primary key and sequence columns. You could create a view for this purpose, for example:
CREATE VIEW data_history_changes AS
SELECT t2.dt_datetime, t2.action, t1.primary_key_column as 'row id',
IF(t1.a_column = t2.a_column, t1.a_column, CONCAT(t1.a_column, " to ", t2.a_column)) as a_column
FROM MyDB.data_history as t1 INNER join MyDB.data_history as t2 on t1.primary_key_column = t2.primary_key_column
WHERE (t1.revision = 1 AND t2.revision = 1) OR t2.revision = t1.revision+1
ORDER BY t1.primary_key_column ASC, t2.revision ASC
Edit:
Oh wow, people like my history table thing from 6 years ago :P
My implementation of it is still humming along, getting bigger and more unwieldy, I would assume. I wrote views and pretty nice UI to look at the history in this database, but I don't think it was ever used much. So it goes.
To address some comments in no particular order:
I did my own implementation in PHP that was a little more involved, and avoided some of the problems described in comments (having indexes transferred over, signifcantly. If you transfer over unique indexes to the history table, things will break. There are solutions for this in the comments). Following this post to the letter could be an adventure, depending on how established your database is.
If the relationship between the primary key and the revision column seems off it usually means the composite key is borked somehow. On a few rare occasions I had this happen and was at a loss to the cause.
I found this solution to be pretty performant, using triggers as it does. Also, MyISAM is fast at inserts, which is all the triggers do. You can improve this further with smart indexing (or lack of...). Inserting a single row into a MyISAM table with a primary key shouldn't be an operation you need to optimize, really, unless you have significant issues going on elsewhere. In the entire time I was running the MySQL database this history table implementation was on, it was never the cause of any of the (many) performance problems that came up.
if you're getting repeated inserts, check your software layer for INSERT IGNORE type queries. Hrmm, can't remember now, but I think there are issues with this scheme and transactions which ultimately fail after running multiple DML actions. Something to be aware of, at least.
It's important that the fields in the history table and the data table match up. Or, rather, that your data table doesn't have MORE columns than the history table. Otherwise, insert/update/del queries on the data table will fail, when the inserts to the history tables put columns in the query that don't exist (due to d.* in the trigger queries), and the trigger fails. t would be awesome if MySQL had something like schema-triggers, where you could alter the history table if columns were added to the data table. Does MySQL have that now? I do React these days :P
It's subtle.
If the business requirement is "I want to audit the changes to the data - who did what and when?", you can usually use audit tables (as per the trigger example Keethanjan posted). I'm not a huge fan of triggers, but it has the great benefit of being relatively painless to implement - your existing code doesn't need to know about the triggers and audit stuff.
If the business requirement is "show me what the state of the data was on a given date in the past", it means that the aspect of change over time has entered your solution. Whilst you can, just about, reconstruct the state of the database just by looking at audit tables, it's hard and error prone, and for any complicated database logic, it becomes unwieldy. For instance, if the business wants to know "find the addresses of the letters we should have sent to customers who had outstanding, unpaid invoices on the first day of the month", you likely have to trawl half a dozen audit tables.
Instead, you can bake the concept of change over time into your schema design (this is the second option Keethanjan suggests). This is a change to your application, definitely at the business logic and persistence level, so it's not trivial.
For example, if you have a table like this:
CUSTOMER
---------
CUSTOMER_ID PK
CUSTOMER_NAME
CUSTOMER_ADDRESS
and you wanted to keep track over time, you would amend it as follows:
CUSTOMER
------------
CUSTOMER_ID PK
CUSTOMER_VALID_FROM PK
CUSTOMER_VALID_UNTIL PK
CUSTOMER_STATUS
CUSTOMER_USER
CUSTOMER_NAME
CUSTOMER_ADDRESS
Every time you want to change a customer record, instead of updating the record, you set the VALID_UNTIL on the current record to NOW(), and insert a new record with a VALID_FROM (now) and a null VALID_UNTIL. You set the "CUSTOMER_USER" status to the login ID of the current user (if you need to keep that). If the customer needs to be deleted, you use the CUSTOMER_STATUS flag to indicate this - you may never delete records from this table.
That way, you can always find what the status of the customer table was for a given date - what was the address? Have they changed name? By joining to other tables with similar valid_from and valid_until dates, you can reconstruct the entire picture historically. To find the current status, you search for records with a null VALID_UNTIL date.
It's unwieldy (strictly speaking, you don't need the valid_from, but it makes the queries a little easier). It complicates your design and your database access. But it makes reconstructing the world a lot easier.
You could create triggers to solve this. Here is a tutorial to do so (archived link).
Setting constraints and rules in the database is better than writing
special code to handle the same task since it will prevent another
developer from writing a different query that bypasses all of the
special code and could leave your database with poor data integrity.
For a long time I was copying info to another table using a script
since MySQL didn’t support triggers at the time. I have now found this
trigger to be more effective at keeping track of everything.
This trigger will copy an old value to a history table if it is changed
when someone edits a row. Editor ID and last mod are stored in the
original table every time someone edits that row; the time corresponds
to when it was changed to its current form.
DROP TRIGGER IF EXISTS history_trigger $$
CREATE TRIGGER history_trigger
BEFORE UPDATE ON clients
FOR EACH ROW
BEGIN
IF OLD.first_name != NEW.first_name
THEN
INSERT INTO history_clients
(
client_id ,
col ,
value ,
user_id ,
edit_time
)
VALUES
(
NEW.client_id,
'first_name',
NEW.first_name,
NEW.editor_id,
NEW.last_mod
);
END IF;
IF OLD.last_name != NEW.last_name
THEN
INSERT INTO history_clients
(
client_id ,
col ,
value ,
user_id ,
edit_time
)
VALUES
(
NEW.client_id,
'last_name',
NEW.last_name,
NEW.editor_id,
NEW.last_mod
);
END IF;
END;
$$
Another solution would be to keep an Revision field and update this field on save. You could decide that the max is the newest revision, or that 0 is the most recent row. That's up to you.
Here is how we solved it
a Users table looked like this
Users
-------------------------------------------------
id | name | address | phone | email | created_on | updated_on
And the business requirement changed and we were in a need to check all previous addresses and phone numbers a user ever had.
new schema looks like this
Users (the data that won't change over time)
-------------
id | name
UserData (the data that can change over time and needs to be tracked)
-------------------------------------------------
id | id_user | revision | city | address | phone | email | created_on
1 | 1 | 0 | NY | lake st | 9809 | #long | 2015-10-24 10:24:20
2 | 1 | 2 | Tokyo| lake st | 9809 | #long | 2015-10-24 10:24:20
3 | 1 | 3 | Sdny | lake st | 9809 | #long | 2015-10-24 10:24:20
4 | 2 | 0 | Ankr | lake st | 9809 | #long | 2015-10-24 10:24:20
5 | 2 | 1 | Lond | lake st | 9809 | #long | 2015-10-24 10:24:20
To find the current address of any user, we search for UserData with revision DESC and LIMIT 1
To get the address of a user between a certain period of time
we can use created_on bewteen (date1 , date 2)
MariaDB supports System Versioning since 10.3 which is the standard SQL feature that does exactly what you want: it stores history of table records and provides access to it via SELECT queries. MariaDB is an open-development fork of MySQL. You can find more on its System Versioning via this link:
https://mariadb.com/kb/en/library/system-versioned-tables/
Why not simply use bin log files? If the replication is set on the Mysql server, and binlog file format is set to ROW, then all the changes could be captured.
A good python library called noplay can be used. More info here.
Just my 2 cents. I would create a solution which records exactly what changed, very similar to transient's solution.
My ChangesTable would simple be:
DateTime | WhoChanged | TableName | Action | ID |FieldName | OldValue
1) When an entire row is changed in the main table, lots of entries will go into this table, BUT that is very unlikely, so not a big problem (people are usually only changing one thing)
2) OldVaue (and NewValue if you want) have to be some sort of epic "anytype" since it could be any data, there might be a way to do this with RAW types or just using JSON strings to convert in and out.
Minimum data usage, stores everything you need and can be used for all tables at once. I'm researching this myself right now, but this might end up being the way I go.
For Create and Delete, just the row ID, no fields needed. On delete a flag on the main table (active?) would be good.
The direct way of doing this is to create triggers on tables. Set some conditions or mapping methods. When update or delete occurs, it will insert into 'change' table automatically.
But the biggest part is what if we got lots columns and lots of table. We have to type every column's name of every table. Obviously, It's waste of time.
To handle this more gorgeously, we can create some procedures or functions to retrieve name of columns.
We can also use 3rd-part tool simply to do this. Here, I write a java program
Mysql Tracker
In MariaDB 10.5+ this is as easy to setup as
CREATE TABLE t (x INT) WITH SYSTEM VERSIONING
PARTITION BY SYSTEM_TIME;
Past history can then be queried by doing
SELECT * FROM t FOR SYSTEM_TIME AS OF TIMESTAMP '2016-10-09 08:07:06';
There is currently no counterpart for this in MySQL.
See the documentation for more info. If you're on an older version of MariaDB, the documentation has an alternate syntax that has been available since MariaDB 10.3.4.
I've seen a lot of discussion regarding this. I'm just seeking for your suggestions regarding this. Basically, what I'm using is PHP and MySQL. I have a users table which goes:
users
------------------------------
uid(pk) | username | password
------------------------------
12 | user1 | hashedpw
------------------------------
and another table which stores updates by the user
updates
--------------------------------------------
uid | date | content
--------------------------------------------
12 | 2011-11-17 08:21:01 | updated profile
12 | 2011-11-17 11:42:01 | created group
--------------------------------------------
The user's profile page will show the 5 most recent updates of a user. The questions are:
For the updates table, would it be possible to set both uid and date as composite primary keys with uid referencing uid from users
OR would it be better to just create another column in updates which auto-increments and will be used as the primary key (while uid will be FK to uid in users)?
Your idea (under 1.) rests on the assumption that a user can never do two "updates" within one second. That is very poor design. You never know what functions you will implement in the future, but chances are that some day 1 click leads to 2 actions and therefore 2 lines in this table.
I say "updates" quoted because I see this more as a logging table. And who knows what you may want to log somewhere in the future.
As for unusual primary keys: don't do it, it almost always comes right back in your face and you have to do a lot of work to add a proper autoincremented key afterwards.
It depends on the requirement but a third possibility is that you could make the key (uid, date, content). You could still add a surrogate key as well but in that case you would presumably want to implement both keys - a composite and a surrogate - not just one. Don't make the mistake of thinking you have to make an either/or choice.
Whether it is useful to add the surrogate or not depends on how it's being used - don't add a surrogate unless or until you need it. In any case uid I would assume to be a foreign key referencing the users table.
Hello is it possible to save the deleted auto incremented primary key in my database. For example
I have
Name_ID
1
2
3
4
If I delete primary key 4 and I insert again the primary key of I inserted should be four.
so. Name 1 2 3 4 5
I deleted primary key 5 (Name 1 2 3 4)
I added a data primary key should be 5 again not 6. THANKS!
Auto generated fields always have gaps in these cases.
What if you have an audit or history table that stored the rows with ID = 4, ID = 5? Then delete them again? How do you differentiate rows?
In your example, you've only deleted the last row? What is you delete ID = 1? Then what?
That is, they are just internal numbers unique to that table (and any associated tables like audit ones): no external meaning should be attached
As with other comments and answers here, I would not recommend this, especially if the data in the auto increment column is referenced externally, but you can set the next auto increment number to a specific value via an ALTER TABLE query
ALTER TABLE T_YourTable AUTO_INCREMENT=4
You could also drop the column and then re-add the column with the same attributes (this could be expensive if you have a lot of rows).
Why?
It's only intended to be a unique identifier.
You'll also get gaps with database clusters and whenever you rollback an insert transaction which overlaps a commited transaction - not just when you delete data.
A mechanism to fill-in-the-gaps would be complex, slow and difficult to maintain - and it's not needed.
I have a table with an auto_increment field and sometimes rows get deleted so auto_increment leaves gaps. Is there any way to avoid this or if not, at the very least, how to write an SQL query that:
Alters the auto_increment value to be the max(current value) + 1
Return the new auto_increment value?
I know how to write part 1 and 2 but can I put them in the same query?
If that is not possible:
How do I "select" (return) the auto_increment value or auto_increment value + 1?
Renumbering will cause confusion. Existing reports will refer to record 99, and yet if the system renumbers it may renumber that record to 98, now all reports (and populated UIs) are wrong. Once you allocate a unique ID it's got to stay fixed.
Using ID fields for anything other than simple unique numbering is going to be problematic. Having a requirement for "no gaps" is simply inconsistent with the requirement to be able to delete. Perhaps you could mark records as deleted rather than delete them. Then there are truly no gaps. Say you are producing numbered invoices: you would have a zero value cancelled invoice with that number rather than delete it.
There is a way to manually insert the id even in an autoinc table. All you would have to do is identify the missing id.
However, don't do this. It can be very dangerous if your database is relational. It is possible that the deleted id was used elsewhere. When removed, it would not present much of an issue, perhaps it would orphan a record. If replaced, it would present a huge issue because the wrong relation would be present.
Consider that I have a table of cars and a table of people
car
carid
ownerid
name
person
personid
name
And that there is some simple data
car
1 1 Van
2 1 Truck
3 2 Car
4 3 Ferrari
5 4 Pinto
person
1 Mike
2 Joe
3 John
4 Steve
and now I delete person John.
person
1 Mike
2 Joe
4 Steve
If I added a new person, Jim, into the table, and he got an id which filled the gap, then he would end up getting id 3
1 Mike
2 Joe
3 Jim
4 Steve
and by relation, would be the owner of the Ferrari.
I generally agree with the wise people on this page (and duplicate questions) advising against reusing auto-incremented id's. It is good advice, but I don't think it's up to us to decide the rights or wrongs of asking the question, let's assume the developer knows what they want to do and why.
The answer is, as mentioned by Travis J, you can reuse an auto-increment id by including the id column in an insert statement and assigning the specific value you want.
Here is a point to put a spanner in the works: MySQL itself (at least 5.6 InnoDB) will reuse an auto-increment ID in the following circumstance:
delete any number rows with the highest auto-increment id
Stop and start MySQL
insert a new row
The inserted row will have an id calculated as max(id)+1, it does not continue from the id that was deleted.
As djna said in her/his answer, it's not a good practice to alter database tables in such a way, also there is no need to that if you have been choosing the right scheme and data types. By the way according to part od your question:
I have a table with an auto_increment field and sometimes rows get deleted so auto_increment leaves gaps. Is there any way to avoid this?
If your table has too many gaps in its auto-increment column, probably as a result of so many test INSERT queries
And if you want to prevent overwhelming id values by removing the gaps
And also if the id column is just a counter and has no relation to any other column in your database
, this may be the thing you ( or any other person looking for such a thing ) are looking for:
SOLUTION
remove the original id column
add it again using auto_increment on
But if you just want to reset the auto_increment to the first available value:
ALTER TABLE `table_name` AUTO_INCREMENT=1
not sure if this will help, but in sql server you can reseed the identity fields. It seems there's an ALTER TABLE statement in mySql to acheive this. Eg to set the id to continue at 59446.
ALTER TABLE table_name AUTO_INCREMENT = 59446;
I'm thinking you should be able to combine a query to get the largest value of auto_increment field, and then use the alter table to update as needed.