I have a fairly large table with about 250k rows. It has an auto incremented ID column that is really sort of useless. I can't just get rid of the column without rewriting too much of the app, but the ID is never used as a foreign key or anything else (except simply as an identifier when you want to delete a row, I guess).
The majority of the data gets deleted and rewritten at least a few times a day (don't ask! it's not important, though I realize it's poor design!), though the total count of the rows stays fairly uniform. What this means is that each day to AI # increases by a quarter million or so.
My question is this: in several years' time, the ID column will get too large for the INT value. Is there a way to "reset" the ID, like an OPTIMIZE or something, or should I just plan on doing a SELECT INTO a temp table and truncating the original table, resetting the ID to 0?
Thanks
If you have the id as integer you can have 2^32 / 2 (2.147.483.647) rows, if is unsigned integer duplicate to 4.294.967.295, no worry 250.000 in nothing, if you want more, use unsigned bigint (18.446.744.073.709.551.615) :P
For reset the auto_numeric position:
ALTER TABLE table AUTO_INCREMENT = 1
Either change the datatype of ID to BIGINT and adjust your program accordingly, or if you're clearing everything out when you delete data you can use TRUNCATE TABLE TABLENAME which will reset the sequence.
Easiest and fastest :) Just drop the index, set autoincrement=1, and add it back :)
ALTER TABLE yourtable DROP id_field;
ALTER TABLE yourtable AUTO_INCREMENT=1;
ALTER TABLE yourtable ADD id_field INT NOT NULL AUTO_INCREMENT FIRST, ADD PRIMARY KEY (id_field);
Related
I have a table "data" which holds around 100,000,000 records.
I have added a new column to it "batch_id" (Integer).
On the application layer, I'm updating the batch_id in batches of 10,000 records for each of the 100,000,000 records (the batch_id is always the same for 10k).
I'm doing something like this (application layer pseudo code):
loop {
$batch_id = $batch_id + 1;
mysql.query("UPDATE data SET batch_id='$batch_id' WHERE batch_id IS NULL LIMIT 10000");
}
I have an index on the batch_id column.
In the beginning, this update statement took ~30 seconds. I'm now halfway through the Table and it's getting slower and slower. At the moment the same statement takes around 10 minutes(!). It reached a point where this is no longer feasible as it would take over a month to update the whole table at the current speed.
What could I do to speed it up, and why is MySQL Getting slower towards the end of the table?
Could an index on the primary key help?
Is the primary key automatically indexed in MySQL? The answer is Yes
So instead one index for batch_id will help.
The problem is without index the engine do a full table scan. At first is easy find 10k with null values, but when more and more records are updated the engine have to scan much more to find those nulls.
But should be easier create batch_id as an autonumeric column
OTHER OPTION: Create a new table and then add the index and replace old table.
CREATE newTable as
SELECT IF(#newID := #newID + 1,
#newID DIV 10000,
#newID DIV 10000) as batch_id,
<other fields>
FROM YourTable
CROSS JOIN (SELECT #newID :=0 ) as v
Insert auto increment primary key to existing table
Do you have a monotonically increasing id in the table? And all rows for a "batch" have 'consecutive' ids? Then don't add batch_id to the table, instead, create another table Batches with one row per batch: (batch_id (PK), id_start, id_end, start_time, end_time, etc).
If you stick to exact chunks of 10K, then don't even materialize batch_id. Instead, compute it from id DIV 10000 whenever you need it.
If you want to discuss this further, please provide SHOW CREATE TABLE for the existing table, and explain what you will be doing with the "batches".
To answer your question about "slow near the end": It is having to scan farther and farther in the table to find the NULLs. You would be better to walk through the table once, fiddling with each 10K chunk as you go. Do this using the PRIMARY KEY, whatever it is. (That is, even if it is not AUTO_INCREMENT.) More Details .
I have a table with a bunch of rows whose ID's are numbers with value less than 20,000,000. The table structure looks like this:
CREATE TABLE records(
id int(11) not null AUTO_INCREMENT,
... more data columns ...
) ENGINE=InnoDB AUTO_INCREMENT=16432352 DEFAULT CHARSET=utf8;
A system that is out of my control inserts rows in this table and the database insert those records with a generated ID.
But, I need to insert records in this table with very big ID's (starting 50,000,000). Also, it's important to note that the uncontrolled system inserts few records, such that the records I'm going to insert never collides with the records of the uncontrolled system.
Making some tests I realized that when I insert a record with a very vig ID, the AUTO_INCREMENT value jumps to that very big ID. For example:
First, I check the initial_auto_increment value:
SHOW TABLE STATUS FROM my-database like 'records';
... the auto_increment value looks like this:
# Name, ... , Auto_increment, ...
'record', ... , '16432352', ...
Next, I insert the record with a very big ID.
INSERT INTO records (id, ...) VALUES(679456755, ...);
Then, checking again the auto_increment value:
SHOW TABLE STATUS FROM my-database like 'records';
.. the final result look like this:
# Name, ... , Auto_increment, ...
'record', ... , '679456756', ...
My question is: How can I temporarily disable the AUTO_ICREMENT feature in such way that my records with very big ID's don't mess around with the AUTO_INCREMENT value of the table?
PS. I'm using MariaDB 10.
Edit: I changed the numbers, but the question is the same.
MySQL and MariaDB actually enforce the restriction AUTO_INCREMENT > MAX(id)
See ALTER TABLE Syntax
You cannot reset the counter to a value less than or equal to the value that is currently in use. For both InnoDB and MyISAM, if the value is less than or equal to the maximum value currently in the AUTO_INCREMENT column, the value is reset to the current maximum AUTO_INCREMENT column value plus one.
You can use ALTER TABLE to set the AUTO_INCREMENT to any value higher than MAX(id) if you would like to store higher values, however you cannot set it to a lower value than one of the rows currently in the table.
If you need to create rows in a "gap", with lower IDs than the AUTO_INCREMENT value, you would need to explicitly specify the id value in your INSERT. But if a process beyond your control is inserting rows and not specifying the IDs then they are always going to obtain IDs higher than everything else currently in the table.
The only thing I can suggest, if you are able to adjust what IDs are used for what, is that you reserve low IDs for your purposes (so use, say, 1 to 10,000 instead of 50,000,000 to 50,009,999), set the AUTO_INCREMENT to 10,001 and then let the outside process use the higher IDs - this would work just fine provided you don't run out of space.
For a longer term solution, consider switching to UUIDs - though you would need to modify the process that is outside your control for this.
You can set the AUTO_INCREMENT to any value you please:
ALTER TABLE records AUTO_INCREMENT = ?
Though I'd strongly recommend burying records at high ID numbers. Usually lower is better, or just mixing them in with regular records. Being obsessive about these things leads to conflict later on when your assumptions end up being mistaken.
I have a table with around 10k rows which I've imported. The ID is a significant column to my application, and it has to be ordered. Currently, I got something like: 1,2,3,4,5....5789,9275,9276.....
It jumped from 5789 to 9275. Is there any way I can reset the Auto Increment but also make it apply to the table? which means, now it will start giving them IDS all over again from 1 to 10k
Thanks!
ALTER TABLE <tablename> AUTO_INCREMENT=<new_value>;
Of course you need to fix the high IDs and all references to them manually.
However, why do you care? Does it really matter if there's a hole in the IDs? If yes, you might want to use a separate column that's always set to MAX(col) + 1 instead of an AUTO_INCREMENT column.
You can certainly reset the auto_increment value to be whatever you want by simply issuing this query:
ALTER TABLE <tbl> AUTO_INCREMENT = <n>;
where tbl is your table name and n is the value to start it at. However, if you have existing IDs in that table already, I believe it will simply set the next inserted items ID to be max(id) + 1 of the ID column
I've got a mysql table where each row has its own sequence number in a "sequence" column. However, when a row gets deleted, it leaves a gap. So...
1
2
3
4
...becomes...
1
2
4
Is there a neat way to "reset" the sequencing, so it becomes consecutive again in one SQL query?
Incidentally, I'm sure there is a technical term for this process. Anyone?
UPDATED: The "sequence" column is not a primary key. It is only used for determining the order that records are displayed within the app.
If the field is your primary key...
...then, as stated elsewhere on this question, you shouldn't be changing IDs. The IDs are already unique and you neither need nor want to re-use them.
Now, that said...
Otherwise...
It's quite possible that you have a different field (that is, as well as the PK) for some application-defined ordering. As long as this ordering isn't inherent in some other field (e.g. if it's user-defined), then there is nothing wrong with this.
You could recreate the table using a (temporary) auto_increment field and then remove the auto_increment afterwards.
I'd be tempted to UPDATE in ascending order and apply an incrementing variable.
SET #i = 0;
UPDATE `table`
SET `myOrderCol` = #i:=#i+1
ORDER BY `myOrderCol` ASC;
(Query not tested.)
It does seem quite wasteful to do this every time you delete items, but unfortunately with this manual ordering approach there's not a whole lot you can do about that if you want to maintain the integrity of the column.
You could possibly reduce the load, such that after deleting the entry with myOrderCol equal to, say, 5:
SET #i = 5;
UPDATE `table`
SET `myOrderCol` = #i:=#i+1
WHERE `myOrderCol` > 5
ORDER BY `myOrderCol` ASC;
(Query not tested.)
This will "shuffle" all the following values down by one.
I'd say don't bother. Reassigning sequential values is a relatively expensive operation and if the column value is for ordering purpose only there is no good reason to do that. The only concern you might have is if for example your column is UNSIGNED INT and you suspect that in the lifetime of your application you might have more than 4,294,967,296 rows (including deleted rows) and go out of range, even if that is your concern you can do the reassigning as a one time task 10 years later when that happens.
This is a question that often I read here and in other forums. As already written by zerkms this is a false problem. Moreover if your table is related with other ones you'll lose relations.
Just for learning purpose a simple way is to store your data in a temporary table, truncate the original one (this reset auto_increment) and than repopulate it.
Silly example:
create table seq (
id int not null auto_increment primary key,
col char(1)
) engine = myisam;
insert into seq (col) values ('a'),('b'),('c'),('d');
delete from seq where id = 3;
create temporary table tmp select col from seq order by id;
truncate seq;
insert into seq (col) select * from tmp;
but it's totally useless. ;)
If this is your PK then you shouldn't change it. PKs should be (mostly) unchanging columns. If you were to change them then not only would you need to change it in that table but also in any foreign keys where is exists.
If you do need a sequential sequence then ask yourself why. In a table there is no inherent or guaranteed order (even in the PK, although it may turn out that way because of how most RDBMSs store and retrieve the data). That's why we have the ORDER BY clause in SQL. If you want to be able to generate sequential numbers based on something else (time added into the database, etc.) then consider generating that either in your query or with your front end.
Assuming that this is an ID field, you can do this when you insert:
INSERT INTO yourTable (ID)
SELECT MIN(ID)
FROM yourTable
WHERE ID > 1
As others have mentioned I don't recommend doing this. It will hold a table lock while the next ID is evaluated.
I have a table with many rows but they are out of order. Im using the field "id" as the primary key. I also have a "date" field which is a datetime field.
How could i reindex the table so that the entries are id'd in chronological order according to the date field
How about something like a simple query using a variable:
set #ROW = 0;
UPDATE `tbl_example` SET `id` = #ROW := #ROW+1 ORDER BY `fld_date` ASC;
This will order your rows like: 0,1,2,4,5...etc by your date.
the way i would do it is to create a new table with auto increment index and just select all your old table into it ordering by date. you can then remove your old table.
Why do you want the sequence of IDs to correlate with the dates? It sounds like you want to do ORDER BY id and have the rows come back in date order. If you want rows in date order, just use ORDER BY date instead.
Values in an autoincrement ID column should be treated as arbitrary. Relying on your IDs being in date order is a bad idea.
The following SQL snippet should do what you want.
ALTER TABLE test_table ADD COLUMN id2 int unsigned not null;
SET #a:=0;
UPDATE test_table SET id2=#a:=#a+1 ORDER BY `date`;
ALTER TABLE test_table DROP id;
ALTER TABLE test_table CHANGE id2 id int UNSIGNED NOT NULL AUTO_INCREMENT,
ADD PRIMARY KEY (id);
Keep in mind that you can never guarantee the order of an auto-incremented column once you start inserting and removing data, so you shouldn't be relying on any order except that which you specify using ORDER BY in your queries. This is an expensive operation that you are doing, as it requires indexes to be completely re-created, so I wouldn't suggest doing it often.
You can use ALTER TABLE t ORDER BY col;
The allowed syntax of ORDER BY is as in SELECT statements.
I had to do something similar. The best way to do it was the following (you can run it in one SQL Query if you want, but bare in mind that this is a slow and very resource consuming operation):
BE SURE TO MAKE A BACKUP OF YOUR TABLE, INCLUDING STRUCTURE AND DATA BEFORE STARTING THIS QUERY!
ALTER TABLE your_table ADD COLUMN temp_id INT UNSIGNED NOT NULL;
SET #a:=0;
UPDATE your_table SET temp_id=#a:=#a+1 ORDER BY `date` ASC;
ALTER TABLE your_table DROP id;
ALTER TABLE your_table CHANGE temp_id id INT UNSIGNED NOT NULL AUTO_INCREMENT, ADD PRIMARY KEY (id);
ALTER TABLE your_table CHANGE COLUMN id id INT(10) FIRST;
Just don't forget to change "your_table" with the name of your table, and the ORDER BY columns.
Here I explain you what you're doing this way step by step:
First you add a new column named "temp_id" (make sure it's not a name you're using already);
Next you add a temp variable equal to 0 (or to whatever you want for your ID to start from);
Then you update your table, row by row by the set ORDER logic, setting a value for your new column "temp_id" equal to the variable you've set, then increment this variable by 1 (you can do something funky here, for example if you want your ID's to be always even, the you can set #a+2);
Next step you drop (remove) your old column ID;
Then you change the name of your temp_id column back to ID and it as a positive integer with auto increment which is the primary key of your table.
Because ID now is the newly added temp_id column, it's located at the end of your table structure. To move it again as first column, you run the last query, to make sure it's the first column.
If you are using something like phpmysql this could be achieved by:
going to the table (left side list of db's and tables), then
from the options in the upper bar select 'SQL'. Follow the advice by #Ryun, then go to 'Operations' (from the upper bar),
look for 'TABLE OPTIONS', leave everything except 'AUTO_INCREMENT' unchanged,
set the 'AUTO_INCREMENT' value to 1 and press go at the bottom of the form.
What will this do, in all?
It will set the id columns in each from 1 to {count}.
Then it will reset the index of the table so that your next inserted row will equal +1 the number of columns (and not +1 the old index).
#Wyzard made reference to just ordering the columns by date when you retrieve them from the table (and not re-indexing). Since, indeed, the Primary Key should be arbitrary (except to any foreign keys and perhaps the consuming platform (but that is another matter)).