I have a 6 million record table with an auto-increment ID PK. Due to various operations over the last several weeks, my starting ID is 2 million. Updates and other queries take a long time, and I'm wondering if having an iD range from 2mil to 8mil vs starting at 1 to 6 million could be responsible? I've noticed anecdotally that if I do selects/updates using a range of say ID>1000000 and ID<1001000 seems to be slower than ID>1 and ID<1000.
Is it worth it to remove the existing PK and add a new one starting at 1? I know I can do
ALTER TABLE tablename Auto-Increment=1
but I cannot do this here with 6 million existing records and Auto-Increment IDs already.
Clearly I can do try and test but for various reasons including the time it is going to take given the size of table, indexes, etc I'd prefer to ask before going to the time and effort if anyone knows the answer definitively.
Update:
For now I did the following:
CREATE TABLE table_new LIKE table;
To dupe the table with indexes and all
Then:
Alter Table table_new set Auto-Increment=1
So the empty duped table re-sets count to 1
Then I inserted from the original table to the duped table:
Insert into Table_New (FieldA,FieldB,FieldC)
Select FieldA,FIeldB,FieldC from Table
To insert all the records minus the ID field so that the Auto-Increment is added per inserted record, starting at 1 as the re-set specified and finally of course:
RENAME TABLE table TO table_old;
RENAME TABLE table_new TO table;
Related
If a table in MySQL containing suppose 1 million record, how can I add a column at any position with no downtime expected.
MySQL's ALTER TABLE performance can become very frustrating with very large tables. ALTER statements makes a new temporary table, copies records from your existing table into the new table even if the data wouldn't strictly need to be copied, and then replaces the old table with the new table.
Suppose you have a table with one million records and if you try to add 3 columns in it, then it will certainly copy the table 3 times, which means coping 3 million records.
A faster way of adding columns is to create your own new table, then select all of the rows from the existing table into it. You can create the structure from the existing table, then modify the structure however you’d like, then select in the data. Make sure that you select the information into the new table in the same order as the fields are defined.
1. CREATE TABLE new_table LIKE table
2. INSERT INTO new_table SELECT * FROM table
3. RENAME TABLE table = old_table, table = new_table;
If you have foreign key constraints you can handle these foreign keys using
SET FOREIGN_KEY_CHECKS = 0;
I'm having problem implementing auto-increment logic in my app. Says I inserts a 'group', and in mysql it has value 10 for its Id, next one would be 11, 12 and so forth.
But once the record (assume it's Id 12) got deleted, the next new item is 12 again. So it may have conflict.
Could possibly makes the auto increment don't repeat the same Int? I want every Id be unique, once it's delete means it never come back.
InnoDB really has this "Feature" or Bug, that the recent auto_increment is NOT stored in the table space. As soon as you restart the MySQL Server, the "auto_increment" value is taken from the highest recent value of the table, thus conflicting with possibly deleted values.
The solution to this is really ugly. You could create a table with the highest unused values per table, in the form
tablename maxvalue
tableA 375
tableB 12
and you could write a Post-Startup-Script, if you manage the MySQL-Server. So after every delete of a row of such a table you would check per AFTER DELETE, if that row was the max-value. That is a bit easier with newer versions of MySQL, since table informations are stored in INFORMATION_SCHEMA, and not only calculated with every select (which means reading INFORMATION_SCHEMA does not fire heavy and blocking queries so often).
You only have to update maxvalue if the deleted row was that max value.
It is a bit easier to update the maxalue on every insert on a row, if that does not slow down the system.
In some cases you have just one table with critical references, and that table has an index, so you can retrieve maxvalue from that table.
All in all that is a big problem with InnoDB, and writing a lot of Triggers just for this single unsaved number auto_increment is really not nice.
I think you not set id is primary key and auto increment
I have a table with about 35 million rows. each has about 35 integer values and one time value (last updated)
The table has two indexes
primary - uses two integer values from the table columns
Secondary - uses the 1st integer from the primary + another integer value.
I would like to delete old records (about 20 millions of them) according to the date field.
What is the fastest way:
1. Delete as is according the the date field?
2. Create another index by date and then delete by date.
There will be one time deletion of large portion of the data and then incremental weekly deletion of much smaller parts.
Is there another way to do it more efficiently?
it might be quicker to create a new table containing the rows you want to keep, drop the old table and then rename the new table
For weekly deletions an index on date field would speed things up.
Fastest (but not easiest) - i think - is to keep your records segmented into multiple
tables based on date, e.g. given week, and then have a union table of all those tables for the regular queries across the whole thing (so your queries would be unaltered). You would each week, create new tables and redefine the union table.
When you wish to drop old records, you simply recreate the union table to leave the records in the old tables out, and then drop those left out (remember to truncate before you drop depending on you filesystem). This is probably the fastest way to get there with MySQL.
A mess to manage though :)
I have a MyISAM table in MySQL which consists of two fields (f1 integer unsigned, f2 integer unsigned) and contains 320 million rows. I have an index on f2. Every week I insert about 150,000 rows into this table. I would like to know what is the frequency with which I need to run "analyze" and "optimize" on this table (as it would probably take a long time and block in the meantime)? I do not do any deletes or update statements, but just insert new rows every week. Also, I am not using this table in any joins so, based on this information, are "analyze" and "optimize" really required?
Thanks in advance,
Tim
ANALYZE TABLE checks the keys, OPTIMIZE TABLE kind of reorganizes tables.
If you never...ever... delete or update the data in your table, only insert new ones, you won't need analyze or optimize.
I have two tables, each one has a primary ID column as key. I want the two tables to share one increasing key counter.
For example, when the two tables are empty, and counter = 1. When record A is about to be inserted to table 1, its ID will be 1 and the counter will be increased to 2. When record B is about to be inserted to table 2, its ID will be 2 and the counter will be increased to 3. When record C is about to be inserted to table 1 again, its ID will be 3 and so on.
I am using PHP as the outside language. Now I have two options:
Keep the counter in the database as a single-row-single-column table. But every time I add things to table A or B, I need to update this counter table.
I can keep the counter as a global variable in PHP. But then I need to initialize the counter from the maximum key of the two tables at the start of apache, which I have no idea how to do.
Any suggestion for this?
The background is, I want to display a mix of records from the two tables in either ASC or DESC order of the creation time of the records. Furthermore, the records will be displayed in page-style, say, 50 records per page. Records are only added to the database rather than being removed. Following my above implementation, I can just perform a "select ... where key between 1 and 50" from two tables and merge the select datasets together, sort the 50 records according to IDs and display them.
Is there any other idea of implementing this requirement?
Thank you very much
Well, you will gain next to nothing with this setup; if you just keep the datetime of the insert you can easily do
SELECT * FROM
(
SELECT columnA, columnB, inserttime
FROM table1
UNION ALL
SELECT columnA, columnB, inserttime
FROM table2
)
ORDER BY inserttime
LIMIT 1, 50
And it will perform decently.
Alternatively (if chasing last drop of preformance), if you are merging the results it can be an indicator to merge the tables (why have two tables anyway if you are merging the results).
Or do it as SQL subclass (then you can have one table maintain IDs and other common attributes, and the other two reference the common ID sequence as foreign key).
if you need creatin time wont it be easier to add a timestamp field to your db and sort them according to that field?
i believe using ids as a refrence of creation is bad practice.
If you really must do this, there is a way. Create a one-row, one-column table to hold the last-used row number, and set it to zero. On each of your two data tables, create an AFTER INSERT trigger to read that table, increment it, and set the newly-inserted row number to that value. I can't remember the exact syntax because I haven't created a trigger for years; see here http://dev.mysql.com/doc/refman/5.0/en/triggers.html