How to insert a new column in a huge MYSQL Database Table? - mysql

I have this table in MYSQL databse which has about 10 million records/rows. I want to insert a new column in the table. However a simple insert column query doesn't seem to work well for me.
This is what I have tried,
ALTER TABLE contacts ADD processed INT(11);
I waited for about 5 hours, but nothing happened. Is there any way to insert a new column in such a huge table?
Hope I am clear with my question. Any help would be appreciated.

If it's production:
You should use pt-online-schema-change of Percona Toolkit.
pt-online-schema-change emulates the way that MySQL alters tables internally, but it works on a copy of the table you wish to alter. This means that the original table is not locked, and clients may continue to read and change data in it.
pt-online-schema-change works by creating an empty copy of the table to alter, modifying it as desired, and then copying rows from the original table into the new table. When the copy is complete, it moves away the original table and replaces it with the new one. By default, it also drops the original table.
Or oak-online-alter-table which is part of openark kit
oak-online-alter-table allows for non blocking ALTER TABLE operations, table rebuilds and creating a table's ghost.
Altering tables will be slower, but it doesn't lock tables.
If it's not production and downtime is okay, try this approach:
CREATE TABLE contacts_tmp LIKE contacts;
ALTER TABLE contacts_tmp ADD COLUMN ADD processed INT UNSIGNED NOT NULL;
INSERT INTO contacts_tmp (contact_table_fields) SELECT * FROM contacts;
RENAME TABLE contacts_tmp TO contacts, contacts TO contacts_old;
DROP TABLE contacts_old;

Related

What is the best and safest method to update/change the data type of a column in a MySQL table that has ~5.5 million rows (TINYINT to SMALLINT)?

Similar questions have been asked, but I have had issues in the past by using
ALTER TABLE tablename MODIFY columnname SMALLINT
I had a server crash and had to recover my table when I ran this the last time. Is it safe to use this command when there is that much data in the table? What if there are other queries that may be running on the table in parallel? Should I copy the table and run the query on the new table? Should I copy the column and move the data to the new column?
Please let me know if there are any best or "safest" practices when doing this.
Also, I know this depends on a lot of factors, but does anyone know how long the query should take on an InnoDB table with ~5.5 million rows (rough estimate)? The column in question is a TINYINT and has data in it. I want to upgrade to a SMALLINT to handle larger values.
Thanks!
On a slow disk, and with lots of columns in the table, it could take hours to finish.
The ALTER is "safe" because it used to do the following:
Lock the table
Create a similar table, but with SMALLINT instead of TINYINT.
Copy all the rows over to the new table.
Rename the tables and drop the old one.
Unlock
Step 3 is the slow part. The only vulnerability is in step 4, which is very fast.
A server crash during steps 1-3 should have left the old table intact, but possibly left behind a partially created tmp table named something like #sql....
Percona's pt-online-schema-change has the advantage of being virtually lockless.
This cannot be easily answered.
It depends on things like
Has the table its own file, or is it shared with others?
How big is the table in terms of bytes?
etc.
It can last from some minutes to, indeed, some hours and can involve copying over the whole content of the table, so you have quite big needs of disk space.
You can add a new SMALLINT column to the table:
ALTER TABLE tablename ADD columnname_new SMALLINT AFTER columnname;
then copy the data from old column to new one:
UPDATE tablename SET columnname_new = columnname WHERE columnname_new IS NULL LIMIT 100000
repeat above until all records done
then you can drop old column:
ALTER TABLE tablename DROP COLUMN columnname
and finally rename new column:
ALTER TABLE tablename CHANGE columnname_new columnname SMALLINT
you could do the copy of values from old column to new column in batch of 100000 rows, just to be sure not to have any issue
I would add a new column, change the code to check if a value exists in the new column and to read/write it if it does. Also change the code to read from the old column and write to the new column. At this point you can migrate the data at will, copying over values from the old column into the new column where a value does not exist in the new column.
Once all of the data has been migrated you can drop the old column.

When is the btree created when adding an index to a MySQL 5.5 table?

I have quite a big table in MySQL 5.5, ~200M rows, and I want to add an index to one of the columns in this table (btree type). The column is of type integer and contains a wide distribution of integers.
My question is when is the btree computed?
When I execute the simple create index query:
ALTER TABLE bigtable ADD INDEX (column3);
It returns immediately. Is the computing of the btree happening in the background? I can't imagine that MySQL is that fast at creating a btree of ~200M values with a wide distribution of integers.
Short answer: Yes.
Long Answer: A look at the MySQL Documentation for ALTER_TABLE reveals the following:
In most cases, ALTER TABLE makes a temporary copy of the original table. MySQL waits for other operations that are modifying the table, then proceeds. It incorporates the alteration into the copy, deletes the original table, and renames the new one. While ALTER TABLE is executing, the original table is readable by other sessions (with the exception noted shortly). Updates and writes to the table that begin after the ALTER TABLE operation begins are stalled until the new table is ready, then are automatically redirected to the new table without any failed updates. The temporary copy of the original table is created in the database directory of the new table. This can differ from the database directory of the original table for ALTER TABLE operations that rename the table to a different database.
So, when you create your index, the index is being created on a temporary copy of the table, which is then imported in place of the now dropped original table when it completes.

How do I efficiently change a MySQL table structure on a table with millions of entries?

I have a MySQL database that is up to about 17 GB in size and has 38 million entries. At the moment, I need to both increase the size of one column (varchar 40 to varchar 80) and add more columns.
Many of the fields are indexed including the one that I need to change. It is part of a unique pair that is necessary for the applications to work. In attempting to just make the change yesterday, the query ran for almost four hours without finishing, when I decided to cut our outage and just bring the service back up.
What is the most efficient way to make changes to something of this size?
Many of these entries are also old and if there is a good way to sort of shard off entries but still have them available that might help with this problem by making the table a much more manageable size.
You have some choices.
In any case you should take a backup before you do this stuff.
One possibility is to take your service offline and do it in place, as you have tried. If you do that you should disable key checks and constraints.
ALTER TABLE bigtable DISABLE KEYS;
SET FOREIGN_KEY_CHECKS=0;
ALTER TABLE (whatever);
ALTER TABLE (whatever else);
...
SET FOREIGN_KEY_CHECKS=1;
ALTER TABLE bigtable ENABLE KEYS;
This will allow the ALTER TABLE operation to go faster. It will regenerate the indexes all at once when you do ENABLE KEYS.
Another possibility is to create a new table with the new schema you want, then disable the keys on the new table, then do as #Bader suggested and insert the contents of the old table.
After your new table is built you will re-enable the keys on it, then rename the old table to some name like "old_bigtable" then rename the new table to "bigtable".
It's possible that you can keep your service online while you're populating the new table. But that might work poorly.
A third possibility is to dump your giant table (to a flat file) and then load it to a new table with the new layout. That is pretty much like the second possibility except that you get a table backup for free. You can make this go pretty fast with SELECT DATA INTO OUTFILE and LOAD DATA INFILE. You'll need to have access to your server machine's file system to do this.
In all cases, disable, then re-enable, the constraints and keys to get things to go fast.
Create a new table with the new structure you want with a different name for example NewTable.
Then insert data into this new table from the old table using the following query:
INSERT INTO NewTable (field1, field2, etc...) SELECT field1, field2, ... FROM OldTable
After this is done, you can drop the old table and rename the new table to the original name
DROP TABLE `OldTable`;
RENAME TABLE `NewTable` TO `OldTable` ;
I have tried this approach on a very large table and it's much much faster than altering the table.
With MySQL 5.1 and again with 5.5 certain alter statements were enhanced to just modify the structure without rewriting the entire table ( http://dev.mysql.com/doc/refman/5.5/en/alter-table.html - search for in-place). The availability of this though varies by the type of change you are making and the engine in use, the most value comes from InnoDB Plugin. In the case of your specific changes though the entire table would be rewritten.
When we encounter these issues, we typically try to leverage replica databases. As long as you are adding and not removing you can run your DDL against the replica first and then schedule a brief outage for promoting the replica to the master role. If you happen to be on RDS this is even one of their suggested uses for their replica instances http://aws.amazon.com/about-aws/whats-new/2012/10/11/amazon-rds-mysql-rr-promotion/.
Some other alternatives include:
Selecting out a subset of records into a new table with the desired structure (use INTO OUTFILE to avoid a table lock). Once complete you can schedule a maintenance window and REPLACE INTO or UPDATE any records that have changed in the origin table since the initial data copy. Once the update is complete a RENAME TABLE... of both tables wraps the changes up.
Using a tool like Percona's pt-online-schema-change: http://www.percona.com/doc/percona-toolkit/2.1/pt-online-schema-change.html. This tool works with triggers so if you already have triggers on the tables you want to change this may not fit your needs.

Change column name without recreating the MySQL table

Is there a way to rename a column on an InnoDB table without a major alter?
The table is pretty big and I want to avoid major downtime.
Renaming a column (with ALTER TABLE ... CHANGE COLUMN) unfortunately requires MySQL to run a full table copy.
Check out pt-online-schema-change. This helps you to make many types of ALTER changes to a table without locking the whole table for the duration of the ALTER. You can continue to read and write the original table while it's copying the data into the new table. Changes are captured and applied to the new table through triggers.
Example:
pt-online-schema-change h=localhost,D=databasename,t=tablename \
--alter 'CHANGE COLUMN oldname newname NUMERIC(9,2) NOT NULL'
Update: MySQL 5.6 can do some types of ALTER operations without rebuilding the table, and changing the name of a column is one of those supported as an online change. See http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html for an overview of which types of alterations do or don't support this.
If there aren't any constraints on it, you can alter it without a hassle as far as I know. If there are you'll have to remove the constraints first, alter and add the constraints back.
Altering a table with many rows can take a long time (though if the columns involved are not indexed, it may be trivial).
If you specifically want to avoid using the ALTER TABLE syntax created specifically for that purpose, you can always create a table with almost the exact same structure (but different name) and copy all the data into it, like so:
CREATE TABLE `your_table2` ...;
-- (using the query from SHOW CREATE TABLE `your_table`,
-- but modified with your new column changes)
LOCK TABLES `your_table` WRITE;
INSERT INTO `your_table2` SELECT * FROM `your_table`;
RENAME TABLE `your_table` TO `your_table_old`, `your_table2` TO `your_table`;
For some ALTER TABLE queries, the above can be quite a bit faster. However, for a simple column name change, it could be trivial. I might try creating an identical table and performing the change on it in order to see how much time you're actually looking at.

create an index without locking the DB

I have a table with 10+ million rows. I need to create an index on a single column, however, the index takes so long to create that I get locks against the table.
It may be important to note that the index is being created as part of a 'rake db:migrate' step... I'm not adverse to creating the index manually if that will work.
UPDATE: I suppose I should have mentioned that this a write often table.
MySQL NDBCLUSTER engine can create index online without locking the writes to the table. However, the most widely used InnoDB engine does not support this feature. Another free and open source DB Postgres supports 'create index concurrently'.
you can prevent the blockage with something like this (pseudo-code):
create table temp like my_table;
update logger to log in temp;
alter table my_table add index new_index;
insert into my_table select * from temp;
update logger to log in my_table;
drop table temp
Where logger would be whatever adds rows/updates to your table in regular use(ex.: php script). This will set up a temporary table to use while the other one updates.
Try to make sure that the index is created before the records are inserted. That way, the index will also be filled during the population of the table. Although that will take longer, at least it will be ready to go when the rake task is done.