Table traversing with multiple operations in ALTER TABLE - mysql

Some databases, like MySQL [1] and PostgreSQL [2], support bundling of certain compatible ALTER TABLE statements (as non-standard SQL).
For example we can have:
ALTER TABLE `my_table`
DROP COLUMN `column_1`,
DROP COLUMN `column_2`,
...
or
ALTER TABLE
MODIFY `column_1` ... ,
MODIFY `column_2` ... ,
instead of having individual statements:
ALTER TABLE `my_table` DROP COLUMN `column_1`;
ALTER TABLE `my_table` DROP COLUMN `column_2`;
or
ALTER TABLE `my_table` MODIFY `column_1` ... ;
ALTER TABLE `my_table` MODIFY `column_2` ... ;
etc
For comparison of the same feature, PostgreSQL [2], which also implements this, will perform all operations in a single scan:
The main reason for providing the option to specify multiple changes in a single ALTER TABLE is that multiple table scans or rewrites can thereby be combined into a single pass over the table.
Although for DROP COLUMN specifically it will often not even need do that:
The DROP COLUMN form does not physically remove the column, but simply makes it invisible to SQL operations...
Questions:
Would the multi-column statement result in traversing all the rows just once and performing all changes needed?
How does MySQL actually perform DROP COLUMN? Does it also "hide" the columns first, or does it delete the data straight away?
Assumptions:
Using InnoDB
No indexes/complex defaults are involved in any of the columns we want to change/drop (so basically changes that would not require a temporary table when run as individual alter statements)
References:
[1] MySQL ALTER TABLE docs
[2] PostgreSQL ALTER TABLE docs

MySQL's InnoDB:
(This does not really answer the Questions, but provides a little more insight in the the bigger question of ALTER.)
If any of the alters needs to copy the table over, you are probably better off putting all alters into the same statement. Changing the PRIMARY KEY, for example, requires rebuilding the data that is clustered with the PK.
Some alters can be achieved by simply altering the schema; these are virtually instantaneous, and could be done via separate alter statements. Adding an option to ENUM was implemented long ago.
Some alters need some form of scan, but can do it "in the background". DROP INDEX can be done by quickly "hiding" it, then freeing up the BTree in the background.
I have left out a grey area in which you batch 'simple' alters. One would hope that ALTER is smart enough to simply go through them quickly, rather than deciding to copy the table over.

I got some useful feedback but decided to respond to my own question to provide a more concrete set of answers.
Would the multi-column statement result in traversing all the rows just once and performing all changes needed?
Yes, if the alter statement results in rebuilding the table then it only needs to do it once.*
* This answer comes from my own testing and other mostly anecdotal evidence (including #Uueerdo 's in this post). It would be useful to have some official docs for this...
How does MySQL actually perform DROP COLUMN? Does it also "hide" the columns first, or does it delete the data straight away?
MySQL will rebuild the table in place (rather than create a copy or just change metadata) for most column operations. Each specific case can be found in the Online DDL docs for InnoDB.
A few operations like renaming a column or setting a default value will just alter metadata, so they don't require a table rebuild.
However, dropping a column DOES require a full table rebuild.

Related

MySql - How to insert and on duplicate key update without explicitly specifying all non key columns

I have a table which was created as a select * from a view (and then added a PK).
I want to periodically update the table with all the data from the view.
I thought the best option is to do this using: INSERT INTO table_a SELECT * FROM view_a ON DUPLICATE KEY UPDATE VALUES(non_key_col_1), VALUES(non_key_col_1), .... ;
Since there are quite a lot of columns, and they might change in the future (then I can re-create the table, but I wish I won't have to edit the periodic insert, I was wondering if there is a way to avoid the explicit specification of all columns?
There no such syntax in mysql unfortunately. You'll have to update all the columns one by one.
You can go with a trigger on insert operation, that is if the primary key exists update the row otherwise insert it. But definitely it is going to impact the performance in case of large data
One thing i can think of is get the column names from INFORMATION_SCHEMA.COLUMNS and use those to dynamically compose your query in your app.
SELECT * FROM information_schema.columns WHERE table_name = 'view_a';
Now you have the columns no matter if the view changes.
Do the same for the table and you have the column differences.
Use those differences to run ALTER TABLE statements or drop it and recreate it all together.
Of course this is probably even more laborious then dropping and recreating the table manually.

MySQL MERGE Storage Engine - DROP & ALTER

I need to Add & Delete merged tables in the UNION=() line. According to the MySQL docs it says:
DROP the MERGE table and re-create it.
Use ALTER TABLE tbl_name UNION=(...) to change the list of
underlying tables.
The only "DROP" I'm aware of is DROP TABLE tablename; Are these instructions suggesting that I drop the MRG_MyISAM table, then recreate it with an empty UNION=() field? To then be followed by an ALTER TABLE tbl_name UNION=(...) with all the tables I need to have connected?
If possible, could you post an example of the commands?
Thanks
Oh boy, am I late here. But this page is in the top google search results for "alter table tbl_name union=(...)". So I guess it needs an answer
So here's the answer.
To change the union list of underlying tables for merge table you only need to execute this statement
alter table tbl_name union=(`t1`,`t2`,`t3`);
where t1,t2,t3 is a list of tables you want to have in a union.
You can drop merge table and recreate it with a new list of underlying tables.
Drop statement execute on merge table will only delete the merge table itself and won't affect underlying tables.
But altering it should be sufficient. And you don't need to recreate it with empty union, if you ever do that, just use list of tables that you want to have in it.
For more, please refer to documentation:
https://dev.mysql.com/doc/refman/5.7/en/merge-storage-engine.html

Retrieve CREATE TABLE code of an already existing table?

Is there a way to do this?
In case the DBMS command history got cleaned or, in my case, when many ALTER TABLE were used in the course of time.
I'm using MySQL.
Yes, it is as simple as
SHOW CREATE TABLE yourtable;
This will include all the subsequent ALTER TABLE statements. You cannot retrieve the table's original state.
Here is the relevant documentation

Altering the data type of a column in a HUGE table. Performance issues

I want to run this on my table:
ALTER TABLE table_name MODIFY col_name VARCHAR(255)
But my table is huge, it has more than 65M (65 million) rows. Now when I execute, it takes nearly 50mins to execute this command. Any better way to alter table?
Well, you need
ALTER TABLE table_name CHANGE col_name new_name VARCHAR(255)
But, you are right, it takes a while to make the change. There really isn't any faster way to change the table in MySQL.
Is your concern downtime during the change? If so, here's a possible approach: Copy the table to a new one, then change the column name on the copy, then rename the copy.
You probably have figured out that routinely changing column names in tables in a production system is not a good idea.
another variant to use percona toolkit
https://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html
You can deal with schema change without downtime using Oak.
oak-online-alter-table copies schema of original table, applies your changes and then copies the data. The CRUD operations can still be invoked as oak puts some triggers on original table so no data is going to be lost during the operation.
Please refer to other question where author of oak gives detailed explanation about this mechanism and also suggests other tools.

Change column name without recreating the MySQL table

Is there a way to rename a column on an InnoDB table without a major alter?
The table is pretty big and I want to avoid major downtime.
Renaming a column (with ALTER TABLE ... CHANGE COLUMN) unfortunately requires MySQL to run a full table copy.
Check out pt-online-schema-change. This helps you to make many types of ALTER changes to a table without locking the whole table for the duration of the ALTER. You can continue to read and write the original table while it's copying the data into the new table. Changes are captured and applied to the new table through triggers.
Example:
pt-online-schema-change h=localhost,D=databasename,t=tablename \
--alter 'CHANGE COLUMN oldname newname NUMERIC(9,2) NOT NULL'
Update: MySQL 5.6 can do some types of ALTER operations without rebuilding the table, and changing the name of a column is one of those supported as an online change. See http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html for an overview of which types of alterations do or don't support this.
If there aren't any constraints on it, you can alter it without a hassle as far as I know. If there are you'll have to remove the constraints first, alter and add the constraints back.
Altering a table with many rows can take a long time (though if the columns involved are not indexed, it may be trivial).
If you specifically want to avoid using the ALTER TABLE syntax created specifically for that purpose, you can always create a table with almost the exact same structure (but different name) and copy all the data into it, like so:
CREATE TABLE `your_table2` ...;
-- (using the query from SHOW CREATE TABLE `your_table`,
-- but modified with your new column changes)
LOCK TABLES `your_table` WRITE;
INSERT INTO `your_table2` SELECT * FROM `your_table`;
RENAME TABLE `your_table` TO `your_table_old`, `your_table2` TO `your_table`;
For some ALTER TABLE queries, the above can be quite a bit faster. However, for a simple column name change, it could be trivial. I might try creating an identical table and performing the change on it in order to see how much time you're actually looking at.