Process of adding an index to production db - mysql

Let's say I have a table with 1M rows and I need to add an index on it. What would be the process of doing so? Am I able to do ALTER TABLE table ADD INDEX column (column) directly on production? Or do I need to take other things into account to add the index.
How does one normally go about adding an index on a live production db?

You should first try adding the same index on your test database under similar load conditions and check that it doesn't cause problems. It's possible that by creating the index you lock the table for some time and cause other queries to fail.
One million rows is a large table, but it's not huge. You will probably find that the adding the index completes reasonably quickly. Unless you have real-time constraints it's unlikely to cause serious issues. It's definitely worth testing it first though.

Personally I'd wait until things were quiet, but you don't have to take the database offline. Users might see a delay, but it shouldn't time out. (If it does, something else is wrong.)
Note that adding an index doesn't make the dbms copy the whole table (as adding a column might).

Related

What is Visible INDEX and Invisible INDEX in MySql

I got confused with Visible INDEX and Invisible INDEX in Mysql. Which is best?
I have a database tables with rows about 1M rows. The previous development team doesn't add Index to the table columns. Now we need to add Index to some columns to speed up the fetch query.
While Indexing the column I got confused with Visible and Invisible columns.
Kindly anyone explain me which is best choice. I tried to learn from online tutorials but I can't get it clear.
A question comes up while developing a database: "Do we really need all these indexes?"
Indexes cost some overhead to keep in sync with data. If the index is needed to optimize some queries, it's generally worth keeping the index. But if the index is not used by any query, it might be time to drop that index. But how can you be sure it's not needed?
Why not test by dropping the index and recreating it if it turns out the index is needed? Because tables get really, really large, and adding the index back could take a long time. In one case, one of the developers I supported dropped an index he thought was not needed. It turned out it was important. But adding the index back took four weeks because the table was so huge, and the server was running hard just keeping up with the other queries. In the meantime, the query that needed that index was running extremely poorly.
It would have been preferable to somehow make MySQL temporarily pretend the index does not exist, and then flip a switch and make the index visible again when they decided it was important.
https://dev.mysql.com/doc/refman/8.0/en/invisible-indexes.html says:
MySQL supports invisible indexes; that is, indexes that are not used by the optimizer.
Invisible indexes make it possible to test the effect of removing an index on query performance, without making a destructive change that must be undone should the index turn out to be required. Dropping and re-adding an index can be expensive for a large table, whereas making it invisible and visible are fast, in-place operations.
An invisible index is an index that exists, and is still kept in sync with data as you run INSERT/UPDATE/DELETE statements. But the optimizer treats it as if the index is not there.
An index hint in a query can suppress use of any index, and that's the way we had to test during MySQL 5.x. But that requires finding all your SQL queries that would use the index, and changing application code. That's especially inconvenient if your queries are generated by layers of ORM code.
In simple words
Visible INDEX are working or Active Index.
Invisible INDEX are removed Indexes, which were marked as Removed.

How can I index rebuild on multiple table every day at 3 AM efficient way?

I run below command on multiple tables but is it right way to do index rebuild or is there any better way to do it every day at specified time using event?
OPTIMIZE TABLE table1, table2;
My second question is that, if another process(insert,delete,update) run on same table during index rebuild then what will happen for that process?
Is process same for both MariaDB, MySQL?
Since I am working on those DBMS that is why I need to know the actual behavior in this scenario.
Thanks In Advance.
If you are using ENGINE=InnoDB (on MySQL or MariaDB), there is "never" any need to rebuild indexes or do OPTIMIZE TABLE.
Sure, either will do some "defragmentation", but, because of the way BTrees work, they become fragmented promptly. And, a fragmented BTree is only slightly slower.
Read and write operations are interfered with by anything that will rebuild an index -- another argument against periodic rebuilding.
About the only useful time to use OPTIMIZE is after you have DELETEd most of a table. In that situation, I have a list of better ways to do the big delete.

Table repair or creating and moving to a new one?

I was altering a table adding an index to a row and that cause my entire site to bog down, so I interrupted the action.
What are the effects of an interruption when altering a table index in my table/database?
Running a table repair will fix any issue from that action?
What about copying all the content over a new table, with the proper index I want to set, and than rename it? Is it more efficient than a table repair?
There are few points that you should consider before taking any action.
Why adding an indexing to your table caused the website to down ?
if you implemented indexing properly to the table without any issue than there might be some issues with the queries. may be your queries are not compatible with indexing.
point 3 seems meaning less if point 2 is valid.
Alter table allows you to perform almost every task you want to do in creating a new table.
Before alter tabling it is recommended to take backup so you do not have need to repair table.

re-indexing in mysql

I have a table which already contains an index in MySQL. I added some rows to the table, do I need to re-index the table somehow or does MySQL do this for me automatically?
This would be done automatically. This is the reason, why sometimes we don't want to create indexes -- rebuilding of parts of indexes on inserting have small but not empty overhead in performance.
If you define an index in MySQL then it will always reflect the current state of the database unless you have deliberately disabled indexing. As soon as indexing is re-enabled, the index will be brought up to date. Usually indexing is only disabled during large insertions for performance reasons.
There is a cost associated with each index on your table. While a good index can speed up retrieval times immensely, every index you define slows insertion by a small amount. The insertion costs grow slowly with the size of the database. This is why you should only define indexes you absolutely need if you are going to be working on large sets of data.
If you want to see what indexes are defined, you can use SHOW CREATE TABLE to have a look at a particular table.
No, you didn't need to rebuild index.
Record insertion will automatically affect old index..

mysql speed, table index and select/update/insert

We have got a MySQL table which has got more than 7.000.000 (yes seven million) rows.
We are always doing so much SELECT / INSERT / UPDATE queries per 5 seconds.
Is it a good thing that if we create MySQL INDEX for that table? Will there be some bad consequences like data corrupting or loosing MySQL services etc.?
Little info:
MySQL version 5.1.56
Server CentOS
Table engines are MyISAM
MySQL CPU load between 200% - 400% always
In general, indexes will improve the speed of SELECT operations and will slow down INSERT/UPDATE/DELETE operations, as both the base table and the indexes must be modified when a change occurs.
It is very difficult to say such a thing. I would expect that the indexing itself might take some time. But after that you should have some improvements. As said by #Joe and #Patrick, it might hurt your modification time, but the selecting will be faster.
Ofcourse, there are some other ways of improving performance on inserting and updating. You could ofcourse batch updates if it is not important to have change visible immediatly.
The indexes will help dramatically with selects. Especially if the match up well with the commonly filtered fields. And you have a good simple primary key. They will help with both the time of the queries and the processing cycles.
The drawbacks are if you are very often updating/altering/deleting these records, especially the indexed fields. Even in this case though, it is often worth it.
How much you're going to be reporting (select statement) vs updating (should!) hugely affects both your initial design as well as your later adjustments once your db is in the wild. Since you already have what you have, testing will give you the answers you need. If you really do a lot of select queries, and a lot of updating, your solution might be to copy out data now and then to a reporting table. Then you can index like crazy with no ill effects.
You have actually asked a large question, and you should study up on this more. The general things I've mentioned above hold for most all relational dbs, but there are also particular behaviors of the particular databases (MySQL in your case), mainly in how they decide when and where to use indexes.
If you are looking for performance, indexes are the way to go. Indexes speed up your queries. If you have 7 Million records, your queries are probably taking many seconds possibley a minute depending on your memory size.
Generally speaking, I would create indexes that match the most frequent SELECT statements. Everyone talks about the negative impact of indexes on table size and speed but I would neglect those impacts unless you have a table for which you are doing 95% of the time inserts and updates but even then, if those inserts happen at night and you query during the day, go and create those indexes, your users during daytime will appreciate it.
What is the actual time impact to an insert or update statement if there is an additional index, 0.001 secondes maybe? If the index saves you many seconds per each query, I guess the additional time required to update index is well worth it.
The only time I ever had an issue with creating an index (it actually broke the program logic) was when we added a primary key to a table that was previously created (by someone else) without a primary key and the program was expecting that the SELECT statement returns the records in the sequence they were created. Creating the primary key changed that, the records when selecting without any WHERE clause were returned in a different sequence.
This is obviously a wrong design in the first place, nevertheless, if you have an older program and you encounter tables without primary key, I suggest to look at the code that reads that table before adding a primary key, just in case.
One more last thought about creating indexes, the choice of fields and the sequence in which the fields appear in the index have an impact on the performance of the index.
I had the same kind of problem that you describe.
I did a few changes and 1 query passed from 11sec to a few milliseconds
1- Upgraded to MariaDB 10.1
2- Changed ALL my DB to ARIA engine
3- Changed my.cnf to the strict mininum
4- Upgraded php 7.1 (but this one had a little impact)
5- with CentOS : "Yum update" in the terminal or via ssh (by keeping everything up to date)
1- MariaDB is the new Open source version of MYSQL
2- ARIA engine is the evolution of MYISAM
3- my.cnf have usually too much change that affect performance
Here an example
[mysqld]
performance-schema=1
general_log=0
slow_query_log=0
max_allowed_packet=268435456
By removing all extra options from the my.cnf, it's telling mysql to use default values.
In MYSQL 5 (5.1, 5.5, 5.6...) When I did that ; I only noticed a small difference.
But in MariaDB -> the small my.cnf like this did a BIG difference.
******* ALL of those changes ; the server hardware remained the same.
Hope it can help you