How to apply an index to a MySQL / MyISAM table without locking? - mysql

Having a production table, with a very critical column (date) missing an index, are there any ways to apply said index without user impact?
The table currently gets around 5-10 inserts every second, so full table locking is out; redirecting those inserts to alternative table / database, even temporarily, is also denied (for corporate politics reasons). Any other ways?

As far as I know this is not possible with MyISAM. With 5-10 INSERTs per second you should consider InnoDB anyways, unless you're not reading that much.
Are you using replication, preferable in a Master-Master Setup? (You should!) If that is the case, you could CREATE INDEX on the standby server, switch roles and do the same, then switch back. Be sure to disable replication temporarily (when using master-master) to avoid replicating the CREATE INDEX to the active node.
Depending on whether you use that table primarily to archive Logs or similar, you might aswell look into the Archive Storage engine.

Related

Can I INSERT into table while UPDATING multiple different rows with MariaDB or MySQL?

I am creating a custom analytics system and currently in the database designing process. I'm planning to use MariaDB with the InnoDB engine to be able to handle big loads.
The data I'm expecting could be around 500k clicks/day. I will need to insert these rows into the database, which means that I'll have around 5.8 inserts/sec on average. However, at the same time, I want to record if someone visited a page associated with that click. (basically to record funnels)
So what I'm planning to do is to create additional columns and search for the ID of the specific row then update that column with the exact time of the visit.
My first question: is this generally a recommended approach to design the database like that? If not, how else is it worth to design the database?
My only concern is that while updating rows the Table will be locked, and can't do inserts, therefore slowing down the user experience.
My second question: is this something I should worry about, that the table gets locked while updating, and thus slowing down inserts? Does it hurt performance?
InnoDB doesn't lock the table for insert if you're performing the update. Your users won't experience any weird hanging.
It's an MVCC compliant engine, designed to handle concurrent access to underlying tables.
You can control the engine's behavior by choosing an appropriate isolation level, however the default (REPEATABLE READ) is excellent and does the job more than well.
If a table is being modified by multiple users (not users that connect to your site but connections established towards MySQL via a scripting language or some other service) and there's many inserts/updates/deletes - MySQL can throw an error saying a deadlock occurred.
A deadlock is a warning, not an error, that more than 1 thread tried to access an occupied resource (such as two threads tried to update the same row at the same time, but only 1 will be allowed to do so). It's an indication you should repeat the query.
I'm suggesting that you take care of all possible scenarios in the language of your choice when it comes to handling MySQL that's under heavier I/O.
~6 inserts a second isn't a lot, make sure you're allowing MySQL to access sufficient system resources. For InnoDB, check the value of innodb_buffer_pool_size or google a bit to see what it is and how to use it to make your database run fast.
Good luck!
At a mere 5.6/second, there won't be much problem.
I do, however, suggest vertical partitioning for "Likes", "Upvotes", "Clicks", and similar things. These tend to have a lot of UPDATEs of random single rows, and may interfere with other activity.
That is, have a separate table with (perhaps) just 2 columns:
The id of the item being Liked/Clicked/etc.
A counter.
It is simple enough (and fast enough) to JOIN via that id when you want to display info including the counter.
As already pointed out, the row is locked, not the table.

Converting a big MyISAM to InnoDB

I'm trying to convert a 10million rows MySQL MyISAM table into InnoDB.
I tried ALTER TABLE but that made my server get stuck so I killed the mysql manually. What is the recommended way to do so?
Options I've thought about:
1. Making a new table which is InnoDB and inserting parts of the data each time.
2. Dumping the table into a text file and then doing LOAD FILE
3. Trying again and just keep the server non-responsive till he finishes (I tried for 2hours and the server is a production server so I prefer to keep it running)
4. Duplicating the table, Removing its indexes, then converting, and then adding indexes
Changing the engine of the table requires rewrite of the table, and that's why the table is not available for so long. Removing indexes, then converting, and adding indexes, may speed up the initial convert, but adding index creates a read lock on your table, so the effect in the end will be the same. Making new table and transferring the data is the way to go. Usually this is done in 2 parts - first copy records, then replay any changes that were done while copying the records. If you can afford disabling inserts/updates in the table, while leaving the reads, this is not a problem. If not, there are several possible solutions. One of them is to use facebook's online schema change tool. Another option is to set the application to write in both tables, while migrating the records, than switch only to the new record. This depends on the application code and crucial part is handling unique keys / duplicates, as in the old table you may update record, while in the new you need to insert it. (here transaction isolation level may also play crucial role, lower it as much as you can). "Classic" way is to use replication, which, as far as I know is also done in 2 parts - you start replication, recording the master position, then import dump of the database in the second server, then start it as a slave to catch up with changes.
Have you tried to order your data first by the PK ? e.g:
ALTER TABLE tablename ORDER BY PK_column;
should speed up the conversion.

Can you index tables differently on Master and Slave (MySQL)

Is it possible to set up different indexing on a read only slave, from on the master? Basically, this seems like it makes sense given the different requirements of the two systems, but I want to make sure it will work and not cause any problems.
I believe so. After replication is working, you can drop the indexes on the slave and create the indexes you want and that should do it. Since MySQL replicates statements and not data (at least by default), as long as the SQL necessary to insert or update or select from the table doesn't need to change, it shouldn't notice.
Now there are obviously downsides to this. If you make a unique key that isn't on the master, you could get data inserted on the master that can't be inserted on the slave. If an update is done that uses an index it may run fast on the master but cause a table scan on the slave (since you don't have whatever index was handy).
And if any DDL changes ever happen on the master (such as to alter an index) that will be passed to the slave and the new index will be created there as well, even though you don't want it to.
For sure. I do it all the time. Issues I've run into:
Referencing indexes via FORCE/USE/IGNORE INDEX in SELECTS will error out
Referencing indexes in ALTER statments on the master can break replication
Adds another step to promoting a slave to be the master in case of emergency
If you're using statement based replication (the norm), and you're playing around with UNIQUE indexes, any INSERT... ON DUPLICATE KEY, INSERT IGNORE or REPLACE statments will cause extreme data drifting / divergence
Performance differences (both good and bad)
Sure, I think it's even a common practice to replicate InnoDB tables into MyISAM tables on the slave to be able to add full-text indexes.

Ensure `INSERT`s are concurrent for a specific MyISAM table?

I have a MyISAM table that basically contains a log. A cluster of machines does single-record INSERTs on this table at a rate of 50 per second tops, but the same table is also SELECTed from by a web application, and indexed to accommodate for this. There are no UPDATEs or DELETEs, though.
So from what I've gathered, I should be using concurrent inserts. (Right?) MyISAM will normally do this for me without any extra work. (Is this correct?)
But what I can't find is a way to guarantee that a given INSERT is processed concurrently. I know that I can set the global variable concurrent_insert to 2, but I'd rather not set this globally.
So my questions are:
Is there some way I'm missing to guarantee a concurrent insert?
If not, is there a command I can use to see whether a table meets the concurrent insert requirements? (I believe just knowing whether a table has holes should be enough?) Because I will also settle for being able to just monitor the table.
And I'm also curious, is there some other database system that can handle this kind of load better? I'm totally okay with a NoSQL solution, if that happens to be the case. As long as I can talk to it from Ruby and C.
Why don't you want to set concurrent_insert=2 globally? That would give you what you want.
Another option you may want to consider for this type of MyISAM table is INSERT DELAYED:
http://dev.mysql.com/doc/refman/5.1/en/insert-delayed.html

MySQL 24x7 - InnoDB ALTER TABLE blocks (TABLE LOCK)

we are trying to minimize (maintenance) downtimes of our mysql based application.
It seems that InnoDB hotbackup will give us the possibility to do regular backups without stopping the server; Master/Slave replication will give us failover capabilities (loosing a few seconds of data due to replication lag is not great, but not a showstopper also).
So far for backup and unexpected downtimes. Now to expected downtimes -
As far as I understand from reading online documentation and books an ALTER TABLE on an InnoDB table will require a TABLE LOCK thus blocking all reads and writes to this table. Effectively this will mean downtime to the application. Some large tables may take hours to be updated.
Are there any known workarounds to this? The perfect workaroudn would be of course a non-blocking ALTER TABLE. But anything to make ALTER TABLE faster is also interesting.
Thanks in advance!
PS - commercial (non-free) tools would be ok also, free solutions are of course also welcome
Since you have replication setup, it is normally possible to do some trickery with ALTER TABLE on the slave, let the slave catchup after it is done, swap roles, and then ALTER on the former master. This doesn't work for all ALTER TABLE commands, but it can handle the majority of them.
There is also a third-party tool at here, but I'm not sure how commonly it is used, how well it works, etc...
The best workaround would be to not alter your tables.
The only time a schema change should be required is if you're adding functionality, or somehow forgot an index.
If you're adding functionality, you'll likely have downtime anyway, to stage your production server.
If you forgot an index, then the database is likely slow anyway, so your users shouldn't mind downtime to fix the performance issue. You should run all you queries through an EXPLAIN to make sure you have the proper indexes declared already.
If you're afraid that you'll be altering tables frequently you might want to re-examine your schema.