MySQL MyISAM data loss possibilities? - mysql

Many sites and script still use MySQL instead of PostgreSQL. I have a couple low-priority blogs and such that I don't want to migrate to another database so I'm using MySQL.
Here's the problem, their on a low-memory VPS. This means I can't enable InnoDB since it uses about 80MB of memory just to be loaded. So I have to risk running MyISAM.
With that in mind, what kind of data loss am I looking at with MyISAM? If there was a power-outage as someone was saving a blog post, would I just lose that post, or the whole database?
On these low-end-boxes I'm fine with losing some recent comments or a blog post as long as the whole database isn't lost.

MyISAM isn't ACID compliant and therefore lacks durability. It really depends on what costs more...memory to utilise InnoDB or downtime. MyISAM is certainly a viable option but what does your application require from the database layer? Using MyISAM can make life harder due to it's limitations but in certain scenarios MyISAM can be fine. Using only logical mysqldump backups will interrupt your service due to their locking nature. If you're utilising binary logging you can back these up to give you incremental backups that could be replayed to aid recovery should something corrupt in the MyISAM tables.

You might find the following MySQL Performance article of interest:
For me it is not only about table locks. Table locks is only one of MyISAM limitations you need to consider using it in production. Especially if you’re comming from “traditional” databases you’re likely to be shocked by MyISAM behavior (and default MySQL behavior due to this) – it will be corrupted by unproper shutdown, it will fail with partial statement execution if certain errors are discovered etc...
http://www.mysqlperformanceblog.com/2006/06/17/using-myisam-in-production/

The MySQL manual points out the types of events that can corrupt your table and there is an article explaining how to use myisamchk to repair tables. You can even issue a query to fix it.
REPAIR TABLE table;
However, there is no information about whether some types of crashes might be "unfix-able". That is the type of data loss that I can't allow even if I'm doing backups.

With a server crash your auto increment primary key can get corrupted, so your blog post IDs can jump from 122, 123, 75912371234, 75912371235 (where the server crashed after 123). I've seen it happen and it's not pretty.

You could always get another host on the same VLAN that is slaved to your database as a backup, this would reduce the risk considerably. I believe the only other options you have are:
Get more RAM for your server or kill of some services
See if your host has shared database hosting of any kind on the VLAN you can use for a small fee.
Make regular backups and be prepared for the worst.

In my humble opinion, there is no kind of data loss with MyISAM.
The risk of data loss from a power outage is due to the power outage, not the database storage mechanism.

Related

Should I switch to InnoDB for my tables

I have an PHP-based API that runs on shared hosting and uses MySQL. I've been doing reading on InnoDB vs MyISAM and wanted to paste some specific things about my API's database to make sure it makes sense to move on to InnoDB. MyISAM was set by default for these tables, so I didn't deliberately pick that database engine.
My queries are a little more writes than reads (70% writes I'd say). Reads/lookups are always by a "foreign key" (userid) (I understand MyISAM doesn't have these constraints) but might be good to know if I move since I could take advantage of that.
I don't do full text searches
My data is important to me, and I recently learned MyISAM has a risk of losing data? A few times in the past I've lost some data and just assumed it was my user's fault in how they interacted with the API. Perhaps not? I am confused about how anyone would be ok with losing data and thus choosing MyISAM so perhaps I don't understand MyISAM enough.
I'm on a shared host and they confirmed I don't have access to change settings in my.cnf, change buffers, threading, concurrency settings, etc.
I will probably switch to DigitalOcean or AWS in the future
My hosting company uses MySQL Version is 14.14 Distribution: 5.6.34
Based on these factors, my instinct is to switch all my tables to InnoDB and at least see if there are problems. If I hit an issue, I can just run the same statement but swap InnoDB with MyISAM to revert back.
Thanks so much.
Short answer: YES! MyISAM was the original format of MySQL, but many years ago InnoDB has been preferred for many reasons. On high-level picture, your app will better perform as InnoDB has a better lock management.
You can find here a longer answer to your question Should I change my DB from MyISAM to InnoDB? (AWS notification) and the following 2 articles covering migration from MyISAM to InnoDB:
https://dba.stackexchange.com/questions/167842/can-converting-myisam-to-innodb-cause-problems
https://kinsta.com/knowledgebase/convert-myisam-to-innodb/

MySQL changing large table to InnoDB

I have a MySQL server running on CentOS which houses a large (>12GB) DB. I have been advised to move to InnoDB for performance reasons as we are experiencing lockups where the application that relies on the DB becomes unresponsive when the server is busy.
I have been reading around and can see that the ALTER command that changes the table to InnoDB is likely to take a long time and hammer the server in the process. As far as I can see, the only change required is to use the following command:
ALTER TABLE t ENGINE=InnoDB
I have run this on a test server and it seems to complete fine, taking about 26 minutes on the largest of the tables that needs to be converted.
Having never run this on a production system I am interested to know the following:
What changes are recommended to be made to the MySQL config to take advantage of additional performance of InnoDB tables? The server currently has 3GB assigned to InnoDB cache - was thinking of increasing this to 15GB once the additional RAM is installed.
Is there anything else I should do to the server with this change?
I would really recommend using either Percona MySQL or MariaDB. Both have tools that will help you get the most out of InnoDB, as well as some tools to help you diagnose and optimize your database further (for example, Percona's Online Schema Change tool could be used to alter your tables without downtime).
As far as optimization of InnoDB, I think most would agree that innodb_buffer_pool_size is one of the most important parameters to tune (and typically people set it around 70-80% of total available memory, but that's not a magic number). It's not the only important config variable, though, and there's really no magic run_really_fast setting. You should also pay attention to innodb_buffer_pool_instances (and there's a good discussion about this topic on https://dba.stackexchange.com/questions/194/how-do-you-tune-mysql-for-a-heavy-innodb-workload)
Also, you should definitely check out the tips offered in the MySQL documentation itself (http://dev.mysql.com/doc/refman/5.6/en/optimizing-innodb.html). It's also a good idea to pay attention to your InnoDB hit ratio (Rolado over at DBA Stackexchange has a great answer on this topic, eg, https://dba.stackexchange.com/questions/65341/innodb-buffer-pool-hit-rate) and analyze your slow query logs carefully. Towards that later end, I would definitely recommend taking a look at Percona again. Their slow query analyzer is top notch and can really give you a leg up when it comes to optimizing SQL performance.

Life without transactions (MyISAM)

My site runs on a VDS-server. I've just found out that my MySQL server doesn't support InnoDB engine, therefore I can't use database transactions in my application.
It makes me think, that some people might never use transactions. Is this the case? If so, how does one manage to coordinate related operations on different tables in MyISAM?
Otherwise, is there a way to install InnoDB on a MySQL server which is run on a VDS?
Thanks!
If you need transactions, then you need transactions and MyISAM isn't going to cut the mustard.
Some applications won't need transactions. For example; an application that never runs more than one related SQL statement at a time and has no need to rollback multiple SQL statements. Another example is an application that uses MySQL as a simple Key-Value Store. There are many use cases that don't require database level transaction support.
It's hard to answer the second part of your question without knowing more details about your VDS. Who is you hosting provider? Do you have shell access and permissions to change my.cnf? If not, then you probably won't be able to enable InnoDB. If you do, then here is a another SO answer that details how to enable InnoDB on MySQL: How to enable INNODB in mysql
You can often either enable the engine, install the InnoDB components manually, or simply re-install a version of MySQL that includes that engine by default. MyISAM is the crazypants database, stupidly fast but also unreliable and prone to complete destruction if your system isn't shut down properly.
Running a mission critical application on MyISAM is an extremely bad idea. Where you need MyISAM tables for performance reasons they should always be disposable, easily re-built from another more reliable source of data.

Locking DB w/ Large Reads (Ruby-on-Rails/Heroku)

Currently I have a Web API running on Heroku that is constantly writing information we're collecting from other data sources (currently theres about half a GB of data and it's growing very quickly). We're looking to add a reporting system on top of the current database that we can use to extract useful information out of the DB. The problem is that when we're running reports we're locking the DB and any other sites communicating with the DB are timing out. Does anyone have any solutions on how to solve this type of issue? Amazon RDS seems to have some interesting stuff with database replication but I don't know if that will solve my problems.
Any advice would be greatly appreciated.
Thanks
Be sure you are running innodb tables and not the old isam or myisam tables - innodb has row level locks which is much more scalable.
Make sure that you have indexes defined on all your joining/foreign keys... if you do joins without indexes it will grind. Also make sure you have indexes where appropriate for data that you search or sort on (as long as it is diverse data, not boolean or a small number of values)
Replication is another good idea, as you could target the reports at the secondary server in read-only mode, and it will just catch up once it unlocks. half a GB of data should not really be locking it up yet, so I'd look at the indexes and innodb first.
One solution to this is to have a replica of the database, so that your normal traffic goes to the master database, while long-running queries execute on the slave. I'm not sure how much control you get over the database on Heroku though, they may not support replication.
However, have you considered that the Heroku setup may be the problem here? A 500 MB database shouldn't really have performance issues unless you're performing really complex queries.
If you're happy using MySQL instead of Postgres, Engine Yard supports database replication (although generally it may not be as easy to use as Heroku).

How can MyISAM tables be used more safely?

I like InnoDB's safety, consistency, and self-checking.
But I need MyISAM's speed and light weight.
How can I make MyISAM less prone to corruption due to crashes, bad data, etc.? It takes forever to go through a check (either CHECK TABLE or myisamchk).
I'm not asking for transactional security -- that's what InnoDB is for. But I do want a database I can restart quickly rather than hours (or days!) later.
UPDATE: I'm not asking how to load data into tables faster. I've beat my head against that already, and determined that using the MyISAM tables for my LOAD DATA is simply much faster. What I'm after now is mitigating the risks of using MyISAM tables. That is, reducing chances of damage, increasing speed of recovery.
MyISAM's supposed speed benefits can actually go away pretty quickly - the fact that it lacks row-level locking means small updates can cause large amounts of data to be locked, and queries to block. Because of that, I'm skeptical of claimed MyISAM speed benefits: start doing several UPDATEs, and the queries per second will tank.
I think you're better off asking "How can applications backed with InnoDB be made faster?" and the answer then deals with caching data, perhaps at the object level, in lightweight caches - there is a cost for ACID, and for, say, web applications, it's not really needed.
If UPDATEs are rare (if they aren't, MyISAM isn't a good choice) then you can even use the MySQL query cache.
memcached (http://www.danga.com/memcached/) is a very popular option for object caching. Depending on your application you have other options as well (HTTP caches, etc.)
The performance advantages of MyISAM are actually pretty minimal in some cases; you need to benchmark your own application MyISAM vs InnoDB. Using the InnoDB transactional engine exclusively gives other benefits too.
In my testing InnoDB will use up typically about 150% more disc space than MyISAM- this is because of its block structure and lack of index compression.
If you can afford it, just use InnoDB instead.
As far as answering your actual question goes: If you partition your table into multiple MyISAM tables, the amount of repair needed in a crash will be much less; if your data are large, this might be a good idea anyway for other reasons.
in normal practice, you shouldn't get corruption. if you are getting corruption, you need to look at things like bad memory, bad hard drive, bad drive controller, or possibly a mysql bug.
if you want to side-step all that, you could set up a replication slave. when the master dies, stop the replication on the slave and make it your new master. the clear the data off your old master and set it up as a slave. user down-time will be limited to the amount of time it takes to detect that the master died and bring the slave up.
this has the added benefit of being a good way to achieve a zero-downtime backup: shut down the slave process and back up the slave.
While I agree with the innodb comments, I will give a solution to your MyISAM problem.
A good way to prevent corruption and increasing speed would be to use MERGE tables
You can use 2 or more MyISAM files. One is usually for backup'd old data that isn't used that often and the other is newer data. Then you will have 2 FRM (the MyISAM table files) on your harddisk and one will be protected. Usually you compress the old MyISAM tables and then they will defiantly not be corrupted, since they become read-only.
This technique is usually used to speed up big MyISAM tables, but you can apply it here as well.
Hope that helped your question. While I realize it didn't really help crash-proof MyISAM, it does give quite a bit of protection.
Are you married to MySQL? Postgres is ACID-compliant (like innoDB) and (when well-tuned) nearly as speedy as MyISAM.
Your comment:
No, the major problem is the amazingly
disk-intensive initial import of data
into the table. MyISAM time: 12
minutes. InnoDB time: 3+ hrs. After my
initial load, UPDATEs are non-existent
and INSERTs are rare. No known
solution to InnoDB's disappointing
load operation.
suggests dropping constraints and indexes, then enabling / rebuilding them after the load may significantly speed it up- I assume you tried that? Did that improve things?
This really depends a lot on how your use of the tables. If they are write heavy, then you may want to consider removing indexes, which will speed up the recovery time. If they are read heavy, you may want to consider using replication which will serialise all writes to your tables, minimising the recovery time for your read copy after a crash.
Once thing you could do is write to an InnoDB copy of the table, and then replicate to a MyISAM copy. The performance benefits of MyISAM are mostly read-oriented anyway.
Using replication of course, you will have lag time between reads and writes
Get a good UPS, with decent power conditioning. Run on stable and redundant hardware.
I don't trust MyISAM tables to ever survive a crash during a write, so I think your best bet is on reducing the occurrence of crashes (and writes).