MyISAM or InnoDB for social networking application - mysql

I am on a college project for a social networking web application. I am over with the schema design part.I am using PHP+MySQL. Right now I am testing the application and the tables are MyISAMs. But I got to know that myISAM doesn't provide row level locking but at table level. So I am confused whether I need to switch to InnoDB. I am expecting a 8:3 select v/s update and 30 updates per second is my threshold limit. I am relying on shared hosting server. So please help me with it. Love to hear from a Database EXPERT...

I would recommend InnoDB engine, because it has better performance in OLTP.
MyISAM is too fast when used in non-concurrent (or low concurrent) queries.
Also you can take advantage of foreign key integrity.
Hope this helps.

Related

Should I switch to InnoDB for my tables

I have an PHP-based API that runs on shared hosting and uses MySQL. I've been doing reading on InnoDB vs MyISAM and wanted to paste some specific things about my API's database to make sure it makes sense to move on to InnoDB. MyISAM was set by default for these tables, so I didn't deliberately pick that database engine.
My queries are a little more writes than reads (70% writes I'd say). Reads/lookups are always by a "foreign key" (userid) (I understand MyISAM doesn't have these constraints) but might be good to know if I move since I could take advantage of that.
I don't do full text searches
My data is important to me, and I recently learned MyISAM has a risk of losing data? A few times in the past I've lost some data and just assumed it was my user's fault in how they interacted with the API. Perhaps not? I am confused about how anyone would be ok with losing data and thus choosing MyISAM so perhaps I don't understand MyISAM enough.
I'm on a shared host and they confirmed I don't have access to change settings in my.cnf, change buffers, threading, concurrency settings, etc.
I will probably switch to DigitalOcean or AWS in the future
My hosting company uses MySQL Version is 14.14 Distribution: 5.6.34
Based on these factors, my instinct is to switch all my tables to InnoDB and at least see if there are problems. If I hit an issue, I can just run the same statement but swap InnoDB with MyISAM to revert back.
Thanks so much.
Short answer: YES! MyISAM was the original format of MySQL, but many years ago InnoDB has been preferred for many reasons. On high-level picture, your app will better perform as InnoDB has a better lock management.
You can find here a longer answer to your question Should I change my DB from MyISAM to InnoDB? (AWS notification) and the following 2 articles covering migration from MyISAM to InnoDB:
https://dba.stackexchange.com/questions/167842/can-converting-myisam-to-innodb-cause-problems
https://kinsta.com/knowledgebase/convert-myisam-to-innodb/

MySQL MyISAM data loss possibilities?

Many sites and script still use MySQL instead of PostgreSQL. I have a couple low-priority blogs and such that I don't want to migrate to another database so I'm using MySQL.
Here's the problem, their on a low-memory VPS. This means I can't enable InnoDB since it uses about 80MB of memory just to be loaded. So I have to risk running MyISAM.
With that in mind, what kind of data loss am I looking at with MyISAM? If there was a power-outage as someone was saving a blog post, would I just lose that post, or the whole database?
On these low-end-boxes I'm fine with losing some recent comments or a blog post as long as the whole database isn't lost.
MyISAM isn't ACID compliant and therefore lacks durability. It really depends on what costs more...memory to utilise InnoDB or downtime. MyISAM is certainly a viable option but what does your application require from the database layer? Using MyISAM can make life harder due to it's limitations but in certain scenarios MyISAM can be fine. Using only logical mysqldump backups will interrupt your service due to their locking nature. If you're utilising binary logging you can back these up to give you incremental backups that could be replayed to aid recovery should something corrupt in the MyISAM tables.
You might find the following MySQL Performance article of interest:
For me it is not only about table locks. Table locks is only one of MyISAM limitations you need to consider using it in production. Especially if you’re comming from “traditional” databases you’re likely to be shocked by MyISAM behavior (and default MySQL behavior due to this) – it will be corrupted by unproper shutdown, it will fail with partial statement execution if certain errors are discovered etc...
http://www.mysqlperformanceblog.com/2006/06/17/using-myisam-in-production/
The MySQL manual points out the types of events that can corrupt your table and there is an article explaining how to use myisamchk to repair tables. You can even issue a query to fix it.
REPAIR TABLE table;
However, there is no information about whether some types of crashes might be "unfix-able". That is the type of data loss that I can't allow even if I'm doing backups.
With a server crash your auto increment primary key can get corrupted, so your blog post IDs can jump from 122, 123, 75912371234, 75912371235 (where the server crashed after 123). I've seen it happen and it's not pretty.
You could always get another host on the same VLAN that is slaved to your database as a backup, this would reduce the risk considerably. I believe the only other options you have are:
Get more RAM for your server or kill of some services
See if your host has shared database hosting of any kind on the VLAN you can use for a small fee.
Make regular backups and be prepared for the worst.
In my humble opinion, there is no kind of data loss with MyISAM.
The risk of data loss from a power outage is due to the power outage, not the database storage mechanism.

Is innodb required/recommended for Spring Security?

I have a grails app that uses the Spring-Security-Core plugin, which integrates Spring Security 3 into my app. I belive that Spring/Hibernate will do some transactional operations under the hood. If that is the case, would it be better to use mysql's innodb engine instead of the default MyIsam engine? or are the operations independent of the underlying database?
Thanks in advance!
There's nothing particularly transactional about how the plugin works. It only reads - the primary database access will be loading a user and the user's assigned roles. You will want to use transactions when updating the user, assigning roles, etc. but that has nothing to do with security, it's just the right thing to do.
As the others said, there's very little reason to use MyISAM except in specialized use cases that are probably better suited for a NoSQL database. InnoDB is very fast and has excellent transaction support.
InnoDB enforces referential integrity; MyISAM does not.
Looks like MyISAM does not support transactions/rollback:
http://dev.mysql.com/doc/refman/5.0/en/ansi-diff-transactions.html
So if a transaction manager is required, better to go with InnoDB.
Actually i think innodb engine will be wise choice. Main reason - durability and data integrity support. MyIsam more "fragile". Only reason to use it now- huge insert activity - and this is not your case( i try to not go too deep in it- it more complex and don't connected with question).

Locking DB w/ Large Reads (Ruby-on-Rails/Heroku)

Currently I have a Web API running on Heroku that is constantly writing information we're collecting from other data sources (currently theres about half a GB of data and it's growing very quickly). We're looking to add a reporting system on top of the current database that we can use to extract useful information out of the DB. The problem is that when we're running reports we're locking the DB and any other sites communicating with the DB are timing out. Does anyone have any solutions on how to solve this type of issue? Amazon RDS seems to have some interesting stuff with database replication but I don't know if that will solve my problems.
Any advice would be greatly appreciated.
Thanks
Be sure you are running innodb tables and not the old isam or myisam tables - innodb has row level locks which is much more scalable.
Make sure that you have indexes defined on all your joining/foreign keys... if you do joins without indexes it will grind. Also make sure you have indexes where appropriate for data that you search or sort on (as long as it is diverse data, not boolean or a small number of values)
Replication is another good idea, as you could target the reports at the secondary server in read-only mode, and it will just catch up once it unlocks. half a GB of data should not really be locking it up yet, so I'd look at the indexes and innodb first.
One solution to this is to have a replica of the database, so that your normal traffic goes to the master database, while long-running queries execute on the slave. I'm not sure how much control you get over the database on Heroku though, they may not support replication.
However, have you considered that the Heroku setup may be the problem here? A 500 MB database shouldn't really have performance issues unless you're performing really complex queries.
If you're happy using MySQL instead of Postgres, Engine Yard supports database replication (although generally it may not be as easy to use as Heroku).

Is MySQL appropriate for a read-heavy database with 3.5m+ rows? If so, which engine?

My experience with databases is with fairly small web applications, but now I'm working with a dataset of voter information for an entire state. There are approximately 3.5m voters and I will need to do quite a bit of reporting on them based on their address, voting history, age, etc. The web application itself will be written with Django, so I have a few choices of database including MySQL and PostgreSQL.
In the past I've almost exclusively used MySQL since it was so easily available. I realize that 3.5m rows in a table isn't really all that much, but it's the largest dataset I've personally worked with, so I'm out of my personal comfort zone. Also, this project isn't a quickie throw-away application though, so I want to make sure I choose the best database for the job and not just the one I'm most comfortable with.
If MySQL is an appropriate tool for the job I would also like to know if it makes sense to use InnoDB or MyISAM. I understand the basic differences between the two, but some sources say to use MyISAM for speed but InnoDB if you want a "real" database, while others say all modern uses of MySQL should use InnoDB.
Thanks!
I've run DB's far bigger than this on mysql- you should be fine. Just tune your indexes carefully.
InnoDB supports better locking semantics, so if there will be occasional or frequent writes (or if you want better data integrity), I'd suggest starting there, and then benchmarking myisam later if you can't hit your performance targets.
MyISAM only makes sense if you need speed so badly that you're willing to accept many data integrity issues downsides to achieve it. You can end up with database corruption on any unclean shutdown, there's no foreign keys, no transactions, it's really limited. And since 3.5 million rows on modern hardware is a trivial data set (unless your rows are huge), you're certainly not at the point where you're forced to optimize for performance instead of reliability because there's no other way to hit your performance goals--that's the only situation where you should have to put up with MyISAM.
As for whether to choose PostgreSQL instead, you won't really see a big performance difference between the two on an app this small. If you're familiar with MySQL already, you could certainly justify just using it again to keep your learning curve down.
I don't like MySQL because there are so many ways you can get bad data into the database where PostgreSQL is intolerant of that behavior (see Comparing Speed and Reliability), the bad MyISAM behavior is just a subset of the concerns there. Given how fractured the MySQL community is now and the uncertainties about what Oracle is going to do with it, you might want to consider taking a look at PostgreSQL just so you have some more options here in the future. There's a lot less drama around the always free BSD licensed PostgreSQL lately, and while smaller at least the whole development community for it is pushing in the same direction.
Since it's a read-heavy table, I will recommend using MyISAM table type.
If you do not use foreign keys, you can avoid the bugs like this and that.
Backing up or copying the table to another server is as simple as coping frm, MYI and MYD files.
If you need to compute reports and complex aggregates, be aware that postgres' query optimizer is rather smart and ingenious, wether the mysql "optimizer" is quite simple and dumb.
On a big join the difference can be huge.
The only advantage MySQL has is that it can hit the indexes without hitting the tables.
You should load your dataset in both databases and experiment the biger queries you intend to run. It is better to spend a few days of experimenting, rather than be stuck with the wrong choice.