Storage engine for high volume of selects - mysql

Background
I am creating an API utilizing the Bible where I would like to be able to eliminate as much as the database bottleneck as possible. My data is fairly de-normalised to eliminate most unnecessary joins.
Information
Seeing as the text of the Bible doesn't change, I will be doing hardly any INSERT statements. The only time I will insert data is when I add a new translation, which will happen periodically, but I don't care about the speed here.
I will, however, be doing tons of SELECT statements.
I do not need any transnational, ACID compliant features. My primary concern is speed.
The Question
What would the ideal MySql storage engine be to fit these conditions?
I am aware of the basics of each engine (my guess would that MyISAM is ideal), so I am looking for an answer that can be backed up with statistics or further reasoning demonstrating a deep knowledge of some of these engines.
Although using NoSQL may be better than a RDBMS, that is not the information I'm looking for.

the bible is small in terms of file size. and as you said doesnt change.
For the best performance on reads consider Memory. This has the limitation that you cant use text / blob. But providing your data is split into 65,533 char chunks you will be fine.
http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine.html
Using memory also means if power is lost / server is restarted all data is lost. so periodically writing to disk will be useful and on restart you will need to populate the table again.
You will need extra RAM to use this method over other methods though as all tables are stored in RAM
From the question in the comments.
The docs say
To populate a MEMORY table when the MySQL server starts, you can use
the --init-file option. For example, you can put statements such as
INSERT INTO ... SELECT or LOAD DATA INFILE into this file to load the
table from a persistent data source. See Section 5.1.3, “Server
Command Options”, and Section 13.2.6, “LOAD DATA INFILE Syntax”.
http://dev.mysql.com/doc/refman/5.5/en/memory-storage-engine.html#idp82809968
http://dev.mysql.com/doc/refman/5.5/en/server-options.html#option_mysqld_init-file
Again you will need to keep this file up to date with any changes. (can use a mysqldump to maintain it)

Innodb with good indexes maybe even good partitions.
innodb is designed to have better performance with multible threads clients (read more clients at the same time) vs MyISAM is not build for that.
if the server is correcly configured Innodb will really blast away myisam on performance

Related

How do I make a MySQL database run completely in memory?

I noticed that my database server supports the Memory database engine. I want to make a database I have already made running InnoDB run completely in memory for performance.
How do I do that? I explored PHPMyAdmin, and I can't find a "change engine" functionality.
Assuming you understand the consequences of using the MEMORY engine as mentioned in comments, and here, as well as some others you'll find by searching about (no transaction safety, locking issues, etc) - you can proceed as follows:
MEMORY tables are stored differently than InnoDB, so you'll need to use an export/import strategy. First dump each table separately to a file using SELECT * FROM tablename INTO OUTFILE 'table_filename'. Create the MEMORY database and recreate the tables you'll be using with this syntax: CREATE TABLE tablename (...) ENGINE = MEMORY;. You can then import your data using LOAD DATA INFILE 'table_filename' INTO TABLE tablename for each table.
It is also possible to place the MySQL data directory in a tmpfs in thus speeding up the database write and read calls. It might not be the most efficient way to do this but sometimes you can't just change the storage engine.
Here is my fstab entry for my MySQL data directory
none /opt/mysql/server-5.6/data tmpfs defaults,size=1000M,uid=999,gid=1000,mode=0700 0 0
You may also want to take a look at the innodb_flush_log_at_trx_commit=2 setting. Maybe this will speedup your MySQL sufficently.
innodb_flush_log_at_trx_commit changes the mysql disk flush behaviour. When set to 2 it will only flush the buffer every second. By default each insert will cause a flush and thus cause more IO load.
Memory Engine is not the solution you're looking for. You lose everything that you went to a database for in the first place (i.e. ACID).
Here are some better alternatives:
Don't use joins - very few large apps do this (i.e Google, Flickr, NetFlix), because it sucks for large sets of joins.
A LEFT [OUTER] JOIN can be faster than an equivalent subquery because
the server might be able to optimize it better—a fact that is not
specific to MySQL Server alone.
-The MySQL Manual
Make sure the columns you're querying against have indexes. Use EXPLAIN to confirm they are being used.
Use and increase your Query_Cache and memory space for your indexes to get them in memory and store frequent lookups.
Denormalize your schema, especially for simple joins (i.e. get fooId from barMap).
The last point is key. I used to love joins, but then had to run joins on a few tables with 100M+ rows. No good. Better off insert the data you're joining against into that target table (if it's not too much) and query against indexed columns and you'll get your query in a few ms.
I hope those help.
If your database is small enough (or if you add enough memory) your database will effectively run in memory since it your data will be cached after the first request.
Changing the database table definitions to use the memory engine is probably more complicated than you need.
If you have enough memory to load the tables into memory with the MEMORY engine, you have enough to tune the innodb settings to cache everything anyway.
"How do I do that? I explored PHPMyAdmin, and I can't find a "change engine" functionality."
In direct response to this part of your question, you can issue an ALTER TABLE tbl engine=InnoDB; and it'll recreate the table in the proper engine.
In place of the Memory storage engine, one can consider MySQL Cluster. It is said to give similar performance but to support disk-backed operation for durability. I've not tried it, but it looks promising (and been in development for a number of years).
You can find the official MySQL Cluster documentation here.
Additional thoughts :
Ramdisk - setting the temp drive MySQL uses as a RAM disk, very easy to set up.
memcache - memcache server is easy to set up, use it to store the results of your queries for X amount of time.

Best Table Engine for massively updated MySQL table. MyISAM or HEAP?

I am creating an application which will store a (semi) real-time feed of a few different scales around a certain location. The weights of each scale will be put in a table with only as many rows as scales. The scale app feeds the MySQL database a new weight every second, which a PHP web app reads every 3 seconds. It doesn't seem like very much traffic that would page the hard drive very much, or if the difference would be negligible, but I'm wondering if it would be more efficient or make more sense to use a Memory/HEAP table vs a normal MyISAM table.
With anything from 100's to 1000's of concurrent read/write requests (think typical OLTP usage) innodb will out perform myisam hands down.
It's not about other people's observations, it's not about transactional/acid support, it's about the architecture of innodb which is far superior to that of the legacy myisam engine.
For example, innodb supports clustered primary key indexes http://dev.mysql.com/doc/refman/5.0/en/innodb-index-types.html.
Additionally, innodb has row level locking which is far more performant under concurrent load than myisam table level locking.
I could keep going but somone's already provided a really good summary of why innodb is a better choice for OLTP: http://tag1consulting.com/MySQL_Engines_MyISAM_vs_InnoDB
Well, if you're expecting a large amount of data, I think you almost have to go MyISAM. You'll likely run out of memory if you store it all in a memory table. Not to mention that you'll lose all of your data upon power loss with a HEAP engine (Keep in mind, you may want that depending on your use case)...
I know that this question is getting dated and you've probably made a very good solution by now but I just wanted to point out to anyone who may be reading this that perhaps a relational database isn't the best way to solve this problem. To me this clearly looks like a case where a flat file database is the ideal solution. You could have saved yourself a ton of overhead by just writing these values out to a binary file and then use simple mathematical operations to select rows and fields.

Converting MyISAM to InnoDB. Beneficial? Consequences?

We're running a social networking site that logs every member's action (including visiting other member's pages); this involves a lot of writes to the db. These actions are stored in a MyISAM table and since something is starting to tax the CPU, my first thought was that it's the table locking of MyISAM that is causing this stress on the CPU.
There are only reads and writes, no updates to this table. I think the balance between reads and writes is about 50/50 for this table, would InnoDB therefore be a better option?
If I want to change the table to InnoDB and we don't use foreign key constraints, transactions or fulltext indexes - do I need to worry about anything?
Notwithstanding any benefits / drawbacks of its use, which are discussed in other threads ( MyISAM versus InnoDB ), migration is a nontrivial process.
Consider
Functionally testing all components which talk to the database if possible - difference engines have different semantics
Running as much performance testing as you can - some things may improve, others may be much worse. A well-known example is SELECT COUNT(*) on a large table.
Checking that all your code will handle deadlocks gracefully - you can get them without explicit use of transactions
Estimate how much space usage you'll get by converting - test this in a non-production environment.
You will doubtless need to change things in a large software platform; this is ok, but seeing as you (hopefully) have a lot of auto-test coverage, change should be acceptable.
PS: If "Something is starting to tax the CPU", then you should a) Find out what, in a non-production environment, b) Try various options to reduce it, in a non-production environment. You should not blindly start doing major things like changing database engines when you haven't fully analysed the problem.
All performance testing should be done in a non-production environment, with production-like data and on production-grade hardware. Otherwise it is difficult to interpret results correctly.
With regards to other potential migration problems:
1) Space - InnoDB tables often require more disk space, though the Barracuda file format for new versions of InnoDB have narrowed the difference. You can get a sense for this by converting a recent backup of the tables and comparing the size. Use "show table status" to compare the data length.
2) Full text search - only on MyISAM
3) GIS/Spatial datatypes - only on MyISAM
On performance, as the other answers and the referenced answer indicate, it depends on your workload. MyISAM is much faster for full table scans. InnoDB tends to be much faster for highly concurrent access. InnoDB can also be much faster if your lookups are based on the primary key.
Another performance issue is that MyISAM can always keep a row count, since it only does table level locking. So, if you're frequently trying to get the row count for a very large table, it may be much slower with InnoDB. Search the Internet if you need a workaround for this, as I've seen several proposed.
Depending on the size of the table(s), you may also need to update your MySQL config file. At the very least, you may want to shift bytes from key_buffer to innodb_buffer_pool_size. You won't get a fair comparison if you leave the database as being optimized for MyISAM. Read up on all the innodb_* configuration properties.
I think it's quite possible that switching to InnoDB would improve performance, but In my experience, you can't really be sure until you try it. If I were you, I would set up a test environment on the same server, convert to InnoDB and run a benchmark.
From my experience, MyISAM tables are only useful for text indexing where you need good performance with searches on big text, but you still don't need a full fledged search engine like Solr or ElasticSearch.
If you want to switch to InnoDB but want to keep indexing your text in a MyISAM table, I suggest you take a look at this: http://blog.lavoie.sl/2013/05/converting-myisam-to-innodb-keeping-fulltext.html
Also: InnoDB supports live atomic backups using innobackupex from Percona. This is godsent when dealing with production servers.

Will a MySQL table with 20,000,000 records be fast with concurrent access?

I ran a lookup test against an indexed MySQL table containing 20,000,000 records, and according to my results, it takes 0.004 seconds to retrieve a record given an id--even when joining against another table containing 4,000 records. This was on a 3GHz dual-core machine, with only one user (me) accessing the database. Writes were also fast, as this table took under ten minutes to create all 20,000,000 records.
Assuming my test was accurate, can I expect performance to be as as snappy on a production server, with, say, 200 users concurrently reading from and writing to this table?
I assume InnoDB would be best?
That depends on the storage engine you're going to use and what's the read/write ratio.
InnoDB will be better if there are lot of writes. If it's reads with very occasional write, MyISAM might be faster. MyISAM uses table level locking, so it locks up whole table whenever you need to update. InnoDB uses row level locking, so you can have concurrent updates on different rows.
InnoDB is definitely safer, so I'd stick with it anyhow.
BTW. remember that right now RAM is very cheap, so buy a lot.
Depends on any number of factors:
Server hardware (Especially RAM)
Server configuration
Data size
Number of indexes and index size
Storage engine
Writer/reader ratio
I wouldn't expect it to scale that well. More importantly, this kind of thing is to important to speculate about. Benchmark it and see for yourself.
Regarding storage engine, I wouldn't dare to use anything but InnoDB for a table of that size that is both read and written to. If you run any write query that isn't a primitive insert or single row update you'll end up locking the table using MyISAM, which yields terrible performance as a result.
There's no reason that MySql couldn't handle that kind of load without any significant issues. There are a number of other variables involved though (otherwise, it's a 'how long is a piece of string' question). Personally, I've had a number of tables in various databases that are well beyond that range.
How large is each record (on average)
How much RAM does the database server have - and how much is allocated to the various configurations of Mysql/InnoDB.
A default configuration may only allow for a default 8MB buffer between disk and client (which might work fine for a single user) - but trying to fit a 6GB+ database through that is doomed to failure. That problem was real btw - and was causing several crashes a day of a database/website till I was brought in to trouble-shoot it.
If you are likely to do a great deal more with that database, I'd recommend getting someone with a little more experience, or at least oing what you can to be able to give it some optimisations. Reading 'High Performance MySQL, 2nd Edition' is a good start, as is looking at some tools like Maatkit.
As long as your schema design and DAL are constructed well enough, you understand query optimization inside out, can adjust all the server configuration settings at a professional level, and have "enough" hardware properly configured, yes (except for sufficiently pathological cases).
Same answer both engines.
You should probably perform a load test to verify, but as long as the index was created properly (meaning indexes are optimized to your query statements), the SELECT queries should perform at an acceptable speed (the INSERTS and/or UPDATES may be more of a speed issue though depending on how many indexes you have, and how large the indexes get).

MySQL transaction support with mixed tables

It seems like I will be needing transaction with MySQL and I have no idea how should I manage transactions in Mysql with mixed InnoDB/MyISAM tables, It all seems like a huge mess.
You might ask why would I ever want to mix the tables together... the anwer is PERFORMANCE. as many developers have noticed, InnoDB tables generally have bad performance, but in return give higher isolation level etc...
does anyone have any advice regarding this issue?
I think you are overrating the performance difference between MyISAM and InnoDB. MyISAM is faster in data warehousing situations (such as full table scan reporting, etc..), but InnoDB can actually be faster in many cases with normal OLTP queries.
InnoDB is harder to tune since it has more knobs, but a properly tuned InnoDB system can often have higher throughput than MyISAM due to better locking and better I/O patterns.
Given that you can't have transactions in MyISAM tables, I am not sure what the actual problem is. Any data you need transactions for must be in an InnoDB table and you manage the transactions using whatever access library you are using or with manual SQL commands.
There are definite performance benefits of using exactly one engine.
A server tuned for one engine won't be tuned for the other - both require that you allocate a substantial amount of RAM to its exclusive use - therefore, you can't give them both an optimal amount.
Say you have 8G of ram on your (obviously 64-bit, but still relatively small) database server, you might want to assign about 3/4 of it to your innodb page cache. Alternatively, if you're using MyISAM, you may want about half of it to be your key_buffer. You can't do both.
Pick an engine and use it exclusively. There are ways of getting around performance problems - most of them aren't easy though (i.e. they require redesigning your data structure or your application).
The short answer is that there is no transaction support in MyISAM. If you start a transaction, add or modify data in some InnoDB tables, add or modify data in a MyISAM table, and then you have to rollback, your MyISAM change cannot be removed. To support mixed engines like that, your application has to know that changes to whatever data is stored MyISAM happens "outside" of the transaction.
If you need transactions for some processes, then isolate the data that must be transactionable and put all that data in InnoDB.