Will making a table innodb and partitioning improve performance in mysql? - mysql

I have a Myisam table with composite unique key of 2 columns and 90 million data. Now we are facing memory and load issues and after going through the web I am planning to include partitioning and changing this table to Innodb for better performance. But I have following concerns:
Changing to innodb will have a huge downtime, Is it possible to minimize the downtime?
Most of the select query are on a particular column of the key on which I am planning to have the hash partitioning, how much it will effect the query on another key column?
Will these changes improve the performance to the extent mentioned theoretically? Is there any better solution for such cases. Any suggestion or experience can be helpful.
My queries are simple like
Select * from Table where Col1= "Value"
Select * from Table where Col1="Value" and Col2 IN (V1,V2,V3)
Inserts are very frequently.

InnoDB will probably help some. Conversion to InnoDB comes with some issues, as I state in My conversion blog.
Partitioning, per se, buys no performance gain. My partitioning blog lists 4 cases where you can, with design changes, gain performance.
Regardless of the Engine, your two queries will both benefit from
INDEX(col1, col2)
No form of partitioning will help. HASH partitioning is especially useless.
Conversion to InnoDB will take a lot of downtime, unless pt-online-schema-change will work for your case. Research it.
Also read my answers and comments on
Can i set up Mysql to auto-partition?
for more specifics.
It may be that adding that index is the main performance gain. But you have to do a lengthy ALTER to get it. MyISAM does not have ALGORITHM=INPLACE.

Innodb (about perfomance we are talking now) have sense only when there are alot of inserts and updates to your table, because of row-locking table.
If the most queries on your table are SELECTs then MyIsam will be faster.
Advice: put in my.cnf key_buffer_size equal to 25% of your free RAM.

If inserts on your database are very frequent, you will likely gain performance by switching to innodb, which won't lock down entire tables to insert, allowing other clients to select data concurrently.
Regarding question #1, if you are worried about downtime, I'd suggest you find a parallel dump/load solution for migrating your data to innodb. if you simply run an ALTER statement on your tables, this is a single threaded operation which will be much slower.
Regarding #2, you'd have to post a schema along with your partitioning strategy and the queries you are worried about.

Related

MyISAM performance

In Mysql (5.7 onwards) for Change tracking of a table, this approach is very simple to implement.
But it needs the versions table to be of MyISAM, which does table level locking.
Would this approach work well for production systems where multiple inserts/updates are happening every second?
Does any one have any real production systems experience about this approach?
Each table in the DB(InnoDB) has Versions table(MyISAM)
My system has the following load.
* Approx 500 reads/sec on each table due to various joins.
* And 50 writes/sec to various tables which have triggers to the versions table.
Would the versions table (MyISAM) become a bottleneck for performance?
When a MyISAM table has AUTO_INCREMENT (and a certain mode set), and no other UNIQUE keys, it will append to the table "without a lock". So, I don't think the 50 writes/sec will be an issue.
MariaDB will probably continue to include MyISAM long after Oracle jettisons it. Oracle's intent is to make InnoDB so good that there will be no need for MyISAM, and they are likely to succeed.
Secondary indexes on the versions tables may become a bottleneck. In this area, I think InnoDB's "change buffer" does a better job than MyISAM's "do it now".

Will switch to MyISAM Engine help to improve the speed of reading operations?

I'm currently have a few tables with InnoDB Engine. 10-20 connections are constantly inserts data into those tables. I use MySQL RDS instance on AWS. Metric shows about 300 Write IOPS (counts/second). However, INSERT operations lock the table, and if someone want to perform a query like SELECT COUNT(*) FROM table; it could literally take a few hours for the first time before MySQL cache the result.
I'm not a DBA and my knowledge about DB are very limited. So the question is if I'll switch to MyISAM Engine will it help to improve the time of READ operations?
SELECT COUNT(*) without WHERE is bad query for InnoDB, as it does not cache the row count like MyISAM do. So if you have issue with this particular query, you have to cache the count somewhere - in a stats table for example.
After you remove this specific type of query, you can talk about InnoDB vs MyISAM read performance. Generally writes do not block reads in InnoDB - is uses MVCC for this. InnoDB performance however is very dependent of how much RAM you have set for the buffer pool.
InnoDB and MyISAM are very different in how they store data. You can always optimize for one of them and knowing the differences can help you in designing your application. Generally you can have as good performance for reading as in MyISAM in InnoDB tables - you just can use count without where clause, and you always should have a suitable index for where clauses, as in InnoDB table scan will be slower than in MyISAM.
I think you should stick with your current setup. InnoDB is supposed not to lock the table when inserting rows, since it uses the MVCC technique. On the other hand, MyISAM locks the entire table when new rows are inserted.
So, if you have many writes, you should stick with InnoDB.
Innodb is a better overall engine in general. There are some benchmarks out there that put read operations in myiasm a little ahead of innodb. However, if your site is big enough to notice this performance difference, you should be on innodb anyway because of all the other efficiencies. Innodb alone wins because of the row level locking instead if table level locking in myiasm when backing up your database.

MySQL: InnoDB or MyISAM Engine ::: which is better for lot of selects?

I have a huge file (~26 MB) with around 200 columns & 30000 records. I want to import it into a database (InnoDB Engine). I wont't be updating or deleting records ever. ALthough I will be querying a lot of records from the table with high complexity in where clause. Which table engine should i prefer for faster query response? Will it really make a lot of difference?
PS: All my other tables use InnoDB.
Also How can I avoid manually creating a table with 200 columns and specifying the datatype for each of them. Most of the columns are float and few are varchar and date.
Usually the answer to "which is faster, ISAM or innodb" would be ISAM
But for best performance with a table which has very few updates you might want to have a look at Infobright's columnar db (which is integrated into mysql).
However with only 30k rows you'll not see a significant difference between innodb, isam and infobright.
OTOH, you really should have a long hard look at whether you really need 200 columns in a single table. I suspect that's not the case - and the schema is far more important in determining performance than the storage engine.
when dealing with large amounts of data innodb fares better then myisam.,
http://www.mysqlperformanceblog.com/2007/01/08/innodb-vs-myisam-vs-falcon-benchmarks-part-1/
and
http://www.cftopper.com/index.cfm?blogpostid=84
James Day, a MySQL Support Engineer and Wikipedia engineer recommends that people use InnoDB all the time unless for some reason if becomes apparent that you need MyISAM:
"I'd go with InnoDB until it's been proved that it's unsuitable. The first reason is reliability. Get a crash with MyISAM and you have the unreliable and slow, related to table size, table repair process. Same thing with InnoDB and you instead get the fixed time, fast and reliable log apply/rollback process. As the data set gets bigger, this matters more and more, as it does if you want to do things like sleep instead of being woken up in the middle of the night to fix a crashed table.
For reliability and performance, we use InnoDB for almost everything at Wikipedia - we just can't afford the downtime implied by MyISAM use and check table for 400GB of data when we get a crash."

Does this case call for InnoDB or MyISAM?

I'm doing a search on a table (few inner joins) which takes max 5 seconds to run (7.5 million rows). It's a MyISAM table and I'm not using full-text indexing on it as I found there to be no difference in speed when using MATCH AGAINST and a normal "like" statement in this case from what I can see.
I'm now "suffering" from locked tables and queries running for several minutes before they complete because of it.
Would it benefit me at all to try and switch the engine to InnoDB? Or does that only help if I need to insert or update rows... not just select them? This whole table-locking thing is busy grinding my balls...
InnoDB supports row-level locking instead of table-level locking... so that should alleviate your problem (although I'm not sure it will remove it entirely).
Your best bet would be to use a dedicated search system (like Sphinx, Lucene, or Solr)
The difference between row-level and table-level locking is only important for insert and update queries. If you're mostly do selects (so the inserts/updates do not happen too often to lock the table) the difference will not be all that much (even though in recent benchmarks InnoDB seems to be outperforming MyISAM).
Other ways you could think about is to reorganise your data structure, perhaps including additional lookup table with 'tags' or 'keywords'. Implementing more efficient full text engine as suggested by webdestroya.
Last but not least, I'm also surprised that you got similar results with FULL TEXT vs LIKE. This could happen if the fields you're searching are not really wide, in which case maybe a stndard B-TREE index with = search would be enough?

Will partitions improve MySQL INSERT speed?

I'm doing a lot of INSERTs via LOAD DATA INFILE on MySQL 5.0. After many inserts, say a few hundred millions rows (InnoDB, PK + a non-unique index, 64 bit Linux 4GB RAM, RAID 1), the inserts slow down considerably and appear IO bound. Are partitions in MySQL 5.1 likely to improve performance if the data flows into separate partition tables?
The previous answer is erroneous in his assumptions that this will decrease performance. Quite the contrary.
Here's a lengthy, but informative article and the why and how to do partitioning in MySQL:
http://dev.mysql.com/tech-resources/articles/partitioning.html
Partitioning is typically used, as was mentioned, to group like-data together. That way, when you decided to archive off or flat out destroy a partition, your tables do not become fragmented. This, however, does not hurt performance, it can actually increase it. See, it is not just deletions that fragment, updates and inserts can also do that. By partitioning the data, you are instructing the RDBMS the criteria (indeces) by which the data should be manipulated and queried.
Edit: SiLent SoNG is correct. DISABLE / ENABLE KEYS only works for MyISAM, not InnoDB. I never knew that, but I went and read the docs. http://dev.mysql.com/doc/refman/5.1/en/alter-table.html#id1101502.
Updating any indexes may be whats slowing it down. You can disable indexes while your doing your update and turn them back on so they can be generated once for the whole table.
ALTER TABLE foo DISABLE KEYS;
LOAD DATA INFILE ... ;
ALTER TABLE ENABLE KEYS;
This will cause the indexes to all be updated in one go instead of per-row. This also leads to more balanced BTREE indexes.
No improvement on MySQL 5.6
"MySQL can apply partition pruning to SELECT, DELETE, and UPDATE statements. INSERT statements currently cannot be pruned."
http://dev.mysql.com/doc/refman/5.6/en/partitioning-pruning.html
If the columns INSERT checks (primary keys, for instance) are indexed - then this will only decrease the speed: MySQL will have to additionally decide on partitioning.
All queries are only improved by adding indexes. Partitioning is useful when you have tons of very old data (e.g. year<2000) which is rarely used: then it'll be nice to create a partition for that data.
Cheers!