How is mysql different from oracle performance-wise? - mysql

I've started a new job where I'm working with MySQL instead of Oracle. What are some things that I might have to "unlearn" from using Oracle? What are some things that might make Oracle SQL go faster, but might be bad under MySQL (and vice-versa)?
In particular, is it better for MySQL code to commit less frequently (as is the case for Oracle)?

Storage Engines are the key. MyISAM and InnoDB will perform very differently, for example (not least because MyISAM is non-transactional and can therefore skip a lot of consistency logic).
So what is faster will often depend on the storage engine being used.

As you've started a new job, the application you're maintaining likely has a different architecture. In optimisation, architecture is much more important than micro-optimisations, so that is likely to make more difference.
Sure of course, the techniques for optimisation will be different, but that would be the case with any product.
However, MySQL (InnoDB) and Oracle operate in a fundamentally similar way - small transactions are generally going to perform less well than the same number of operations in large transactions if you have durability enabled, as each transaction requires a fdatasync (possibly slightly less if group commit is enabled and works).

Related

Optimizing mysql/postgresql for create and update

As far as I know most of the RDBMS packages are built keeping in mind 99% of the queries will be select queries. However, I am in a situation where we have at least 50 % of the queries as create/update queries. Since we also need persistence, we can not go for NoSQL solutions. Essentially, whenever there is an update it should be immediately stored permanently. So, I was wondering if the performance with MySQL will be hampered because of that. Our current MySQL engine is InnoDb. Is any other MySQL engine more preferable? I plan to use Amazon RDS so my focus is MySQL; but just out of curiousity I would like to know if postgresql can help in this.
N.B. - Just to give an idea of the scale, we are talking about create/update queries on tables with at least a million entries within a couple of months of going into production.
If your working set fits in memory, your inserts and updates will tend to be quite fast. Partitioning can help here, as others have mentioned. Most NoSQL solutions have persistence so you shouldn't exclude them outright. Cassandra has a storage model specifically tuned for writes and might be worth a look.
If you go with MySQL, there are tuning parameters to trade some durability for insert performance, and various other hardware and software settings:
https://serverfault.com/questions/118504/how-to-improve-mysql-insert-and-update-performance
You can probably expect around 100 inserts / sec using full durability on standard disks. If that's not going to cut it, setup benchmarks and start tweaking parameters or get ready for some re-architecting. Benchmark testing is important using realistic amounts of data in your tables. It's much better to find a problem now than to discover it 6 months down the road when your tables start to fill in. Synthetic data is fine, just make sure the indexed fields are distributed similarly.
Having as few as possible indexes increases speed of inserts and updates, because all indexes have to get updated when inserting/updating rows to the tables.
But of course, keep in mind that some indexes might increase your updates as weel.

Mixing MyISAM and InnoDB engines for database design

One of my friends, who is a DBA, commented that mixing MyISAM and InnoDB is fairly common among the DBA community, while designing schema in MySQL.
My question is, if this is true, then how good it is? Does it have any effect in the maintainability, scalability etc?
There are generally very few good reasons remaining to use MyISAM. Newer versions of MySQL (5.5+) have extended InnoDB to support all the features that were previously only available on MyISAM (such as fulltext and geospatial indexing), and InnoDB performance is usually considerably better than MyISAM when configured properly.
Unless you are working with an older version of MySQL, or if you have a very good reason for doing so, I'd recommend just using InnoDB throughout any new database design.
IMHO, this is one of the terrible things about MySQL: that it makes you choose between speed and full-text indices (MyISAM) and referential integrity and transactions (InnoDB). If you can, I highly recommend switching to PostgreSQL: in addition to a number of other advantages, you get speed, full-text indices, transactions, and referential integrity in one storage engine. (I no longer use MySQL for new projects at all.)
If you must stick with MySQL, I recommend using InnoDB on all tables unless you have a particular reason not to.
On a practical note.
I have done it a number of times. Notably on one system where queries were locking a whole table as myisam and we converted a number of the critical tables to innodb so they locked at the row level. This removed some bottlenecks from the process and i was with the company for another 18 months after this change with no problems from that solution. The application being supported made fairly intensive use of the database as well so any inadequacies tended to come to light quite quickly.

Fast at selects/joining MySQL storage engines?

I have some very large databases (some up to 150M rows) I'm working with & after initially inserting the data there isn't much INSERT's going on; just a lot of SELECT's & usage of JOINS.
I've been messing around with InfoBright a lot (the community version) & whilst I believe it is a good engine, I personally have been having some problems with it getting it to run like it should (fast).
So I was wondering if anyone else could recommend any other fast free storage engine for MySQL?
I'm just now checking out tokudb; is there anything else out there to check out as well?
You should look at InfiniDB too. http://infinidb.org/ (one of the fastest)
There are a lot of considerations you need to make before benchmarking any engine. Hardware stuff like multicore processors, memory, configuration. Design stuff related to your schema etc etc. and how all this impacts the engine performance.
Do check this blog out for how they do benchmarking of engines (it names other engine types) - http://www.mysqlperformanceblog.com/2010/01/07/star-schema-bechmark-infobright-infinidb-and-luciddb/
Note that this comparison is for a star schema design. If a columnar db engine doesn't suit your requirements, you can look into XtraDB , which is an extended version of InnoDB (not the fastest, but is ACID compliant).
ps - Always track the properties (important to you) of each engine - like referential integrity checks, ACID compliance etc. Sometimes these limitations can be bigger deal breakers as compared to a 10% increase in query performance
Have you looked at Sphinx at all? While it is a search engine, it also supports query-less searches, which is similar to standard SELECT queries with indexes. I found it to be a huge help when dealing with large datasets. It's very fast, and is used heavily in high-traffic forums who are up in the millions (or hundreds of millions) of posts arena.
There is also a plugin for MySQL called SphinxSE which allows it to act as a MySQL storage engine which makes integration very easy to set up. You build your indexes by supplying the indexer program a query, and then once it's all set up, you can query it as if it was a normal table.
http://sphinxsearch.com/docs/2.0.1/sphinxse-overview.html (note, I haven't used it much since pre 1.0)
Besides taking into consideration which DBMS you use, you should also focus on optimizing your tables, indices and queries.
Whenever you have multiple joins, join first on the most selective relation and then on the less selective.
Analyze your query execution plans.
Create indices on columns that are hit often in your QEPs.
Brett -
When using Infobright, you get the best performance gains by:
1) Utilizing the Knowledge Grid as much as possible
2) reducing joins
3) creating 'lookup'
Since the Knowledge-Grid is in-memory, you can kill off a lot of query time just by adding additional filters. Also, consider using a nested select instead of a join. By doing so, you can use an already-created knowledge node (instead of generating a pack-to-pack node on the fly).
If you have some queries that you think should be faster, post them, and I can help with potentially modifying the query to make it run faster.
Cheers,
Jeff

Converting MyISAM to InnoDB. Beneficial? Consequences?

We're running a social networking site that logs every member's action (including visiting other member's pages); this involves a lot of writes to the db. These actions are stored in a MyISAM table and since something is starting to tax the CPU, my first thought was that it's the table locking of MyISAM that is causing this stress on the CPU.
There are only reads and writes, no updates to this table. I think the balance between reads and writes is about 50/50 for this table, would InnoDB therefore be a better option?
If I want to change the table to InnoDB and we don't use foreign key constraints, transactions or fulltext indexes - do I need to worry about anything?
Notwithstanding any benefits / drawbacks of its use, which are discussed in other threads ( MyISAM versus InnoDB ), migration is a nontrivial process.
Consider
Functionally testing all components which talk to the database if possible - difference engines have different semantics
Running as much performance testing as you can - some things may improve, others may be much worse. A well-known example is SELECT COUNT(*) on a large table.
Checking that all your code will handle deadlocks gracefully - you can get them without explicit use of transactions
Estimate how much space usage you'll get by converting - test this in a non-production environment.
You will doubtless need to change things in a large software platform; this is ok, but seeing as you (hopefully) have a lot of auto-test coverage, change should be acceptable.
PS: If "Something is starting to tax the CPU", then you should a) Find out what, in a non-production environment, b) Try various options to reduce it, in a non-production environment. You should not blindly start doing major things like changing database engines when you haven't fully analysed the problem.
All performance testing should be done in a non-production environment, with production-like data and on production-grade hardware. Otherwise it is difficult to interpret results correctly.
With regards to other potential migration problems:
1) Space - InnoDB tables often require more disk space, though the Barracuda file format for new versions of InnoDB have narrowed the difference. You can get a sense for this by converting a recent backup of the tables and comparing the size. Use "show table status" to compare the data length.
2) Full text search - only on MyISAM
3) GIS/Spatial datatypes - only on MyISAM
On performance, as the other answers and the referenced answer indicate, it depends on your workload. MyISAM is much faster for full table scans. InnoDB tends to be much faster for highly concurrent access. InnoDB can also be much faster if your lookups are based on the primary key.
Another performance issue is that MyISAM can always keep a row count, since it only does table level locking. So, if you're frequently trying to get the row count for a very large table, it may be much slower with InnoDB. Search the Internet if you need a workaround for this, as I've seen several proposed.
Depending on the size of the table(s), you may also need to update your MySQL config file. At the very least, you may want to shift bytes from key_buffer to innodb_buffer_pool_size. You won't get a fair comparison if you leave the database as being optimized for MyISAM. Read up on all the innodb_* configuration properties.
I think it's quite possible that switching to InnoDB would improve performance, but In my experience, you can't really be sure until you try it. If I were you, I would set up a test environment on the same server, convert to InnoDB and run a benchmark.
From my experience, MyISAM tables are only useful for text indexing where you need good performance with searches on big text, but you still don't need a full fledged search engine like Solr or ElasticSearch.
If you want to switch to InnoDB but want to keep indexing your text in a MyISAM table, I suggest you take a look at this: http://blog.lavoie.sl/2013/05/converting-myisam-to-innodb-keeping-fulltext.html
Also: InnoDB supports live atomic backups using innobackupex from Percona. This is godsent when dealing with production servers.

MySQL transaction support with mixed tables

It seems like I will be needing transaction with MySQL and I have no idea how should I manage transactions in Mysql with mixed InnoDB/MyISAM tables, It all seems like a huge mess.
You might ask why would I ever want to mix the tables together... the anwer is PERFORMANCE. as many developers have noticed, InnoDB tables generally have bad performance, but in return give higher isolation level etc...
does anyone have any advice regarding this issue?
I think you are overrating the performance difference between MyISAM and InnoDB. MyISAM is faster in data warehousing situations (such as full table scan reporting, etc..), but InnoDB can actually be faster in many cases with normal OLTP queries.
InnoDB is harder to tune since it has more knobs, but a properly tuned InnoDB system can often have higher throughput than MyISAM due to better locking and better I/O patterns.
Given that you can't have transactions in MyISAM tables, I am not sure what the actual problem is. Any data you need transactions for must be in an InnoDB table and you manage the transactions using whatever access library you are using or with manual SQL commands.
There are definite performance benefits of using exactly one engine.
A server tuned for one engine won't be tuned for the other - both require that you allocate a substantial amount of RAM to its exclusive use - therefore, you can't give them both an optimal amount.
Say you have 8G of ram on your (obviously 64-bit, but still relatively small) database server, you might want to assign about 3/4 of it to your innodb page cache. Alternatively, if you're using MyISAM, you may want about half of it to be your key_buffer. You can't do both.
Pick an engine and use it exclusively. There are ways of getting around performance problems - most of them aren't easy though (i.e. they require redesigning your data structure or your application).
The short answer is that there is no transaction support in MyISAM. If you start a transaction, add or modify data in some InnoDB tables, add or modify data in a MyISAM table, and then you have to rollback, your MyISAM change cannot be removed. To support mixed engines like that, your application has to know that changes to whatever data is stored MyISAM happens "outside" of the transaction.
If you need transactions for some processes, then isolate the data that must be transactionable and put all that data in InnoDB.