My MysQL database contains multiple MyISAM tables, with each table containing millions of rows. There is a heavy insert load on the database, so I cannot issue SELECTs on that live database. Instead, I create a replica of the database for queries and conduct analysis on that.
For the analysis, I need to issue multiple parallel queries. The queries are independent (i.e., the results of the queries are not combined together), but they operate on same tables most of the time. As far as I know, the entire MyISAM table is locked for each query, which means parallel independent queries would be slow. Ideally, I would prefer an engine that supports "NO LOCKING". I am assuming MySQL doesnt have such an engine, so should I use InnoDB? I might be missing lot of things here. Please suggest what is the right path to take here.
Thanks
MyISAM read locks are compatible, so the SELECT queries won't lock each other.
If your analysis queries on the replica database don't write, only read, then it's OK to use MyISAM.
You could stick to MyISAM and use INSERT DELAYED:
When a client uses INSERT DELAYED, it gets an okay from the server at once, and the row is queued to be inserted when the table is not in use by any other thread.
Another major benefit of using INSERT DELAYED is that inserts from many clients are bundled together and written in one block. This is much faster than performing many separate inserts.
Related
I am using MySQL and I would like to know if I make multiple select statements simultaneously in order to get information from the information schema, how are these queries handled? Could this cause some potential database malfunction?
Since your are using the myISAM storage engine and are worrying about concurrent SELECT statements:
READ (SELECT) can happen concurrently as long as there is no WRITE (INSERT, UPDATE, DELETE or ALTER TABLE). Ie. you can have either one writer or several readers.
Otherwise the operations are queued and executed as soon as possible.
There is a special case : concurrent inserts.
Note : if you are wondering about the choice between the two main mySQL storage engines myISAM and InnoDB, InnoDB is usually a good choice, please read this SO question.
I have two databases that are identical except that in one I have about 500.000 entries (distributed over several tables) while the other database is empty.
If I run my program in the empty database then execution takes around 10mins while in the database with the 500k entries execution takes around 40mins. I now deleted some of the entries (about 250k entries) and it speeded up the execution by around 10mins. The strange thing is that these tables where not heavily queried (just some very simple inserts), so I wonder how this can have such an effect on the execution.
Also, all SQL statements that I do (I run a lot of them) are rahter simple (no complicated joins mainly inserts), so I wonder why some tables with 250k entries can have such an effect on the performance. Any ideas what could be the reason?
Following things could be the reason but for actual reasons you should look and profile your queries,
Though you think you are making simple inserts, its not a simple operation from DB perspective. (for every entry you insert following things may change and update
Index
Constraints
Integrity of DB (PK-FK) and there are many things to consider.above things look simple but they take time if volume is high
Check volume of queries (if high no. of insert queries are getting executed then as might be knowing Insert is exclusive operation i.e. it locks the table for updating and volume is high that means more locking time and waiting time.)
to avoid this probably you can try chaining or bulk operations
Is bulk update faster than single update in db2?
Data Distribution also plays important role. if you are accessing heavily loaded tables then parsing/accessing/fetching data from such tables will also take time (it doesn't matter for single query but it really hurts for large volume of similar queries). Try to minimize that by tuning your queries.
A Database already has up to 25-30 tables and all are MyISAM. Most of these tables are related to each other meaning a lot of queries use joins on IDs and retrieve data.
One of the tables contain 7-10 Million records and it becomes slow if i want to perform a search or update or even retrieval of all data. Now i proposed a solution to my boss saying that converting tables into InnoDB might give better performance.
I also explained the benefits of InnoDB:
Since we anyways join multiple tables on keys and they are related, it will be better to use foreign keys and have relational database which will avoid Orphan Rows. I found around 10-15k orphan rows in one of the big tables and had to manually remove them.
Support for transactions, we perform big updates from time to time and if one of them fails on the way we have to replace the entire table with the backed-up one and run the update again to make sure that all queries were executed. With InnoDB we can revert back any changes from query 1 if query 2 fails.
Now the response i got from my boss is that I need to prove that InnoDB will run faster than MyISAM. My question is, wont above 2 things improve the speed of the application itself by eliminating orphan rows?
In general is MyISAM faster than InnoDB?
Note: using MySQL 5.5
You should also mention to your boss probably the biggest benefit you get from InnoDB for large tables with both read/write load - You get row-level locking rather than table-level locking. This can be a great performance benefit for the application in cases where you see a lot of waits for table locks to be released.
Of course the best way to convince your boss is to prove it. Make copies of your large table and place on a testing database. Make one version of data in MyISAM and one in InnoDB. Then run load testing against it with a load mix that approximates your current DB read/write activity. Find out for yourself if it is better.
Just updated for your comment that you are on 5.5. With 5.5 it is a no brainer to use InnoDB. MyISAM engine basically has seen no improvement over the last several years and development effort has been around InnoDB. InnoDB is THE MySQL engine of choice going forward.
I have a parallel process with about 64 children that each need to insert data into a landing table. I am currently using a MySQL MyISAM engine, and I disable keys before and after inserts.
However, this seems to be a huge bottleneck in my process. I believe MySQL is table locking for each insert and so processes are constantly waiting to write.
The inserts are independent and there is no danger of conflicting inserts. This also does not need transactions or anything of that nature.
Is there a different engine, or ways to improve the insert/write performance of MySQL?
I have thought about instantiating a table for each process, but this would make the code more complex, and that is not really my style....
Any suggestions would be greatly appreciated.
Thanks!
As documented under INSERT DELAYED Syntax:
The DELAYED option for the INSERT statement is a MySQL extension to standard SQL that is very useful if you have clients that cannot or need not wait for the INSERT to complete.
[ deletia ]
Another major benefit of using INSERT DELAYED is that inserts from many clients are bundled together and written in one block. This is much faster than performing many separate inserts.
MyISAM does indeed lock the tables when inserting, updating or deleting. InnoDB allows transaction and row-based locks.
You can also look into LOAD DATA INFILE which is faster for bulk inserts.
We have an update process which currently takes over an hour and means that our DB is unusable during this period.
If I setup up replication would this solve the problem or would the replicated DB suffer from exactly the same problem that the tables would be locked during the update?
Is it possible to have the replicated DB prioritize reading over updating?
Thanks,
D
I suspect that with replication you're just going to be dupolicating the issue (unless most of the time is spent in CPU and only results in a couple of records being updated).
Without knowing a lot more about the scema, distribution and size of data and the update process its impossible to say how best to resolve the problem - but you might get some mileage out of using innodb instead of C-ISAM and making sure that the update is implemented as a number of discrete steps (e.g. using stored procuedures) rather than a single DML statement.
MySQL gives you the ability to run queries delaye. Example: "INSERT DELAYED INTO...", this will cause the query to only be executed when MYSQL has time to take the query.
Based on your input, it sounds like you are using MyISAM tables, MyISAM only support table-wide locking. That means that a single update will lock the whole database table until the query is completed. InnoDB on the other hand uses row locking, which will not cause SELECT queries to wait(hang) for updates to complete.
So you have the best chances of a better sysadmin life if you change to InnoDB :)
When it comes to replication it is pretty normal to seperate updates and selects to two different MySQL servers, and that does tend to work very well. But if you are using MyISAM tables and does a lot of updates, the locking issue itself will still be there.
So my 2 cents: First get rid of MyISAM, then consider replication or a better scaled MySQL server if the problem still exists. (The key for good performance in MySQL is to have at least the size of all indexes across all databases as physical RAM)