How should I set up these tables for searching? - mysql

My PHP site is an online store with about 5k products. Products belong to a vendor, a category, and possibly a subcategory. Each of those items has a name and the products have descriptions.
The search queries we've set up work wonderfully, but tend to run pretty slow. They range between 0.20s and 30s (yes 30 seconds). We've optimized like crazy and I'm starting to think we're out of room to improve on that front, so we're caching them and that's making life a lot easier.
But when they run they are still killing the server, because what appears to be all of the table locking that comes with MyISAM.
So on to my question: Is there a way for us to use InnoDB (row-level locking) and still maintain FULLTEXT? Should we move our DB offsite and use a service like DB2? Is there some other search engine type software we should use instead?
Any help is greatly appreciated :)

InnoDB does have full-text indexing now: http://blogs.innodb.com/wp/2011/07/innodb-full-text-search-tutorial/
but since it's basically brand new, most MySQL installs will not support it yet.
The standard workaround is to have a 'mirror' MyISAM table that contains copies of the searchable data with a fulltext index. You then join the original InnoDB table against the MyISAM copies, with fulltext searches on the myisam fields and regular other 'where' clauses on the innodb copies.
With appropriate triggers on the InnoDB table, there's no reason that the MyISAM copies would get stale/incorrect, or you could simply rebuild them on a scheduled basis so that you've got a staleness window that matches the rebuild interval.

Related

Move existing tables to InnoDB from MyISAM and which one is faster?

A Database already has up to 25-30 tables and all are MyISAM. Most of these tables are related to each other meaning a lot of queries use joins on IDs and retrieve data.
One of the tables contain 7-10 Million records and it becomes slow if i want to perform a search or update or even retrieval of all data. Now i proposed a solution to my boss saying that converting tables into InnoDB might give better performance.
I also explained the benefits of InnoDB:
Since we anyways join multiple tables on keys and they are related, it will be better to use foreign keys and have relational database which will avoid Orphan Rows. I found around 10-15k orphan rows in one of the big tables and had to manually remove them.
Support for transactions, we perform big updates from time to time and if one of them fails on the way we have to replace the entire table with the backed-up one and run the update again to make sure that all queries were executed. With InnoDB we can revert back any changes from query 1 if query 2 fails.
Now the response i got from my boss is that I need to prove that InnoDB will run faster than MyISAM. My question is, wont above 2 things improve the speed of the application itself by eliminating orphan rows?
In general is MyISAM faster than InnoDB?
Note: using MySQL 5.5
You should also mention to your boss probably the biggest benefit you get from InnoDB for large tables with both read/write load - You get row-level locking rather than table-level locking. This can be a great performance benefit for the application in cases where you see a lot of waits for table locks to be released.
Of course the best way to convince your boss is to prove it. Make copies of your large table and place on a testing database. Make one version of data in MyISAM and one in InnoDB. Then run load testing against it with a load mix that approximates your current DB read/write activity. Find out for yourself if it is better.
Just updated for your comment that you are on 5.5. With 5.5 it is a no brainer to use InnoDB. MyISAM engine basically has seen no improvement over the last several years and development effort has been around InnoDB. InnoDB is THE MySQL engine of choice going forward.

MyISAM vs InnoDB for BI / batch query performance (ie, _NOT_ OLTP)

Sure, for a transactional database InnoDB is a slam dunk. MyISAM doesn't support transactions or row-level locking.
But what if I want to do big nasty batch queries that touch hundreds of millions of rows?
Are there areas where MyISAM has relative advantage over InnoDB??
eg, one (minor) one that I know of ... "select count(*) from my_table;" MyISAM knows the answer to this instantly whereas InnoDB may take a minute or more to make up its mind.
--- Dave
MyISAM scales better with very large datasets. InnoDB outperforms MyISAM in many situations until it can't keep the indexes in memory, then performance drop drastically.
MyISAM also supports MERGE tables, which is sort of a "poor man's" sharding. You can add/remove very large sets of data instantaneously. For example, if you have 1 table per business quarter, you can create a merge table of the last 4 quarters, or a specific year, or any range you want. Rather than exporting, deleting and importing to shift data around, you can just redeclare the underlying MERGE table contents. No code change required since the name of the table doesn't change.
MyISAM is also better suited for logging, when you are only adding to a table. Like MERGE tables, you can easily swap out (rotate "logs") a table and/or copy it.
You can copy the DB files associated with a MyISAM table to another computer and just put them in the MySQL data directory and MySQL will automatically add them to the available tables. You can't do that with InnoDB, you need to export/import.
These are all specific cases, but I've taken advantage of each one a number of times.
Of course, with replication, you could use both. A table can be InnoDB on the master and MyISAM on the slave. The structure has to be the same, not the table type. Then you can get the best of both. The BLACKHOLE table type works this way.
Here's a great article comparing various performance points http://www.mysqlperformanceblog.com/2007/01/08/innodb-vs-myisam-vs-falcon-benchmarks-part-1/ - you'll have to evaluate this from quite a few angles, including how you intend to write your queries and what your schema looks like. It's simply not a black and white question.
According to this article, as of v5.6, InnoDB has been developed to the point where it is better in all scenarios. The author is probably a bit biased, but it clearly outlines which tech is seen as the future direction of the platform.

MySQL: InnoDB or MyISAM Engine ::: which is better for lot of selects?

I have a huge file (~26 MB) with around 200 columns & 30000 records. I want to import it into a database (InnoDB Engine). I wont't be updating or deleting records ever. ALthough I will be querying a lot of records from the table with high complexity in where clause. Which table engine should i prefer for faster query response? Will it really make a lot of difference?
PS: All my other tables use InnoDB.
Also How can I avoid manually creating a table with 200 columns and specifying the datatype for each of them. Most of the columns are float and few are varchar and date.
Usually the answer to "which is faster, ISAM or innodb" would be ISAM
But for best performance with a table which has very few updates you might want to have a look at Infobright's columnar db (which is integrated into mysql).
However with only 30k rows you'll not see a significant difference between innodb, isam and infobright.
OTOH, you really should have a long hard look at whether you really need 200 columns in a single table. I suspect that's not the case - and the schema is far more important in determining performance than the storage engine.
when dealing with large amounts of data innodb fares better then myisam.,
http://www.mysqlperformanceblog.com/2007/01/08/innodb-vs-myisam-vs-falcon-benchmarks-part-1/
and
http://www.cftopper.com/index.cfm?blogpostid=84
James Day, a MySQL Support Engineer and Wikipedia engineer recommends that people use InnoDB all the time unless for some reason if becomes apparent that you need MyISAM:
"I'd go with InnoDB until it's been proved that it's unsuitable. The first reason is reliability. Get a crash with MyISAM and you have the unreliable and slow, related to table size, table repair process. Same thing with InnoDB and you instead get the fixed time, fast and reliable log apply/rollback process. As the data set gets bigger, this matters more and more, as it does if you want to do things like sleep instead of being woken up in the middle of the night to fix a crashed table.
For reliability and performance, we use InnoDB for almost everything at Wikipedia - we just can't afford the downtime implied by MyISAM use and check table for 400GB of data when we get a crash."

Does this case call for InnoDB or MyISAM?

I'm doing a search on a table (few inner joins) which takes max 5 seconds to run (7.5 million rows). It's a MyISAM table and I'm not using full-text indexing on it as I found there to be no difference in speed when using MATCH AGAINST and a normal "like" statement in this case from what I can see.
I'm now "suffering" from locked tables and queries running for several minutes before they complete because of it.
Would it benefit me at all to try and switch the engine to InnoDB? Or does that only help if I need to insert or update rows... not just select them? This whole table-locking thing is busy grinding my balls...
InnoDB supports row-level locking instead of table-level locking... so that should alleviate your problem (although I'm not sure it will remove it entirely).
Your best bet would be to use a dedicated search system (like Sphinx, Lucene, or Solr)
The difference between row-level and table-level locking is only important for insert and update queries. If you're mostly do selects (so the inserts/updates do not happen too often to lock the table) the difference will not be all that much (even though in recent benchmarks InnoDB seems to be outperforming MyISAM).
Other ways you could think about is to reorganise your data structure, perhaps including additional lookup table with 'tags' or 'keywords'. Implementing more efficient full text engine as suggested by webdestroya.
Last but not least, I'm also surprised that you got similar results with FULL TEXT vs LIKE. This could happen if the fields you're searching are not really wide, in which case maybe a stndard B-TREE index with = search would be enough?

Mysql FULLTEXT index, search locks table

Consider this scenario, my database table has 300000 rows and has a fulltext index. Whenever a search is done it locks the database and doesn't allow anyone else to login to the portal.
Any advice on how to get things sorted out here will be really appreciable.
Does logging on perform a write to the table? eg. a 'last visit' time?
If so you may expect behaviour something like this because MyISAM writes do a lock over the entire table. Usually this is avoided by not using noddy MyISAM and going to InnoDB instead, which has row-level locking (amongst other desirable database features).
The problem, of course, is that you only get fulltext search with MyISAM.
So you'll need to split your tables up. If you can keep the read-heavy and fulltext stuff in a different table to the stuff that needs writing (but linked using the same primary key), you can probably make it so that the two operations don't affect each other.
Better, migrate the bulk of the table to InnoDB, leaving only a fulltext field in MyISAM. Everything except fulltext searches can then steer clear of the MyISAM table, and use only the InnoDB table which exhibits much better locking performance. Personally, I now tend to store everything in the InnoDB table, including the text, and store a second copy of the text in the MyISAM table purely for fulltext searchbait purposes; this simplifies queries and code and brings the advantages of InnoDB's consistency to the text content, and I also use it to process the searchbait to get stemming and other features MySQL's fulltext doesn't normally support. But it does mean you have to spend a lot more space on storage.
You can also improve matters by cutting down number of writes. For example if it is a 'last visit' timestamp you're writing, you can avoid writing that unless, say, a minute has passed between the previous time and now, on the basis that no-one needs to know the exact second someone last accessed the site.
If you use an external search engine or MySQL search plug-ins Lucene or Sphinx, they should be able to read and index without locking the table. They store a local version of the indexed records, so they don't have to read the table very often, and never need to write to it.