An alternative to MySQL fulltext search - mysql

I read that MySQL fulltext search can cause table locking. It means people can't insert or update the table when it's being searched on.
I read that there are many search servers (Lucence and Sphinx) can do it without table locking and even faster. It requires many configuration and hard to implement.
Is there any other way to use fulltext or some searching like that without using search service? I don't want to configure one more server other than MySQL.

Create an extra table which will be used only to perform FULLTEXT searches. In your code you have to ensure that all data and actions (create, update, delete) are properly replicated to this table. This solution is also handy if your data tables are running e.g. InnoDB engine.

Apache Lucene doesn't need many configuration and isn't hard to implement. Moreover, it's one of the most popular fulltext search engine, and allows the users to do very precise queries, like "to be or not to be", j?hn d?e, func*, etc.
I already did some database indexing with Lucene, so if you could be a bit more precise about which fields of which tables you wanna index, I can give you pieces of code which should do the trick.

I vote for Sphinxsearch anyway. It has one of APIs close to Mysql, easy to install and configure. Not so universal as Apache Lucene, but jet quick and very helpful in my projects.

Related

Mysql Search - InnoDB and transactions vs MyISAM for FULLTEXT search

I'm currently doing research on the best ways to provide an advanced search for my php project.
I have narrowed it down to using FULLTEXT search rather than using LIKE to find matching strings in a table row. However, to do this it appears I need to sacrifice using the InnoDB engine which will make me lose the ACIDity of transactions and table relationships.
Is it really worth using the MYISAM mysql engine or are there better ways of providing search functionality.
Any pointers would be appreciated!
It really depends on the application... Using MyISAM for anything that needs referential integrity is an instant fail. At the same time, it's text search isn't all that efficient.
Basically, there are two ways to go. If you find you don't need true referential integrity, consider a NoSQL datastore. MongoDB is a great document store database.
If, on the other hand, you really need referential integrity, but also need fast, indexed full-text searching, you might do better to use Sphinx or Apache Solr to create an indexed cache for full-text search.
Either way, I consider MyISAM to be a legacy datastore. I wouldn't use it on a new project. YMMV.
MyISAM has several drawbacks - lack of transaction support, table-level locks which makes it very slow in heavy read+write load type. Another inconvenience of MyISAM tables - they are not crash safe so you can lost some data in case of unexpected shutdown or power loss on server. However MyISAM is very fast on some queries.
Regarding the FullText search I would suggest to use InnoDB + external search engine like Lucene or Sphinx so you could benefit from both safe and reliable storage engine and fast Full-text queries.
For quick start with InnoDB and Sphinx you can refer to http://astellar.com/2011/12/replacing-mysql-full-text-search-with-sphinx/
MySQL 5.6 supports FULLTEXT indexes with InnoDB (released Feb 2013). See:
http://dev.mysql.com/doc/refman/5.6/en/fulltext-search.html

Any third party search engines (fulltext search and so on) work fine with InnoDB tables?

I know, that InnoDB tables do not support fulltext searches, yet. So I thought of using a third party search engine like solr, xapian or whoosh. Do those third party tools work equivalently fine with InnoDB tables as they work with MyIsam tables? I need to find e.g. spelling suggestions, and similar strings...
You could use Solr/Lucene to do the fulltext-search over your DB data. Since my MySQL DB is to big for an fast fulltext-search, i decided to combine mysql and Solr/lucene.
It's important to know, that Solr/Lucene is not an MySQL Plugin. So you will not be able to search the fulltext-index by using typical MySQL SQL-Statements. An fulltext-search, initiated by the application, should be first send the request to the 3rd party fulltext-index (Solr), which returns the primary keys of the related documents. Second step is to run an SQL statement against your MySQL innoDB with an where clause with the corresponding primary keys from the Solr result set.
That solution works in my case very well and much, much faster (and better) than an typical MySQL Myisam fulltext-search.
As an alternative you could not only index the data in solr. You also could store the data in solr additionally. In that case, solr is able to return the full text. So you don't need get the data form the database, as in the example above.
Do those third party tools work equivalently fine with InnoDB tables as they work with MyIsam tables?
Absolutely. Solr has an DataImportHandler. Ther you define an SQL statement in order to get the data you like to index in solr, like: select * from MyTable;
But keep in mind: right now (as far as I know) ther is no MySQL solr plugin available. The cooperation of Solr and MySQL should be handled by the application.
Third-party fulltext search engines typically copy data returned by a MySQL query, and use it to populate their search index. There's no difference between MyISAM and InnoDB data sources in this respect.
I gave a presentation Practical Full-Text Search in MySQL a few years ago. You might find it interesting.
Sphinx supports its own index and just takes data from MySQL on a timely basis by issuing a query.
It is not even aware of the underlying table structure and as long as the query runs and returns the results, it's OK for Sphinx.
Other third party engines work in a similar way.

MySQL index versus Sphinx index

I have a question that , i use the index by the mysql database itself to index is better or using the index function in sphinx index.
it depends.
sphinx is a better index by all standards but 1. it's faster, has a more advanced syntax, is smaller, is more scalable, doesn't involve myisam.
mysql is more maintainable, simpler to install, does not involve another tier to your application, and if it's good enough you might as well use it.
Sphinx provides you full text search option. You can say its a mini search engine embedded in your app. And with no doubts its better Mysql Index.
If you just want to index auto incremented integer columns then there seems no point adding sphinx in your app. Still database size matters.
Checkout some the previously asked questions to get better idea what suits your needs.
https://stackoverflow.com/questions/tagged/sphinx?sort=votes

pitfalls with mixing storage engines in mysql with django?

I'm running a django system over mysql in amazon's cloud, and the database default is innodb. But now I want to put a fulltext index on a couple of tables for searching, which evidently requires myisam.
The obvious solution is to just tell mysql to ALTER TABLE to myisam, but are there going to be any issues with that?
One that comes to mind is that I'll have to remember to do that any time I build a new version of the database, which should theoretically be rare, but there doesn't seem to be a way to tell django to please set the storage engine at the table level. I guess I could write a migration (we use south).
Any other things I might be missing? What could possibly go wrong?
Will the application notice? Probably not.
Will it cause problems? Only when things go wrong. MyISAM is not a transactional storage engine. If you change the data in a MyISAM table while inside of a transaction, then have to roll back changes, the changes in that table won't be rolled back. It's been a while since I tried to break it horribly, but I'm willing to wager that MySQL won't even issue a warning when this happens. This will lead to data consistency issues.
You should seriously consider external search software instead of a fulltext index, like ElasticSearch (integrates at the application level), or Sphinx (integrates at the MySQL level, though if you're using RDS instead of MySQL directly, I don't think you'll be able to use it).
the following may be of help:
use a myisam fulltext table to index back into your innodb tables for example:
Build your system using innodb:...
Any way to achieve fulltext-like search on InnoDB

How do you identify unused indexes in a MySQL database?

I have recently completely re-written a large project. In doing so, I have consolidated great number of random MySQL queries. I do remember that over the course of developing the previous codebase, I created indexes on a whim, and I'm sure there are a great number that aren't used anymore.
Is there a way to monitor MySQL's index usage to determine which indexes are being used, and which ones are not?
I don't think this information is available in a stock MySQL installation.
Percona makes tools and patches for MySQL to collect index usage data.
See:
User Statistics (and Index Statistics)
How expensive is USER_STATISTICS?
pt-index-usage
See also:
New INDEX_STATISTICS table in the information_schema
check-unused-keys: A tool to interact with INDEX_STATISTICS
New table_io_waits_summary_by_index_usage table in performance_schema in MySQL 5.6
You may also be interested in a Java app called MySQLIndexAnalyzer, which helps to find redundant indexes. But this tool doesn't have any idea which indexes are unused.