I have a question that , i use the index by the mysql database itself to index is better or using the index function in sphinx index.
it depends.
sphinx is a better index by all standards but 1. it's faster, has a more advanced syntax, is smaller, is more scalable, doesn't involve myisam.
mysql is more maintainable, simpler to install, does not involve another tier to your application, and if it's good enough you might as well use it.
Sphinx provides you full text search option. You can say its a mini search engine embedded in your app. And with no doubts its better Mysql Index.
If you just want to index auto incremented integer columns then there seems no point adding sphinx in your app. Still database size matters.
Checkout some the previously asked questions to get better idea what suits your needs.
https://stackoverflow.com/questions/tagged/sphinx?sort=votes
Related
I am willing to use sphinx with MySQL for my current project.
MYISAM as database engine as this db is gonna be only read-only with 10-25 millions of records.
so i would like to know whether ,
Does using union or joins in query causes performance issues in Sphinx ?
as i am about to design database and if union/joins gonna cause the slower performance then i can go for optimized design for sphinx.
Maybe like creating one big table with all fields and data and then creating separate INDEXES in sphinx depending on the data to be searched.
please guide me in correct direction.
thanks for your time.
Sphinx cant do joins anyway. Can do unions, just searching multiple indexes at once.
Or do you mean to build the sphinx index (ie in sql_query)? Indexer will only run the queries to build the indexes in the first place.
As you say read only - hence no updates, the indexes should never rebuilding, so doesnt really matter how slow they are.
In general a sphinx index will perform very similar regardless of how many feilds. So shouldnt need to split into different indexes. JUst have one multi purpose index (if its possible).
YOu can however shard the index into bits, so can distribute to multiple servers if performance becomes an issue.
I read that MySQL fulltext search can cause table locking. It means people can't insert or update the table when it's being searched on.
I read that there are many search servers (Lucence and Sphinx) can do it without table locking and even faster. It requires many configuration and hard to implement.
Is there any other way to use fulltext or some searching like that without using search service? I don't want to configure one more server other than MySQL.
Create an extra table which will be used only to perform FULLTEXT searches. In your code you have to ensure that all data and actions (create, update, delete) are properly replicated to this table. This solution is also handy if your data tables are running e.g. InnoDB engine.
Apache Lucene doesn't need many configuration and isn't hard to implement. Moreover, it's one of the most popular fulltext search engine, and allows the users to do very precise queries, like "to be or not to be", j?hn d?e, func*, etc.
I already did some database indexing with Lucene, so if you could be a bit more precise about which fields of which tables you wanna index, I can give you pieces of code which should do the trick.
I vote for Sphinxsearch anyway. It has one of APIs close to Mysql, easy to install and configure. Not so universal as Apache Lucene, but jet quick and very helpful in my projects.
I just use following code right now.
SELECT terms FROM searches WHERE MATCH(q) AGAINST('search term') LIMIT 20;
The table is MyISAM 90MB. terms has FULLTEXT INDEX and it is varchar(255). There are 1,000,000 rows on the table.
I wonder if there is any solution which is more resource usage friendly than fulltext search on MySQL? Especially in terms of memory.
By saying solution, I refer to any solution such as other types of databases, table structures etc.
and if the solution would be adaptable to a standart VPS or hosting in general, it would be extremely super duper perfect!
Thanks for your time!
I would check out Apache Solr. You can continue to use your MySQL database, have the Solr server index that column and use the Solr server to later do full-text searches on that column. There are even hosted solutions, see WebSolr.
I'm currently doing research on the best ways to provide an advanced search for my php project.
I have narrowed it down to using FULLTEXT search rather than using LIKE to find matching strings in a table row. However, to do this it appears I need to sacrifice using the InnoDB engine which will make me lose the ACIDity of transactions and table relationships.
Is it really worth using the MYISAM mysql engine or are there better ways of providing search functionality.
Any pointers would be appreciated!
It really depends on the application... Using MyISAM for anything that needs referential integrity is an instant fail. At the same time, it's text search isn't all that efficient.
Basically, there are two ways to go. If you find you don't need true referential integrity, consider a NoSQL datastore. MongoDB is a great document store database.
If, on the other hand, you really need referential integrity, but also need fast, indexed full-text searching, you might do better to use Sphinx or Apache Solr to create an indexed cache for full-text search.
Either way, I consider MyISAM to be a legacy datastore. I wouldn't use it on a new project. YMMV.
MyISAM has several drawbacks - lack of transaction support, table-level locks which makes it very slow in heavy read+write load type. Another inconvenience of MyISAM tables - they are not crash safe so you can lost some data in case of unexpected shutdown or power loss on server. However MyISAM is very fast on some queries.
Regarding the FullText search I would suggest to use InnoDB + external search engine like Lucene or Sphinx so you could benefit from both safe and reliable storage engine and fast Full-text queries.
For quick start with InnoDB and Sphinx you can refer to http://astellar.com/2011/12/replacing-mysql-full-text-search-with-sphinx/
MySQL 5.6 supports FULLTEXT indexes with InnoDB (released Feb 2013). See:
http://dev.mysql.com/doc/refman/5.6/en/fulltext-search.html
I have some very large databases (some up to 150M rows) I'm working with & after initially inserting the data there isn't much INSERT's going on; just a lot of SELECT's & usage of JOINS.
I've been messing around with InfoBright a lot (the community version) & whilst I believe it is a good engine, I personally have been having some problems with it getting it to run like it should (fast).
So I was wondering if anyone else could recommend any other fast free storage engine for MySQL?
I'm just now checking out tokudb; is there anything else out there to check out as well?
You should look at InfiniDB too. http://infinidb.org/ (one of the fastest)
There are a lot of considerations you need to make before benchmarking any engine. Hardware stuff like multicore processors, memory, configuration. Design stuff related to your schema etc etc. and how all this impacts the engine performance.
Do check this blog out for how they do benchmarking of engines (it names other engine types) - http://www.mysqlperformanceblog.com/2010/01/07/star-schema-bechmark-infobright-infinidb-and-luciddb/
Note that this comparison is for a star schema design. If a columnar db engine doesn't suit your requirements, you can look into XtraDB , which is an extended version of InnoDB (not the fastest, but is ACID compliant).
ps - Always track the properties (important to you) of each engine - like referential integrity checks, ACID compliance etc. Sometimes these limitations can be bigger deal breakers as compared to a 10% increase in query performance
Have you looked at Sphinx at all? While it is a search engine, it also supports query-less searches, which is similar to standard SELECT queries with indexes. I found it to be a huge help when dealing with large datasets. It's very fast, and is used heavily in high-traffic forums who are up in the millions (or hundreds of millions) of posts arena.
There is also a plugin for MySQL called SphinxSE which allows it to act as a MySQL storage engine which makes integration very easy to set up. You build your indexes by supplying the indexer program a query, and then once it's all set up, you can query it as if it was a normal table.
http://sphinxsearch.com/docs/2.0.1/sphinxse-overview.html (note, I haven't used it much since pre 1.0)
Besides taking into consideration which DBMS you use, you should also focus on optimizing your tables, indices and queries.
Whenever you have multiple joins, join first on the most selective relation and then on the less selective.
Analyze your query execution plans.
Create indices on columns that are hit often in your QEPs.
Brett -
When using Infobright, you get the best performance gains by:
1) Utilizing the Knowledge Grid as much as possible
2) reducing joins
3) creating 'lookup'
Since the Knowledge-Grid is in-memory, you can kill off a lot of query time just by adding additional filters. Also, consider using a nested select instead of a join. By doing so, you can use an already-created knowledge node (instead of generating a pack-to-pack node on the fly).
If you have some queries that you think should be faster, post them, and I can help with potentially modifying the query to make it run faster.
Cheers,
Jeff