Is it bad practice to create a mirrored table (MyISAM) of the records in an InnoDB table for the purposes of doing full-text searches? I figure this way I'm just searching a copy of the data and if anything happens to that data it's not as big of a deal because it can always be re-created. But, it just feels awkward.
(MyISAM is the only engine that supports full-text searching, but I need to use the foreign key constraints offered by InnoDB)
Should I avoid this?
first of all, have you considered using a good search indexer? for example lucene : http://lucene.apache.org/java/docs/ will speed up searches a lot as it builds its own index tables.
if you definitely want to use the inbuilt mysql full-text search, you could cut down the myisam table so that it just contains the text data you want to search and the primary key - and then retrieve the proper data from the normal innodb tables once you know the pkey. that would avoid duplication of the other data in the table.
Related
This post says:
If you’re running Innodb Plugin on Percona Server with XtraDB you get
benefit of a great new feature – ability to build indexes by sort
instead of via insertion
However I could not find any info on this. I'd like to have an ability to reorganize how a table is laid out physically, similar to Postgre CLUSTER command, or MyISAM "alter table ... order by". For example table "posts" has millions of rows in random insertion order, most queries use "where userid = " and I want the table to have rows belonging to one user physically separated nearby on disk, so that common queries require low IO. Is it possible with XtraDB?
Clarification concerning the blog post
The feature you are basically looking at is fast index creation. This features speeds up the creation of secondary indexes to InnoDB tables, but it is only used in very specific cases. For example the feature is not used while OPTIMIZE TABLE, which can therefore be dramatically speed up by dropping the indexes first, then run OPTIMIZE TABLE and then recreate the indexes with fast index creation (about this was the post you linked).
Some kind of automation for the cases, which can be improved by using this feature manually like above, was added to Percona Server as a system variable named expand_fast_index_creation. If activated, the server should use fast index creation not only in the very specific cases, but in all cases it might help, such as OPTIMIZE TABLE — the problem mentioned in the linked blog article.
Concerning your question
Your question was actually if it is possible to save InnoDB tables in a custom order to speed up specific kind of queries by exploiting locality on the disk.
This is not possible. InnoDB rows are saved in pages, based on the clustered index (which is essentially the primary key). The rows/pages might be in chaotic ordering, for which one can OPTIMIZE TABLE the InnoDB table. With this command the table is actually recreated in primary key order. This allows to gather primary key local rows on the same or neighboring pages.
That is all you can force InnoDB to do. You can read the manual about clustered index, another page in the manual as a definite answer that this is not possible ("ORDER BY does not make sense for InnoDB tables because InnoDB always orders table rows according to the clustered index.") and the same question on dba.stackexchange which answers might interest you.
I have a DB table that is myISAM, used for fulltext searching. I also have a table that is InnoDB. I have a column in my myISAM table that I want to match with a column in my InnoDB table. Can that be done? I cant seem to work it out!
http://dev.mysql.com/doc/refman/5.0/en/innodb-foreign-key-constraints.html
Foreign keys definitions are subject to the following conditions:
Both tables must be InnoDB tables and they must not be TEMPORARY tables.
So, I'm afraid you wont be able to achieve what you want done.
I would recommend altering your DB architecture such that you have one set of tables designed with data integrity for writing (all InnoDB), and a second set designed for search - possibly on a different box, and possibly not even using MySQL, but maybe a search server like Solr or Sphinx, which should outperform a fulltext MySQL table. You could then populate your search DB periodically from your write DB.
I was trying to make a full-text search for a simple engine and i have realized i can not do it with InnoDB.
Now, i was wondering if I can convert only that table to MyISAM and still working with the others ones at the same time. This table has a foreign key with another table which is at the same time related with many others.
Thanks.
I'm really lost on which type of database engine should I pick for my table.
+-----------------------+
| id | userid | content |
+-----------------------+
Imagine this table. userid is holding user ids which are stored in another table. Also, some other tables are using the id field of this table. Therefore, I thought that setting id as primary key and userid as a foreign key would speed up the join processes. However, if I select my table as InnoDB to set foreign keys, then I cannot conduct a FULLTEXT search on content (which is a TEXT field).
So basically, if I switch back to MyISAM to use the FULLTEXT searches, will I have problems when joining, say, 3-4 tables of hundreds of millions of rows?
PS: If there is another liable way to create tables to handle both joins and fulltexts, please tell me so, I can change the tables structure as well.
Take a look at the answer for this question: Fulltext Search with InnoDB
In short, MyISAM locks an entire table when you write to it, so that will be bad for performance when you have a lot of writes to the table. The solution is to go for the InnoDB tables for the referential integrity, and use a dedicated search engine for the indexing/searchfing of the content (for example Lucene).
InnoDB scales better than MyISAM. If you're talking about hundreds of millions of row then go for InnoDB and adapt a search engine. AFAIK, FULLTEXT becomes really slow after a certain point. Therefore, go for InnoDB + a search engine of your choice.
Consider this scenario, my database table has 300000 rows and has a fulltext index. Whenever a search is done it locks the database and doesn't allow anyone else to login to the portal.
Any advice on how to get things sorted out here will be really appreciable.
Does logging on perform a write to the table? eg. a 'last visit' time?
If so you may expect behaviour something like this because MyISAM writes do a lock over the entire table. Usually this is avoided by not using noddy MyISAM and going to InnoDB instead, which has row-level locking (amongst other desirable database features).
The problem, of course, is that you only get fulltext search with MyISAM.
So you'll need to split your tables up. If you can keep the read-heavy and fulltext stuff in a different table to the stuff that needs writing (but linked using the same primary key), you can probably make it so that the two operations don't affect each other.
Better, migrate the bulk of the table to InnoDB, leaving only a fulltext field in MyISAM. Everything except fulltext searches can then steer clear of the MyISAM table, and use only the InnoDB table which exhibits much better locking performance. Personally, I now tend to store everything in the InnoDB table, including the text, and store a second copy of the text in the MyISAM table purely for fulltext searchbait purposes; this simplifies queries and code and brings the advantages of InnoDB's consistency to the text content, and I also use it to process the searchbait to get stemming and other features MySQL's fulltext doesn't normally support. But it does mean you have to spend a lot more space on storage.
You can also improve matters by cutting down number of writes. For example if it is a 'last visit' timestamp you're writing, you can avoid writing that unless, say, a minute has passed between the previous time and now, on the basis that no-one needs to know the exact second someone last accessed the site.
If you use an external search engine or MySQL search plug-ins Lucene or Sphinx, they should be able to read and index without locking the table. They store a local version of the indexed records, so they don't have to read the table very often, and never need to write to it.