Adding a FULLTEXT index on existing db - mysql

I have a MySQL table that has 165,716 records (and counting). The table is 233,3 MB large. Now I want to add a FULLTEXT index to a column in that table. Is that possible, or is it going to be a problem?
Kind regards

It's possible if the table is running the MyISAM engine, but it's likely to take a long time to complete the initial indexing.
[edit]I misread the size - I thought it was 2.3GB, not 233MB! If that's the case, the indexing shouldn't take that long.

Related

innodb_ft_min_token_size = 1 performance implications

If I change innodb_ft_min_token_size =1 from default of 3, will this cause a lot more disk usage? Any performance issues with search?
I want to be able to use fulltext search in 1 character in words.
Also once I make this change how would I rebuild the index? Will this put a lot of load on server?
There are not that many 1- and 2- letter words, so the space change may not be that great.
Modifying innodb_ft_min_token_size, innodb_ft_max_token_size, or ngram_token_size [in my.cnf] requires restarting the server.
To rebuild FULLTEXT indexes for an InnoDB table, use ALTER TABLE with the DROP INDEX and ADD INDEX options to drop and re-create each index.
-- https://dev.mysql.com/doc/refman/8.0/en/fulltext-fine-tuning.html
The "Scope" of innodb_ft_min_token_size is "Global". That is, it applies to all InnoDB FT indexes.
-- https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_ft_min_token_size
Recreating the index will read the entire table and rebuild the FT index, which will "lock" the table at some level for some period of time. The time to rebuild will be roughly proportional to the size of the table. And it will consume a bunch of extra disk space until it is finished. (The table and all the indexes will be copied over and at least the FT index will be rebuilt.)
If you have a thousand rows, no big deal. If you have a billion rows, you will need a long "downtime".
After changing the innodb_ft_max_token_size, I would be afraid to do a short wildcard test like
AGAINST('a*' IN BOOLEAN MODE)
If you have a test server, simply try it.
I noticed that the documentation recommends a value of 1 for Chinese, etc.

re-indexing in mysql

I have a table which already contains an index in MySQL. I added some rows to the table, do I need to re-index the table somehow or does MySQL do this for me automatically?
This would be done automatically. This is the reason, why sometimes we don't want to create indexes -- rebuilding of parts of indexes on inserting have small but not empty overhead in performance.
If you define an index in MySQL then it will always reflect the current state of the database unless you have deliberately disabled indexing. As soon as indexing is re-enabled, the index will be brought up to date. Usually indexing is only disabled during large insertions for performance reasons.
There is a cost associated with each index on your table. While a good index can speed up retrieval times immensely, every index you define slows insertion by a small amount. The insertion costs grow slowly with the size of the database. This is why you should only define indexes you absolutely need if you are going to be working on large sets of data.
If you want to see what indexes are defined, you can use SHOW CREATE TABLE to have a look at a particular table.
No, you didn't need to rebuild index.
Record insertion will automatically affect old index..

i am inserting a lot of records into a large table, should I remove the indices until I am finished?

I am importing a lot of data (something like 75 million inserts) into a MySQL database with a few different tables.
I have indexes on a lot of columns. Should I remove them while I do the inserts and just add them back after it is done? Will that have a significant impact on performance?
I get the feeling the import has slowed down now that I have imported a few hundred thousand records, and I suspect the indexes might be case.
Would any more information be useful?
There is no yes/no answer for that
The most important point being: If the table is used while importing,
do not disable the indices: Just imagine a few simple queries
falling back to full table scan, after you have inserted 74 million
records.
Closely realted: Can you make sure, the table is not needed after being filled, but the indices not yet built?
If you can do the insert on a completly "cold" table, I'd drop the indices and rebuild them later.
Yes, certainly. You need to remove the indexes before importing large amount of records. Otherwise it will not only take a long time to import, but it will also make the existing indexes heavily fragmented. You will have to rebuild the index anyway to restore optimal performance.
If you remove the indexes before doing the import, then import will be faster. After the import, create the indexes again and then the indexes will be created fresh and they will have no fragmentation and the search performance will be faster as well.

Different ways to create index in mysql?

If i use
CREATE INDEX status_index ON eligible_users (status)
or
CREATE INDEX status ON eligible_users (status)
its the same thing no difference?
Also if i create alot of indexes will it actually help with queries or slow down?
Both statements you wrote do the same exact thing, only difference is the name of the index.
As for usefulness, it depends on the database setup and usage. Indexes are useful to speed up queries, but they have to be maintained on every INSERT/UPDATE, so it depends. There are a lot of resources available online about this wide topic.
An index can make or break a query. The execution time for certain queries can go from minutes to fractions of a second just by adding the correct indexes. In case you need to improve a query you can always prepend EXPLAIN to it, to see what MySQL's execution plan is: it will show what indexes the query uses (if any) and will help you troubleshoot some bottlenecks.
As said, an index is useful but is not free. It has to be kept up to date, so every time you insert or modify data in an indexed field, then the index must be updated too.
Generally in cases where you have a lot of reads and (relatively) few writes, indexes help a lot. But unnecessary indexes can degrade performance instead of improving it.
The short syntax for creating a single column index on column col from table tbl is:
CREATE INDEX [index_name] ON tbl (col)
Full details available in the MySQL Manual.

Mysql - Index Performances

Is there any performance issues if you create an index with multiple columns, or should you do 1 index per column?
There's nothing inherently wrong with a multi-column index -- it depends completely on how you're going to query the data. If you have an index on colA+colB, it will help for queries like where colA='value' and colA='value' and colB='value' but it's not going to help for queries like colB='value'.
Advantages of MySQL Indexes
Generally speaking, MySQL indexing into database gives you three advantages:
Query optimization: Indexes make search queries much faster.
Uniqueness: Indexes like primary key index and unique index help to avoid duplicate row data.
Text searching: Full-text indexes in MySQL version 3.23.23, users have the opportunity to optimize searching against even large amounts of text located in any field indexed as such.
Disadvantages of MySQL indexes
When an index is created on the column(s), MySQL also creates a separate file that is sorted, and contains only the field(s) you're interested in sorting on.
Firstly, the indexes take up disk space. Usually the space usage isn’t significant, but because of creating index on every column in every possible combination, the index file would grow much more quickly than the data file. In the case when a table is of large table size, the index file could reach the operating system’s maximum file size.
Secondly, the indexes slow down the speed of writing queries, such as INSERT, UPDATE and DELETE. Because MySQL has to internally maintain the “pointers” to the inserted rows in the actual data file, so there is a performance price to pay in case of above said writing queries because every time a record is changed, the indexes must be updated. However, you may be able to write your queries in such a way that do not cause the very noticeable performance degradation.