Fulltext and composite indexes and how they affect the query - mysql

Just say I had a query as below..
SELECT
name,category,address,city,state
FROM
table
WHERE
MATCH(name,subcategory,category,tag1) AGAINST('education')
AND
city='Oakland'
AND
state='CA'
LIMIT
0, 10;
..and I had a fulltext index as name,subcategory,category,tag1 and a composite index as city,state; is this good enough for this query? Just wondering if something extra is needed when mixing additional AND's when making use of the fulltext index with the MATCH/AGAINST.
Edit: What I am trying to understand is, what happens with the additional columns that are within the query but are not indexed in the chosen index (the fulltext index), the above example being city and state. How does MySQL now find the matching rows for these since it can't use two indexes (or can it?) - so, basically, I'm trying to understand how MySQL goes about finding the data optimally for the columns NOT in the chosen fulltext index and if there is anything I can or should do to optimize the query.

If I understand your question, you know that the MATCH AGAINST uses your FULLTEXT index and your wondering how MySQL goes about applying the rest of the WHERE clause (ie. does it do a tablescan or an indexed lookup).
Here's what I'm assuming about your table: it has a PRIMARY KEY on some id column and the FULLTEXT index.
So first off, MySQL will never use the FULLTEXT index for the city/state WHERE clause. Why? Because FULLTEXT indexes only apply with MATCH AGAINST. See here in the paragraph after the first set of bullets (not the Table of Contents bullets).
EDIT: In your case, assuming your table doesn't only have like 10 rows, MySQL will apply the FULLTEXT index for your MATCH AGAINST, then do a tablescan on those results to apply the city/state WHERE.
So what if you add a BTREE index onto city and state?
CREATE INDEX city__state ON table (city(10),state(2)) USING BTREE;
Well MySQL can only use one index for this query since it's a simple select. It will either use the FULLTEXT or the BTREE. Note that when I say one index, I mean one index definition, not one column in a multi-part index. Anwway, this then begs the question which one does it use?
That depends on the table analysis. MySQL will attempt to estimate (based on table stats from the last OPTIMIZE TABLE) which index will prune the most records. If the city/state WHERE gets you down to 10 records while the MATCH AGAINST only gets you down to 100, then MySQL will use the city__state index first for the city/state WHERE and then do a tablescan for the MATCH AGAINST.
On the other hand, if the MATCH_AGAINST gets you down to 10 records while the city/state WHERE gets you down to only a 1000, then MySQL will apply the FULLTEXT index first and tablescan for city and state.
The bottom line is the cardinality of your index. Essentially, how unique are the values that will go into your index? If every record in your table has city set to Oakland, then it's not a very unique key and so having city = 'Oakland' doesn't really reduce the number of records all that much for you. In that case, we say your city__state index has a low cardinality.
Consequently if 90% of the words in your FULLTEXT index are "John", then that doesn't really help you much either for the exact same reasons.
If you can afford the space and the UPDATE/DELETE/INSERT overhead, I would recommend adding the BTREE index and letting MySQL decide which index he wants to use. In my experience, he usually does a very good job of picking the right one.
I hope that answers your question.
EDIT: On a side note, making sure you pick the right size for your BTREE index (in my example I picked the first 10 char in city). This obviously makes a huge impact to cardinality. If you picked city(1), then obviously you'll get a lower cardinality then if you did city(10).
EDIT2: MySQL's query plan (estimation) for which index prunes the most records is what you see in EXPLAIN.

I think you can easily determine which index gets used by using EXPLAIN on your query. Please check the accepted answer for this question, which provides some good resources on how to interpret the output of EXPLAIN.
How does MySQL now find the matching rows for these since it can't use
two indexes
Yes it can: Can MySQL use multiple indexes for a single query? Also, you should read the documentation: How MySQL Uses Indexes

I had similar task some time ago, and I have noticed that MySQL can use either FULLTEXT index or any other index/indexes in one query, but not both; I wasn't able to mix FULLTEXT with any other index. Any selection with fulltext search will work in such way:
select subset using FULLTEXT search
select records matching other criteria from that subset 'Using where'
So you can use either fulltext index or any other index (I wasn't able to use both indexes by FORCE INDEX or anything else).
I suggest trying with both using fulltext and using other index (i.e. on City and State columns) and compare the results - they may vary depending on actual content in your database.
In my case I have discovered that forcing regular (non-fulltext) index in such query produced better performance (since I had very large number of rows, about 300 000, and non-fulltext criteria matched about 1000 of them).
I was using MySQL 5.5.24

Related

Other Indexes not working with full text index in mysql [duplicate]

I would like to make system whitch allows to search user messages, by specific user.
assume having folowing table
create table messages(
user_id int,
message nvarchar(500));
So what kind of index I should use here, if I want to search for all messages from user 1, containing word 'foo'.
Simple, non unique index user_id
It will filter only specific user messages nd then full scan for specific word.
FULLTEXT index on message
this will find all messages from all users and then filter by ID, seems to be very inefficient in case of big amount of users.
comopound index on both user_id and message
So full text index tree is created for each user separately, so they can be searched individually. During query system filters messages by ID and then performs text search on remaining rows in index.
A.F.A.I.K. last one is impossible. So then I assume I shall use 1-st option, It will perform better in case of few thousands of users?
And if each will have ~100 messages, full iteration won't cost much resources?
Perhaps I can include username into message and use BOOLEAN full text search mode, but I think it would be slower than by using indexed user_id.
#Alden Quimby's answer is correct as far as it goes, but there is more to the story, because MySQL will only try to choose the optimal index, and its ability to make that determination is limited because of the way fulltext indexes interact with the optimizer.
What actually happens is this:
If the specified user_id exists in either 0 or 1 matching rows in the table, the optimizer will realize this and will choose user_id as the index for that query. Fast execution.
Otherwise, the optimizer will choose the fulltext index, filtering every row matched by the fulltext index to eliminate rows not containing a user_id that matches the WHERE clause. Not quite as fast.
So it's not truly the "optimum" path. It's more like fulltext, with a nice optimization to avoid the fulltext search under the one condition that we know we have almost nothing of interest in the table.
The reason this breaks down is that a fulltext index doesn't give any meaningful statistics back to the optimizer. It just says "yeah, I think that query should probably only require me to check 1 row" ... which, of course, pleases the optimizer greatly, so the fulltext index wins the bid for lowest cost, unless the index with the integer value also comes in comparably low or lower.
Still, that doesn't mean I wouldn't try it this way first.
There's another option, which would work best with fulltext queries IN BOOLEAN MODE and that is to create another column which you would populate with something like CONCAT('user_id_',user_id) or something similar, and then declare a 2-column fulltext index.
filter_string VARCHAR(48) # populated with CONCAT('user_id_',user_id);
....
FULLTEXT KEY (message,filter_string)
Then specify everything in the query.
SELECT ...
WHERE user_id = 500 AND
MATCH (message,filter_string) AGAINST ('+kittens +puppies +user_id_500' IN BOOLEAN MODE);
Now, the fulltext index will be responsible for matching only those rows where kittens, puppies, and "user_id_500" appears in the combined fulltext index of the two columns, but you'd still want to have the integer filter there too to make sure the final results are constrained in spite of any random appearance of "user_id_500" in the message.
You should add a fulltext index on message and a regular index on user_id, and use the query:
SELECT *
FROM messages
WHERE MATCH(message) AGAINST(#search_query)
AND user_id = #user_id;
You're right that you can't do option 3. But rather than trying to pick between 1 and 2, let MySQL do the work for you. MySQL will only use one of the two indexes, and will do a linear scan to complete the second filter, but it will estimate the effectiveness of each index and choose the optimal one.
Note: only do this if you can afford the overhead of two indexes (slower insert/update/delete). Also, if you know that each user will only have a few messages, then yes it might make sense to use a simple index and do a regex in the application layer or something like that.
Turn on the "Optimizer trace" and look for "considered_execution_plans". I contend that the Optimizer will always pick the FULLTEXT index, even when some other index might be better. This may be because it is quite costly when the MATCH is not pre-computed as when the FT index is built.
More on Optimizer Trace: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#optimizer_trace (Earlier in that doc are my tips on FULLTEXT.)

Improve Mysql Select Query Performance [duplicate]

I've been using indexes on my MySQL databases for a while now but never properly learnt about them. Generally I put an index on any fields that I will be searching or selecting using a WHERE clause but sometimes it doesn't seem so black and white.
What are the best practices for MySQL indexes?
Example situations/dilemmas:
If a table has six columns and all of them are searchable, should I index all of them or none of them?
What are the negative performance impacts of indexing?
If I have a VARCHAR 2500 column which is searchable from parts of my site, should I index it?
You should definitely spend some time reading up on indexing, there's a lot written about it, and it's important to understand what's going on.
Broadly speaking, an index imposes an ordering on the rows of a table.
For simplicity's sake, imagine a table is just a big CSV file. Whenever a row is inserted, it's inserted at the end. So the "natural" ordering of the table is just the order in which rows were inserted.
Imagine you've got that CSV file loaded up in a very rudimentary spreadsheet application. All this spreadsheet does is display the data, and numbers the rows in sequential order.
Now imagine that you need to find all the rows that have some value "M" in the third column. Given what you have available, you have only one option. You scan the table checking the value of the third column for each row. If you've got a lot of rows, this method (a "table scan") can take a long time!
Now imagine that in addition to this table, you've got an index. This particular index is the index of values in the third column. The index lists all of the values from the third column, in some meaningful order (say, alphabetically) and for each of them, provides a list of row numbers where that value appears.
Now you have a good strategy for finding all the rows where the value of the third column is "M". For instance, you can perform a binary search! Whereas the table scan requires you to look N rows (where N is the number of rows), the binary search only requires that you look at log-n index entries, in the very worst case. Wow, that's sure a lot easier!
Of course, if you have this index, and you're adding rows to the table (at the end, since that's how our conceptual table works), you need to update the index each and every time. So you do a little more work while you're writing new rows, but you save a ton of time when you're searching for something.
So, in general, indexing creates a tradeoff between read efficiency and write efficiency. With no indexes, inserts can be very fast -- the database engine just adds a row to the table. As you add indexes, the engine must update each index while performing the insert.
On the other hand, reads become a lot faster.
Hopefully that covers your first two questions (as others have answered -- you need to find the right balance).
Your third scenario is a little more complicated. If you're using LIKE, indexing engines will typically help with your read speed up to the first "%". In other words, if you're SELECTing WHERE column LIKE 'foo%bar%', the database will use the index to find all the rows where column starts with "foo", and then need to scan that intermediate rowset to find the subset that contains "bar". SELECT ... WHERE column LIKE '%bar%' can't use the index. I hope you can see why.
Finally, you need to start thinking about indexes on more than one column. The concept is the same, and behaves similarly to the LIKE stuff -- essentially, if you have an index on (a,b,c), the engine will continue using the index from left to right as best it can. So a search on column a might use the (a,b,c) index, as would one on (a,b). However, the engine would need to do a full table scan if you were searching WHERE b=5 AND c=1)
Hopefully this helps shed a little light, but I must reiterate that you're best off spending a few hours digging around for good articles that explain these things in depth. It's also a good idea to read your particular database server's documentation. The way indices are implemented and used by query planners can vary pretty widely.
Check out presentations like More Mastering the Art of Indexing.
Update 12/2012: I have posted a new presentation of mine: How to Design Indexes, Really. I presented this in October 2012 at ZendCon in Santa Clara, and in December 2012 at Percona Live London.
Designing the best indexes is a process that has to match the queries you run in your app.
It's hard to recommend any general-purpose rules about which columns are best to index, or whether you should index all columns, no columns, which indexes should span multiple columns, etc. It depends on the queries you need to run.
Yes, there is some overhead so you shouldn't create indexes needlessly. But you should create the indexes that give benefit to the queries you need to run quickly. The overhead of an index is usually far outweighed by its benefit.
For a column that is VARCHAR(2500), you probably want to use a FULLTEXT index or a prefix index:
CREATE INDEX i ON SomeTable(longVarchar(100));
Note that a conventional index can't help if you're searching for words that may be in the middle of that long varchar. For that, use a fulltext index.
I won't repeat some of the good advice in other answers, but will add:
Compound Indices
You can create compound indices - an index that includes multiple columns. MySQL can use these from left to right. So if you have:
Table A
Id
Name
Category
Age
Description
if you have a compound index that includes Name/Category/Age in that order, these WHERE clauses would use the index:
WHERE Name='Eric' and Category='A'
WHERE Name='Eric' and Category='A' and Age > 18
but
WHERE Category='A' and Age > 18
would not use that index because everything has to be used from left to right.
Explain
Use Explain / Explain Extended to understand what indices are available to MySQL and which one it actually selects. MySQL will only use ONE key per query.
EXPLAIN EXTENDED SELECT * from Table WHERE Something='ABC'
Slow Query Log
Turn on the slow query log to see which queries are running slow.
Wide Columns
If you have a wide column where MOST of the distinction happens in the first several characters, you can use only the first N characters in your index. Example: We have a ReferenceNumber column defined as varchar(255) but 97% of the cases, the reference number is 10 characters or less. I changed the index to only look at the first 10 characters and improved performance quite a bit.
If a table has six columns and all of them are searchable, should i index all of them or none of them
Are you searching on a field by field basis or are some searches using multiple fields?
Which fields are most being searched on?
What are the field types? (Index works better on INTs than on VARCHARs for example)
Have you tried using EXPLAIN on the queries that are being run?
What are the negetive performance impacts of indexing
UPDATEs and INSERTs will be slower. There's also the extra storage space requirments, but that's usual unimportant these days.
If i have a VARCHAR 2500 column which is searchable from parts of my site, should i index it
No, unless it's UNIQUE (which means it's already indexed) or you only search for exact matches on that field (not using LIKE or mySQL's fulltext search).
Generally I put an index on any fields that i will be searching or selecting using a WHERE clause
I'd normally index the fields that are the most queried, and then INTs/BOOLEANs/ENUMs rather that fields that are VARCHARS. Don't forget, often you need to create an index on combined fields, rather than an index on an individual field. Use EXPLAIN, and check the slow log.
Load Data Efficiently: Indexes speed up retrievals but slow down inserts and deletes, as well as updates of values in indexed columns. That is, indexes slow down most operations that involve writing. This occurs because writing a row requires writing not only the data row, it requires changes to any indexes as well. The more indexes a table has, the more changes need to be made, and the greater the average performance degradation. Most tables receive many reads and few writes, but for a table with a high percentage of writes, the cost of index updating might be significant.
Avoid Indexes: If you don’t need a particular index to help queries perform better, don’t create it.
Disk Space: An index takes up disk space, and multiple indexes take up correspondingly more space. This might cause you to reach a table size limit more quickly than if there are no indexes. Avoid indexes wherever possible.
Takeaway: Don't over index
In general, indices help speedup database search, having the disadvantage of using extra disk space and slowing INSERT / UPDATE / DELETE queries. Use EXPLAIN and read the results to find out when MySQL uses your indices.
If a table has six columns and all of them are searchable, should i index all of them or none of them?
Indexing all six columns isn't always the best practice.
(a) Are you going to use any of those columns when searching for specific information?
(b) What is the selectivity of those columns (how many distinct values are there stored, in comparison to the total amount of records on the table)?
MySQL uses a cost-based optimizer, which tries to find the "cheapest" path when performing a query. And fields with low selectivity aren't good candidates.
What are the negetive performance impacts of indexing?
Already answered: extra disk space, lower performance during insert - update - delete.
If i have a VARCHAR 2500 column which is searchable from parts of my site, should i index it?
Try the FULLTEXT Index.
1/2) Indexes speed up certain select operations but they slow down other operations like insert, update and deletes. It can be a fine balance.
3) use a full text index or perhaps sphinx

MySQL Index is NULL but there are available Keys

I have the following problem when running a mysql query:
Query is very slow and when i use explain the query key is null but possible_keys are avaiable and the order is correct, i also tried adding independent indexes per each row but still key was NULL.
You can see table, index and mysql explain here: https://snag.gy/vcChl6.jpg
The optimizer likely has just decided that there is no reason to use the index.
Since you are using SELECT * that means that means that if it used the index, then it would have to use the primary key from the index to then go back and look up all the necessary data from the clustered index. That is referred to as a double lookup, and is generally bad for performance. As there are so few records in this table, the optimizer likely decided that it can easily do a full table scan instead and get your result faster.
In short, this is expected behavior.
If you want to SELECT just some columns, add them to the t1 index and then just SELECT only the columns you need, with that given WHERE clause. It should use the index then. As your table grows in size, it may start using the index as well, once it estimates that the double lookup is cheaper than the full table scan.
A guess: Most rows are of that 'project' and that 'lang'.
The Optimizer does not understand that fact, so it takes the index that is obviously the best:
(id_project, id_lang)
This one would be equally good: (id_lang, id_project).
No fair... The EXPLAIN mentions indexes named id_project and id_lang (not useful), but the list of indexes shows a composite index t1(id_project, id_lang) (useful).
Then, as Willem suggests, it has to bounce between the index and the table. Normally (that is, when it has adequate statistics), the Optimizer will say "Oh, more than ~20% of the table is being referenced; let's ignore any index."
Things you can do:
Get rid of that index.
Change * to a list of just the columns you need. In particular, if you avoid the 3 TEXT columns, two optimizations kick in. Alternatively, any that will never be longer than 255 characters can be changed to VARCHAR(255).
Use some other filtering, ordering, limiting, etc. If this is a web application, do you really want to get ~534 rows?

Understanding Indexes in MySQL

I am trying to understand indexes in MySQL. I know that an index created in a table can speed up executing queries and it can slow down the inserting and updating of rows.
When creating an index, I used this query on a table called authors that contains (AuthorNum, AuthorFName, AuthorLName, ...)
Create index Index_1 on Authors ([What to put here]);
I know I have to put a column name, but which one?
Do I have to put the column name that will be compared in the Where statement when a user query the Table or what?
The Anatomy of an Index
An index is a distinct data structure within a database and is data redundancy. Its primary purpose is to provide an ordered representation of the indexed data through a logical ordering which is independent of the physical ordering. We do this using a doubly linked list and a tree structure known as the balanced search tree (B-tree). B-trees are nice because they keep data sorted and allow searches, access, insertions, and deletions in logarithmic time. Because of the doubly linked list, we are able to go backwards or forwards as needed on the index for various queries easily. Inserts become simple since we only have to rearrange pointers to the different pieces of data. Databases use these doubly linked list to connect leaf nodes (usually in a B+ tree or B-tree), each of which are stored in a page, and to establish logical ordering between the leaf nodes. Operations like UPDATE or INSERT become slower because they are actually two writing operations in the filesystem (one for the table data and one for the index data).
Defining an Optimal Index With WHERE
To define an optimal index you must not only understand how indexes work, but you must also understand how the application queries the data. E.g., you must know the column combinations that appear in the WHERE clause.
A common restriction with queries on LAST_NAME and FIRST_NAME columns deals with case sensitivity. For example, instead of doing an exact search like Hotinger we would prefer to match all results such as HoTingEr and so on. This is very easy to do in a WHERE clause: we just say WHERE UPPER(LAST_NAME) = UPPER('Hotinger')
However, if we define an index of LAST_NAME and query, it will actually run a full table scan because the query is not on LAST_NAME but on UPPER(LAST_NAME). From the database's perspective, this is completely different. So, in this case you should define the index on UPPER(LAST_NAME) instead.
Indexes do not necessarily have to be for one column. For example, if the primary key is a composite key (consisting of multiple columns) it will create a concatenated index also known as a combined index. Note that the ordering of the concatenated index has a significant impact on its usability and scalability so it must be chosen carefully. Basically, the ordering should match the way it is ordered in the WHERE clause.
Defining an Optimal Index With LIKE
The position of the wildcard characters makes a huge difference. LIKE clauses only use the characters before the wildcard during tree traversal; the rest do not narrow the scanned index range. The more selective the prefix of the LIKE clause the more narrow the scanned index becomes. This makes the index lookup faster. As a tip, avoid LIKE clauses which lead with wildcards like "%OTINGER%" For full-text searches, MySQL offers MATCH and AGAINST keywords. Starting with MySQL 5.6, you can have full-text indexes. Look at Full-Text Search Functions from MySQL for more in-depth discussion on indexing these results.
Yes, generally you need an index on the column or columns that you compare in the WHERE clause of your queries to speed up queries.
If you search by AuthorFName, then you create an index on that column. If they search by AuthorLName, then you create an index on that column.
In this case though, maybe what you should be looking at is a FULLTEXT index. That would allow users to enter fuzzy queries, which would return a number of results ordered by relevance.
From the MySQL Manual:
Indexes are used to find rows with specific column values quickly.
Without an index, MySQL must begin with the first row and then read
through the entire table to find the relevant rows. The larger the
table, the more this costs. If the table has an index for the columns
in question, MySQL can quickly determine the position to seek to in
the middle of the data file without having to look at all the data. If
a table has 1,000 rows, this is at least 100 times faster than reading
sequentially. If you need to access most of the rows, it is faster to
read sequentially, because this minimizes disk seeks.
An index usually means a B-Tree. Understand the structure of the B-Tree and you'll understand what index can and cannot do.
In your particular case:
WHERE AuthorLName = 'something' and WHERE AuthorLName LIKE 'something%' can be sped-up by an index on {AuthorLName}.
WHERE AuthorLName = 'something AND AuthorFName = 'something else' can be sped-up by a composite index on {AuthorLName, AuthorFName} or {AuthorFName, AuthorLName}.
WHERE AuthorLName = 'something OR AuthorFName = 'something else' (which doesn't make much sense, but is here as an example) can be sped-up by having two indexes: on {AuthorLName} and on {AuthorFName}.
WHERE AuthorLName LIKE '%something' cannot be sped-up by a B-Tree index (cunsider full-text indexing).
Etc...
See Use The Index, Luke! for a much more thorough treatment of the subject than possible in a simple SO post.
Limited length index:
When using text columns or very large varchar columns you won't be able to create an index over the entire length of the text/varchar, there are some limits (around 1024 ASCII characters in length).
In such a case you specify the length in the index declaration.
CREATE INDEX `my_limited_length_index` ON `my_table`(`long_text_content`(512));
-- please notice the use of the numeric length of the index after the column name
Processed value index (apparently available in PostgreSQL not MySQL):
Indexes are not exclusively built from one column, some may be built from multiple columns and other may be built from just some of the info a column has. For example if you have a full datetime column but you know you're only going to filter records by date you can build an index based on the datetime column but only containing date info.
-- `my_table` has a `created` column of type timestamp
CREATE INDEX `my_date_created` ON `my_table`(DATE(`created`));
-- please notice the use of the DATE function which extracts only
-- the date from the `created` timestamp
index shall span the columns you are going to use in WHERE statement.
To better understand, here is an example:
SELECT * FROM Authors WHERE AuthorNum > 10 AND AuthorLName LIKE 'A%';
SELECT * FROM Authors WHERE AuthorLName LIKE 'Be%';
If you are often using the shown above queries, you are highly adviced to have two indexes:
Create index AuthNum_AuthLName_Index on Authors (AuthorNum, AuthorLName);
Create index AuthLName_Index on Authors (AuthorLName);
The key thing to remember: index shall have the same combiation of columns used in WHERE statements

Compound FULLTEXT index in MySQL

I would like to make system whitch allows to search user messages, by specific user.
assume having folowing table
create table messages(
user_id int,
message nvarchar(500));
So what kind of index I should use here, if I want to search for all messages from user 1, containing word 'foo'.
Simple, non unique index user_id
It will filter only specific user messages nd then full scan for specific word.
FULLTEXT index on message
this will find all messages from all users and then filter by ID, seems to be very inefficient in case of big amount of users.
comopound index on both user_id and message
So full text index tree is created for each user separately, so they can be searched individually. During query system filters messages by ID and then performs text search on remaining rows in index.
A.F.A.I.K. last one is impossible. So then I assume I shall use 1-st option, It will perform better in case of few thousands of users?
And if each will have ~100 messages, full iteration won't cost much resources?
Perhaps I can include username into message and use BOOLEAN full text search mode, but I think it would be slower than by using indexed user_id.
#Alden Quimby's answer is correct as far as it goes, but there is more to the story, because MySQL will only try to choose the optimal index, and its ability to make that determination is limited because of the way fulltext indexes interact with the optimizer.
What actually happens is this:
If the specified user_id exists in either 0 or 1 matching rows in the table, the optimizer will realize this and will choose user_id as the index for that query. Fast execution.
Otherwise, the optimizer will choose the fulltext index, filtering every row matched by the fulltext index to eliminate rows not containing a user_id that matches the WHERE clause. Not quite as fast.
So it's not truly the "optimum" path. It's more like fulltext, with a nice optimization to avoid the fulltext search under the one condition that we know we have almost nothing of interest in the table.
The reason this breaks down is that a fulltext index doesn't give any meaningful statistics back to the optimizer. It just says "yeah, I think that query should probably only require me to check 1 row" ... which, of course, pleases the optimizer greatly, so the fulltext index wins the bid for lowest cost, unless the index with the integer value also comes in comparably low or lower.
Still, that doesn't mean I wouldn't try it this way first.
There's another option, which would work best with fulltext queries IN BOOLEAN MODE and that is to create another column which you would populate with something like CONCAT('user_id_',user_id) or something similar, and then declare a 2-column fulltext index.
filter_string VARCHAR(48) # populated with CONCAT('user_id_',user_id);
....
FULLTEXT KEY (message,filter_string)
Then specify everything in the query.
SELECT ...
WHERE user_id = 500 AND
MATCH (message,filter_string) AGAINST ('+kittens +puppies +user_id_500' IN BOOLEAN MODE);
Now, the fulltext index will be responsible for matching only those rows where kittens, puppies, and "user_id_500" appears in the combined fulltext index of the two columns, but you'd still want to have the integer filter there too to make sure the final results are constrained in spite of any random appearance of "user_id_500" in the message.
You should add a fulltext index on message and a regular index on user_id, and use the query:
SELECT *
FROM messages
WHERE MATCH(message) AGAINST(#search_query)
AND user_id = #user_id;
You're right that you can't do option 3. But rather than trying to pick between 1 and 2, let MySQL do the work for you. MySQL will only use one of the two indexes, and will do a linear scan to complete the second filter, but it will estimate the effectiveness of each index and choose the optimal one.
Note: only do this if you can afford the overhead of two indexes (slower insert/update/delete). Also, if you know that each user will only have a few messages, then yes it might make sense to use a simple index and do a regex in the application layer or something like that.
Turn on the "Optimizer trace" and look for "considered_execution_plans". I contend that the Optimizer will always pick the FULLTEXT index, even when some other index might be better. This may be because it is quite costly when the MATCH is not pre-computed as when the FT index is built.
More on Optimizer Trace: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#optimizer_trace (Earlier in that doc are my tips on FULLTEXT.)