MySQL: unexpected behaviour 'in boolean mode' - mysql

I use the following call for getting information from a database:
select *
from submissions
where
match( description ) against ('+snowboard' in boolean mode )
and (!disabled or disabled='n')
order by datelisted desc limit 30
Which means everything with ‘snowboard' in the description is found. Now here’s the problem:
select *
from submissions
where
match( description ) against ('+snowboard +mp4' in boolean mode )
and (!disabled or disabled='n')
order by datelisted desc limit 30
will ignore the +mp4 for some reason and return the same as the first query
select *
from submissions
where
match( description ) against ('+mp4' in boolean mode )
doesn't return anything, so basically it appears it's ignored in the search
Does anybody know how to work around this behavior?

mysql's boolean mode will only match words which are longer than a certain length. and mp4 is too short. you'd have to recompile mysql to change the threshold
EDIT: turns out, this can now be set via the config file, see http://dev.mysql.com/doc/refman/5.0/en/fulltext-fine-tuning.html for furhter reference

Your problem is the minimum word length, which by default is 3 characters.
Try the same with +snowboard +scooter and you will see that it works. (Supposing you don't have scooters in your database, of course).
See Fine-Tuning MySQL Full-Text Search on how to change it:
The minimum and maximum lengths of words to be indexed are defined by the ft_min_word_len and ft_max_word_len system variables. (See Section 5.1.4, “Server System Variables”.) The default minimum value is four characters; the default maximum is version dependent. If you change either value, you must rebuild your FULLTEXT indexes. For example, if you want three-character words to be searchable, you can set the ft_min_word_len variable by putting the following lines in an option file:
[mysqld]
ft_min_word_len=3
Then restart the server and rebuild your FULLTEXT indexes. Note particularly the remarks regarding myisamchk in the instructions following this list.

Related

Cannot change InnoDB full text minimum word length

I have a MySQL 5.7.31 InnoDB table with full text index enabled...
if I search for a longer word, I get results:
SELECT * FROM my_table WHERE match(my_title) against('landscape in' IN BOOLEAN MODE)
if I search full text for short word (e.g in), I get no results
SELECT * FROM my_table WHERE match(my_title) against('in' IN BOOLEAN MODE)
the data is there, I can find it with like %% query:
SELECT * FROM my_table WHERE my_title LIKE '%in%'
I set these two in /etc/my.cnf, I understand one is for InnoDB and one for MyIsam, I restarted MySQL, I still cannot run the above short full text query.
ft_min_word_len=1
innodb_ft_min_token_size=1
Edit:
If I have a value like landscape in Paris, then I get data for against('+landscape +Paris' IN BOOLEAN MODE) but NOT for against('+landscape +in +Paris' IN BOOLEAN MODE)
Is in a reserved word maybe ?
"in" is probably in the "stop list". Change the specification of the stoplist file.
After changing the min-len or the stoplist, you must rebuild the Fulltext index(es). (Restarting MySQL is not needed.)
An alternative I used on one situation: I added + to long words. For example, against('+landscape in +Paris' IN BOOLEAN MODE) would probably achieve your goal without changing either the min-len or the stopword list.
(Yes, there are several 'differences' between MyISAM and InnoDB. I have not found a definitive list in the docs. Here's my attempt at such a list: http://mysql.rjweb.org/doc.php/myisam2innodb#fulltext )

*Actual* exact MySQL fulltext search

So I'm having some difficulty creating exact searches in MySQL fulltext.
In my database, I'm trying to find jobs with a specific keyword in its title.
So I might try
WHERE MATCH(jobTitle) AGAINST ('"fs sales"' IN BOOLEAN MODE)
However, this finds matches on "sales", not "fs sales"
How can I ensure that "fs sales" matches EXACTLY on "fs sales" and not "sales"?
Table is InnoDB for reference.
"fs" is probably excluded from the search as too short.
Check the value of innodb_ft_min_token_size and manual: https://dev.mysql.com/doc/refman/5.6/en/fulltext-fine-tuning.html
You have to rebuild the index after changing that variable.
Your query should work. My guess, though is that you did not change the minimum word length, so "fs" was never indexed. See here for information on this.
Other possibilities are that there are other characters in the text, perhaps characters you do not see.
You might try this
select t.*
from (select . . .
WHERE MATCH(jobTitle) AGAINST ('+fs +sales' IN BOOLEAN MODE)
) t
where jobTitle like '%fs sales%';
This only does the like on the returned set from the match.
However, my best guess is that innodb_ft_min_token_size is set to its default value of 3, so "fs" is not being indexed.
you can do it like
select col1, col2 from table_name where text_column like '%fs sales%'
this will return all the records having fs sales in them..

Why is Mysql match boolean mode not finding "Knows"

I have these two queries
SELECT * FROM `foo` WHERE MATCH(`title`) AGAINST('knows' in boolean mode )
SELECT * FROM `foo` WHERE MATCH(`title`) AGAINST('woman' in boolean mode )
and in the table I have a row with title = "a woman knows"
the first query finds that row, but the second doesn't!
I have experimented with different alternatives - for example, if the title contains "a woman knots" then querying for a match against "knots" works
I am mystified, so any help you can provide would be welcome.
"knows" is a stopword and will not be indexed (and therefore ignored in all searches).
You can load your own list of stopwords with the ft_stopword_file server option.

MySQL full text index underscore

I have a problem with MySQL's full text index, it treats underscore as part of a word (why? dunno).
This is the string I have in my table, VA_-_Some_Album
And this is the query for it:
SELECT
*
FROM
`mytable`
WHERE
MATCH (`name`) AGAINST ('+Some* +Album*' IN BOOLEAN MODE)
ORDER BY `sdate` DESC
LIMIT 3
MySQL returns an empty set for this query, unless I change it to +*Some* since the underscore is part of the word (_Some instead of Some). This is not good for me, since when adding the extra asterisk (*) the plus sign stops functioning and I don't get the "AND" done.
I tried to change the charset definition, and rebuild the full-text index but nothing.
Any ideas? changing the way the string is stored is not up to me.
Thank you!
I'm not very clear with your question but take a look here:
http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html
there they explain the differences between _, +, *, -, <>, etc In Boolean Mode

MySQL fulltext search isn't matching as expected

I have a pretty simple query that doesn't seem to be giving me the results I'd like. I'm trying to allow the user to search for a resturant by its name, address, or city using a fulltext search.
here is my query:
SELECT ESTAB_NAME, ESTAB_ADDRESS, ESTAB_CITY
FROM restaurant_restaurants rr
WHERE
MATCH (rr.ESTAB_NAME, rr.ESTAB_ADDRESS, rr.ESTAB_CITY)
AGAINST ('*new* *hua*' IN BOOLEAN MODE)
LIMIT 0, 500
New Hua is the restaurant that exists within the table. However when i do a search for 'ting ho' i get the results I would expect.
Does anyone have any idea what What is going on?
I'm using a MyISAM storage engine on MySQL version 5.0.41
Most likely, the full-text index settings have set a minimum word length of 4 - I believe this is the default. You'll need to change these settings, even for BOOLEAN MODE (as per http://dev.mysql.com/doc/refman/5.1/en/fulltext-boolean.html). Take a look at http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html for the settings to change.
I think Michael is right, but also, you probably want to remove the ***** characters unless that's actually in the title you're searching for. MATCH AGAINST doesn't require a "match all" type of parameter.
My guess: "new" is a Mysql Default Stop Word. See Michael Madsen's second link to see how to change the stop word list and regain the restaurant.