Cannot change InnoDB full text minimum word length - mysql

I have a MySQL 5.7.31 InnoDB table with full text index enabled...
if I search for a longer word, I get results:
SELECT * FROM my_table WHERE match(my_title) against('landscape in' IN BOOLEAN MODE)
if I search full text for short word (e.g in), I get no results
SELECT * FROM my_table WHERE match(my_title) against('in' IN BOOLEAN MODE)
the data is there, I can find it with like %% query:
SELECT * FROM my_table WHERE my_title LIKE '%in%'
I set these two in /etc/my.cnf, I understand one is for InnoDB and one for MyIsam, I restarted MySQL, I still cannot run the above short full text query.
ft_min_word_len=1
innodb_ft_min_token_size=1
Edit:
If I have a value like landscape in Paris, then I get data for against('+landscape +Paris' IN BOOLEAN MODE) but NOT for against('+landscape +in +Paris' IN BOOLEAN MODE)
Is in a reserved word maybe ?

"in" is probably in the "stop list". Change the specification of the stoplist file.
After changing the min-len or the stoplist, you must rebuild the Fulltext index(es). (Restarting MySQL is not needed.)
An alternative I used on one situation: I added + to long words. For example, against('+landscape in +Paris' IN BOOLEAN MODE) would probably achieve your goal without changing either the min-len or the stopword list.
(Yes, there are several 'differences' between MyISAM and InnoDB. I have not found a definitive list in the docs. Here's my attempt at such a list: http://mysql.rjweb.org/doc.php/myisam2innodb#fulltext )

Related

MyIsam fulltext search against multiple %term%

I have a field called filepath that I'm trying to search. Here is an example path:
/mnt/qfs-X/Asset_Management/XG_Marketing_/Episodic-SG_1001_1233.jpg
I would like to be able to search the following and get a match:
search = "qf episodic sg_1001 JPG"
How would I do this with a fulltext search in mysql/myisam? What I have now is:
SELECT * FROM x_files2 WHERE MATCH(path)
AGAINST('qf episodic sg_1001 JPG' in boolean mode)
But it is returning way too many results (seems like it's returning if any terms are found instead of only those where all are found.
Put + in front of each 'word':
AGAINST('+qf* +episodic +sg_1001* +JPG' in boolean mode)
Do you have the min-word-length set to 2? If not, there could be other troubles.
The + avoids "too many".
Consider switching to InnoDB, now that it has FULLTEXT.
You may have to abandon use of FULLTEXT and switch to LIKE:
WHERE path LIKE '%qf%episodic%sg_1001%JPG%'
If performance is an issue, consider something like
WHERE MATCH(path) AGAINST('...' IN BOOLEAN MODE) -- using some of the words
AND path LIKE '...' -- as above
The MATCH will run first, whittling down the number of possible rows considerably, then the LIKE takes care of details.
Note that middles of words cannot be used in AGAINST. Those could be left out, relying on LIKE to take care of them.

match against not making sense

This is my filter text:
Oliver used book
If I search for 'Oliver' it works, if I search for 'book' it works but if I search for 'used' it does not work.
Heater white fan HEOP1322
Heater -> works : white -> works : fan -> does not work : HEOP -> does not work : HEOP1322 -> works.
My query is like this:
SELECT * FROM table WHERE MATCH(filter) AGAINST ('fan' IN BOOLEAN MODE)
SELECT * FROM table WHERE MATCH(filter) AGAINST ('HEOP' IN BOOLEAN MODE)
SELECT * FROM table WHERE MATCH(filter) AGAINST ('used' IN BOOLEAN MODE)
Why d'hell does the word used not work and the word book works? They have the same length.
I also tried this suggestions Mysql search for string and number using MATCH() AGAINST() without success.
Edit: Solved, follow this instructions.
XAMPP MySQL - Setting ft_min_word_len
"used" is one of the default MySQL full text stopwords: https://dev.mysql.com/doc/refman/5.1/en/fulltext-stopwords.html. Stopwords are words which are ignored because they are too frequent in the (English) language and would not positively contribute to the result of a full text search. If you're only querying for single words, a LIKE %..% query may be more suited than a full-blown full text search.

Prepending an * (asterisk) to a Fulltext Search in MySQL

I understand that the asterisk is a wildcard that can be appended to the end of fulltext search words, but what if my searched keyword is a suffix? For example, I want to be able to search for "ames" and have a result that contains the name "james" returned. Here is my current query which does not work because you cannot prepend asterisks to fulltext searches.
SELECT * FROM table WHERE MATCH(name, about, address) AGAINST ("*$key*" IN BOOLEAN MODE)
I would simply switch to using LIKE, but it would be way too slow for the size of my database.
What you could do is create another column in your database with full-text search index, this new column should have the reversed string of the column you are trying to search on, and you will reverse the search query and use it to search on the reversed column, here is how the query will look like:
SELECT * FROM table WHERE MATCH(column1) AGAINST ("$key*" IN BOOLEAN MODE) OR MATCH(reversedColumn1) AGAINST ("$reveresedkey*" IN BOOLEAN MODE)
the first condition
MATCH(column1) AGAINST ("$key*" IN BOOLEAN MODE)
example:
reversedColumn1==>Jmaes $reveresedkey*==>ames*
will search for words that start with ames ==> no match
the seconds condition
MATCH(reversedColumn1) AGAINST ("$reveresedkey*" IN BOOLEAN MODE)
example:
reversedColumn1==>semaJ $reveresedkey*==>sema*
will search for words that end with ames ==> we have a match
This might not be a bad idea if your text is short:
Can't be done due to limitation of MySQL. Values are indexed left-to-right, not right-to-left. You'll have to stick with LIKE if you want wildcards prepended to search string.

Why is Mysql match boolean mode not finding "Knows"

I have these two queries
SELECT * FROM `foo` WHERE MATCH(`title`) AGAINST('knows' in boolean mode )
SELECT * FROM `foo` WHERE MATCH(`title`) AGAINST('woman' in boolean mode )
and in the table I have a row with title = "a woman knows"
the first query finds that row, but the second doesn't!
I have experimented with different alternatives - for example, if the title contains "a woman knots" then querying for a match against "knots" works
I am mystified, so any help you can provide would be welcome.
"knows" is a stopword and will not be indexed (and therefore ignored in all searches).
You can load your own list of stopwords with the ft_stopword_file server option.

MySQL: unexpected behaviour 'in boolean mode'

I use the following call for getting information from a database:
select *
from submissions
where
match( description ) against ('+snowboard' in boolean mode )
and (!disabled or disabled='n')
order by datelisted desc limit 30
Which means everything with ‘snowboard' in the description is found. Now here’s the problem:
select *
from submissions
where
match( description ) against ('+snowboard +mp4' in boolean mode )
and (!disabled or disabled='n')
order by datelisted desc limit 30
will ignore the +mp4 for some reason and return the same as the first query
select *
from submissions
where
match( description ) against ('+mp4' in boolean mode )
doesn't return anything, so basically it appears it's ignored in the search
Does anybody know how to work around this behavior?
mysql's boolean mode will only match words which are longer than a certain length. and mp4 is too short. you'd have to recompile mysql to change the threshold
EDIT: turns out, this can now be set via the config file, see http://dev.mysql.com/doc/refman/5.0/en/fulltext-fine-tuning.html for furhter reference
Your problem is the minimum word length, which by default is 3 characters.
Try the same with +snowboard +scooter and you will see that it works. (Supposing you don't have scooters in your database, of course).
See Fine-Tuning MySQL Full-Text Search on how to change it:
The minimum and maximum lengths of words to be indexed are defined by the ft_min_word_len and ft_max_word_len system variables. (See Section 5.1.4, “Server System Variables”.) The default minimum value is four characters; the default maximum is version dependent. If you change either value, you must rebuild your FULLTEXT indexes. For example, if you want three-character words to be searchable, you can set the ft_min_word_len variable by putting the following lines in an option file:
[mysqld]
ft_min_word_len=3
Then restart the server and rebuild your FULLTEXT indexes. Note particularly the remarks regarding myisamchk in the instructions following this list.