MySQL full text search by relevancy with wildcard - mysql

I need to search products and sort them by relevancy , for that I tried this MySQL query
SELECT *, MATCH(`SubProductName`) AGAINST ('+app*' IN BOOLEAN MODE) AS
relevance FROM `tblsubproducts1` WHERE MATCH(SubProductName) AGAINST
('+app*' IN BOOLEAN MODE) ORDER BY relevance DESC
That query for example returns : Apple Thunderbolt, Apple TV ... as results. which is right.
But when I try with '+usb*' it doesn't return any rows, while the database contains a row with SubProductName USB-C Charge Cable that I can find by matching against '+cable*'
To clarify,I want the search to work with partial words like 'app' for apple which is why I added *, but why it doesn't always seem to work is what's confusing me here. Is it the - in USB-C or ... ?

If you are using MyISAM, then the minimum word length for full text indexing is 4. (This is documented here.)
In other words, "usb" is not even in the index. You need to change this parameter and re-build the index.

Related

match against doesn't work with the word "when"

When desc contains the string: zoom when wifi dies for 1 second
Query 1:
SELECT * FROM `pics` WHERE MATCH(title, desc, owntags, usertags) AGAINST('+zoom* +wifi*' IN BOOLEAN MODE)
No problem, I get the row!
Query 2:
SELECT * FROM `pics` WHERE MATCH(title, desc, owntags, usertags) AGAINST('+zoom* +when*' IN BOOLEAN MODE)
No results! So when belongs to sql commands.
So how to solve this?
You need to learn some basics about full text search. One very important concept are stop words. These are words that are not included in the full-text index, because they are so common or add little meaning (at least from the perspective of the person who created the stop word list . . . a famous problem involves the band The Who).
The word 'when' is a common stop word and a default stop word in MySQL (see here and here). So, it is not being indexed.
You will need to recreate your full text indexes, either removing all stop words or using your own custom list.

How to improve mysql NATURAL LANGUAGE MODE search query?

This is my query
SELECT * FROM myTable WHERE MATCH (name) AGAINST ("Apple M1" IN NATURAL LANGUAGE MODE)
if I search Apple M1 as results i get Orange M1 then third or more position i get Apple M-1 – which is the value i stored and I was assuming should be first!
my question is: there is a way to fine tune the mysql search?
They best way to improve MySQL Natural Language Mode search is to use Boolean Full-Text Searches instead. It will do the same as Natural Language Mode search, but you can use additional modifiers to finetune your results, e.g. by
> <
These two operators are used to change a word's contribution to the relevance value that is assigned to a row. The > operator increases the contribution and the < operator decreases it.
There is one minor difference, boolean mode search will not order automatically according to relevance, so you have to order yourself.
SELECT * FROM myTable
WHERE MATCH (name) AGAINST (">Apple M1" IN BOOLEAN MODE)
ORDER BY MATCH (name) AGAINST (">Apple M1" IN BOOLEAN MODE) desc
And a remark: both versions of fulltext search will not find M-1 if you match against M1 (even with a minimum wordlength setting of 2). It will only look for exakt (usually case-insensitive) word matches, it does not look for similar words (unless you use *). It will "just" weigh the combination of (exact) words by some algorithm, and, if you use them, the modifiers.
Update Some additional clarification according to the comments:
If you match against Apple M1, it returns rows that contain (case-insensitive) Apple or M1 in any order, so e.g. M1 apple, Apple M4, Apple M-1 and Orange M1. It will not find Apples M4 or Orange M-1, because they are not exactly that words. E.g. like '%M-1%' wouldn't find Apple M1 either. But if you like, you can match against Apple* to find Apple and Apples, but it's always at the end of the word, *Apple* is not possible, you have to use like '%Apple%' then.
These rows are then ordered by the scoring algorithm, that will basically score words that are less common in your texts higher than very common words. And if you add >Apple, it will give Apple a higher value. It will just be a number, you can add them to your select, e.g. select ..., MATCH (name) AGAINST (">Apple M1" IN BOOLEAN MODE) as score to get a feeling for that.
There are some other things to consider:
only words that have a minimum length are added to the index. That length is given by innodb_ft_min_token_size for innodb or ft_min_word_len for myisam. So you should set it to e.g. 2 to include M1 (otherwise, this word will not have any effect in your search. Since in your example, you found Orange M1, I assume it is set correctly).
- is usually considered a hyphen. So M-1 in your text will be split up into two words M and 1 (that may or may not be included according to your mininum word lenght setting, so maybe set it to 1). You can change that behaviour by adding - to the characterset (see Fine-Tuning MySQL Full-Text Search, the part beginning with Modify a character set file), but this will then not find blue-green anymore if you search for blue and/or green.
the full text search uses stopwords. These words are not included in your index. This list includes a and i, so even with minimum wordlength of 1, you would not find them. You can edit that list.
Some ideas about your potential problem about M1/M-1. To adjust that to your exact requirements, you would have to add more information about your searches and data (and would be maybe another question), but some ideas:
You can replace userinput that contains - by including both versions to your search query: once with -, but enclosed in "", once without. So if the user enters Apple M-1, you would create a search for Apple M1 "M-1" (that would work with or without a modified characterset, but without a new characterset, your min word length has to be 1). If the user enters M1, you should detect that and replace that by M1 "M-1" too.
Another alternative would be to save an additional column with clean, hyphenless words and add that column to the full text index and then match (name, clean_name) against ("M1" ....
And you can of course combine like and match, e.g. if you detect a product number in your input, you can use something like where match(...) against(...) or product_id like 'M%1%', or where match(...) against(...) or product_id = 'M-1' or product_id = 'M1' or even where match(...) against(...) or name like '%M%1%', but the latter would probably be a lot slower and contain a lot of noise. And it might not score correctly, but at least it will be in the resultset.
But as I said, that would depend on your data and your requirements.

match against not making sense

This is my filter text:
Oliver used book
If I search for 'Oliver' it works, if I search for 'book' it works but if I search for 'used' it does not work.
Heater white fan HEOP1322
Heater -> works : white -> works : fan -> does not work : HEOP -> does not work : HEOP1322 -> works.
My query is like this:
SELECT * FROM table WHERE MATCH(filter) AGAINST ('fan' IN BOOLEAN MODE)
SELECT * FROM table WHERE MATCH(filter) AGAINST ('HEOP' IN BOOLEAN MODE)
SELECT * FROM table WHERE MATCH(filter) AGAINST ('used' IN BOOLEAN MODE)
Why d'hell does the word used not work and the word book works? They have the same length.
I also tried this suggestions Mysql search for string and number using MATCH() AGAINST() without success.
Edit: Solved, follow this instructions.
XAMPP MySQL - Setting ft_min_word_len
"used" is one of the default MySQL full text stopwords: https://dev.mysql.com/doc/refman/5.1/en/fulltext-stopwords.html. Stopwords are words which are ignored because they are too frequent in the (English) language and would not positively contribute to the result of a full text search. If you're only querying for single words, a LIKE %..% query may be more suited than a full-blown full text search.

Full text search order by closest match

SELECT user_id, user_name.fullname, live, likes,
MATCH (fullname, email, live) AGAINST (:search_I IN BOOLEAN MODE) AS relevance
FROM profile LEFT JOIN user_name ON user_id=user_id
WHERE MATCH (fullname, email, live) AGAINST (:search_II IN BOOLEAN MODE)
ORDER BY relevance DESC
bindValue(':search_I', $search...);
bindValue(':search_II', $search...);//PDO can't use same one twice
I have a query use FULL TEXT search, I need to order by the closest match on top.
However this query is not working, It didn't order anything.
I did a testing, search 123#hotmail.com
2 rows in my db, abc#hotmail.com & 123#hotmail.com
It return 2 rows but didn't put the closest match on top(123#hotmail.com)
anyone know where is the problems?
By default MySQL full text search has a minimum word length of 3 (see here).
So, your example of '123#hotmail.com' is only matching on 'hotmail' and the two are equivalent.
You can change the default (and rebuild the index). But, I'd suggest that you do testing with 'abcd#hotmail.com' instead.
EDIT:
The definition of a word is buried a bit in the documentation:
The MySQL FULLTEXT implementation regards any sequence of true word
characters (letters, digits, and underscores) as a word. That sequence
may also contain apostrophes (“'”), but not more than one in a row.
This means that aaa'bbb is regarded as one word, but aaa''bbb is
regarded as two words. Apostrophes at the beginning or the end of a
word are stripped by the FULLTEXT parser; 'aaa'bbb' would be parsed as
aaa'bbb.
Because of the where clause, you can see that there is a match to both email addresses. That match would have to be on 'hotmail'. The 'com' and email name get chopped off because of the default minimum word length.

Issue with Singular Words and MySQL Fulltext Searching

I've setup a fulltext search to listen on the title and description columns for my blog articles table in MySQL. The SQL that I use to search the table is as follows:
SELECT title,description,publish_date FROM table WHERE MATCH(title,description) AGAINST('cats','dogs') ORDER BY publish_date DESC LIMIT 100
This works (for 'dogs' and 'cats'), but when I use the singular ('dog' or 'cat') then I find no results. Not sure why this is going on, I've tried different variations like "+dog, +cat" and tried including IN BOOLEAN MODE as well ... Nothing works. And Yes I am sure that there are other words in the description column that are "dog" and "cat" as well as their plural versions.
How can I get singular words to work with MySQL?
The default minimum word length for full-text searches is 4 characters.
You'll need to change that in the server configuration. See here for some info on how to do it.
why don't you try something like this:
SELECT title,description,publish_date, MATCH(title,description) AGAINST('search') AS score FROM table WHERE MATCH(title,description) AGAINST('seacrh') ORDER BY score LIMIT 100;
maybe this will help but will not work propertly with one word