Issue with Singular Words and MySQL Fulltext Searching - mysql

I've setup a fulltext search to listen on the title and description columns for my blog articles table in MySQL. The SQL that I use to search the table is as follows:
SELECT title,description,publish_date FROM table WHERE MATCH(title,description) AGAINST('cats','dogs') ORDER BY publish_date DESC LIMIT 100
This works (for 'dogs' and 'cats'), but when I use the singular ('dog' or 'cat') then I find no results. Not sure why this is going on, I've tried different variations like "+dog, +cat" and tried including IN BOOLEAN MODE as well ... Nothing works. And Yes I am sure that there are other words in the description column that are "dog" and "cat" as well as their plural versions.
How can I get singular words to work with MySQL?

The default minimum word length for full-text searches is 4 characters.
You'll need to change that in the server configuration. See here for some info on how to do it.

why don't you try something like this:
SELECT title,description,publish_date, MATCH(title,description) AGAINST('search') AS score FROM table WHERE MATCH(title,description) AGAINST('seacrh') ORDER BY score LIMIT 100;
maybe this will help but will not work propertly with one word

Related

MATCH AGAINST in MySQL don't work

I have a problem with FULLTEXT search in MySql.
I create query:
SELECT searchTag, MATCH (searchTag) AGAINST ('after party') as score FROM post WHERE MATCH (searchTag) AGAINST ('after party') ORDER BY score DESC
Its result:
1. we,like,to,party 3.6987853050231934
2. f,w,g,party 3.6987853050231934
3. after,party,tooka 3.657205581665039
Why number 3 have lower score if it have two words searching?
after is a stop word. It is ignored by a FULLTEXT MATCH query.
Basically, the word "after" is so common in the English language that including it in a query is semantically meaningless.
Think of it this way: imagine a query against the word "a". There are so many sentences which use the word "a", that a match against them really won't provide you with anything useful.
In this post, all of the sentences reference the word "a".

MySQL full text search by relevancy with wildcard

I need to search products and sort them by relevancy , for that I tried this MySQL query
SELECT *, MATCH(`SubProductName`) AGAINST ('+app*' IN BOOLEAN MODE) AS
relevance FROM `tblsubproducts1` WHERE MATCH(SubProductName) AGAINST
('+app*' IN BOOLEAN MODE) ORDER BY relevance DESC
That query for example returns : Apple Thunderbolt, Apple TV ... as results. which is right.
But when I try with '+usb*' it doesn't return any rows, while the database contains a row with SubProductName USB-C Charge Cable that I can find by matching against '+cable*'
To clarify,I want the search to work with partial words like 'app' for apple which is why I added *, but why it doesn't always seem to work is what's confusing me here. Is it the - in USB-C or ... ?
If you are using MyISAM, then the minimum word length for full text indexing is 4. (This is documented here.)
In other words, "usb" is not even in the index. You need to change this parameter and re-build the index.

mysql boolean mode fulltext search with wildcards and literals

I'm pretty new to MySQL full-text searches and I ran into this problem today:
My company table has a record with "e-magazine AG" in the name column. I have a full-text index on the name column.
When I execute this query the record is not found:
SELECT id, name FROM company WHERE MATCH(name) AGAINST('+"e-magazi"*' IN BOOLEAN MODE);
I need to work with quotes because of the dash and to use the wildcard because I implement a "search as you type" functionality.
When I search for the whole term "e-magazine AG", the record is found.
Any ideas what I'm doing wrong here? I read about adding the dash to the list of word characters (config update needed) but I'm searching for a way to do this programmatically.
This clause
MATCH(name) AGAINST('+"e-magazi"*' IN BOOLEAN MODE);
Will search for a AND "e" AND NOT "magazi"; i.e. the - inside "e-magazi" will be interpreted as a not even though it is inside quotation marks.
For this reason it will not work as expected.
A solution is to apply an extra having clause with a LIKE.
I know this having is slow, but it will only be applied to the results of the match, so not too many rows should be involved.
I suggest something like:
SELECT id, name
FROM company
WHERE MATCH(name) AGAINST('magazine' IN BOOLEAN MODE)
HAVING name LIKE '%e-magazi%';
MySQL fulltext treats the word e-magazine in a text as a phrase and not as a word. Because of that it results the two words e and magazine. And while it builds the search index it does not add the e to the index because of the ft_min_word_len (default is 4 chars).
The same length limitation is used for the search query. That is the reason why a search for e-magazine returns exactly the same results as a-magazine because a and - is fully ignored.
But now you want to find the exact phrase e-magazine. By that you use the quotes and that is the complete correct way to find phrases, but MySQL does not support operators for phrases, only for words:
https://dev.mysql.com/doc/refman/5.7/en/fulltext-boolean.html
With this modifier, certain characters have special meaning at the beginning or end of words in the search string
Some people would suggest to use the following query:
SELECT id, name
FROM company
WHERE MATCH(name) AGAINST('e-magazi*' IN BOOLEAN MODE)
HAVING name LIKE 'e-magazi%';
As I said MySQL ignores the e- and searches for the wildcard word magazi*. After those results are optained it uses HAVING to aditionally filter the results for e-magazi* including the e-. By that you will find the phrase e-magazine AG. Of course HAVING is only needed if the search phrase contains the wildcard operator and you should never use quotes. This operator is used by your user and not you!
Note: As long you do not surround the search phrase with % it will find only fields that start with that word. And you do not want to surround it, because it would find bee-magazine as well. So maybe you need an additional OR HAVING name LIKE ' %e-magazi%' OR HAVING NAME LIKE '\\n%e-magazi%' to make it usable inside of texts.
Trick
But finally I prefer a trick so HAVING isn't needed at all:
If you add texts to your database table, add them additionally to a separate fulltext indexed column and replace words like up-to-date with up-to-date uptodate.
If a user searches for up-to-date replace it in the query with uptodate.
By that you can still find specific in user-specific but up-to-date as well (and not only date).
Bonus
If a user searches for -well-known huge ports MySQL treats that as not include *well*, could include *known* and *huge*. Of course you could solve that with an other extra query variant as well, but with the trick above you remove the hyphen so the search query looks simply like that:
SELECT id
FROM texts
WHERE MATCH(text) AGAINST('-wellknown huge ports' IN BOOLEAN MODE)

Against not returning score

There is something wrong with this query? This one works sometimes and sometimes not. For example with the word 'seven' it doesn't return any score, but i know that it appears on 29 rows at least in the body however it return as score 0.
With other words it work ok but not with this one. Anyone know why or have a different solution to sort it by relevance?
SELECT *,
( (MATCH(articles.name) AGAINST('seven'))*5 +
(MATCH(articles.subtitle) AGAINST('seven'))*3 +
(MATCH(articles.body) AGAINST('seven'))) AS search_score
FROM articles
LEFT JOIN matches ON articles.match=matches.id
ORDER BY search_score DESC
EDIT: I noticed that 'seven' is a stop word. There is other way to do this? stopwords
Add COALESCE(value,0) around each score.
Problem
If the word is too common, i.e. occurs in 50%+ of the rows, MySQL considers it a STOP-word and will not match against it.
Then there's the stop-word list (which you've already noticed)
See: http://dev.mysql.com/doc/refman/5.5/en/fulltext-stopwords.html
Solution
This answer: where to edit mysql fulltext stopword lists?
Tells you how to override/replace the default stop word list.
Here's the link to the MySQL docs page: http://dev.mysql.com/doc/refman/5.5/en/fulltext-fine-tuning.html

MySQL fulltext search isn't matching as expected

I have a pretty simple query that doesn't seem to be giving me the results I'd like. I'm trying to allow the user to search for a resturant by its name, address, or city using a fulltext search.
here is my query:
SELECT ESTAB_NAME, ESTAB_ADDRESS, ESTAB_CITY
FROM restaurant_restaurants rr
WHERE
MATCH (rr.ESTAB_NAME, rr.ESTAB_ADDRESS, rr.ESTAB_CITY)
AGAINST ('*new* *hua*' IN BOOLEAN MODE)
LIMIT 0, 500
New Hua is the restaurant that exists within the table. However when i do a search for 'ting ho' i get the results I would expect.
Does anyone have any idea what What is going on?
I'm using a MyISAM storage engine on MySQL version 5.0.41
Most likely, the full-text index settings have set a minimum word length of 4 - I believe this is the default. You'll need to change these settings, even for BOOLEAN MODE (as per http://dev.mysql.com/doc/refman/5.1/en/fulltext-boolean.html). Take a look at http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html for the settings to change.
I think Michael is right, but also, you probably want to remove the ***** characters unless that's actually in the title you're searching for. MATCH AGAINST doesn't require a "match all" type of parameter.
My guess: "new" is a Mysql Default Stop Word. See Michael Madsen's second link to see how to change the stop word list and regain the restaurant.