MySql Match() Against in Boolean mode - mysql

I'm not able to figure it out why this is not working.
Here is my query which works:
SELECT
id,
title
FROM
`pages`
WHERE MATCH (title) AGAINST ('Visual*' IN BOOLEAN MODE)
ORDER BY title
LIMIT 10;
and here is my pages table:
id title
===============================
1 About Us
2 Visual Data
but this one does not return any records:
SELECT
id,
title
FROM
`pages`
WHERE MATCH (title) AGAINST ('About*' IN BOOLEAN MODE)
ORDER BY title
LIMIT 10;
here is SQL Fiddle : http://sqlfiddle.com/#!9/d264f2/2

There are several important concepts when using full text search -- and the documentation has more details.
One key concept is what defines a word. That is not important here, but MySQL lets you specify the delimiters.
Another key concept is that only some words are indexed. Two common reasons why words are not indexed are:
They are too short (or I suppose too long, but that is unusual).
They are in the stop words list.
The words in the stop words list are usually "filler" words -- such as "the", "otherwise", . . . and you might guess "about".
You can override the stop words list. You will need to provide another stop words list (or none at all). And then rebuild the index.

Related

Boost MySQL query with lot LIKE conditions

Have a table with ~25K rows, the user can provide keywords and negative keywords to filter rows.
It slows down when a user added a lot of keywords and/or negative keywords.
The query looks like this:
SELECT id, title, description
FROM entities
WHERE
(
title LIKE '%keyword_1%' OR description LIKE '%keyword_1%'
OR title LIKE '%keyword_2%' OR description LIKE '%keyword_2%'
OR title LIKE '%keyword_3%' OR description LIKE '%keyword_3%'
)
AND
(
title NOT LIKE '%negative_keyword_1%' OR description NOT LIKE '%negative_keyword_1%'
OR title NOT LIKE '%negative_keyword_2%' OR description NOT LIKE '%negative_keyword_2%'
OR title NOT LIKE '%negative_keyword_3%' OR description NOT LIKE '%negative_keyword_3%'
)
for example, a query with 9 keywords and 130 negative keywords takes ~7 seconds.
Maybe there is a better solution to filter those rows without LIKE? maybe the whole logic is wrong.
Tried MATCH () AGAINST() - it is slower than LIKE for some reason.
For such cases like yours it might help to try to filter the positive matches in database and then test them for a presence of a negative keyword in memory. So try something like:
SELECT id, title, description
FROM entities
WHERE
(
title LIKE '%keyword_1%' OR description LIKE '%keyword_1%'
OR title LIKE '%keyword_2%' OR description LIKE '%keyword_2%'
OR title LIKE '%keyword_3%' OR description LIKE '%keyword_3%'
)
This gives you results which match positive keywords. You can then filter out the matches in memory which contain negative keywords.
Also give an index a chance to be used with LIKE. This reduces your search options from CONTAINS to STARTS WITH but it might be sufficient for your case. Syntax is LIKE 'keyword_1%'.
Fulltext search
Another option is to use MySQL fulltext search by defining a fulltext index on your title and description columns.
CREATE FULLTEXT INDEX idx_title ON entities(title);
CREATE FULLTEXT INDEX idx_description ON entities(description);
Or you can merge these two columns into a single column - for the search purposes. Then you need only one fulltext index.
Query syntax is then following:
MATCH (title) AGAINST ('keyword_1')
instead of
title LIKE '%keyword_1%'
For this search I would also recommend to filter positive matches only in a database and then filter out the matches in memory which contain negative keywords.

MySQL Fulltext search query matching ALL words still returns partial matches

I'm having the identical issue that this poster had, however the accepted answer didn't resolve my issue. Basically I'm trying to match my "title" column with ALL of the words in a fulltext search query, yet it's still returning partial matches. I recently transferred my MySQL database tables to a new web host, and my fulltext search isn't behaving as it was on my old server. I'm assuming there might be a setting difference, but I can't seem to locate it. Fulltext is enabled, my ft_min_word_len is set at 3, and yet the following MySQL query is still garnering partial matches:
SELECT title, MATCH (title) AGAINST ("more pink") AS relevance
FROM discography
WHERE MATCH (title) AGAINST ("+more +pink" IN BOOLEAN MODE)
ORDER BY relevance DESC
The above code returns the below set, the first 7 titles are:
Under The Pink & More Pink
Under The Pink Tour All Pass
Under The Pink Tour Guest Pass
Under The Pink Tour Aftershow Pass
Under The Pink Tour After Show Pass
Under The Pink
Under The Pink
How can I omit the partial matches? Is there something I'm missing? The results are even worse if I put the SELECT statement in Boolean mode, since that sets the relevance into a binary 1 or 0:
SELECT title, MATCH (title) AGAINST ("+more +pink" IN BOOLEAN MODE) AS relevance
FROM discography
WHERE MATCH (title) AGAINST ("+more +pink" IN BOOLEAN MODE)
ORDER BY relevance DESC
First 7 titles are:
Under The Pink
Under The Pink
Under The Pink
Under The Pink
Under The Pink
Under The Pink
Under The Pink & More Pink
Despite using the + operator, it doesn't seem to be narrowing my results. Any help would be welcome, many thanks in advance.
Well, I feel silly now. My table uses MyISAM, and according to the documentation, "more" is on the stopwords list. So that's why that search is picking up on partial matches. Thanks everyone for the help.
EDIT
If anyone is curious how to "go around" a stopwords list on shared hosting when programming your own search engine on your website, I recommend a similar technique that I used to get around my "ft_min_word_len" setting. Create a separate search column that saves a duplicate all of the values in the column or columns you wish to be searched via Fulltext. Create an include file that stores all the stopwords listed for your database type into an array. Before saving the values into your dedicated search column, loop through each individual word in your column values and check if any exist in the stopwords array using the include file. If any values include stopwords, add a character onto the stopword at the end (I chose "z"). Then when a search is triggered, loop the search terms through the same stopwords array and check to see if any include stopwords. If any search words are in the stopwords array, once again add the same character you chose to add to the end of the stopwords in your search column ("z" in this case). After looping through the array and making the necessary alterations to the search terms, you may search your dedicated search column without fear of your stopwords being ignored. Of course, I don't use my search column for any display purposes, only searching.

Common words not showing up in FULLTEXT search results

I am using Full Text searching for a website I am making to order a users search query by relevance. This is working great with one problem. When a search term is populated in my table more than 50% of the time, that query is ignored. I am looking for a solution to NOT ignore words that are in more than 50% of the rows of a table.
For example, in my "items" table, it may look something like this:
item_name
---------
poster 1
poster 2
poster 3
poster 4
another item
If a user searches for "poster", the query returns 0 results because it appears too many times in the table. How can I stop this from happening?
I've tried using IN BOOLEAN MODE, but that takes away the functionality I need (which is ordering by relevance).
Here's an example of my SQL:
SELECT item_title
FROM items
WHERE MATCH(item_title, tags, item_category) AGAINST('poster')
You have to recompile MySQL to change this. From the documentation on Fine-Tuning MySQL Full-Text Search
The 50% threshold for natural language searches is determined by the particular weighting scheme chosen. To disable it, look for the following line in storage/myisam/ftdefs.h:
#define GWS_IN_USE GWS_PROB
Change that line to this:
#define GWS_IN_USE GWS_FREQ
Then recompile MySQL. There is no need to rebuild the indexes in this case.

Mysql not ranking results, fulltext

I have setup a database and enabled fulltext search, i have some entries in the database that include the word 'test' and one that has 'test some more' when i used the below to search the database:
SELECT keywords, title FROM database WHERE Match(keywords) Against ('more test')
I was expecting it to rank the entry that had 'test some more' above the one that just had 'test' in it.
Am i doing something wrong maybe?
Thanks.
You never specify that the results must be ordered, you simply put an additional constraint on the result that there must be a match.
You can solve this with the following query:
SELECT keywords, title,
MATCH (keywords) AGAINST ('more test') AS relevance
FROM database
WHERE MATCH (keywords) AGAINST ('more test')
ORDER BY relevance DESC
And as #GordonLinoff mentions, you probably need to disable the stopwords filter by setting stopwords.txt variable to the empty string (''; but remember the original value to restore it).
By default, MySQL removes certain words from the text search. These are called "stop words" and they are not in the index. You can read about them here. The words "more" and "some" are examples of stop words.
You can provide your own list of stop words, but you will have to recreate the index.

MySql Match() Against() return all rows containg a word

I currently use the following query to get search results, and it works ok.
select name from table where MATCH(name) AGAINST(:searchTerm IN BOOLEAN MODE)
Thing is, if a row contained a word MariaDb, and a user search for Maria, it'll return no results. How is it possible to do a search query that'll return all rows containing the words Maria?
Users can also search for multiple words like Computer software
You need to use the operators as you can see in this link:
http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
So, change your search term acording with the operators that you need.
Your example would be:
To match MariaDB with Maria word, your search term should be: maria*
select name from table where MATCH(name) AGAINST('maria*' IN BOOLEAN MODE)