I have setup a database and enabled fulltext search, i have some entries in the database that include the word 'test' and one that has 'test some more' when i used the below to search the database:
SELECT keywords, title FROM database WHERE Match(keywords) Against ('more test')
I was expecting it to rank the entry that had 'test some more' above the one that just had 'test' in it.
Am i doing something wrong maybe?
Thanks.
You never specify that the results must be ordered, you simply put an additional constraint on the result that there must be a match.
You can solve this with the following query:
SELECT keywords, title,
MATCH (keywords) AGAINST ('more test') AS relevance
FROM database
WHERE MATCH (keywords) AGAINST ('more test')
ORDER BY relevance DESC
And as #GordonLinoff mentions, you probably need to disable the stopwords filter by setting stopwords.txt variable to the empty string (''; but remember the original value to restore it).
By default, MySQL removes certain words from the text search. These are called "stop words" and they are not in the index. You can read about them here. The words "more" and "some" are examples of stop words.
You can provide your own list of stop words, but you will have to recreate the index.
Related
I'm having the identical issue that this poster had, however the accepted answer didn't resolve my issue. Basically I'm trying to match my "title" column with ALL of the words in a fulltext search query, yet it's still returning partial matches. I recently transferred my MySQL database tables to a new web host, and my fulltext search isn't behaving as it was on my old server. I'm assuming there might be a setting difference, but I can't seem to locate it. Fulltext is enabled, my ft_min_word_len is set at 3, and yet the following MySQL query is still garnering partial matches:
SELECT title, MATCH (title) AGAINST ("more pink") AS relevance
FROM discography
WHERE MATCH (title) AGAINST ("+more +pink" IN BOOLEAN MODE)
ORDER BY relevance DESC
The above code returns the below set, the first 7 titles are:
Under The Pink & More Pink
Under The Pink Tour All Pass
Under The Pink Tour Guest Pass
Under The Pink Tour Aftershow Pass
Under The Pink Tour After Show Pass
Under The Pink
Under The Pink
How can I omit the partial matches? Is there something I'm missing? The results are even worse if I put the SELECT statement in Boolean mode, since that sets the relevance into a binary 1 or 0:
SELECT title, MATCH (title) AGAINST ("+more +pink" IN BOOLEAN MODE) AS relevance
FROM discography
WHERE MATCH (title) AGAINST ("+more +pink" IN BOOLEAN MODE)
ORDER BY relevance DESC
First 7 titles are:
Under The Pink
Under The Pink
Under The Pink
Under The Pink
Under The Pink
Under The Pink
Under The Pink & More Pink
Despite using the + operator, it doesn't seem to be narrowing my results. Any help would be welcome, many thanks in advance.
Well, I feel silly now. My table uses MyISAM, and according to the documentation, "more" is on the stopwords list. So that's why that search is picking up on partial matches. Thanks everyone for the help.
EDIT
If anyone is curious how to "go around" a stopwords list on shared hosting when programming your own search engine on your website, I recommend a similar technique that I used to get around my "ft_min_word_len" setting. Create a separate search column that saves a duplicate all of the values in the column or columns you wish to be searched via Fulltext. Create an include file that stores all the stopwords listed for your database type into an array. Before saving the values into your dedicated search column, loop through each individual word in your column values and check if any exist in the stopwords array using the include file. If any values include stopwords, add a character onto the stopword at the end (I chose "z"). Then when a search is triggered, loop the search terms through the same stopwords array and check to see if any include stopwords. If any search words are in the stopwords array, once again add the same character you chose to add to the end of the stopwords in your search column ("z" in this case). After looping through the array and making the necessary alterations to the search terms, you may search your dedicated search column without fear of your stopwords being ignored. Of course, I don't use my search column for any display purposes, only searching.
I'm not able to figure it out why this is not working.
Here is my query which works:
SELECT
id,
title
FROM
`pages`
WHERE MATCH (title) AGAINST ('Visual*' IN BOOLEAN MODE)
ORDER BY title
LIMIT 10;
and here is my pages table:
id title
===============================
1 About Us
2 Visual Data
but this one does not return any records:
SELECT
id,
title
FROM
`pages`
WHERE MATCH (title) AGAINST ('About*' IN BOOLEAN MODE)
ORDER BY title
LIMIT 10;
here is SQL Fiddle : http://sqlfiddle.com/#!9/d264f2/2
There are several important concepts when using full text search -- and the documentation has more details.
One key concept is what defines a word. That is not important here, but MySQL lets you specify the delimiters.
Another key concept is that only some words are indexed. Two common reasons why words are not indexed are:
They are too short (or I suppose too long, but that is unusual).
They are in the stop words list.
The words in the stop words list are usually "filler" words -- such as "the", "otherwise", . . . and you might guess "about".
You can override the stop words list. You will need to provide another stop words list (or none at all). And then rebuild the index.
I have following table structure,
keyword keyword_type
---------------------------------
Membership Renew
Membership New
Membership Lost
Membership Damage
Both the columns are indexed for full text search. And I have set ft_min_word_len = 3 in mysql configuration file, so that the word less than 4 characters can be searched. (MYSQL SHOW VARIABLES confirms that variable is set correctly). But the problem I m facing is whenever I execute the following query, Membership Renew comes first, Membership New should come first as I have ordered by relevancy.
SELECT *, MATCH(keyword_name, keyword_type)
AGAINST ('+membership new' IN BOOLEAN MODE) AS relevancy
FROM table_keywords
WHERE MATCH(keyword_name, keyword_type)
AGAINST ('+membership new' IN BOOLEAN MODE) ORDER BY relevancy DESC
As instructed in other posts I tried
REPAIR TABLE table_keywrods QUICK
This didn't worked either. Any help?
Thanks
Sharmila
My experiments with this show that all the samples in your table throw back the same relevancy value with your query.
One of the facts about FULLTEXT search is its dependence on human disambiguation. It's often hard to make the very first row of the result set be the perfect row.
You could try putting in this ORDER BY clause, to show the shortest results first. It might do the trick; it did for me.
ORDER BY relevancy DESC, LENGTH(keyword)+LENGTH(keyword_type) ASC
Here's a SQL fiddle. http://sqlfiddle.com/#!2/ab9e5/2/0
I will simplify my problem in order to explain it.
I have a table which contains text messages posted by users and another table which contains keywords.
I want to display, for each user, the number of times keywords are found in text messages.
I don't want the result to display a keyword if it's not found in text messages.
I wan't it to be case INSENSITIVE. All keywords are lowered but in messages, you can find lower & upper chars.
Because I'm not sure that my explanation is clear enough, here comes the SQLFiddle.
http://sqlfiddle.com/#!2/c402a
Hope anyone can help me.
I found what I was looking for. It wasn't easy for me but here is my query :
SELECT t_msg.msg_usr,
t_list.list_word,
count(t_list.list_word),
t_msg.msg_text
FROM t_msg
INNER JOIN t_list
ON LOWER(t_msg.msg_text) LIKE CONCAT("%", t_list.list_word, "%")
GROUP BY t_msg.msg_usr, t_list.list_word;
The SQLFiddle is there : http://sqlfiddle.com/#!2/ba052/8
The recommendation would be to not try solving this with a query. It's possible to write a query that will do it, such query will scan the messages table for each keyword separately, and produce a count (or a row that you can group by), but this won't scale, or be reliable in sense of language search.
Here is what you might want to do:
Create a table to map (user_id, keyword_id) to a count of this keyword in messages of this user. Let's call it t_keyword_count.
Each time you receive a message, before you save the message into the database, search it for all the keywords you care about (using whatever good text search libraries that account for misspellings, etc.). You should know the (user_id) for this message.
You will, at that point, be ready to add the message to the database, and will have an array of (keyword_id) with keywords that this message will have.
In a transaction, insert the message into the t_msg table, and run update/insert for (user_id,keyword_id) to have value=value+1 (or +n, if you need to count the same keyword more than once in the same message) for the t_keyword_count table.
If you are trying to solve the problem of having to do the above on existing data, you can do this manually, just to build up that t_keyword_count table first (depends on how many keywords you have in total, but even if there are a lot, this can be scripted). But you should change (or mirror) the t_msg.msg_text field to be a field suitable for text search, and use SQL text search functionality to find the keywords.
I currently use the following query to get search results, and it works ok.
select name from table where MATCH(name) AGAINST(:searchTerm IN BOOLEAN MODE)
Thing is, if a row contained a word MariaDb, and a user search for Maria, it'll return no results. How is it possible to do a search query that'll return all rows containing the words Maria?
Users can also search for multiple words like Computer software
You need to use the operators as you can see in this link:
http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
So, change your search term acording with the operators that you need.
Your example would be:
To match MariaDB with Maria word, your search term should be: maria*
select name from table where MATCH(name) AGAINST('maria*' IN BOOLEAN MODE)