Irrelevant results in mysql full text search - mysql

I have a Mysql table set up for full text search across both the title and the content (body) columns.
I'm trying to bring the most relevant results to the top but I get a lot of garbage.
I have 3 full text indexes, one for the title, one for the body and one for both the title and the body so I can execute the following query:
SELECT id, url, title, body, earliestCapture, responseYear, urlScore,
MATCH (title) AGAINST ("jurassic park" IN BOOLEAN MODE) AS titleScore,
MATCH (body) AGAINST ("jurassic park" IN BOOLEAN MODE) AS bodyScore,
(SELECT (titleScore * 100 + bodyScore)) AS finalscore
FROM Entries
WHERE MATCH (title,body) AGAINST ("jurassic park" IN BOOLEAN MODE)
ORDER BY finalScore DESC LIMIT 0,1000;
I'm trying to multiply the score of the title by 100 to bring instances where the term is in the title to the top.
This does help, but if the body has the word park repeated many times even without the word Jurassic appearing a single time, that row is propelled to the top of the search results.
A great example of that is when I search for "intel pentium". There are a few rows with bodies that use the word intel in the context of intelligence/information and not the company name, that word is repeated hundreds of times and even though there are no instances of the word pentium, those pages are always on the top.
I'm getting really annoyed by this. Does anyone know how to improve the search results?
Thank you!

you ahev to add a + to both search terms so that only results are shown that have both see manual
SELECT id, url, title, body, earliestCapture, responseYear, urlScore,
MATCH (title) AGAINST ("+jurassic +park" IN BOOLEAN MODE) AS titleScore,
MATCH (body) AGAINST ("+jurassic +park" IN BOOLEAN MODE) AS bodyScore,
(SELECT (titleScore * 100 + bodyScore)) AS finalscore
FROM Entries
WHERE MATCH (title,body) AGAINST ("+jurassic +park" IN BOOLEAN MODE)
ORDER BY finalScore DESC LIMIT 0,1000;

Related

MySQL fulltext with score by columns

I created an index like this:
ALTER TABLE `blog_posts`
ADD FULLTEXT `title_description_content` (`title`, `description`, `content`);
I can search with:
SELECT * FROM `blog_posts`
WHERE MATCH(title, content, description)
AGAINST("lorem ipsum" IN NATURAL LANGUAGE MODE)
LIMIT 12
But I want to score by column. For example, that the title column is worth 3 points, that the description is worth 2 and the content is worth 1. So that words found in the title have a higher score.
I know this is possible because I already did it once, but I lost the source and I couldn't find any example on Google.
Thanks!
You can use multiple indexes for your problem
SELECT *,
MATCH (title) AGAINST ("lorem ipsum" IN NATURAL LANGUAGE MODE) AS rel1,
MATCH (content) AGAINST ("lorem ipsum" IN NATURAL LANGUAGE MODE) AS rel2,
MATCH (description) AGAINST ("lorem ipsum" IN NATURAL LANGUAGE MODE) AS rel3
FROM `blog_posts`
WHERE MATCH(title, content, description) AGAINST("lorem ipsum" IN NATURAL LANGUAGE MODE)
ORDER BY (rel1 * 3)+(rel2 * 2)+(rel3 * 1) DESC
LIMIT 12
The weight multiplications can be finetuned

FULLTEXT search with multiple exact phrases and exclusions

I'm trying to create a fulltext query that matches ANY of multiple exact phrases and excluding others. In my test query. I want to select any record that has EITHER (or both) the exact phrase 'brown cow' OR 'green cat' AND NOT 'silver rhino'. I have set up test records with combinations of these three phrases and should return 3 records if I can get my query right.
Query 1
SELECT * FROM jos_sea_messages
WHERE ((Match(body,subject) Against('"+green cat"' IN BOOLEAN MODE) OR Match(body,subject) Against('"+brown cow"' IN BOOLEAN MODE))
AND ( Match(body,subject) Against('"-silver rhino"' IN BOOLEAN MODE)))
Returns 2 records- one of them with 'silver rhino', so not what I want
Query 2
SELECT * FROM jos_sea_messages
WHERE ((Match(body,subject) Against('"+green cat" "-silver rhino"' IN BOOLEAN MODE) OR Match(body,subject) Against('"+brown cow" "-silver rhino"' IN BOOLEAN MODE)))
Returns all records with any of the phrases, including 'silver rhino', so still not right
Query 3
SELECT * FROM jos_sea_messages
WHERE (Match(body,subject) Against('"+green cat" "+brown cow" "-silver rhino"' IN BOOLEAN MODE))
Returns a whole lot of rows, some of which I don't think have any of the exact phrases?
What is the proper syntax for finding records that have either (or both) exact phrases 'brown cow' and 'green cat' but must not contain 'silver rhino'?
Thanks in advance.
I figured it out. Here is my query:
SELECT * FROM jos_sea_messages WHERE (Match(body,subject) Against('"green cat" "brown cow" -"silver rhino"' IN BOOLEAN MODE))
The + aren't required because that would mean that both phrases were needed, and the - goes outside the double quotes.
Hope it helps someone else. I'm sure this type of requirement isn't unique

query that returns rows that do not contain a word in MySql

I am trying to make a query where I select all the rows that do not contain a specific word, for this I have a fulltext type index in this column, try the following bolt works:
SELECT *
FROM products
WHERE MATCH(title) AGAINST(' -Dolo' IN BOOLEAN MODE)
So how can I perform this search?
If I have understood you correctly you want to find all the rows from the table that do not contain a word'Dolo'.
Well you can use NOT operator for that.
SELECT *
FROM products
WHERE NOT MATCH(title) AGAINST('Dolo');
Here is a DEMO.
Also, you can use it like this(because as the OP has asked: "if the whole word is "dolorem", would this query work?"):
SELECT title as Title
, MATCH(title) AGAINST('Dolo*' IN BOOLEAN MODE) as Score
FROM products
WHERE MATCH(title) AGAINST('Dolo*' IN BOOLEAN MODE) = 0;
* is a wildcard.
Other signs are described here: https://dev.mysql.com/doc/refman/8.0/en/fulltext-boolean.html
Here is the DEMO for the second example.

MySQL Match Against In Boolean Mode returns nothing on middle word

I've got a problem using Match Against in my MySQL database, and I'm hoping someone can help.
This is the examples of the data in my database:
id name
1 really bitter chocolate
2 soft cheese
When I run this query:
SELECT * FROM food WHERE (name) LIKE "%bitter%"
This bring back the first result:
1 really bitter chocolate
However its part of a much larger query, and when I run the Match Against code, I don't get anything returned from either of these queries:
SELECT * FROM food WHERE MATCH (name) AGAINST ("bitter")
SELECT * FROM food WHERE MATCH (name) AGAINST ("bitter", IN BOOLEAN MODE)
I have full text searches turned on, and it works when I search the start of the name:
SELECT * FROM food WHERE MATCH (name) AGAINST ("really")
SELECT * FROM food WHERE MATCH (name) AGAINST ("really", IN BOOLEAN MODE)
Both of which returns:
1 really bitter chocolate
I've read through this for solutions: http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
And I've looked here: mysql WHERE MATCH AGAINST
Can someone please see where I am going wrong or point me in the right direction?
Thanks!
EDIT
Ok as per Woot4Moo great answer below I've changed my code to remove the comma which shouldn't have been there. I've also added in the + and putting it in single quotes but still no luck.
My current query now looks like this:
SELECT * FROM food WHERE MATCH (name) AGAINST ('+bitter' IN BOOLEAN MODE)
But it's returning no results in query browser, and not returning any errors or warnings.
If this are the only two rows in your table then you have in 50% of the records the searched string and it will be ignored.
SELECT * FROM food WHERE MATCH (name) AGAINST ("bitter", IN BOOLEAN MODE)
looks like it should be:
SELECT * FROM food WHERE MATCH (name) AGAINST ('+bitter' IN BOOLEAN MODE)
notice how mine has the plus sign + and NO comma ,
Basing it off of the example here
MySQL can perform boolean full-text searches using the IN BOOLEAN
MODE modifier. With this modifier, certain characters have special
meaning at the beginning or end of words in the search string. In the
following query, the + and - operators indicate that a word is
required to be present or absent, respectively, for a match to occur.
Thus, the query retrieves all the rows that contain the word “MySQL”
but that do not contain the word “YourSQL”:
mysql> SELECT * FROM articles WHERE MATCH (title,body)
-> AGAINST ('+MySQL -YourSQL' IN BOOLEAN MODE);

match against, sort by relevance

My Query
$query = selectQuery('SELECT * FROM page_info WHERE MATCH(content)
AGAINST("reactie klanten" WITH QUERY EXPANSION)');
how can I sort by relevance? i get back about 10 rows, but the row that actually contains the words 'reactie' and 'klanten' is somewhere in the middle. the rest of the results is somehow relevant with the words :S
MySQL documentation about 'WITH QUERY EXPANSION'
EDIT
changed by query to:
SELECT *, MATCH(content) AGAINST("reactie klanten") AS relevance FROM page_info WHERE MATCH(content) AGAINST("reactie klanten" WITH QUERY EXPANSION) ORDER BY relevance DESC
this seems promising, because now the field is on top of the list.
still how are the others related?
example:
$text = This page is being tested by our tester
how is $text relevant to reactie klanten?