I have a table with names of movies, and I want to be able to search for a movie in that table. But I want to be able to search for part of the title, and still return a result. For example, if there is a record with the name "The quantum of solace", then I want to be able to do a search for "quantum solace", or even "007: quantum solace" and I want to find that record. Is there a way to do this?
EDIT
And how do I sort according to the matches? That is, the row that matches the most, should be returned first.
Use a MySQL Full Text Search in boolean mode.
If you do this when you search for '007: quantum solace' as it contains at least one matching result in the column it will be displayed, you can then order by relevancy.
SELECT *, MATCH(title) AGAINST ('quantum solace' IN BOOLEAN MODE) AS rank
FROM films
WHERE MATCH(title) AGAINST ('quantum solace' IN BOOLEAN MODE) ORDER BY rank DESC
Have a look at the full text search capabilities of MySQL. Once you set up a full text index, you can do queries like this one:
SELECT *
FROM movies
WHERE MATCH(title) AGAINST ('quantum solace' IN BOOLEAN MODE)
Related
I want to find all rows that match a full-text search for one pair of columns but also do not match the same text in another column.
Both of these seem to work
SELECT * FROM docs WHERE MATCH(title, descript) AGAINST ('energy' IN BOOLEAN MODE) AND NOT MATCH(categories) AGAINST ('energy' IN BOOLEAN MODE);
Or using a subquery:
SELECT * FROM docs WHERE MATCH(title, descript) AGAINST ('energy' IN BOOLEAN MODE) AND id NOT IN (SELECT id FROM docs where MATCH(categories) AGAINST ('energy' IN BOOLEAN MODE));
The docs field has the relevant full-text indexes set up.
Any reason to prefer one over the other?
On the (small) database I'm using they are both very fast, too fast to measure reliably.
Thanks for any suggestions.
I've been looking for resources to explain how this query exactly sorts retrieved items by relevance, and haven't been able to find any.
Hopefully one of you can explain the logistics of it to me?
SELECT *, MATCH(body, subject) AGAINST ('words' IN BOOLEAN MODE) AS relevance
FROM `messages`
WHERE MATCH(body, subject) AGAINST ('words' IN BOOLEAN MODE)
ORDER BY relevance DESC
In this case, I know that first half of this query searches through the messages.body and messages.subject columns for the search terms "words". It then returns those results, (regardless of the Boolean Operators) in what is essential a "random order" (ordered by what is found first, then found 2nd, and so on).
What I don't understand, however, is how MySQL interprets the WHERE clause and the rest of the query. How does repeating the first half of code reorder the results by relevance?
For example, an ORDER BY clause that sorts a users.user_id column by desc. numerical order MAKES SENSE to me because each row/cell has a clear order (e.g. - 3 , 2 , 1, and so on)
But how does (going back to the original query) MySQL interpret these "word" results (words, obviously not having any values/numbers/clear-order) and sort them according to relevance?
Is it because the Boolean Full-text Search gives hidden numerical values to these search terms? Like if the AGAINST clause read:
AGAINST ('+apple -macintosh ~microsoft >windows' IN BOOLEAN MODE)
Like "apple" gets a value of 100, "macintosh" a value of -100, "microsoft" a value of 20, and "windows" a value of 40 (to reflect the Operator Effects)?
I know that this is oversimplifying the process (especially when considering if a column contains more than one of these search terms), but that is the best I got.
What I basically need, is a layman-terms explanation of the WHERE clause's (the 2nd half of query code's) effect on the query results as a whole.
How can I do a MySQL search which will match partial words but also provide accurate relevancy sorting?
SELECT name, MATCH(name) AGAINST ('math*' IN BOOLEAN MODE) AS relevance
FROM subjects
WHERE MATCH(name) AGAINST ('math*' IN BOOLEAN MODE)
The problem with boolean mode is the relevancy always returns 1, so the sorting of results isn't very good. For example, if I put a limit of 5 on the search results the ones returned don't seem to be the most relevant sometimes.
If I search in natural language mode, my understanding is that the relevancy score is useful but I can't match partial words.
Is there a way to perform a query which fulfils all of these criteria:
Can match partial words
Results are returned with accurate relevancy
Is efficient
The best I've got so far is:
SELECT name
FROM subjects
WHERE name LIKE 'mat%'
UNION ALL
SELECT name
FROM subjects
WHERE name LIKE '%mat%' AND name NOT LIKE 'mat%'
But I would prefer not to be using LIKE.
The new InnoDB full-text search feature in MySQL 5.6 helps in this case.
I use the following query:
SELECT MATCH(column) AGAINST('(word1* word2*) ("word1 word1")' IN BOOLEAN MODE) score, id, column
FROM table
having score>0
ORDER BY score
DESC limit 10;
where ( ) groups words into a subexpression. The first group has like word% meaning; the second looks for exact phrase. The score is returned as float.
I obtained a good solution in this (somewhat) duplicate question a year later:
MySQL - How to get search results with accurate relevance
Well i'm running 2 queries that should show me the same result,
First query:
SELECT count( id ) AS cv FROM table_name WHERE field_name LIKE '%êêê01, word02, word03%'
Second query:
SELECT count( id ) AS cv FROM table_name WHERE match(field_name) against('êêê01, word02, word03')
but the first show more rows than the second, someone could tell me why?
I'm using fulltext index on this field,
Thanks.
I did a quick research and the following quote should answer your question:
One problem with MATCH on MySQL is that it seems to only match against whole words so a search for 'bla' won't match a column with a value of 'blah'.
It's also described in the documentation for match
By default, the MATCH() function performs a natural language search for a string against a text collection. A collection is a set of one or more columns included in a FULLTEXT index. The search string is given as the argument to AGAINST(). For each row in the table, MATCH() returns a relevance value; that is, a similarity measure between the search string and the text in that row in the columns named in the MATCH() list.
Meanwhile like is more "powerful" as it can look upon individuals characters:
Per the SQL standard, LIKE performs matching on a per-character basis, thus it can produce results different from the = comparison operator:
Which explains why like returns more results than match.
I have a text field that I'm searching against using an array of keywords and right now, I'm either searching for all of the keywords or any of the keywords.
My question is: is there a way to pull results with a minimum number of keywords?
For example, I'm searching for 6 keywords, but I only need 50% of them to match, so I want the fulltext search to only return results that have matched at least 3 of the keywords.
Is this even possible?
Maybe by using a FullText Modifier?
When you do a fulltext search using the IN BOOLEAN MODE modifier in your select statement, it can display the number of matches in the search.
Example:
SELECT id, MATCH (text) AGAINST ('MySQL Fulltext' IN BOOLEAN MODE) AS matches
FROM table_name
HAVING matches > 2;