I have a table with articles and a table with categories. Each category has a number of keywords and I want to use those keywords to determine if an article belongs to a certain category.
I'm using the query below:
SELECT
path,
title,
description,
keywords
FROM
(
SELECT
path,
keywords,
(select title from article where id = 164016) as title,
(select description from article where id = 164016) as description
FROM
categories c
) as x
WHERE
MATCH (title, description) AGAINST ('my keywords' IN BOOLEAN MODE)
For some reason this query is not working because of incorrect paramaters with MATCH but I can't figure out what it is.
Use boolean fulltext search to enable exact matching on an expression. However, there are some limitations to such searches, as described by the documentation linked above:
A phrase that is enclosed within double quote (") characters matches only rows that contain the phrase literally, as it was typed. The full-text engine splits the phrase into words and performs a search in the FULLTEXT index for the words. Nonword characters need not be matched exactly: Phrase searching requires only that matches contain exactly the same words as the phrase and in the same order. For example, "test phrase" matches "test, phrase".
If the phrase contains no words that are in the index, the result is
empty. The words might not be in the index because of a combination of
factors: if they do not exist in the text, are stopwords, or are
shorter than the minimum length of indexed words.
The above also means that if you have a stop word or a word shorter than the minimum length in the search expression, then MySQL will not return any matches.
SELECT path, keywords, MATCH (c.keywords) AGAINST ('"Here is some text"' IN BOOLEAN MODE) as relevance
FROM categories c
WHERE MATCH (c.keywords) AGAINST ('"Here is some text"' IN BOOLEAN MODE)
If you want completely exact matches, then you cannot use fulltext search. You either need to use the like operator or the = operator.
UPDATE
title in this case is a calculated field which does not have a fulltext index.
against()
takes a string to search for, and an optional modifier that indicates what type of search to perform. The search string must be a string value that is constant during query evaluation. This rules out, for example, a table column because that can differ for each row.
(source)
You have a field name in the against() function, which is not allowed.
Related
I have a table in which I created FullText index in a column called item_desc.
Let's say table contains three records in which column item_desc includes "Sodium Chloride" like following:
Solution Sodium Chloride standard
5425 Sodium Chloride 100u
QtySodium Chloride solution
I have a following (Match, Against) query which supposed to be return rows by exact matching the records but it is returning only first two rows against Sodium Chloride and doesn't consider the phrase if it is concatenated with another word like QtySodium Chloride.
SELECT * FROM tblhugedata WHERE MATCH(Item_desc) AGAINST('"*Sodium Chloride*"' IN BOOLEAN MODE);
Following LIKE query returns expected results but I want to use only FullText index.
SELECT * FROM tblhugedata WHERE Item_desc like '%SODIUM CHLORIDE%';
Is there anyway to extract such results by match, Against way.
Remove the asterisks. FULLTEXT does not allow for leading wildcards. That is, there is no way to get MATCH to match QtySodium against Sodium.
I would consider "QtySodium" to be "garbage in" and complain to the provider of the data.
Here is a kludge that will work in some cases:
WHERE WHERE MATCH(Item_desc) AGAINST('Sodium Chloride' IN BOOLEAN MODE)
AND Item_desc LIKE '%SODIUM CHLORIDE%'
That way, it will efficiently filter down to rows that have either "Sodium" or "Chloride", then check such rows for exactly the substring "Sodium Chloride". That will match your 3 examples, but perhaps not some other examples.
SELECT * FROM tblhugedata WHERE MATCH(Item_desc) AGAINST( 'Sodium Chloride' IN NATURAL LANGUAGE MODE);
InnoDB full-text search does not support the use of multiple operators on a single search word.
I have a MySQL table with a list of bad words in it (bad_words), and I want to scan a text field (public_message) for the number of bad words in that field. There are about 1100 entries in bad_words
I've tried contains, but that only looks at one word.
Something like this:
SELECT public_post_id, count(word)
FROM public_posts
WHERE public_message CONTAINS (SELECT word FROM bad_words)
I know this syntax is wrong, but that's the gist of what I'm trying to achieve.
The final output should be a number of bad words in each public_message. I'm not concerned with which words at this point, just if there are any, and how many.
You can do this:
SELECT p.public_post_id, COUNT(*)
FROM public_posts p JOIN bad_words b
ON public_message LIKE CONCAT('%', b.word, '%')
But it will have incredibly bad performance. It will have to do a number of searches equal to the number of rows in public_posts times the 1,100 words in bad_words.
MySQL has a feature for fulltext indexing, but it won't work for your case.
https://dev.mysql.com/doc/refman/8.0/en/fulltext-restrictions.html says:
The argument to AGAINST() must be a string value that is constant during query evaluation. This rules out, for example, a table column because that can differ for each row.
In other words, you can't do this:
SELECT ...
FROM public_posts p JOIN bad_words b
ON MATCH(p.public_posts) AGAINST(b.word) -- ERROR!
You could search for one word at a time, but then it would require 1,100 queries.
Or you could do it with a fulltext index by listing many words in the AGAINST expression:
SELECT ...
FROM public_posts p
WHERE MATCH(p.public_posts) AGAINST('word1 word2 word3 word4 word5...' IN BOOLEAN MODE)
The many words could be a list you generate by querying the bad_words table.
But this doesn't tell you the count of matches per word, only that the post contained at least one matching word.
Also, I'm not sure if there's a length limit, or if you can make a string of all 1,100 bad words.
I don't know of any other fulltext search implementation that would handle this better.
Well i'm running 2 queries that should show me the same result,
First query:
SELECT count( id ) AS cv FROM table_name WHERE field_name LIKE '%êêê01, word02, word03%'
Second query:
SELECT count( id ) AS cv FROM table_name WHERE match(field_name) against('êêê01, word02, word03')
but the first show more rows than the second, someone could tell me why?
I'm using fulltext index on this field,
Thanks.
I did a quick research and the following quote should answer your question:
One problem with MATCH on MySQL is that it seems to only match against whole words so a search for 'bla' won't match a column with a value of 'blah'.
It's also described in the documentation for match
By default, the MATCH() function performs a natural language search for a string against a text collection. A collection is a set of one or more columns included in a FULLTEXT index. The search string is given as the argument to AGAINST(). For each row in the table, MATCH() returns a relevance value; that is, a similarity measure between the search string and the text in that row in the columns named in the MATCH() list.
Meanwhile like is more "powerful" as it can look upon individuals characters:
Per the SQL standard, LIKE performs matching on a per-character basis, thus it can produce results different from the = comparison operator:
Which explains why like returns more results than match.
I have to implement fulltext search in multiple columns with result weighting based on relevance of certain columns / fields.
All the solutions I've come across seem to use single-column indexes for calculating relevance and one multiple-column index for the WHERE clause. See: https://stackoverflow.com/a/600915/168719 or https://stackoverflow.com/a/6305108/168719
Here's my query then:
SELECT MATCH(name) AGAINST (text) as relevance_name,
MATCH(description) AGAINST(text) as relevance_description,
MATCH(description_long) AGAINST (text) as relevance_description_long
FROM products WHERE
And I'm facing the choice between:
a)
MATCH(name, description, description_long) AGAINST (text) > 0
b)
MATCH(name) AGAINST (text) > 0
OR MATCH(description) AGAINST (text) > 0
OR MATCH(description_long) AGAINST (text) > 0
After which the sorting clause comes.
ORDER BY (relevance_name * 2 +
relevance_description * 3 +
relevance_description_long * 4) / 9
The question is - what is the superiority of a (apparently the preferred method) over b?
a requires creating another fulltext index (across all searchable columns), which obviously takes more disk space.
What are the advantages? Is it a matter of performance? Or search quality?
Manual on page 12.9.1. Natural Language Full-Text Searches tells us:
For each row in the table, MATCH() returns a relevance value; that is, a similarity measure between the search string and the text in that row in the columns named in the MATCH() list.
Therefore, MATCH () will return different values for MATCH (c1,c2,c3) and MATCH(c1) + MATCH(c2) + MATCH(c3). Similar difference will be when using match with OR operator.
Relevance is computed based on the number of words in the row, the number of unique words in that row, the total number of words in the collection, and the number of documents (rows) that contain a particular word.
You should use approach B, because it is in the same form as your query.
I have a text field that I'm searching against using an array of keywords and right now, I'm either searching for all of the keywords or any of the keywords.
My question is: is there a way to pull results with a minimum number of keywords?
For example, I'm searching for 6 keywords, but I only need 50% of them to match, so I want the fulltext search to only return results that have matched at least 3 of the keywords.
Is this even possible?
Maybe by using a FullText Modifier?
When you do a fulltext search using the IN BOOLEAN MODE modifier in your select statement, it can display the number of matches in the search.
Example:
SELECT id, MATCH (text) AGAINST ('MySQL Fulltext' IN BOOLEAN MODE) AS matches
FROM table_name
HAVING matches > 2;