MySQL Fulltext Search: One-to-many Relationships - mysql

I'm attempting to implement a search function on a two tables with a one-to-many relationship. Think of it as a post with multiple tags. Each tag has its own row in the tag table.
I'd like to retrieve a post if all of the search terms can be found in either a) the post text, b) the post tags or c) both.
Let's say I've created my tables like this:
CREATE TABLE post (
id MEDIUMINT NOT NULL AUTO_INCREMENT,
text VARCHAR(100) NOT NULL
);
CREATE TABLE tag (
id MEDIUMINT NOT NULL AUTO_INCREMENT,
name VARCHAR(30) NOT NULL,
post MEDIUMINT NOT NULL
);
And I create indexes like this:
CREATE FULLTEXT INDEX post_idx ON post(text);
CREATE FULLTEXT INDEX tag_idx ON tag(name);
If my search query were "TermA TermB" and wanted to search just in the post text, I'd formulate my SQL query like this:
SELECT * FROM post WHERE MATCH(text) AGAINST('+TermA +TermB' IN BOOLEAN MODE);
Is there a way to add tags into the mix? My previous attempt was this:
SELECT * FROM post
RIGHT JOIN tag ON tag.post = post.id
WHERE MATCH(post.text) AGAINST('TermA TermB' IN BOOLEAN MODE)
OR MATCH(tag.name) AGAINST('TermA TermB' IN BOOLEAN MODE);
The problem is, this is only an any words query and not an all words query. By this I mean, I'd like to retrieve the post if TermA is in the text and TermB is in the tags.
What am I missing here? Is this even possible using a fulltext search? Is there a better way to approach this?

Try this one:
SELECT post.*
FROM post
INNER JOIN (SELECT post, GROUP_CONCAT(name SEPARATOR ' ') tags FROM tag GROUP BY post) tag ON post.id=tag.post
WHERE MATCH(post.text) AGAINST('+TermA +TermB' IN BOOLEAN MODE)
OR MATCH(tags) AGAINST('+TermA +TermB' IN BOOLEAN MODE)
This might work to also get results that match from either content or tags, but it didn't work in the MySQL 5.1:
SELECT post.*, GROUP_CONCAT(tag.name SEPARATOR ' ') tags
FROM post
LEFT JOIN tag ON post.id=tag.post
GROUP BY post.id
HAVING MATCH(post.text,tags) AGAINST('+TermA +TermB' IN BOOLEAN MODE)
so I rewrote it as:
SELECT post.*, tags
FROM post
LEFT JOIN (SELECT post, GROUP_CONCAT(tag.name SEPARATOR ' ') tags FROM tag GROUP BY post) tags ON post.id=tags.post
WHERE MATCH(post.text, tags) AGAINST('+TermA +TermB' IN BOOLEAN MODE)

This is possible, but I'm guessing that in your Tags table, you have one row for each tag per post. So one row containing the tag 'TermA' for post 1 and another record with the tag 'TermB', right?
The all words query (with +) only returns rows where the searched field contains all the specified words. For the tags table, that is never the case.
One possible solution would be to store all tags in a single field in the posts table itself. Then it would be easy to do advanced matching on the tags as well.
Another possibility is to change the condition for tags altogether. That is, use an all query for the text and an any query for the tags. To do that, you'll have to modify the search query yourself, which can fortunately be as easy as removing the plusses from the query.
You can also query for an exact match, like this:
SELECT * FROM post p
WHERE
MATCH(p.text) AGAINST('TermA TermB' IN BOOLEAN MODE)
AND
/* Number of matching tags .. */
(SELECT COUNT(*) FROM tags t
WHERE
t.post = p.id
AND (t.tag in ('TermA', 'TermB')
= /* .. must be .. */
2 /* .. number of searched tags */ )
In this query, I count the number of matching tags. In this case I want it to be exactly 2, meaning that both tags match (provided that tags are unique per post). You could also check for >= 1 to see if any tags match.
But as you can see, this also requires parsing of the search string. You will have to remove the plusses (or even check their existence to understand whether you want 'any' or 'all'). And you will have to split it as well to get the number of searched words, and get the separate words themselves.
All in all, adding all tags to a 'tags' field in post is the easiest way. Not ideal from a normalisation point of view, but that is managable, I think.

You can search on both text and tags.
SELECT *
FROM post
WHERE MATCH(text,tags) AGAINST('+TermA +TermB' IN BOOLEAN MODE)
To get this to work you'll need to make a FULLTEXT index for both columns together.
CREATE FULLTEXT INDEX keywords ON pos(text,tags)
In Boolean search mode this should do what you want.

Related

MySQL match against sub select

I have a query on a full text column 'abstract':
SELECT title
FROM citation
WHERE MATCH(abstract)
AGAINST ('+orange' IN BOOLEAN MODE)
My question: I want to generate the AGAINST clause using results from a sub select, eg (pseudocode):
SELECT title
FROM citation
WHERE MATCH(abstract)
AGAINST (
CONCAT("'","+",
SELECT names
FROM fruits
WHERE type = "citrus"),"' "
), "IN BOOLEAN MODE"
)
Is this possible?
The arguument in against needs to be a constant.
Take a look at this document here. click to see
Your query is much likely to work with LIKE however using this makes your query a lot slower.
Or you could do something like
SET #finding_element=(SELECT c1.full_name FROM customer c1 WHERE c1.id=2); SELECT c.full_name FROM customer c WHERE MATCH(c.full_name) AGAINST((#finding_element));
(query I have is used is just for reference)
Happy Learning

It is possible to use MATCH AGAINST with COALESCE?

It is possible to match COALESCE(x,y) from two different tables against a string?
Here is my request (not working...)
SELECT COALESCE(title_translations.title,collection.title)
LEFT JOIN title_translations ON title_translations.ref_collection=collection.id
WHERE MATCH(COALESCE(title_translations.title,collection.title)) AGAINST("string")
The request works properly if i try to only match collection.title, but doesn't with both
https://dev.mysql.com/doc/refman/8.0/en/fulltext-search.html says:
MATCH() takes a comma-separated list that names the columns to be searched.
By trying to use COALESCE(), you're passing a string expression to MATCH(), not column identifiers.
That won't work.
Re your comment:
MATCH(title_translations.title,collection.title) wouldn't work anyway because the columns you list must belong to a single fulltext index, and each index belongs to one table. You can't list columns from different tables. Also you must list all the columns defined, if you defined a multi-column fulltext index.
I assume in your case, you have one fulltext index defined for the single column title in each table.
You will need this:
WHERE MATCH(title_translations.title) AGAINST('string')
OR MATCH(collection.title) AGAINST('string')
You must do this in two matching terms, so you must repeat the AGAINST in each term.
But I'm not sure if that does what you intend. It isn't clear from your original question.
Re your clarification:
If the title exists in title_translations, string need to be matched against title_translations.title and only against it. If the title doesn't exist in title_translations, string need to be matched against collection.title
This is what I come up with:
SELECT x.title FROM
(
SELECT IF(t.title IS NOT NULL,
IF(MATCH(t.title) AGAINST('string') > 0, t.title, NULL),
IF(MATCH(c.title) AGAINST('string') > 0, c.title, NULL)
) AS title
FROM collection AS c
LEFT OUTER JOIN title_translations AS t
ON t.ref_collection = c.id
) AS x
WHERE x.title IS NOT NULL

having with match and group_concat in mysql

I' trying to write a MYSQL query which looks for a string in an aggregation of fields.
The following query finds all the concatenations where "io sono" is present:
SELECT chapter, GROUP_CONCAT(text_search) AS aggregated_chapters
FROM bible_it_cei_2008
GROUP BY chapter
HAVING aggregated_chapters LIKE '%io sono%';
However, trying to use MATCH... AGAINST instead of LIKE:
SELECT chapter, GROUP_CONCAT(text_search) AS aggregated_chapters
FROM bible_it_cei_2008
GROUP BY chapter
HAVING MATCH ( aggregated_chapters ) AGAINST ( '+"io sono"' IN BOOLEAN MODE);
returns the error:
#1210 - Incorrect arguments to MATCH
Isn't there any way to use MATCH AGAINST with GROUP_CONCAT?
Isn't there any way to use MATCH AGAINST with GROUP_CONCAT?
No. That's not the way FULLTEXT search works in MySQL.
If your table contains the columns chapter and text_search, and you hope to find the values of chapter matching text search, you want something like this.
SELECT chapter,
MATCH(text_search) AGAINST ('+"io sono"' IN NATURAL LANGUAGE MODE) AS score
FROM bible_it_cei_2008
To get this to work you'll need to create an appropriate FULLTEXT index.

MySQL wildcard Like query with multiple words

I have a mysql query as follows.
$query="SELECT name,activity FROM appid
where result > 5 AND name LIKE :term ORDER BY name ASC LIMIT 0,40";
$result = $pdo->prepare($query);
$result->bindvalue(':term','%'.$_GET["q"].'%',PDO::PARAM_STR);
$result->execute();
What i want to do is this.
I have and entry like this that i want to find
'News & Weather'
However when i type
'news weather'
it of course will not find it. How can i be able to type that and retrieve that entry?
Regular expressions can do the trick:
select *
from appid
where name rlike 'news|weather' -- Matches 'news' or 'weather'
Another example:
select *
from appid
where name rlike 'news.*weather' -- Matches 'news' and 'wether'
-- with any characters in-between or none at all
-- (ordered)
Just one more:
select *
from appid
where name rlike '^news.{1,}weather$' -- Matches any text that starts with 'news'
-- and has at least one character before
-- ending with 'weather'
Regular espressions can be used to create very complicated filters on text fields. Read the link above for more information.
If you can add a full-text index to your table, Full-text search might be the better way to go with this. Specifically, a boolean Full-Text search:
select *
from appid
where match(name) against (+news +weather)
I believe the only way possible are through code:
Option A: Replace the spaces in your query parameter with '%' in code, but that of course will make the multiple words ordered
Option B: Split your parameter on spaces and dynamically construct your query with as many LIKEs as needed, adding additional ":termN" parameters for each one.

Ordering mysql result by number of regexp matches

I've the following query. It selects all posts where the title contains the words green, blue or red.
SELECT id, title FROM post WHERE title REGEXP '(green|blue|red)'
I would like to sort the results in such a way that the title with the most matches (all three words) and thus the most relevant one, is listed first. Is this possible in this scenario and if so, how I would go on about it?
Thanks
You must split the regex. Either to different conditions or different queries:
SELECT COUNT(results.username) as count, results.* FROM (
SELECT * FROM `post` WHERE `title` LIKE "%blue%"
UNION SELECT * FROM `post` WHERE `title` LIKE "%red%"
UNION SELECT * FROM `post` WHERE `title` LIKE "%green%"
) as results GROUP BY results.title ORDER BY count DESC;
Note: I used LIKE instead of REGEXP, becouse when you split the condition you wont need it anymore according to your example. LIKE is a bit faster then regex, but if your pattern is more complex, then you can always replace it back.