How to use the result of GROUP_CONCAT in an RLIKE query? - mysql

Given a table keywords which contains some keywords for searching, and another table titles of which title to search. I tried GROUP_CONCAT all words in keyword and fed the result ('w1|w2|w3|w4') in a RLIKE query as following:
select title from titles where title rlike
(select group_concat(distinct word separator '|') from keywords) as keyword;
But the statement violated the SQL syntax. How could I fix the statement (assuming full-text search is unavailable)?

That seems like a very odd thing to do. Why not just use exists?
select t.title
from titles t
where exists (select 1
from keywords kw
where t.title rlike kw.word
);

Related

MySQL: Limit the number of characters in LIKE clause?

I'm using this query in my autocomplete feature:
SELECT description FROM my_table WHERE LIKE "%keyword%"
But this query returns the entire content of the field which is sometimes too long.
Is it possible to limit the number of characters before and after "keyword" ?
I suggest using MySQL's REGEXP operator here. For example, to accept a maximum of 10 characters before and after keyword, you could use:
SELECT description
FROM my_table
WHERE col REGEXP '^.{0,10}keyword.{0,10}$';
Note that if you intend to match keyword as a standalone word, you may want to surround it by word boundaries in the regex pattern:
SELECT description
FROM my_table
WHERE col REGEXP '^.{0,10}\\bkeyword\\b.{0,10}$';
To show for example 5 characters before and after you word you can do it using RIGHT, LEFT and SUBSTRING_INDEX
select description, concat(RIGHT(SUBSTRING_INDEX(description, 'keyword', 1),5), 'keyword', LEFT(SUBSTRING_INDEX(description, 'keyword', -1),5) ) as snippet
from my_table
where description like "%keyword%";
Check it here : https://dbfiddle.uk/MZcVJgEL

How to search for the number of times that keywords will appear in several text between two tables

I have 2 tables and the they have the following columns.
articles: id (PK), article (longtext), date (date)
keywords: id (PK), keyword (varchar)
For the moment I can only do that, search for a hard keyword and display the text where the word appears the most times :
SELECT * , MATCH (article) AGAINST ("keyword*" IN BOOLEAN MODE) AS relevance
FROM `articles`
WHERE MATCH (article) AGAINST ("keyword*" IN BOOLEAN MODE) ORDER BY relevance DESC LIMIT 10
How can I search for the number of times each keyword (Table : keywords) appears in each text (Table : articles) ?
I have try that (I do not know if it's possible?) but I get "Invalid argument at AGAINST":
SELECT keyword
FROM keywords
CROSS JOIN articles
WHERE MATCH (keywords.keyword)
AGAINST (articles.article IN NATURAL LANGUAGE MODE)
EDIT FOR Gordon Linoff :
Table : Keywords
Id Keyword
1 first
2 second
3 text
4 keyword
-
Table : Articles
Id Article
1 the first text
2 the second text
3 text text text
Desired results :
Keyword score
text 5
first 1
second 1
keyword 0
Storing keywords in delimited strings is just the wrong way to store them. You should have a junction/associate table with one row per keyword in each article.
That said, sometimes we are stuck with other people's really, really, really bad data models. If this is your data model, you should spend your effort fixing it rather than using it.
But, you can do this, with some string manipulations:
select kw.motcle,
sum( (length(d.articles) -
length(replace(d.articles, mc.motcle, '')
) / length(d.articles)
) as cnt
from test_motcle mc left join
articles a
on fin_in_set(mc.motcle, replace(d.articles, ' ', ',') > 0
group by mc.motcle
order by cnt desc;
select kw.motcle, count(d.articles)
from test_motcle kw left join
articles d
on concat(' ', d.articles, ' ') like concat('% ', kw.motcle, ' %')
group by kw.motcle
ORDER BY count(d.articles) DESC
LIMIT 10
:)

mysql multiple OR NOT LIKES

I have a wordpress plugin that essentially creates a mysql query and returns the results to wordpress.
It is user driven and so can end up in large queries with multiple NOT LIKEs which results in a very slow query.
Any suggestions that I could use to improve:
SELECT field1,field2,field3,field4
from datatable
WHERE (title NOT LIKE '%word%' AND title NOT LIKE '%word2%'
AND title NOT LIKE '%word3%' AND title NOT LIKE '%word4%'
AND title NOT LIKE '%word5%' AND title NOT LIKE '%word6%'
AND title NOT LIKE '%word7%' AND title NOT LIKE '%word8%'
AND title NOT LIKE '%word9%')
AND MATCH (title) AGAINST ("\"brandname\" " IN BOOLEAN MODE)
ORDER BY total ASC LIMIT 0,60
The customer is adding a lot of negative keywords to the wordpress plugin which results in larger queries than the one above.
This is most easily done with REGEXP. For multiple words, use a group like (one|two|three)
SELECT
field1,
field2,
field3,
field4
from datatable
WHERE
title NOT REGEXP '(word1|word2|word3|word4|word5...|word9)'
AND MATCH (title) AGAINST ("\"brandname\" " IN BOOLEAN MODE)
ORDER BY total ASC
LIMIT 0,60
You can use a REGEXP operation to compare all the patterns at once.
Your query will be something like:
SELECT field1,field2,field3,field4
FROM data table
WHERE title NOT REGEXP '^word[0-9]?$'
AND MATCH(title) ("\"brandname\" " IN BOOLEAN MODE)
ORDER BY total ASC LIMIT 0,60

MySQL Regexp search keyword

I use this query
SELECT keyword
FROM files
UNION SELECT keyword
FROM search
WHERE keyword
REGEXP "/(honda)|(jazz)|(manual)/"
AND keyword != "honda jazz manual"
ORDER BY keyword ASC
LIMIT 0 , 10
but I got this result
Big bang theory reference
I want to asking you guys, how to use regexp to search keyword.
Please try the following:
SELECT keyword
FROM
(SELECT keyword
FROM files
UNION SELECT keyword
FROM search) allkeywords
WHERE keyword REGEXP '(honda|jazz|manual)'
AND keyword != 'honda jazz manual'
ORDER BY keyword ASC
LIMIT 0 , 10
See http://sqlfiddle.com/#!2/9341ff/5
Explanation:
(1) The unioned query needed making into a subquery to allow the WHERE clause to affect all of it.
(2) The REGEXP syntax was slightly wrong - parentheses round the whole OR'd expression, not individual items.

MySQL Search Refinement (replace long regex with subquery)

I have a MySQL query
select query from HR_Health_Logs where query REGEXP 'CPU|MAC|PC|abacus|calculator|laptop|mainframe|microcomputer|minicomputer|machine';
Except that the regex is much longer, and contains many synonyms and misspellings.
I need to cut this short and have a table with all the synonyms and misspellings, so that I can avoid this very long query. So I'm looking for something like
select query from HR_Health_Logs where query REGEXP '**HAVE A TABLE WITH ALL MY SYNONYMS AND MISSPELLINGS SEARCHED HERE**';
How about the ANY function ?
select query from HR_Health_Logs where query REGEXP ANY (SELECT spell FROM misspelled WHERE correct = 'masturbate' ) ;
SELECT query
FROM HR_Health_Logs l, synonym s
WHERE l.query = s.synonym
SELECT query
FROM HR_Health_Logs
WHERE query IN (
SELECT synonym AS query
FROM synonyms_table
WHERE word = 'masturbation'
UNION
SELECT misspelling AS query
FROM misspellings_table
WHERE word = 'masturbation'
)
Assuming your synonyms and misspellings are in two separate tables. Otherwise you'll only use one of the subqueries and drop the UNION.