Highlight matched words MySQL FULLTEXT index - mysql

MySQL FULLTEXT index trouble.
There are:
Table with words like:
ID
WORDS
1
jfjdeur,rjjghfje,rioogr
2
fkjtifdfe,lerkr
3
kfrkriti
4
frlerkti,tykitriero,frorodfl,rfjkrjr
...
N
fjfjtiu,frkrker,fkdkri
MySQL FULLTEXT index on WORDS column.
Query like:
SELECT *
FROM `table`
WHERE match(`words`) against ('gjfjdeur,rioogr,tykitriero')
As the result returns:
ID
WORDS
1
gjfjdeur,rjjghfje,rioogr
(because contains gjfjdeur and rioogr)
4
frlerkti,tykitriero,frorodfl,rfjkrjr
(because contains tykitriero)
Is it possible to rewrite query to create additional column which will be contain found/matched word? Something like:
ID
WORDS
FOUND
1
gjfjdeur,rjjghfje,rioogr
gjfjdeur,rioogr
4
frlerkti,tykitriero,frorodfl,rfjkrjr
tykitriero
Maybe something like:
SELECT *, some_function_to_select_matched_range(`word`) as found
FROM `table`
WHERE match(`words`) against ('gjfjdeur,rioogr,tykitriero')
In realily, table.words consists of hundreds words and preg_match_all in php isn't a good solution to select the matched words from each found.

Related

How to use REGEXP in mysql for matching words from a text

I have a mysql query like :
SELECT name FROM table_name WHERE name LIKE '%custom%' limit 10
It retruns me 2 rows from my custom table.
I want to get records which contains either of any word from the text c cu cus cust usto stom tom om m also.
I tried below query :
SELECT name FROM table_name WHERE name like '%custom%' OR name REGEXP 'c|cu|cus|cust|usto|stom|tom|om|m' limit 10
Above query returning me 7 records but these 7 records does not have such 2 records which 1st query result have.
How to get that? Or any other way to get these result in mysql?
EDIT : Here I also want to order by maximum substrings matches in second query.
Try this:
SELECT name FROM table_name WHERE name REGEXP 'custom' limit 10;
There is no need of LIKE with REGEXP, but REGEXP are slower then LIKE. So if your table have so many records then REGEXP quesries are slower.
Try this:
SELECT name FROM table_name WHERE name REGEXP 'custom|c|cu|cus|cust|usto|stom|tom|om|m' limit 10
What we did above is that we combined custom with the rest of the patterns, and we made them all use REGEXP.
You need to add word boundaries, which in MySQL are [[:<:]] for start of word and [[:>:]] for end of word:
SELECT name
FROM table_name
WHERE name REGEXP '[[:<:]](c|cu|cus|cust|usto|stom|tom|om|m)[[:>:]]'
limit 10
See live demo.
Note the brackets around the alternation.

How to use regexp

I have 'tags' table with columns (id, link). Link possible values:
id link
1 index
2 index/index
3 index/.*
when I got index, I need to select id 1. And:
index -> 1
index/index -> 2
index/test -> 3
I use something like this:
SELECT * FROM tags WHERE 'index/test' REGEXP link LIMIT 1
But it's return me id 1, if I remove LIMIT second row will be id 3. I need just full math - only id 3.
Also for regexp, the column identifier comes first, then the search value. When I get you right, you want to get id -> 3 whenever the search term is not index or index/index. So your regex could be something like
SELECT * FROM tags WHERE link REGEXP '[^index/index|^index]$'

Query data using first 3 and last 3 characters from a ten character word

I have data set of about 10K alphanumeric words with 10 characters length each. I need to match these using the first 3 characters and the last 3 characters.
Example: BGP12BR2010
In this case, I should use only BGP and 010 and see if there are any entries in my database. I have used
LEFT(replace(term_id,' ',''),3)||RIGHT(replace(term_id,' ',''),3)
Is there any other way to get this done.
You can also use LIKE:
SELECT * FROM yourTabel WHERE term_id LIKE 'BGP%210';
this matches on all string, not only 10 CHAR. to specify the lenght you can
use underscore
SELECT * FROM yourTabel WHERE term_id LIKE 'BGP____210';
A better way for this is to add 2 virtual persitent fields, where Mysql calculate the values and you also can set a index on it for a better performance and not using a full table scan
add persistent virtual fields
ALTER TABLE yourtable
ADD COLUMN first3 VARCHAR(5) AS (SUBSTRING('hallo',1,3)) PERSISTENT,
ADD COLUMN last3 VARCHAR(5) AS (SUBSTRING('hallo',-3,3)) PERSISTENT;
Now you can select it
SELECT * FROM yourTable where first in('BGP','YXZ','XXX) and last3 = '210';
I'll do so:
SELECT * FROM yourtable
WHERE LENGTH(yourcolumn) = 10
AND yourcolumn LIKE 'BPG%010';
To get all the values starting with 3 alphabets and ending with 3 numeric characters, use
select *
from t
where val regexp '^[a-z]{3}.+[0-9]{3}$'
To extract them, if they follow the above pattern,
select val, substring(val,1,3) as first3, substring(val,-3,3) last3,
--concatenate them if required
concat(substring(val,1,3), substring(val,-3,3)) concatenated_string
from t
where val regexp '^[a-z]{3}.+[0-9]{3}$'
Add a condition for length of the column if it has to be exactly 10 characters. In that case, change the regexp to '^[a-z]{3}.{numcharactersrequired}[0-9]{3}$' , which would be '^[a-z]{3}.{4}[0-9]{3}$'
SQL Fiddle

Mysql REGEXP with . and numbers only

Consider a table category in a database, with column typeis. The datatype is varchar with values
typeis
------
2.5.1
12
1.1.1
11
letters12
.........
I want to write a query that only returns records with "." and numbers from 0-9
For example
2.5.1
1.1.1
So far, I have
select typeis from category where typeis
not in
(select typeis from category where typeis REGEXP '[^0-9 \.]+')
and typeis in
(select typeis from category where typeis REGEXP '^[0-9]+[\.]')
which seems to work. The problem is that it takes over 3secs for just 1500 records. I would like to make it simpler and faster with just one REGEXP, instead of having nested select
Try: ^[0-9]+\.[0-9]+(\.[0-9]+)*
This should match things starting with a number(s) including a dot somewhere in the middle, and ending with numbers, and as many of these patterns as it would like.
This is pretty easy & powerful:
^([0-9]+\.*)+
The query-time issue could be caused by no indexing. Try to index typeis column - if it is possible create an index of its full length. For example if you've varchar(255) create the index of 255 length, like:
create index index_name on table_name (column_name(length))

MySQL Search / compare Keywords in table 1 and Table 2 :

Hey , i got this challenge , i got a MySQL DB table 1 with queries or text and table 2 with synonyms and misspellings as CSV [comma separated values]. Now i want to test if any query word in table 1 matches a synonym or misspelling in table 2 , then i would select them separately .
example :
table 1 row: "i am sick of HIV AIDS , what can i do?"
table 2 : HIV,AIDS,Cancer,TB,Chicken Pox ......
so this would be selected because at least there is a MATCH word in table 1 that matches a synonym in table 2.
On a MyISAM table:
SELECT *
FROM table1 com, table2 syn
WHERE MATCH (com.body) AGAINST(syn.list IN BOOLEAN MODE);
This will work even if your don't have a FULLTEXT index on com.body, but with a FULLTEXT index this will be super fast.
If you wrap your synonym lists into double quotes, like this:
"HIV", "AIDS", "chicken pox", "swine flu"
, only the whole phrases will be matched, not just split words.
select strings.text
from table1 strings
where exists (
select 1
from table2 sm
where instr(strings.text, sm.word) <> 0
)