Use REGEXP in MySQL to match keywords for search engine in random order - mysql

I'm trying to use a regular expression to match a user entered search string to a title of an entry in my MySQL database.
For example I have the following rows in a table in my databse:
id title
1 IM2 - Article 3 Funky Business
2 IM2 - Article 4 There's no Business That's not Show Business
3 IM2 - There's no Business That's not Show Business
4 CO4 - Life's a business
When a user searches for "IM Article Business", the following query will be executed (spaces are replaced by "(.*)" using str_replace):
SELECT * FROM mytable WHERE title REGEXP 'IM(.*)Article(.*)Business'
This will return the first 2 rows.
Now, I want it to show the same results when a user uses the same words, but in another order, for example: "Business IM Article". The results MUST contain all words entered, only the order of how the words are entered shouldn't matter.
I couldn't figure out how to do it in any way and hoped regular expressions would be the answer. I've never used them before, so does anybody know how to do this?
Thanks,
Pascal

This isn't something regular expressions are great at. Fortunately, it's something SQL is pretty good at. (I'm going to not use mysql's regexp keyword, which I didn't even knew existed, and instead use the SQL standard "%" glob matching.)
select * from mytable where title like '%IM%' and title like '%Article%' and title like '%Business%'
Now title has to contain all three strings, but you haven't specified an order. Exactly what you want.

Related

MySQL String Comparison with Wildcards

I wrote a query where a user can input a string and get the data related to that string back from the database.
For example, a user will input Apple even though the full name is Apple Inc.
The code would be laid out as so...
and Description like '%Apple%'
The problem with this is, it will return Snapple along with Apple.
Aside from removing the first "%" wildcard and making the user type more, how can I limit the results to just Apple?
Use a regular expression:
WHERE Description RLIKE '[[:<:]]apple[[:>:]]'
[[:<:]] matches the beginning of a word, [[:>:]] matches the end of a word.
See the documentation for all the regexp operators supported by MySQL
Firstly - string comparison with wild cards (especially leading wild cards) doesn't really scale using "like". You might want to look at full-text searching instead. This basically gives you "google-like" text searching capabilities.
To answer your question, in most cases, "Apple" is a better match than "Snapple" for the term "apple". So, you could include the concept of "match quality" in the search - something like:
select *, 10 as MatchQuality
from table
where description like 'Apple'
union
select *, 5 as MatchQuality
from table
where description like 'Apple%'
union
select *, 1 as MatchQuality
from table
where description like '%Apple%'

Search XML feed's description against a keyword from database

I'm working on a project where I use XML feeds to get input. I have to filter the items which title and description that matches specific keywords. If an item contains smart phone in title or description, I have to add that item in database under the category "Smart phone".
The query I use here is
$title = $item=>title;
$desc = $item->description;
SELECT cid FROM tbl_keyword WHERE MATCH(keyword) AGAINST ('".$title." ".$desc."' IN
BOOLEAN MODE);
Query returns value but it gets other rows from database like smart watch,smart toys.
I want to know, how to include space based search.
Query have to match the exact keyword.
table looks like
id cid keyword
1 6 smart phone
2 6 iphone
3 7 smart watch
When i get a title as "Smart phones are not essential", query should return only the cid 6.
How to implement it.?
As I know , you can't just search the whole sentence(Smart phones are not essential) against the db to get that exact result .
Two ways to do this :
1.*(Recommended)*You can just break the sentence with space ("Smart", "Phone", "are", "not", "essential")and then apply the following query
select * from tbl_keyword where keyword like "%smart%" or keyword like "%phones%" or keyword like "%are%" or keyword like "%not%" or keyword like "%essential%"
This query will output list of possible entries from the database. From this you will need to compare the result with your query sentence using your programming language.
2.This way will output the entry directly from the database (but this may rule out some of the important entries at worst case).
Breakdown the sentence into single word like this ("Smart", "Phones", "are", "not", "essential")
And then break this sentence with two words like this ("Smart Phones", "Phone are", "are not", "not essential","essential Smart","Smart are","Smart not","Phone not")
And use both of this to retrieve entries from the database (This process will just narrow down the filter)
select * from tbl_keyword where (keyword like "%smart%" or keyword like "%phone%" or keyword like "%are%" or keyword like "%not%" or keyword like "%essential%") and (keyword like "%smart phones%" or keyword like "%phones are%" or keyword like "%are not%" or keyword like "%not essentials%" or keyword like "%essentials smart%")
Hope this will help you ...

Mysql search, match words if similar

I have a db with a table called sections. In that is a field called head that has a full text index with 3 entries each a string. 2 entries have the word motorcycle and one has motorcycles. I can't seem to find a way to return all 3 if the term "motorcycles" is search.
I have tried
SELECT * FROM sections
WHERE MATCH (head) AGAINST ('Motorcycles')
but it only returns the plural entry. I have also tried.
SELECT * FROM sections
WHERE head like '%motorcycles%'
but that also only returns the plural entry. Is there a way to return all three rows based on "motorcycles"?
Have you tried boolean mode?
where match (head) against ('+ Motorcycle*' in Boolean mode)
More information is here.
Your where clause has an extra "s":
SELECT * FROM sections WHERE head like '%motorcycle%'
Assuming your question is more general than the specific motorcylce example you've given...I'm not aware of a way that you can relax the contraints directly in the SQL (without a stored proc to pre process the input). I'd suggest pre processing your input with a regex to remove/replace the chars that make the word plural. Then use like in the way that you have shown on the singular version of the word.
If i have got your Questions correctly I think you want something like this:
if (SELECT count(1) FROM sections WHERE head like '%motorcycles%')>1
begin
select * FROM selections
WHERE head like '%motorcycle%'
end

MySQL search within the last 5 characters in a column?

My user table has a column "name" which contains information like this:
Joe Lee
Angela White
I want to search for either first name or last name efficiently. First name is easy, I can do
SELECT * FROM user WHERE name LIKE "ABC%"
But for last name, if I do
SELECT * FROM user WHERE name LIKE "%ABC"
That would be extremely slow.
So I am thinking about counting the characters of the input, for example, "ABC" has 3 characters, and if I can search only the last three characters in name column, that would be great. So I want something like
SELECT * FROM user WHERE substring(name, end-3, end) LIKE "ABC%"
Is there anything in MySQL that can do this?
Thanks so much!
PS. I cannot do fulltext because our search engine doesn't support that.
The reason that
WHERE name LIKE '%ith'
is a slow way to look for 'John Smith' by last name is the same reason that
WHERE Right(name, InStr(name, ' ' )) LIKE 'smi%'
or any other expression on the column is slow. It defeats the use of the index for quick lookup and leaves the MySQL server doing a full table scan or full index scan.
If you were using Oracle (that is, if you worked for a formerly wealthy employer) you could use function indexes. As it is you have to add some extra columns or some other helping data to accelerate your search.
Your smartest move is to split your first and last names into separate columns. Several other people have pointed out good reasons for doing that.
If you can't do that you could try creating an extra column which contains the name string reversed, and create an index on that column. That column will have, for example, 'John Smith' stored as 'htimS nhoJ'. Then you can search as follows.
WHERE nameReversed LIKE CONCAT(REVERSE('ith'),'%')
This search will use the index and be decently fast. I've had good success with it.
You're close. In MySQL you should be able to use InStr(str, substr) and Right(str, index) to do the following:
SELECT * FROM user WHERE Right(name, InStr(name, " ")) LIKE "ABC%"
InStr(name, " ") returns the index of the Space character (you may have to play with the " " syntax). This index is then used in the Right() function to search for only the last name (basically; problems arise when you have multiple names, multiple spaces etc). LIKE "ABC%" would then search for a last name starting with ABC.
You cannot use a fixed index as names that are more than 3 or less than 3 characters long would not return properly as you suggest.
However, as Zane said, it's a much better practise to use seperate fields.
If it is a MyIsam table, you may use Free text search to do the same.
You can use the REGEXP operator:
SELECT * FROM user WHERE name REGEXP "ABC$"
http://dev.mysql.com/doc/refman/5.1/en/regexp.html

mysql query to match sentence against keywords in a field

I have a mysql table with a list of keywords such as:
id | keywords
---+--------------------------------
1 | apple, oranges, pears
2 | peaches, pineapples, tangerines
I'm trying to figure out how to query this table using an input string of:
John liked to eat apples
Is there a mysql query type that can query a field with a sentence and return results (in my example, record #1)?
One way to do it could be to convert apple, oranges, pears to apple|oranges|pears and use RLIKE (ie regular expression) to match against it.
For example, 'John liked to eat apples' matches the regex 'apple|orange|pears'.
First, to convert 'apple, oranges, pears' to the regex form, replace all ', ' by '|' using REPLACE. Then use RLIKE to select the keyword entries that match:
SELECT *
FROM keywords_table
WHERE 'John liked to eat apples' RLIKE REPLACE(keywords,', ','|');
However this does depend on your comma-separation being consistent (i.e. if there is one row that looks like apples,oranges this won't work as the REPLACE replaces a comma followed by a space (as per your example rows).
I also don't think it'll scale up very well.
And, if you have a sentence like 'John liked to eat pineapples', it would match both of the rows above (as it does have 'apple' in it). You could then try to add word boundaries to the regex (i.e. WHERE $sentence RLIKE '[[:<:]](apple|oranges|pears)[[:>:]]'), but this would screw up matching when you have plurals ('apples' wouldn't match '[wordboundary]apple[wordboundary]').
Hopefully this isn't more abstract than what you need but maybe good way of doing it.
I haven't tested this but I think it would work. If you can use PHP you can use str_replace to turn the spaces into keyword LIKE '%apple%'
$sentence = "John liked to eat apples";
$sqlversion = str_replace(" ","%' OR Keyword like '%",$sentence );
$finalsql = "%".$sqlversion."%";
the above will echo:
%John%' OR Keyword like '%liked%' OR Keyword like '%to%' OR Keyword like '%eat%' OR Keyword like '%apples%
Then just combine with your SQl statement
SQL ="SELECT *
FROM keywords_table
WHERE Keyword like" . $finalsql;
Storing comma delimited data is... less than ideal.
If you broke up the string "John liked to eat apples" into individual words, you could use the FIND_IN_SET operator:
WHERE FIND_IN_SET('apple', t.keywords) > 0
The performance wouldn't be great - this operation is better suited to Full Text Search.
I'm not aware of any direct solution to that type of query. But Full Text Search is a possibility. If you have a full-text index on the field of interest then a search with OR between each word in the sentence (although I think the OR operator is implied) would find that record ... but it might also find more than you want too.
I really don't think what you are looking for is completely possible but you can look into Full Text Search or SOUNDEX. SOUNDEX, for example, can do something like:
WHERE SOUNDEX(sentence) = SOUNDEX('%'+keywords+'%');
I have never tried it in this context but you should and let me know how it works out.