What method should I use in MySQL to match two or more strings with a column?
I have a column string which contains keywords such as apple, orange, and lemon. What I need to have is to search rows that contains apple and orange using regex, the command has something like this:
where string regexp '(apple|orange)' and fruit = 1
The command above will break the rule, if a row with fruit 1 has only apple it should be not included in the result because fruit 1 didn't have another row which contains orange.
If you want to succeed when (and only when) string contains both "apple" and "orange", then the best way is to have FULLTEXT(string)
WHERE MATCH(string) AGAINST("+apple +orange" IN BOOLEAN MODE)
This will also match "ORANGE colored apples" and a few other variants. Fulltext has some caveats, such as dealing only with "words" and not dealing with short words. But if the restrictions are OK, this will be much faster than LIKE or REGEXP.
If Fulltext will not work, then something like this is best:
WHERE string LIKE '%apple%'
AND string LIKE '%orange%'
Or it can be done with REGEXP:
WHERE string REGEXP '(apple.*orange)|(orange.*apple)'
If you need to obey word boundaries and/or allow plurals, then add that to your specification; these suggested solutions may need changing. For example, changing .* to .+ would reject "appleorange" while still allowing "apple/orange".
Related
I need to search words that contain multiple number prefixes.
Example:
0119
0129
0139
0149
But there is other prefixes, 0155859, 0128889
Etc.
If i search 0%9 it'll come up with all the results i don't want, it'll include the 0155859, 0128889 ones
I need to search and list ONLY the ones that have 0119, etc
How do i do it ?
0XX9 ( Where XX is any strings that matches, so 0119, 0129, etc. % Lists all other characters till a 9 appears, i don't want that. )
I'm trying on my english, correct me if i did'nt expressed myself right !
In a LIKE pattern, the _ character matches any single character. So you can do:
WHERE word LIKE '0__9%'
This matches a word that begins with 0, then any two characters, then 9, then anything after that.
My gut feeling at seeing your question was to consider using REGEXP, which is MySQL's regex matching operator. Try the following query:
SELECT *
FROM yourTable
WHERE word REGEXP '0[0-9][0-9]9'
The pattern used would match any word containing a zero, followed by any two numbers, followed by a 9.
I dont want rows to be returned where the LIKE is matching a partial word. I am splitting strings on whitespace and then generating a query that will find a match, but its returning matches for partial words. Here is an example
SELECT ID from VideoGames WHERE Title Like "%GI%" AND Title Like "%JOE%"
Returns a match where title = "Yu-Gi-Oh! Power of Chaos: Joey the Passion".
I know only matching full words wont completely resolve the issue, but it will hugely increase accuracy. What can i do to return what i want rather than this.
You can use RLIKE, the regular expression version of LIKE to get more flexibility with your matching.
SELECT ID from VideoGames
WHERE Title RLIKE "[[:<:]]GI[[:>:]]" AND Title RLIKE "[[:<:]]JOE[[:>:]]"
The [[:<:]] and [[:>:]] markers are word boundaries marking the start and and of a word respectively. You could build a single regex rather than the AND but I have made this match your original question.
I wrote a query where a user can input a string and get the data related to that string back from the database.
For example, a user will input Apple even though the full name is Apple Inc.
The code would be laid out as so...
and Description like '%Apple%'
The problem with this is, it will return Snapple along with Apple.
Aside from removing the first "%" wildcard and making the user type more, how can I limit the results to just Apple?
Use a regular expression:
WHERE Description RLIKE '[[:<:]]apple[[:>:]]'
[[:<:]] matches the beginning of a word, [[:>:]] matches the end of a word.
See the documentation for all the regexp operators supported by MySQL
Firstly - string comparison with wild cards (especially leading wild cards) doesn't really scale using "like". You might want to look at full-text searching instead. This basically gives you "google-like" text searching capabilities.
To answer your question, in most cases, "Apple" is a better match than "Snapple" for the term "apple". So, you could include the concept of "match quality" in the search - something like:
select *, 10 as MatchQuality
from table
where description like 'Apple'
union
select *, 5 as MatchQuality
from table
where description like 'Apple%'
union
select *, 1 as MatchQuality
from table
where description like '%Apple%'
I have a mysql table with a list of keywords such as:
id | keywords
---+--------------------------------
1 | apple, oranges, pears
2 | peaches, pineapples, tangerines
I'm trying to figure out how to query this table using an input string of:
John liked to eat apples
Is there a mysql query type that can query a field with a sentence and return results (in my example, record #1)?
One way to do it could be to convert apple, oranges, pears to apple|oranges|pears and use RLIKE (ie regular expression) to match against it.
For example, 'John liked to eat apples' matches the regex 'apple|orange|pears'.
First, to convert 'apple, oranges, pears' to the regex form, replace all ', ' by '|' using REPLACE. Then use RLIKE to select the keyword entries that match:
SELECT *
FROM keywords_table
WHERE 'John liked to eat apples' RLIKE REPLACE(keywords,', ','|');
However this does depend on your comma-separation being consistent (i.e. if there is one row that looks like apples,oranges this won't work as the REPLACE replaces a comma followed by a space (as per your example rows).
I also don't think it'll scale up very well.
And, if you have a sentence like 'John liked to eat pineapples', it would match both of the rows above (as it does have 'apple' in it). You could then try to add word boundaries to the regex (i.e. WHERE $sentence RLIKE '[[:<:]](apple|oranges|pears)[[:>:]]'), but this would screw up matching when you have plurals ('apples' wouldn't match '[wordboundary]apple[wordboundary]').
Hopefully this isn't more abstract than what you need but maybe good way of doing it.
I haven't tested this but I think it would work. If you can use PHP you can use str_replace to turn the spaces into keyword LIKE '%apple%'
$sentence = "John liked to eat apples";
$sqlversion = str_replace(" ","%' OR Keyword like '%",$sentence );
$finalsql = "%".$sqlversion."%";
the above will echo:
%John%' OR Keyword like '%liked%' OR Keyword like '%to%' OR Keyword like '%eat%' OR Keyword like '%apples%
Then just combine with your SQl statement
SQL ="SELECT *
FROM keywords_table
WHERE Keyword like" . $finalsql;
Storing comma delimited data is... less than ideal.
If you broke up the string "John liked to eat apples" into individual words, you could use the FIND_IN_SET operator:
WHERE FIND_IN_SET('apple', t.keywords) > 0
The performance wouldn't be great - this operation is better suited to Full Text Search.
I'm not aware of any direct solution to that type of query. But Full Text Search is a possibility. If you have a full-text index on the field of interest then a search with OR between each word in the sentence (although I think the OR operator is implied) would find that record ... but it might also find more than you want too.
I really don't think what you are looking for is completely possible but you can look into Full Text Search or SOUNDEX. SOUNDEX, for example, can do something like:
WHERE SOUNDEX(sentence) = SOUNDEX('%'+keywords+'%');
I have never tried it in this context but you should and let me know how it works out.
I have table with rows of strings.
I'd like to search for those strings that consists of only
two words.
I tried few ways with [[:space:]] etc but mysql was returning
three, four word strings also
try this:
select * from yourTable WHERE field REGEXP('^[[:alnum:]]+[[:blank:]]+[[:alnum:]]+$');
more details in link :
http://dev.mysql.com/doc/refman/5.1/en/regexp.html
^\w+\s\w+$ should do well.
Note; what I experience more often in the last days is that close to nobody uses the ^$-operators.
They are absolutely needed if you want to tell if a string starts or ends with something or want to match the string exactly, word for word, as you. "Normal" strings, like you used (I assume you used something like \w[:space]\w match in the string, what means that they also match if the condition is true anywhere within the string!
Keep that in mind and Regex will serve you well :)
REGEXP ('^[a-z0-9]*[[:space:]][a-z0-9]*$')