MySQL String Comparison with Wildcards - mysql

I wrote a query where a user can input a string and get the data related to that string back from the database.
For example, a user will input Apple even though the full name is Apple Inc.
The code would be laid out as so...
and Description like '%Apple%'
The problem with this is, it will return Snapple along with Apple.
Aside from removing the first "%" wildcard and making the user type more, how can I limit the results to just Apple?

Use a regular expression:
WHERE Description RLIKE '[[:<:]]apple[[:>:]]'
[[:<:]] matches the beginning of a word, [[:>:]] matches the end of a word.
See the documentation for all the regexp operators supported by MySQL

Firstly - string comparison with wild cards (especially leading wild cards) doesn't really scale using "like". You might want to look at full-text searching instead. This basically gives you "google-like" text searching capabilities.
To answer your question, in most cases, "Apple" is a better match than "Snapple" for the term "apple". So, you could include the concept of "match quality" in the search - something like:
select *, 10 as MatchQuality
from table
where description like 'Apple'
union
select *, 5 as MatchQuality
from table
where description like 'Apple%'
union
select *, 1 as MatchQuality
from table
where description like '%Apple%'

Related

Make MySQL LIKE Return Full Word matches Only

I dont want rows to be returned where the LIKE is matching a partial word. I am splitting strings on whitespace and then generating a query that will find a match, but its returning matches for partial words. Here is an example
SELECT ID from VideoGames WHERE Title Like "%GI%" AND Title Like "%JOE%"
Returns a match where title = "Yu-Gi-Oh! Power of Chaos: Joey the Passion".
I know only matching full words wont completely resolve the issue, but it will hugely increase accuracy. What can i do to return what i want rather than this.
You can use RLIKE, the regular expression version of LIKE to get more flexibility with your matching.
SELECT ID from VideoGames
WHERE Title RLIKE "[[:<:]]GI[[:>:]]" AND Title RLIKE "[[:<:]]JOE[[:>:]]"
The [[:<:]] and [[:>:]] markers are word boundaries marking the start and and of a word respectively. You could build a single regex rather than the AND but I have made this match your original question.

Mysql SELECT query on name ignoring the first word if it is "the", "a", "an" etc

I've been trying (without success) to construct a MYSQL query which will select a group of records with a "title" field starting with a single alphabetical character but ignoring the first word if it's "The", "An" or "A". I've found plenty of examples that do this for the ORDER BY part of the query, but it's the initial WHERE part that I need to do it for, as the order is irrelevant if the correct records haven't been found. Using
WHERE title LIKE "R%"
will just give me titles that have this as the very first letter (e.g. "Robin Hood") but won't match "The Red House". I think I need some kind of REGEX, but I can't seem to get it to work.
So for example, given the following movie titles,
Road House
The Return of the King
Mamma Mia
Argo
Titanic
A River Runs Through it
Selecting movie titles that start with "R" would return the following:
The Return of the King
A River Runs Through it
Roadhouse
(other fields omitted)
The easiest way is to programmatically expand the query to something like
SELECT
...
WHERE
title LIKE 'R%'
OR title LIKE 'The R%'
OR title LIKE 'A R%'
OR title LIKE 'An R%'
...
This should perform better than a REGEX, as it will be able to use an index, which a REGEX never will.
BTW: The canonical way to do this, is to store the article in a seperate field.
This regex should suit your needs:
WHERE title REGEXP '^(The |An? )?R.*$'
But as #EugenRieck noticed, since you probably use an index on the title column, you should better use the WHERE... OR... clauses.
To add to the above suggestions (both of which I agree with), for the sort you would probably need to use a CASE to dervive a field for the ORDER BY clause.
SELECT somefield,
CASE
WHEN title LIKE 'R%' THEN title
WHEN title LIKE 'The R%' THEN SUBSTRING(title FROM 5)
WHEN title LIKE 'A R%' THEN SUBSTRING(title FROM 3)
WHEN title LIKE 'An R%' THEN SUBSTRING(title FROM 4)
ELSE title
END AS SortTitle
FROM sometable
ORDER BY SortTitle

MySQL search within the last 5 characters in a column?

My user table has a column "name" which contains information like this:
Joe Lee
Angela White
I want to search for either first name or last name efficiently. First name is easy, I can do
SELECT * FROM user WHERE name LIKE "ABC%"
But for last name, if I do
SELECT * FROM user WHERE name LIKE "%ABC"
That would be extremely slow.
So I am thinking about counting the characters of the input, for example, "ABC" has 3 characters, and if I can search only the last three characters in name column, that would be great. So I want something like
SELECT * FROM user WHERE substring(name, end-3, end) LIKE "ABC%"
Is there anything in MySQL that can do this?
Thanks so much!
PS. I cannot do fulltext because our search engine doesn't support that.
The reason that
WHERE name LIKE '%ith'
is a slow way to look for 'John Smith' by last name is the same reason that
WHERE Right(name, InStr(name, ' ' )) LIKE 'smi%'
or any other expression on the column is slow. It defeats the use of the index for quick lookup and leaves the MySQL server doing a full table scan or full index scan.
If you were using Oracle (that is, if you worked for a formerly wealthy employer) you could use function indexes. As it is you have to add some extra columns or some other helping data to accelerate your search.
Your smartest move is to split your first and last names into separate columns. Several other people have pointed out good reasons for doing that.
If you can't do that you could try creating an extra column which contains the name string reversed, and create an index on that column. That column will have, for example, 'John Smith' stored as 'htimS nhoJ'. Then you can search as follows.
WHERE nameReversed LIKE CONCAT(REVERSE('ith'),'%')
This search will use the index and be decently fast. I've had good success with it.
You're close. In MySQL you should be able to use InStr(str, substr) and Right(str, index) to do the following:
SELECT * FROM user WHERE Right(name, InStr(name, " ")) LIKE "ABC%"
InStr(name, " ") returns the index of the Space character (you may have to play with the " " syntax). This index is then used in the Right() function to search for only the last name (basically; problems arise when you have multiple names, multiple spaces etc). LIKE "ABC%" would then search for a last name starting with ABC.
You cannot use a fixed index as names that are more than 3 or less than 3 characters long would not return properly as you suggest.
However, as Zane said, it's a much better practise to use seperate fields.
If it is a MyIsam table, you may use Free text search to do the same.
You can use the REGEXP operator:
SELECT * FROM user WHERE name REGEXP "ABC$"
http://dev.mysql.com/doc/refman/5.1/en/regexp.html

mysql query to match sentence against keywords in a field

I have a mysql table with a list of keywords such as:
id | keywords
---+--------------------------------
1 | apple, oranges, pears
2 | peaches, pineapples, tangerines
I'm trying to figure out how to query this table using an input string of:
John liked to eat apples
Is there a mysql query type that can query a field with a sentence and return results (in my example, record #1)?
One way to do it could be to convert apple, oranges, pears to apple|oranges|pears and use RLIKE (ie regular expression) to match against it.
For example, 'John liked to eat apples' matches the regex 'apple|orange|pears'.
First, to convert 'apple, oranges, pears' to the regex form, replace all ', ' by '|' using REPLACE. Then use RLIKE to select the keyword entries that match:
SELECT *
FROM keywords_table
WHERE 'John liked to eat apples' RLIKE REPLACE(keywords,', ','|');
However this does depend on your comma-separation being consistent (i.e. if there is one row that looks like apples,oranges this won't work as the REPLACE replaces a comma followed by a space (as per your example rows).
I also don't think it'll scale up very well.
And, if you have a sentence like 'John liked to eat pineapples', it would match both of the rows above (as it does have 'apple' in it). You could then try to add word boundaries to the regex (i.e. WHERE $sentence RLIKE '[[:<:]](apple|oranges|pears)[[:>:]]'), but this would screw up matching when you have plurals ('apples' wouldn't match '[wordboundary]apple[wordboundary]').
Hopefully this isn't more abstract than what you need but maybe good way of doing it.
I haven't tested this but I think it would work. If you can use PHP you can use str_replace to turn the spaces into keyword LIKE '%apple%'
$sentence = "John liked to eat apples";
$sqlversion = str_replace(" ","%' OR Keyword like '%",$sentence );
$finalsql = "%".$sqlversion."%";
the above will echo:
%John%' OR Keyword like '%liked%' OR Keyword like '%to%' OR Keyword like '%eat%' OR Keyword like '%apples%
Then just combine with your SQl statement
SQL ="SELECT *
FROM keywords_table
WHERE Keyword like" . $finalsql;
Storing comma delimited data is... less than ideal.
If you broke up the string "John liked to eat apples" into individual words, you could use the FIND_IN_SET operator:
WHERE FIND_IN_SET('apple', t.keywords) > 0
The performance wouldn't be great - this operation is better suited to Full Text Search.
I'm not aware of any direct solution to that type of query. But Full Text Search is a possibility. If you have a full-text index on the field of interest then a search with OR between each word in the sentence (although I think the OR operator is implied) would find that record ... but it might also find more than you want too.
I really don't think what you are looking for is completely possible but you can look into Full Text Search or SOUNDEX. SOUNDEX, for example, can do something like:
WHERE SOUNDEX(sentence) = SOUNDEX('%'+keywords+'%');
I have never tried it in this context but you should and let me know how it works out.

Use REGEXP in MySQL to match keywords for search engine in random order

I'm trying to use a regular expression to match a user entered search string to a title of an entry in my MySQL database.
For example I have the following rows in a table in my databse:
id title
1 IM2 - Article 3 Funky Business
2 IM2 - Article 4 There's no Business That's not Show Business
3 IM2 - There's no Business That's not Show Business
4 CO4 - Life's a business
When a user searches for "IM Article Business", the following query will be executed (spaces are replaced by "(.*)" using str_replace):
SELECT * FROM mytable WHERE title REGEXP 'IM(.*)Article(.*)Business'
This will return the first 2 rows.
Now, I want it to show the same results when a user uses the same words, but in another order, for example: "Business IM Article". The results MUST contain all words entered, only the order of how the words are entered shouldn't matter.
I couldn't figure out how to do it in any way and hoped regular expressions would be the answer. I've never used them before, so does anybody know how to do this?
Thanks,
Pascal
This isn't something regular expressions are great at. Fortunately, it's something SQL is pretty good at. (I'm going to not use mysql's regexp keyword, which I didn't even knew existed, and instead use the SQL standard "%" glob matching.)
select * from mytable where title like '%IM%' and title like '%Article%' and title like '%Business%'
Now title has to contain all three strings, but you haven't specified an order. Exactly what you want.