Removing single quotes from comparison in select statement - mysql

I have a table where a field can have single quotes, but I need to be able to search by that field without single quotes. For example, if the search query is "Johns favorite", I need to be able to find a row where that field contains "John's favorite". I was looking into regex for it, but that seems to return a 0 or 1 when used in a select statement, if I'm understanding it correctly.

Take a look at:
http://www.artfulsoftware.com/infotree/queries.php#552
This will give you the distance between two strings. I.e. you can check whether levensthein distance is less than 3, which means, less than 3 operations are required to be equal.

Try using REPLACE:
SELECT
IF(
REPLACE("John's favorite","'","") = "Johns favorite" ,
"found",
"not found"
)
It's not optimal but it should do the job.

Related

How do I create a SELECT conditional in MySQL where the conditional is the character length of the LIKE match?

I am working on a search function, where the matches are weighted based on certain conditions. One of the conditions I want to add weight to is matches where the character length of the query string in a LIKE match is longer than 4.
This is what I want to the query to look like, roughly. %s is meant to represent the actual match found by LIKE, but I don't think it does. I'm wondering if there is a special variable in MySQL that does represent the precise character match found by LIKE.
SELECT help.*,
IF(CHAR_LENGTH(%s) > 4, 2, 0) w
FROM help
WHERE (
(title LIKE '%this%' OR title LIKE '%testy%' OR title LIKE '%test%') OR
(content LIKE '%this%' OR content LIKE '%testy%' OR content LIKE '%test%')
) LIMIT 1000
edit: I could in the PHP split the search string array into two arrays based on the character length of the elements, with two separate queries that return different values for 'w', then combine the results, but I'd rather not do that, as it seems to me that would be awkward, messy, and slow.
Check out FULLTEXT as another way to discover rows. It will be faster, but won't address your question.
This probably has the effect you want.
SELECT ....
IF ( (title LIKE '%testy%' OR
content LIKE '%testy%'), 2, 0)
....
Note that the "match" in your LIKEs includes the %, so it is the entire length of the string. I don't think that is what you wanted.
REGEXP "(this|testy|that)" will match either 4 or 5 characters (in this example). It may be possible to do something with REGEXP_REPLACE to replace that with the empty string, then see how much it shrank.
I think the answer to my question is that what I wanted to do isn't possible. There is no special variable in MySQL representing the core character match in a WHERE condtional where LIKE is the operator. The match is the contents of the returned data row.
What I did to reach my objective was took the original dynamic list of search tokens, iterated through that list, and performed a search on each token, with the SQL tailored to the conditions that matched each token.
As I did this I built an array of the search results, using the id for the database row as the index for the array. This allowed me to perform calculations with the array elements, while avoiding duplicates.
I'm not posting the PHP code because the original question was about the SQL.

Using REGEX to alter field data in a mysql query

I have two databases, both containing phone numbers. I need to find all instances of duplicate phone numbers, but the formats of database 1 vary wildly from the format of database 2.
I'd like to strip out all non-digit characters and just compare the two 10-digit strings to determine if it's a duplicate, something like:
SELECT b.phone as barPhone, sp.phone as SPPhone FROM bars b JOIN single_platform_bars sp ON sp.phone.REGEX = b.phone.REGEX
Is such a thing even possible in a mysql query? If so, how do I go about accomplishing this?
EDIT: Looks like it is, in fact, a thing you can do! Hooray! The following query returned exactly what I needed:
SELECT b.phone, b.id, sp.phone, sp.id
FROM bars b JOIN single_platform_bars sp ON REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(b.phone,' ',''),'-',''),'(',''),')',''),'.','') = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(sp.phone,' ',''),'-',''),'(',''),')',''),'.','')
MySQL doesn't support returning the "match" of a regular expression. The MySQL REGEXP function returns a 1 or 0, depending on whether an expression matched a regular expression test or not.
You can use the REPLACE function to replace a specific character, and you can nest those. But it would be unwieldy for all "non-digit" characters. If you want to remove spaces, dashes, open and close parens e.g.
REPLACE(REPLACE(REPLACE(REPLACE(sp.phone,' ',''),'-',''),'(',''),')','')
One approach is to create user defined function to return just the digits from a string. But if you don't want to create a user defined function...
This can be done in native MySQL. This approach is a bit unwieldy, but it is workable for strings of "reasonable" length.
SELECT CONCAT(IF(SUBSTR(sp.phone,1,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,1,1),'')
,IF(SUBSTR(sp.phone,2,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,2,1),'')
,IF(SUBSTR(sp.phone,3,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,3,1),'')
,IF(SUBSTR(sp.phone,4,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,4,1),'')
,IF(SUBSTR(sp.phone,5,1) REGEXP '^[0-9]$',SUBSTR(sp.phone,5,1),'')
) AS phone_digits
FROM sp
To unpack that a bit... we extract a single character from the first position in the string, check if it's a digit, if it is a digit, we return the character, otherwise we return an empty string. We repeat this for the second, third, etc. characters in the string. We concatenate all of the returned characters and empty strings back into a single string.
Obviously, the expression above is checking only the first five characters of the string, you would need to extend this, basically adding a line for each position you want to check...
And unwieldy expressions like this can be included in a predicate (in a WHERE clause). (I've just shown it in the SELECT list for convenience.)
MySQL doesn't support such string operations natively. You will either need to use a UDF like this, or else create a stored function that iterates over a string parameter concatenating to its return value every digit that it encounters.

How to write MySQL query where A contains ( "a" or "b" )

I must use this format where A operand B. A is the field; I want B to be either "text 1" or "text 2", so if A has data like "text 1 texttext" or "texttext 2" , the query will have result.
But how do I write this? Does MySQL support something like
where A contains ('text 1' OR 'text 2')? `
Two options:
Use the LIKE keyword, along with percent signs in the string
select * from table where field like '%a%' or field like '%b%'.
(note: If your search string contains percent signs, you'll need to escape them)
If you're looking for more a complex combination of strings than you've specified in your example, you could regular expressions (regex):
See the MySQL manual for more on how to use them: http://dev.mysql.com/doc/refman/5.1/en/regexp.html
Of these, using LIKE is the most usual solution -- it's standard SQL, and in common use. Regex is less commonly used but much more powerful.
Note that whichever option you go with, you need to be aware of possible performance implications. Searching for sub-strings like this will mean that the query will have to scan the entire table. If you have a large table, this could make for a very slow query, and no amount of indexing is going to help.
If this is an issue for you, and you'r going to need to search for the same things over and over, you may prefer to do something like adding a flag field to the table which specifies that the string field contains the relevant sub-strings. If you keep this flag field up-to-date when you insert of update a record, you could simply query the flag when you want to search. This can be indexed, and would make your query much much quicker. Whether it's worth the effort to do that is up to you, it'll depend on how bad the performance is using LIKE.
You can write your query like so:
SELECT * FROM MyTable WHERE (A LIKE '%text1%' OR A LIKE '%text2%')
The % is a wildcard, meaning that it searches for all rows where column A contains either text1 or text2
I've used most of the times the LIKE option and it works just fine.
I just like to share one of my latest experiences where I used INSTR function. Regardless of the reasons that made me consider this options, what's important here is that the use is similar:
instr(A, 'text 1') > 0 or instr(A, 'text 2') > 0
Another option could be:
(instr(A, 'text 1') + instr(A, 'text 2')) > 0
I'd go with the LIKE '%text1%' OR LIKE '%text2%' option... if not hope this other option helps
I user for searching the size of motorcycle :
For example :
Data = "Tire cycle size 70 / 90 - 16"
i can search with "70 90 16"
$searchTerms = preg_split("/[\s,-\/?!]+/", $itemName);
foreach ($searchTerms as $term) {
$term = trim($term);
if (!empty($term)) {
$searchTermBits[] = "name LIKE '%$term%'";
}
}
$query = "SELECT * FROM item WHERE " .implode(' AND ', $searchTermBits);

MySQL query - select postcode matches

I need to make a selection based on the first 2 characters of a field, so for example
SELECT * from table WHERE postcode LIKE 'rh%'
But this would select any record that contains those 2 characters at any point in the "postcode" field right? I am in need of a query that just selects the first 2 characters. Any pointerS?
Thanks
Your query is correct. It searches for postcodes starting with "rh".
In contrast, if you wanted to search for postcodes containing the string "rh" anywhere in the field, you would write:
SELECT * from table WHERE postcode LIKE '%rh%'
Edit:
To answer your comment, you can use either or both % and _ for relatively simple searches. As you have noticed already, % matches any number of characters whereas _ matches a single character.
So, in order to match postcodes starting with "RHx " (where x is any character) your query would be:
SELECT * from table WHERE postcode LIKE 'RH_ %'
(mind the space after _). For more complex search patterns, you need to read about regular expressions.
Further reading:
http://dev.mysql.com/doc/refman/5.1/en/pattern-matching.html
http://dev.mysql.com/doc/refman/5.1/en/regexp.html
LIKE '%rh%' will return all rows with 'rh' anywhere
LIKE 'rh%' will return all rows with 'rh' at the beginning
LIKE '%rh' will return all rows with 'rh' at the end.
If you want to get only first two characters 'rh', use MySQL SUBSTR() function
http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_substr
Dave, your way seems correct to me (and works on my test data). Using a leading % as well will match anywhere in the string which obviously isn't desirable when dealing with postcodes.

MySql Not Like Regexp?

I'm trying to find rows where the first character is not a digit. I have this:
SELECT DISTINCT(action) FROM actions
WHERE qkey = 140 AND action NOT REGEXP '^[:digit:]$';
But, I'm not sure how to make sure it checks just the first character...
First there is a slight error in your query. It should be:
NOT REGEXP '^[[:digit:]]'
Note the double square parentheses. You could also rewrite it as the following to avoid also matching the empty string:
REGEXP '^[^[:digit:]]'
Also note that using REGEXP prevents an index from being used and will result in a table scan or index scan. If you want a more efficient query you should try to rewrite the query without using REGEXP if it is possible:
SELECT DISTINCT(action) FROM actions
WHERE qkey = 140 AND action < '0'
UNION ALL
SELECT DISTINCT(action) FROM actions
WHERE qkey = 140 AND action >= ':'
Then add an index on (qkey, action). It's not as pleasant to read, but it should give better performance. If you only have a small number of actions for each qkey then it probably won't give any noticable performance increase so you can stick with the simpler query.
Your current regex will match values consisting of exactly one digit, not the first character only. Just remove the $ from the end of it, that means "end of value". It'll only check the first character unless you tell it to check more.
^[:digit:] will work, that means "start of the value, followed by one digit".