/*Usage example: This function takes S. O. L. I. D. as input and returns SOLID. And similarly removes single quotes, hyphens and slashes from input*/
CREATE DEFINER=`root`#`localhost` FUNCTION `SanitiseNameForSearch`(Name nvarchar(100)) RETURNS varchar(100) CHARSET utf8
BEGIN
RETURN REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(Name, ' ', ''), '.', ''), '''', ''), '-', ''), '/', '');
END
Using this function here in a procedure, applied the function on search input and on column. Works fine, but definitely not scalable.
CREATE DEFINER=`root`#`localhost` PROCEDURE `Search`(SearchFilter nvarchar(20))
BEGIN
SET #SearchFilter = `SanitiseNameForSearch`(SearchFilter);
SELECT t.TermId, t.Name
FROM Terminology AS t
WHERE `SanitiseNameForSearch`(Name) Like #SearchFilter
ORDER BY length(Name) asc
LIMIT 5;
END;
Is it ideal to implement this functionality via function or add a separate column/table that holds the column values after the function is applied i.e. hold precalculated value of SanitiseNameForSearch(Name) so that it can be indexed?
LIKE '%...' does not optimize at all. It will do a table scan. Therefore, this is the main part of being "non scalable". The functions, etc are insignificant in comparison.
So, what to do? Look at FULLTEXT to see if you can use it. It will probably require changing expectations of the users -- it looks for whole words, not arbitrary substrings. But it is much faster and scalable.
Using FULLTEXT would obviate the need for your Sanitize function.
(So, guys, what's all this crap about downvoting and closing the question? Haven't I answered his question, or at least the implied question?)
Related
I am working on a legacy system a client has. Phone numbers are stored in a multitude of ways. Ex:
514-879-9989
514.989.2289
5147899287
The client wants to be able to search the database by phone number.
How could this be achieved without normalizing the data stored in the database? Is this possible?
I am wondering if it is possible to have a query that looks like:
SELECT FROM table WHERE phonenumber LIKE %input%
but that takes into account only the numerical characters in the db?
$sql = "SELECT * FROM tab
WHERE replace(replace(phone, '.', ''), '-', '') like '%". $input ."%'"
Yes you can add more replace according to values in your table, as mentioned by #spencer7593 eg:
$sql = "SELECT * FROM tab
WHERE replace(replace(replace(replace(replace(replace(phone, '.', ''), '-', ''), '+', ''), '(', ''), ')', ''), ' ', '') like '%". $input ."%'"
but I would prefer to cleanup the data before the query.
The approach I would take with this (i.e. not having a "normalized" value with only digits available, and a restriction of not adding an additional column with the normalized value...)
I would take the user input for the search, and add wild cards in strategic locations. For example, if the user provides search input of 3155551212), then I'd run a query that has a predicate equivalent to this:
phonenumber LIKE '%315%555%1212%'
But if I'm not guaranteed that the provided search digits will be a full three digit area code, a three digit exchange (central office) code, and a four digit line number, for a broader search, I'd add wild cards between all of the provided digits, e.g.
phonenumber LIKE '%3%1%5%5%5%5%1%2%1%2%'
This latter approach is less than ideal, because it could potentially provide more matches than aren't intended. Especially if the user is providing fewer than ten digits. For example, consider a phonenumber value:
'+1 (315) 555-7172 ext. 123'
As a demonstration:
SELECT '+1 (315) 555-7172 ext. 123' LIKE '%3%1%5%5%5%5%1%2%1%2%'
, '+1 (315) 555-7172 ext. 123' LIKE '%315%555%1212%'
There's no builtin string function in MySQL that will extract the digit characters from a string.
If you want a function that does that, e.g.
SELECT only_digits_from('+1 (315) 555-7172 ext. 123')
to return
13155557172123
You'd have to create a stored function that does that. I wouldn't attempt doing it inline in the SQL statement, that would require an atrociously long and ugly expression.
This is piece of code i frequently use to clean up the database columns. I have modified to to be fit for your purpose.
Update Table SET Column =
replace
(replace
(replace(column,
'-','',
'.',''),
' ','')
)
WHERE Column is Not Null
In not a database guy but: I have mixed up data in a mySql database that I inherited.
Some Phone numbers are formatted (512) 555-1212 (call it dirty)
Others 5125551212 (Call it clean)
I need a sqlstamet that says
UPDATE table_name
SET Phone="clean'(Some sort of cleaning code - regex?)
WHERE Phone='Dirty'
Unfortunately there's no regex replace/update in MySQL. If it's just parentheses and dashes and spaces then some nested REPLACE calls will do the trick:
UPDATE table_name
SET Phone = REPLACE(REPLACE(REPLACE(REPLACE(Phone, '-', ''), ')', ''), '(', ''), ' ', '')
To my knowledge you can't run a regexp to replace data during the update process. Only during the SELECT statement.
Your best bet is to use a scripting language that you're familiar with and read the table and change it that way. Basically by retrieving all the entries. Then using a string replace to match a simple regexp such as [^\d]* and remove those characters. Then update the table with the new value.
Also, see this answer:
How to do a regular expression replace in MySQL?
I have one column name phone_number in the database table.Right now the numbers stored in the table are format like ex.+91-852-9689568.I want to format it and just want only digits.
How can i do it in MySql ? I have tried it with using functions like REGEXP but it displays error like function does not exist.And i don't want to use multiple REPLACE.
One of the options is to use mySql substring. (As long as the format doesn't change)
SELECT concat(SUBSTRING(pNo,2,2), SUBSTRING(pNo,5,3), SUBSTRING(pNo,9,7));
if you want to format via projection only, use SELECT, you will only need to use replace twice and no problem with that.
SELECT REPLACE(REPLACE(columnNAme, '-', ''), '+', '')
FROM tableName
otherwise, if you want to update the value permanently, use UPDATE
UPDATE tableName
SET columnName = REPLACE(REPLACE(columnNAme, '-', ''), '+', '')
MySQL does not have a builtin function for pattern-matching and replace.
You'll be better off fetching the whole string back to your application, and then using a more flexible string-manipulation function on it. For instance, preg_replace() in PHP.
Try the following and comment please.
Select dbo.Regex('\d+',pNo);
Select dbo.Regex('[0-9]+',pNo);
Reference on RUBLAR.
So MYSQL is not like Oracle, hence you may just use a USer defined Function to get numbers. This could get you going.
I need to do the following and I'm struggling with the syntax:
I have a table called 'mytable' and a column called 'mycolumn' (string).
In mycolumn the value is a constructed value - for example, one of the values is: 'first:10:second:18:third:31'. The values in mycolumn are all using the same pattern, just the ids/numbers are different.
I need to change the value of 18 (in this particular case) to a value from another tables key. The end result for this column value should be 'first:10:second:22:third:31' because I replaced 18 with 22. I got the 22 from another table using the 18 as a lookup value.
So ideally I would have the following:
UPDATE mytable
SET mycolumn = [some regex function to find the number between 'second:' and ":third" -
let's call that oldkey - and replace it with other id from another table -
(select otherid from tableb where id = oldkey)].
I know the mysql has a REPLACE function but that doesn't get me far enough.
You can create your own function. I am scared of REGEX so I use SUBSTRING and SUBSTRING_INDEX.
CREATE FUNCTION SPLIT_STRING(str VARCHAR(255), delim VARCHAR(12), pos INT)
RETURNS VARCHAR(255)
RETURN REPLACE(SUBSTRING(SUBSTRING_INDEX(str, delim, pos),
LENGTH(SUBSTRING_INDEX(str, delim, pos-1)) + 1),
delim, '');
SPLIT_STRING('first:10:second:18:third:31', ':', 4)
returns 18
Based on this answer:
Equivalent of explode() to work with strings in MySQL
The problem with MySQL is it's REGEX flavor is very limited and does not support back references or regex replace, which pretty much makes it impossible to replace the value like you want to with MySQL alone.
I know it means taking a speed hit, but you may want to consider selecting the row you want with by it's id or however you select it, modify the value with PHP or whatever language you have interfacing with MySQL and put it back in with an UPDATE query.
Generally speaking, REGEX in programming languages is much more powerful.
If you keep those queries slim and quick, you shouldn't take too big of a speed hit (probably negligible).
Also, here is documentation on what MySQL's REGEX CAN do. http://dev.mysql.com/doc/refman/5.1/en/regexp.html
Cheers
EDIT:
To be honest, eggyal's comment makes a whole lot more sense for your situation (simple int values). Just break them up into columns there's no reason to access them like that at all imo.
You want something like this, where it matches the group:
WHERE REGEXP 'second:([0-9]*):third'
However, MySQL doesn't have a regex replace function, so you would have to use a user-defined function:
REGEXP_REPLACE?(text, pattern, replace [,position [,occurence [,return_end [,mode]]])
User-defined function is available here:
http://www.mysqludf.org/lib_mysqludf_preg/
Suppose I have the following comma-delimited column value in MySQL: foo,bar,baz,bar,foo2
What is the best way to replace whatever is in the 4th position (in this case bar) of this string with barAAA (so that we change foo,bar,baz,bar,foo2 to foo,bar,baz,barAAA,foo2)? Note that bar occurs both in position 2 as well as position 4.
I know that I can use SUBSTRING_INDEX() in MySQL to get the value of whatever is in position 4, but have not been able to figure out how to replace the value in position 4 with a new value.
I need to do this without creating a UDF or stored function, via using only the standard string functions in MySQL (http://dev.mysql.com/doc/refman/5.5/en/string-functions.html).
Hmm... maybe this?
SELECT #before := CONCAT(SUBSTRING_INDEX(`columnname`,',',3),','),
#len := LENGTH(SUBSTRING_INDEX(`columnname`,',',4)+1
FROM `tablename` WHERE ...;
SELECT CONCAT(#before,'newstring',SUBSTRING(`columnname`,#len+1)) AS `result`
FROM `tablename` WHERE ...;
Replace things as needed, but that should just about do it.
EDIT: Merged into one query:
SELECT
CONCAT(
SUBSTRING_INDEX(`columnname`,',',3),
',newstring,',
SUBSTRING(`columnname`, LENGTH(SUBSTRING_INDEX(`columnname`,',',4)+1))
) as `result`
FROM `tablename` WHERE ...;
That +1 may need to be +2, I'm not sure, but that should work.
You first split your problem in two parts:
locate the comma and split the string in values separated by comma.
update the table with same string and some substring appended.
For the first part I would suggest you take a look here
And for the second part you should take a look here
One more thing there is no shortcut to any problem. You should not run from the problem. Take it as a challenge. Learn while you search for the answer. Best thing take guidance from here and Try to do more researching and efforts.
Try this:
UPDATE yourtable
SET
categories =
TRIM(BOTH ',' FROM
REPLACE(
REPLACE(CONCAT(',',REPLACE(col, ',', ',,'), ','),',2,', ''), ',,', ',')
)
WHERE
FIND_IN_SET('2', categories)
taken from here The best way to remove value from SET field?