Cleaning out a field of Phone numbers in mySql - mysql

In not a database guy but: I have mixed up data in a mySql database that I inherited.
Some Phone numbers are formatted (512) 555-1212 (call it dirty)
Others 5125551212 (Call it clean)
I need a sqlstamet that says
UPDATE table_name
SET Phone="clean'(Some sort of cleaning code - regex?)
WHERE Phone='Dirty'

Unfortunately there's no regex replace/update in MySQL. If it's just parentheses and dashes and spaces then some nested REPLACE calls will do the trick:
UPDATE table_name
SET Phone = REPLACE(REPLACE(REPLACE(REPLACE(Phone, '-', ''), ')', ''), '(', ''), ' ', '')

To my knowledge you can't run a regexp to replace data during the update process. Only during the SELECT statement.
Your best bet is to use a scripting language that you're familiar with and read the table and change it that way. Basically by retrieving all the entries. Then using a string replace to match a simple regexp such as [^\d]* and remove those characters. Then update the table with the new value.
Also, see this answer:
How to do a regular expression replace in MySQL?

Related

SQL Query that checks only numeric characters in the database

I am working on a legacy system a client has. Phone numbers are stored in a multitude of ways. Ex:
514-879-9989
514.989.2289
5147899287
The client wants to be able to search the database by phone number.
How could this be achieved without normalizing the data stored in the database? Is this possible?
I am wondering if it is possible to have a query that looks like:
SELECT FROM table WHERE phonenumber LIKE %input%
but that takes into account only the numerical characters in the db?
$sql = "SELECT * FROM tab
WHERE replace(replace(phone, '.', ''), '-', '') like '%". $input ."%'"
Yes you can add more replace according to values in your table, as mentioned by #spencer7593 eg:
$sql = "SELECT * FROM tab
WHERE replace(replace(replace(replace(replace(replace(phone, '.', ''), '-', ''), '+', ''), '(', ''), ')', ''), ' ', '') like '%". $input ."%'"
but I would prefer to cleanup the data before the query.
The approach I would take with this (i.e. not having a "normalized" value with only digits available, and a restriction of not adding an additional column with the normalized value...)
I would take the user input for the search, and add wild cards in strategic locations. For example, if the user provides search input of 3155551212), then I'd run a query that has a predicate equivalent to this:
phonenumber LIKE '%315%555%1212%'
But if I'm not guaranteed that the provided search digits will be a full three digit area code, a three digit exchange (central office) code, and a four digit line number, for a broader search, I'd add wild cards between all of the provided digits, e.g.
phonenumber LIKE '%3%1%5%5%5%5%1%2%1%2%'
This latter approach is less than ideal, because it could potentially provide more matches than aren't intended. Especially if the user is providing fewer than ten digits. For example, consider a phonenumber value:
'+1 (315) 555-7172 ext. 123'
As a demonstration:
SELECT '+1 (315) 555-7172 ext. 123' LIKE '%3%1%5%5%5%5%1%2%1%2%'
, '+1 (315) 555-7172 ext. 123' LIKE '%315%555%1212%'
There's no builtin string function in MySQL that will extract the digit characters from a string.
If you want a function that does that, e.g.
SELECT only_digits_from('+1 (315) 555-7172 ext. 123')
to return
13155557172123
You'd have to create a stored function that does that. I wouldn't attempt doing it inline in the SQL statement, that would require an atrociously long and ugly expression.
This is piece of code i frequently use to clean up the database columns. I have modified to to be fit for your purpose.
Update Table SET Column =
replace
(replace
(replace(column,
'-','',
'.',''),
' ','')
)
WHERE Column is Not Null

Remove all special character from column in MySQL

I'm working on MySQL and having column phone-number. And trying to use regex for this and not succeed.
How can I remove all special character from this column ?
phone-number
'8-903-400-65-38'
'+79265682388'
'8.10492E+15'
'8-913-469-38-35'
'+79882856253'
'+79110987703'
'+7 (495) 989-21-16'
'8142 77-55-51'
'+79378299427'
Please can anyone help me on this issue? I don't want to lose these list of contact numbers.
Thanks in advance
The canned answer here is just to chain together a series of calls to MySQL's REPLACE function:
SELECT REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(phone-number, '-', ''), '+', ''), '.', ''), '(', ''), ')', ''), ' ', '')
This would remove the following characters from the phone-number column:
- + . ( ) and space
The nicer solution would be to use a regular expression to do the replacement, but alas MySQL does not any such built in support for regex replace.
If you don't want a fugly REPLACE chain, then you could write some dynamic MySQL code which iterates over a set of characters which you define, and does a number of updates to the table. Here is what one such update would look like:
UPDATE yourTable
SET phone-number = REPLACE(phone-number, '+', '')
You could perform one update for each character and handle it this way.

MySql : query to format a specific column in database table

I have one column name phone_number in the database table.Right now the numbers stored in the table are format like ex.+91-852-9689568.I want to format it and just want only digits.
How can i do it in MySql ? I have tried it with using functions like REGEXP but it displays error like function does not exist.And i don't want to use multiple REPLACE.
One of the options is to use mySql substring. (As long as the format doesn't change)
SELECT concat(SUBSTRING(pNo,2,2), SUBSTRING(pNo,5,3), SUBSTRING(pNo,9,7));
if you want to format via projection only, use SELECT, you will only need to use replace twice and no problem with that.
SELECT REPLACE(REPLACE(columnNAme, '-', ''), '+', '')
FROM tableName
otherwise, if you want to update the value permanently, use UPDATE
UPDATE tableName
SET columnName = REPLACE(REPLACE(columnNAme, '-', ''), '+', '')
MySQL does not have a builtin function for pattern-matching and replace.
You'll be better off fetching the whole string back to your application, and then using a more flexible string-manipulation function on it. For instance, preg_replace() in PHP.
Try the following and comment please.
Select dbo.Regex('\d+',pNo);
Select dbo.Regex('[0-9]+',pNo);
Reference on RUBLAR.
So MYSQL is not like Oracle, hence you may just use a USer defined Function to get numbers. This could get you going.

How to remove un-wanted multiple sign (e.g. &, %, #) sign from database records

I have trouble to remove multiple unwanted sign in MySql database as too huge.
Is there any availabe script to remove that sign?
sample Record With multiple sign :-
[]No. 11, Persiaran Bukit [] Satu&[]Taman #Sri %Nibong
use REPLACE function.
UPDATE tablename
SET ColName = REPLACE(REPLACE(REPLACE(Colname, '&', ''), '#', ''), '%', '')
What I suggest here is prepare a table that contains all the possible unwanted characters and then process them in one select query using replace function.

Format Phone Numbers in MySQL

I have a bunch of phone numbers in a DB that are formatted as such: (999) 123-3456.
I'm needing them to look like 123-123-1234
Is there any sort of regex or something I can do in MySQL to quickly format all these phone numbers?
Also, frustratingly, some are NOT formatted like that, so I couldn't just apply this to an entire column.
Thanks!
A quick solution would be to run these two queries:
UPDATE table_name set PhoneCol = REPLACE(PhoneCol, '(', '');
UPDATE table_name set PhoneCol = REPLACE(PhoneCol, ') ', '-');
Just write a small php script that loops through all the values and updates them. Making that change is pretty simple in php. Then just run an update on the row to overwrite the value.
maybe a two pass solution.
strip out all non-numeric characters (and spaces)
inset the formatting characters '(',')', ' ', and '-' into the correct spots
(or better yet, leave them off and format only during select on your reports.)
I had a similar problem, but increased by the reason that some phones had the format with the dashes and others did not and this was the command that helped me to update the formats of the numbers that did not have the hyphens.
Phone before the command: 1234567890
Phone after command: 123-456-7890
The phone field is called phone_number and is a VARCHAR
The command I used is:
UPDATE database.table
SET phone_number = concat(SUBSTRING(phone_number,1,3) , '-' , SUBSTRING(phone_number,4,3) , '-' , SUBSTRING(phone_number,7,4))
WHERE LOCATE('-', phone_number) = 0;
I think your command could be like this:
UPDATE database.table
SET phone_number = concat(SUBSTRING(phone_number,2,3) , '-' , SUBSTRING(phone_number,7,8));
I would remove the WHERE clause under the assumption that all phones would be formatted with the (). Also, the second string of characters would start from position 7 because there appears to be a space after the parentheses.