I need a better way to replace a non-numeric characters in a string.
I have phone numbers like so
(888) 488-6655
888-555-8888
blah blah blah
So I am able to return a clean string by using a simple replace function but I am looking for a better way may be using expression function to replace any non-numeric value. like space slash, backslash, quote..... any none numeric value
this is my current query
SELECT
a.account_id,
REPLACE(REPLACE(REPLACE(REPLACE(t.phone_number, '-', ''), ' ', ''), ')', ''),'(','') AS contact_number,
IFNULL(t.ext, '') AS extention,
CASE WHEN EXISTS (SELECT number_id FROM contact_numbers WHERE main_number = 1 AND account_id = a.account_id) THEN 0 ELSE 1 END AS main_number,
'2' AS created_by
FROM cvsnumbers t
INNER JOIN accounts a ON a.company_code = t.company_code
WHERE REPLACE(REPLACE(REPLACE(REPLACE(t.phone_number, '-', ''), ' ', ''), ')', ''),'(','') NOT IN(SELECT contact_number FROM contact_numbers WHERE account_id = a.account_id)
AND LENGTH(REPLACE(REPLACE(REPLACE(REPLACE(t.phone_number, '-', ''), ' ', ''), ')', ''),'(','') ) = 10
How can I change my query to use an REGEX to replace non-numeric values.
Thanks
This is a brute force approach.
The idea is to create a numbers table, which will index each digit in the phone number. Keep the digit if it is a number and then group them together. Here is how it would work:
select t.phone_number,
group_concat(SUBSTRING(t.phone_number, n.n, 1) separator '' order by n
) as NumbersOnly
from cvsnumbers t cross join
(select 1 as n union all select 2 union all select 3
) n
where SUBSTRING(t.phone_number, n.n, 1) between '0' and '9'
group by t.phone_number;
This example only looks at the first 3 digits in the number. You would expand the subquery for n to the maximum length of a phone number.
I don't know the mySql regex flavour but I would give this a go:
REPLACE(t.phone_number, '[^\d]+', '')
[^\d]+ means: 'Match everything that is not a digit, once or more times'
You might need to escape the backslash ([^\\d]+).
Related
I enter in the database the value with this format 1 000.00.
I want to add two values of this format, for example 1 000.00 + 4 000.00 like this:
CAST(SUM(1 000.00 + 4 000.00) AS DECIMAL (15 , 2 )) AS Pago
the returned value is 5 and should be 5 000.00
Sum 2 values and finally add a space
-- MySQL
SELECT INSERT((SUM(CAST(REPLACE('1 000.00', ' ', '') AS decimal(15, 2)) + CAST(REPLACE('4 000.00', ' ', '') AS decimal(15, 2)))), 2, 0, ' ') sum_total
Please check url https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=b8ab1e7c4c1d2ab596fbc1fe58754cf8
Add these two values in a table
SELECT INSERT((SUM(CAST(REPLACE(apago, ' ', '') as decimal(15, 2)))), 2, 0, ' ') pago FROM test;
Please check this url https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=65fa6af1bedd4a1acce01dc0ea700dd6
I resolved it this way. First I removed the space in the string like this:
SELECT Id, Total, Metodo, REPLACE(Pago, ' ', '') AS Pago
FROM dados.Orcamento
Then I added the string:
SELECT B.Id, B.Total, B.Metodo, CAST(SUM(B.Pago) AS DECIMAL (15 , 2 )) AS Pago
FROM
(SELECT Id, Total, Metodo, REPLACE('Pago', ' ', '') AS Pago
FROM dados.Orcamento) AS B GROUP BY B.Id, B.Metodo, B.Total ORDER BY B.Id ASC LIMIT 1
So the 5000.00 returned
I have a column as below
Products
jeans,oil
jeans,shampoo
I want to split the strings and use it in the same column using SQL. The result I want is
Products count
jeans 2
oil 1
shampoo 1
Could you please guide me in getting this result
Thank you
You are storing CSV data in your SQL table, which is not a good thing. But it looks like you are trying to move away from that, which is a good thing. Here is one option using a union with SUBSTRING_INDEX:
SELECT Products, COUNT(*) AS count
FROM
(
SELECT SUBSTRING_INDEX(Products, ',', 1) AS Products FROM yourTable
UNION ALL
SELECT SUBSTRING_INDEX(Products, ',', -1) FROM yourTable
) t
GROUP BY Products
ORDER BY
count DESC, Products;
Demo
Firstly you need to split the data into two columns like
SELECT CASE
WHEN name LIKE '%,%' THEN LEFT(name, Charindex(' ', products) - 1)
ELSE name
END,
CASE
WHEN name LIKE '%,%' THEN RIGHT(name, Charindex(' ', Reverse(products)) - 1)
END
FROM YourTable
then you need to union this with the same table... and the final code will look like...
select count( distinct abc), abc from
(
SELECT CASE
WHEN PA_NAME LIKE '% %' THEN LEFT(PA_NAME, Charindex(' ', PA_NAME) - 1)
ELSE PA_NAME
END [abc]
FROM phparty
union all
SELECT CASE
WHEN PA_NAME LIKE '% %' THEN RIGHT(PA_NAME, Charindex(' ', Reverse(PA_NAME)) -1)
END [abc]
FROM phparty
) t group by abc
here you can replace pa_name with your_column_name
i have the next sub-query:
SELECT SUBSTRING(
(SELECT GROUP_CONCAT(DISTINCT PT.Factura SEPARATOR '|')
FROM darwin.vt_partidas PT
WHERE PT.Pedimento = P.ID)
,1,30) AS 'Resultado'
FROM darwin.vt_pedimentos P WHERE P.ID=130
I need to concat all results separated with | until i reach 130 characters, but my problem is that if at the end a result doesn't fit example:
i get the first 30 characters but the last result doesn't fit, i get:
result1|result2|result3|result
and i want this:
result1|result2|result3
(if the result doesn't fit, remove all characters from that result)
Thank you guys
Try this
Updated your GROUP_CONCAT and added another step to remove irrelevant data exceeding the 30 max length
SELECT
#str:= left(GROUP_CONCAT( DISTINCT PT.Factura SEPARATOR '|'), 30)
FROM
vt_pedimentos P
INNER JOIN
vt_partidas PT
ON PT.Pedimento = P.ID
WHERE
P.ID = 130;
-- to check whether the last or truncated text exists in the table otherwise remove
select
#str:= left(#str,
(
length(#str) - length(reverse(left(reverse(#str), locate('|', reverse(#str)) - 1)))
)
- 1)
FROM
vt_pedimentos P
where
NOT EXISTS
(
select
1
from
vt_partidas PT
where
PT.Factura = reverse(left(reverse(#str), locate('|', reverse(#str)) - 1))
)
and P.ID = 130;
further enhancement - have it to one sql statement
String manipulation is not a forte of SQL expressions.
But something like this should do it:
SELECT
IF(CHAR_LENGTH( GROUP_CONCAT(DISTINCT PT.Factura SEPARATOR '|') ) < 130
, GROUP_CONCAT(DISTINCT PT.Factura SEPARATOR '|')
, SUBSTRING_INDEX(
SUBSTR( GROUP_CONCAT(DISTINCT PT.Factura SEPARATOR '|') ,1,130)
, '|'
, CHAR_LENGTH( SUBSTR( GROUP_CONCAT(DISTINCT PT.Factura SEPARATOR '|') ,1,130) )
-CHAR_LENGTH(REPLACE(SUBSTR( GROUP_CONCAT(DISTINCT PT.Factura SEPARATOR '|') ,1,130),'|',''))
)
)
That's fairly complicated. It will be easier to decipher if we replace the GROUP_CONCAT expression with a placeholder. Let's have res represent GROUP_CONCAT(DISTINCT PT.Factura SEPARATOR '|') expression.
SELECT
IF(CHAR_LENGTH( res ) < 130
, res
, SUBSTRING_INDEX(
SUBSTR( res ,1,130)
, '|'
, CHAR_LENGTH( SUBSTR( res ,1,130) )
-CHAR_LENGTH(REPLACE(SUBSTR( res ,1,130),'|',''))
)
)
Still ugly, but better. Let's break that down.
If the number of characters in res is less than 130, we're done. Just return res.
Otherwise, we need to trim res to 130 characters, we can use SUBSTRING function to do that.
Now, we want to trim the last | and the following characters. To do that, we can get a count of the number | separator characters. Then we know which one the last one is.
(We can get a count of the separator characters by replacing all separator characters with an empty string, then getting the length of that string, and subtracting that from the length of the original string. The difference is the total length of the removed separator characters.
Then we can use that difference in a SUBSTRING_INDEX function to return all of the the characters before the last separator.
It's not a pretty solution. But it does implement an algorithm that satisfies the specification.
I would like to have a mysql query like this:
select <second word in text> word, count(*) from table group by word;
All the regex examples in mysql are used to query if the text matches the expression, but not to extract text out of an expression. Is there such a syntax?
The following is a proposed solution for the OP's specific problem (extracting the 2nd word of a string), but it should be noted that, as mc0e's answer states, actually extracting regex matches is not supported out-of-the-box in MySQL. If you really need this, then your choices are basically to 1) do it in post-processing on the client, or 2) install a MySQL extension to support it.
BenWells has it very almost correct. Working from his code, here's a slightly adjusted version:
SUBSTRING(
sentence,
LOCATE(' ', sentence) + CHAR_LENGTH(' '),
LOCATE(' ', sentence,
( LOCATE(' ', sentence) + 1 ) - ( LOCATE(' ', sentence) + CHAR_LENGTH(' ') )
)
As a working example, I used:
SELECT SUBSTRING(
sentence,
LOCATE(' ', sentence) + CHAR_LENGTH(' '),
LOCATE(' ', sentence,
( LOCATE(' ', sentence) + 1 ) - ( LOCATE(' ', sentence) + CHAR_LENGTH(' ') )
) as string
FROM (SELECT 'THIS IS A TEST' AS sentence) temp
This successfully extracts the word IS
Shorter option to extract the second word in a sentence:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('THIS IS A TEST', ' ', 2), ' ', -1) as FoundText
MySQL docs for SUBSTRING_INDEX
According to http://dev.mysql.com/ the SUBSTRING function uses start position then the length so surely the function for the second word would be:
SUBSTRING(sentence,LOCATE(' ',sentence),(LOCATE(' ',LOCATE(' ',sentence))-LOCATE(' ',sentence)))
No, there isn't a syntax for extracting text using regular expressions. You have to use the ordinary string manipulation functions.
Alternatively select the entire value from the database (or the first n characters if you are worried about too much data transfer) and then use a regular expression on the client.
As others have said, mysql does not provide regex tools for extracting sub-strings. That's not to say you can't have them though if you're prepared to extend mysql with user-defined functions:
https://github.com/mysqludf/lib_mysqludf_preg
That may not be much help if you want to distribute your software, being an impediment to installing your software, but for an in-house solution it may be appropriate.
I used Brendan Bullen's answer as a starting point for a similar issue I had which was to retrive the value of a specific field in a JSON string. However, like I commented on his answer, it is not entirely accurate. If your left boundary isn't just a space like in the original question, then the discrepancy increases.
Corrected solution:
SUBSTRING(
sentence,
LOCATE(' ', sentence) + 1,
LOCATE(' ', sentence, (LOCATE(' ', sentence) + 1)) - LOCATE(' ', sentence) - 1
)
The two differences are the +1 in the SUBSTRING index parameter and the -1 in the length parameter.
For a more general solution to "find the first occurence of a string between two provided boundaries":
SUBSTRING(
haystack,
LOCATE('<leftBoundary>', haystack) + CHAR_LENGTH('<leftBoundary>'),
LOCATE(
'<rightBoundary>',
haystack,
LOCATE('<leftBoundary>', haystack) + CHAR_LENGTH('<leftBoundary>')
)
- (LOCATE('<leftBoundary>', haystack) + CHAR_LENGTH('<leftBoundary>'))
)
I don't think such a thing is possible. You can use SUBSTRING function to extract the part you want.
My home-grown regular expression replace function can be used for this.
Demo
See this DB-Fiddle demo, which returns the second word ("I") from a famous sonnet and the number of occurrences of it (1).
SQL
Assuming MySQL 8 or later is being used (to allow use of a Common Table Expression), the following will return the second word and the number of occurrences of it:
WITH cte AS (
SELECT digits.idx,
SUBSTRING_INDEX(SUBSTRING_INDEX(words, '~', digits.idx + 1), '~', -1) word
FROM
(SELECT reg_replace(UPPER(txt),
'[^''’a-zA-Z-]+',
'~',
TRUE,
1,
0) AS words
FROM tbl) delimited
INNER JOIN
(SELECT #row := #row + 1 as idx FROM
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t1,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t2,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t3,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t4,
(SELECT #row := -1) t5) digits
ON LENGTH(REPLACE(words, '~' , '')) <= LENGTH(words) - digits.idx)
SELECT c.word,
subq.occurrences
FROM cte c
LEFT JOIN (
SELECT word,
COUNT(*) AS occurrences
FROM cte
GROUP BY word
) subq
ON c.word = subq.word
WHERE idx = 1; /* idx is zero-based so 1 here gets the second word */
Explanation
A few tricks are used in the SQL above and some accreditation is needed. Firstly the regular expression replacer is used to replace all continuous blocks of non-word characters - each being replaced by a single tilda (~) character. Note: A different character could be chosen instead if there is any possibility of a tilda appearing in the text.
The technique from this answer is then used for transforming a string with delimited values into separate row values. It's combined with the clever technique from this answer for generating a table consisting of a sequence of incrementing numbers: 0 - 10,000 in this case.
The field's value is:
"- DE-HEB 20% - DTopTen 1.2%"
SELECT ....
SUBSTRING_INDEX(SUBSTRING_INDEX(DesctosAplicados, 'DE-HEB ', -1), '-', 1) DE-HEB ,
SUBSTRING_INDEX(SUBSTRING_INDEX(DesctosAplicados, 'DTopTen ', -1), '-', 1) DTopTen ,
FROM TABLA
Result is:
DE-HEB DTopTEn
20% 1.2%
I have the Following Query:
SELECT BUSINESS_NAME, 'KEYWORD', REPLACE(BUSINESS_NAME, ' ', '-')
FROM clearindia.business b
LEFT OUTER JOIN `clearindia`.`keywords_master` km ON km.KEYWORD_TEXT = b.BUSINESS_NAME
WHERE km.KEYWORD_TEXT IS NULL
AND b.business_name='Dey Radio Service'
GROUP BY BUSINESS_NAME
It gives me the following results:
# BUSINESS_NAME, KEYWORD, REPLACE(BUSINESS_NAME, ' ', '-')
'Dey Radio service', 'KEYWORD', 'Dey-Radio service'
REPLACE(BUSINESS_NAME, ' ', '-') is not working correctly and does not replace the second space with a '-'. Why is that?
Please Note: BUSINESS_NAME has a collation of utf_unicode_ci.