Mysql search & replace with random characters - mysql

I have in text some URLS src="https://example.com/public/images/someimage.jpg?itok=WDGFySy"
I need remove in every url this garbage token ?itok=WDGFySy, all tokens obviously are random :).
I try do it directly in database like this:
UPDATE wp_posts SET post_content = REPLACE(post_content, 'itok=[[:xdigit:]]{8}', '') WHERE post_content LIKE 'itok=[[:xdigit:]]{8}
';
But i cant find any of this tokens like this. LIKE this [a-fA-F0-9]{8} also wont help. Any advice? Thank you for any suggestions.

You can only use regex if you have MySQL 8.0.
However, if your links are in separate fields, it is possible to create and use SP to clean them up:
DELIMITER $$
CREATE FUNCTION CLEANUP(
aString VARCHAR(255)
, aName VARCHAR(15)
)
RETURNS VARCHAR(255)
BEGIN
SET #u = SUBSTRING_INDEX(aString, '?', 1);
SET #q = SUBSTRING_INDEX(aString, '?', -1);
IF #q = aString THEN
RETURN aString; -- no '?' char found
ELSE
SET #f = LOCATE(CONCAT(aName, '='), #q);
SET #query = IF(#f > 1, SUBSTR(#q, 1, #f - 1), '');
IF #f > 0 THEN
SET #t = LOCATE('&', #q, #f + LENGTH(aName) + 1);
IF #t > 0 THEN
SET #query = CONCAT(#query, SUBSTR(#q, #t + 1));
END IF;
END IF;
IF #query = '' THEN
RETURN #u;
ELSE
RETURN CONCAT(#u, '?', TRIM('&' FROM #query));
END IF;
END IF;
END
$$
DELIMITER ;
Then:
SELECT
CLEANUP('https://example.com/public/images/someimage.jpg?itok=WDGFySy&b=2', 'itok')
, CLEANUP('https://example.com/public/images/someimage.jpg?itok=WDGFySy', 'itok')
, CLEANUP('https://example.com/public/images/someimage.jpg?a=1&itok=WDGFySy', 'itok')
, CLEANUP('https://example.com/public/images/someimage.jpg?a=1&itok=WDGFySy&b=2', 'itok')
;

Related

sql - select column but remove non alphanumeric in result [duplicate]

I'm working on a routine that compares strings, but for better efficiency I need to remove all characters that are not letters or numbers.
I'm using multiple REPLACE functions now, but maybe there is a faster and nicer solution ?
Using MySQL 8.0 or higher
Courtesy of michal.jakubeczy's answer below, replacing by Regex is now supported by MySQL:
UPDATE {table} SET {column} = REGEXP_REPLACE({column}, '[^0-9a-zA-Z ]', '')
Using MySQL 5.7 or lower
Regex isn't supported here. I had to create my own function called alphanum which stripped the chars for me:
DROP FUNCTION IF EXISTS alphanum;
DELIMITER |
CREATE FUNCTION alphanum( str CHAR(255) ) RETURNS CHAR(255) DETERMINISTIC
BEGIN
DECLARE i, len SMALLINT DEFAULT 1;
DECLARE ret CHAR(255) DEFAULT '';
DECLARE c CHAR(1);
IF str IS NOT NULL THEN
SET len = CHAR_LENGTH( str );
REPEAT
BEGIN
SET c = MID( str, i, 1 );
IF c REGEXP '[[:alnum:]]' THEN
SET ret=CONCAT(ret,c);
END IF;
SET i = i + 1;
END;
UNTIL i > len END REPEAT;
ELSE
SET ret='';
END IF;
RETURN ret;
END |
DELIMITER ;
Now I can do:
select 'This works finally!', alphanum('This works finally!');
and I get:
+---------------------+---------------------------------+
| This works finally! | alphanum('This works finally!') |
+---------------------+---------------------------------+
| This works finally! | Thisworksfinally |
+---------------------+---------------------------------+
1 row in set (0.00 sec)
Hurray!
From a performance point of view,
(and on the assumption that you read more than you write)
I think the best way would be to pre calculate and store a stripped version of the column,
This way you do the transform less.
You can then put an index on the new column and get the database to do the work for you.
SELECT teststring REGEXP '[[:alnum:]]+';
SELECT * FROM testtable WHERE test REGEXP '[[:alnum:]]+';
See: http://dev.mysql.com/doc/refman/5.1/en/regexp.html
Scroll down to the section that says: [:character_class:]
If you want to manipulate strings the fastest way will be to use a str_udf, see:
https://github.com/hholzgra/mysql-udf-regexp
Since MySQL 8.0 you can use regular expression to remove non alphanumeric characters from a string. There is method REGEXP_REPLACE
Here is the code to remove non-alphanumeric characters:
UPDATE {table} SET {column} = REGEXP_REPLACE({column}, '[^0-9a-zA-Z ]', '')
Straight and battletested solution for latin and cyrillic characters:
DELIMITER //
CREATE FUNCTION `remove_non_numeric_and_letters`(input TEXT)
RETURNS TEXT
BEGIN
DECLARE output TEXT DEFAULT '';
DECLARE iterator INT DEFAULT 1;
WHILE iterator < (LENGTH(input) + 1) DO
IF SUBSTRING(input, iterator, 1) IN
('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'А', 'Б', 'В', 'Г', 'Д', 'Е', 'Ж', 'З', 'И', 'Й', 'К', 'Л', 'М', 'Н', 'О', 'П', 'Р', 'С', 'Т', 'У', 'Ф', 'Х', 'Ц', 'Ч', 'Ш', 'Щ', 'Ъ', 'Ы', 'Ь', 'Э', 'Ю', 'Я', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я')
THEN
SET output = CONCAT(output, SUBSTRING(input, iterator, 1));
END IF;
SET iterator = iterator + 1;
END WHILE;
RETURN output;
END //
DELIMITER ;
Usage:
-- outputs "hello12356"
SELECT remove_non_numeric_and_letters('hello - 12356-привет ""]')
The fastest way I was able to find (and using ) is with convert().
from Doc. CONVERT() with USING is used to convert data between different character sets.
Example:
convert(string USING ascii)
In your case the right character set will be self defined
NOTE from Doc. The USING form of CONVERT() is available as of 4.1.0.
Based on the answer by Ryan Shillington, modified to work with strings longer than 255 characters and preserving spaces from the original string.
FYI there is lower(str) in the end.
I used this to compare strings:
DROP FUNCTION IF EXISTS spacealphanum;
DELIMITER $$
CREATE FUNCTION `spacealphanum`( str TEXT ) RETURNS TEXT CHARSET utf8
BEGIN
DECLARE i, len SMALLINT DEFAULT 1;
DECLARE ret TEXT DEFAULT '';
DECLARE c CHAR(1);
SET len = CHAR_LENGTH( str );
REPEAT
BEGIN
SET c = MID( str, i, 1 );
IF c REGEXP '[[:alnum:]]' THEN
SET ret=CONCAT(ret,c);
ELSEIF c = ' ' THEN
SET ret=CONCAT(ret," ");
END IF;
SET i = i + 1;
END;
UNTIL i > len END REPEAT;
SET ret = lower(ret);
RETURN ret;
END $$
DELIMITER ;
Be careful, characters like ’ or » are considered as alpha by MySQL.
It better to use something like :
IF c BETWEEN 'a' AND 'z' OR c BETWEEN 'A' AND 'Z' OR c BETWEEN '0' AND
'9' OR c = '-' THEN
I have written this UDF. However, it only trims special characters at the beginning of the string. It also converts the string to lower case. You can update this function if desired.
DELIMITER //
DROP FUNCTION IF EXISTS DELETE_DOUBLE_SPACES//
CREATE FUNCTION DELETE_DOUBLE_SPACES ( title VARCHAR(250) )
RETURNS VARCHAR(250) DETERMINISTIC
BEGIN
DECLARE result VARCHAR(250);
SET result = REPLACE( title, ' ', ' ' );
WHILE (result <> title) DO
SET title = result;
SET result = REPLACE( title, ' ', ' ' );
END WHILE;
RETURN result;
END//
DROP FUNCTION IF EXISTS LFILTER//
CREATE FUNCTION LFILTER ( title VARCHAR(250) )
RETURNS VARCHAR(250) DETERMINISTIC
BEGIN
WHILE (1=1) DO
IF( ASCII(title) BETWEEN ASCII('a') AND ASCII('z')
OR ASCII(title) BETWEEN ASCII('A') AND ASCII('Z')
OR ASCII(title) BETWEEN ASCII('0') AND ASCII('9')
) THEN
SET title = LOWER( title );
SET title = REPLACE(
REPLACE(
REPLACE(
title,
CHAR(10), ' '
),
CHAR(13), ' '
) ,
CHAR(9), ' '
);
SET title = DELETE_DOUBLE_SPACES( title );
RETURN title;
ELSE
SET title = SUBSTRING( title, 2 );
END IF;
END WHILE;
END//
DELIMITER ;
SELECT LFILTER(' !##$%^&*()_+1a b');
Also, you could use regular expressions but this requires installing a MySql extension.
This can be done with a regular expression replacer function I posted in another answer and have blogged about here. It may not be the most efficient solution possible and might look overkill for the job in hand - but like a Swiss army knife, it may come in useful for other reasons.
It can be seen in action removing all non-alphanumeric characters in this Rextester online demo.
SQL (excluding the function code for brevity):
SELECT txt,
reg_replace(txt,
'[^a-zA-Z0-9]+',
'',
TRUE,
0,
0
) AS `reg_replaced`
FROM test;
I had a similar problem with trying to match last names in our database that were slightly different. For example, sometimes people entered the same person's name as "McDonald" and also as "Mc Donald", or "St John" and "St. John".
Instead of trying to convert the Mysql data, I solved the problem by creating a function (in PHP) that would take a string and create an alpha-only regular expression:
function alpha_only_regex($str) {
$alpha_only = str_split(preg_replace('/[^A-Z]/i', '', $str));
return '^[^a-zA-Z]*'.implode('[^a-zA-Z]*', $alpha_only).'[^a-zA-Z]*$';
}
Now I can search the database with a query like this:
$lastname_regex = alpha_only_regex($lastname);
$query = "SELECT * FROM my_table WHERE lastname REGEXP '$lastname_regex';
So far, the only alternative approach less complicated than the other answers here is to determine the full set of special characters of the column, i.e. all the special characters that are in use in that column at the moment, and then do a sequential replace of all those characters, e.g.
update pages set slug = lower(replace(replace(replace(replace(name, ' ', ''), '-', ''), '.', ''), '&', '')); # replacing just space, -, ., & only
.
This is only advisable on a known set of data, otherwise it's
trivial for some special characters to slip past with a
blacklist approach instead of a whitelist approach.
Obviously, the simplest way is to pre-validate the data outside of sql due to the lack of robust built-in whitelisting (e.g. via a regex replace).
I needed to get only alphabetic characters of a string in a procedure, and did:
SET #source = "whatever you want";
SET #target = '';
SET #i = 1;
SET #len = LENGTH(#source);
WHILE #i <= #len DO
SET #char = SUBSTRING(#source, #i, 1);
IF ((ORD(#char) >= 65 && ORD(#char) <= 90) || (ORD(#char) >= 97 && ORD(#char) <= 122)) THEN
SET #target = CONCAT(#target, #char);
END IF;
SET #i = #i + 1;
END WHILE;
Needed to replace non-alphanumeric characters rather than remove non-alphanumeric characters so I have created this based on Ryan Shillington's alphanum. Works for strings up to 255 characters in length
DROP FUNCTION IF EXISTS alphanumreplace;
DELIMITER |
CREATE FUNCTION alphanumreplace( str CHAR(255), d CHAR(32) ) RETURNS CHAR(255)
BEGIN
DECLARE i, len SMALLINT DEFAULT 1;
DECLARE ret CHAR(32) DEFAULT '';
DECLARE c CHAR(1);
SET len = CHAR_LENGTH( str );
REPEAT
BEGIN
SET c = MID( str, i, 1 );
IF c REGEXP '[[:alnum:]]' THEN SET ret=CONCAT(ret,c);
ELSE SET ret=CONCAT(ret,d);
END IF;
SET i = i + 1;
END;
UNTIL i > len END REPEAT;
RETURN ret;
END |
DELIMITER ;
Example:
select 'hello world!',alphanum('hello world!'),alphanumreplace('hello world!','-');
+--------------+--------------------------+-------------------------------------+
| hello world! | alphanum('hello world!') | alphanumreplace('hello world!','-') |
+--------------+--------------------------+-------------------------------------+
| hello world! | helloworld | hello-world- |
+--------------+--------------------------+-------------------------------------+
You'll need to add the alphanum function seperately if you want that, I just have it here for the example.
I tried a few solutions but at the end used replace. My data set is part numbers and I fairly know what to expect. But just for sanity, I used PHP to build the long query:
$dirty = array(' ', '-', '.', ',', ':', '?', '/', '!', '&', '#');
$query = 'part_no';
foreach ($dirty as $dirt) {
$query = "replace($query,'$dirt','')";
}
echo $query;
This outputs something I used to get a headache from:
replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(part_no,' ',''),'-',''),'.',''),',',''),':',''),'?',''),'/',''),'!',''),'&',''),'#','')
if you are using php then....
try{
$con = new PDO ("mysql:host=localhost;dbname=dbasename","root","");
}
catch(PDOException $e){
echo "error".$e-getMessage();
}
$select = $con->prepare("SELECT * FROM table");
$select->setFetchMode(PDO::FETCH_ASSOC);
$select->execute();
while($data=$select->fetch()){
$id = $data['id'];
$column = $data['column'];
$column = preg_replace("/[^a-zA-Z0-9]+/", " ", $column); //remove all special characters
$update = $con->prepare("UPDATE table SET column=:column WHERE id='$id'");
$update->bindParam(':column', $column );
$update->execute();
// echo $column."<br>";
}
the alphanum function (self answered) have a bug, but I don't know why.
For text "cas synt ls 75W140 1L" return "cassyntls75W1401", "L" from the end is missing some how.
Now I use
delimiter //
DROP FUNCTION IF EXISTS alphanum //
CREATE FUNCTION alphanum(prm_strInput varchar(255))
RETURNS VARCHAR(255)
DETERMINISTIC
BEGIN
DECLARE i INT DEFAULT 1;
DECLARE v_char VARCHAR(1);
DECLARE v_parseStr VARCHAR(255) DEFAULT ' ';
WHILE (i <= LENGTH(prm_strInput) ) DO
SET v_char = SUBSTR(prm_strInput,i,1);
IF v_char REGEXP '^[A-Za-z0-9]+$' THEN
SET v_parseStr = CONCAT(v_parseStr,v_char);
END IF;
SET i = i + 1;
END WHILE;
RETURN trim(v_parseStr);
END
//
(found on google)
Probably a silly suggestion compared to others:
if(!preg_match("/^[a-zA-Z0-9]$/",$string)){
$sortedString=preg_replace("/^[a-zA-Z0-9]+$/","",$string);
}

How to get only Digits from String in mysql?

I have some string output which contain alphanumeric value. I want to get only Digits from that string. how can I fetch this by query? which MySql function can I Use?
My query is like :
select DISTINCT SUBSTRING(referrerURL,71,6)
from hotshotsdblog1.annonymoustracking
where advertiserid = 10
limit 10;
Output :
100683
101313
19924&
9072&h
12368&
5888&h
10308&
100664
1&hash
101104
And I Want output like :
100683
101313
19924
9072
12368
5888
10308
100664
1
101104
If the string starts with a number, then contains non-numeric characters, you can use the CAST() function or convert it to a numeric implicitly by adding a 0:
SELECT CAST('1234abc' AS UNSIGNED); -- 1234
SELECT '1234abc'+0; -- 1234
To extract numbers out of an arbitrary string you could add a custom function like this:
DELIMITER $$
CREATE FUNCTION `ExtractNumber`(in_string VARCHAR(50))
RETURNS INT
NO SQL
BEGIN
DECLARE ctrNumber VARCHAR(50);
DECLARE finNumber VARCHAR(50) DEFAULT '';
DECLARE sChar VARCHAR(1);
DECLARE inti INTEGER DEFAULT 1;
IF LENGTH(in_string) > 0 THEN
WHILE(inti <= LENGTH(in_string)) DO
SET sChar = SUBSTRING(in_string, inti, 1);
SET ctrNumber = FIND_IN_SET(sChar, '0,1,2,3,4,5,6,7,8,9');
IF ctrNumber > 0 THEN
SET finNumber = CONCAT(finNumber, sChar);
END IF;
SET inti = inti + 1;
END WHILE;
RETURN CAST(finNumber AS UNSIGNED);
ELSE
RETURN 0;
END IF;
END$$
DELIMITER ;
Once the function is defined, you can use it in your query:
SELECT ExtractNumber("abc1234def") AS number; -- 1234
To whoever is still looking, use regex:
select REGEXP_SUBSTR(name,"[0-9]+") as amount from `subscriptions`
Here I got success with this function:
select REGEXP_REPLACE('abc12.34.56-ghj^-_~##!', '[^0-9]+', '')
output: 123456
Explaining: basically I'm asking for mysql replace all 'not numbers' in interval from 0 to 9 to ''.
Based on Eugene Yarmash Answer. Here is a version of the custom function that extracts a decimal with two decimal places. Good for price extraction.
DELIMITER $$
CREATE FUNCTION `ExtractDecimal`(in_string VARCHAR(255))
RETURNS decimal(15,2)
NO SQL
BEGIN
DECLARE ctrNumber VARCHAR(255);
DECLARE in_string_parsed VARCHAR(255);
DECLARE digitsAndDotsNumber VARCHAR(255) DEFAULT '';
DECLARE finalNumber VARCHAR(255) DEFAULT '';
DECLARE sChar VARCHAR(1);
DECLARE inti INTEGER DEFAULT 1;
DECLARE digitSequenceStarted boolean DEFAULT false;
DECLARE negativeNumber boolean DEFAULT false;
-- FIX FIND_IN_SET cannot find a comma ","
SET in_string_parsed = replace(in_string,',','.');
IF LENGTH(in_string_parsed) > 0 THEN
-- extract digits and dots
WHILE(inti <= LENGTH(in_string_parsed)) DO
SET sChar = SUBSTRING(in_string_parsed, inti, 1);
SET ctrNumber = FIND_IN_SET(sChar, '0,1,2,3,4,5,6,7,8,9,.');
IF ctrNumber > 0 AND (sChar != '.' OR LENGTH(digitsAndDotsNumber) > 0) THEN
-- add first minus if needed
IF digitSequenceStarted = false AND inti > 1 AND SUBSTRING(in_string_parsed, inti-1, 1) = '-' THEN
SET negativeNumber = true;
END IF;
SET digitSequenceStarted = true;
SET digitsAndDotsNumber = CONCAT(digitsAndDotsNumber, sChar);
ELSEIF digitSequenceStarted = true THEN
SET inti = LENGTH(in_string_parsed);
END IF;
SET inti = inti + 1;
END WHILE;
-- remove dots from the end of number list
SET inti = LENGTH(digitsAndDotsNumber);
WHILE(inti > 0) DO
IF(SUBSTRING(digitsAndDotsNumber, inti, 1) = '.') THEN
SET digitsAndDotsNumber = SUBSTRING(digitsAndDotsNumber, 1, inti-1);
SET inti = inti - 1;
ELSE
SET inti = 0;
END IF;
END WHILE;
-- extract decimal
SET inti = 1;
WHILE(inti <= LENGTH(digitsAndDotsNumber)-3) DO
SET sChar = SUBSTRING(digitsAndDotsNumber, inti, 1);
SET ctrNumber = FIND_IN_SET(sChar, '0,1,2,3,4,5,6,7,8,9');
IF ctrNumber > 0 THEN
SET finalNumber = CONCAT(finalNumber, sChar);
END IF;
SET inti = inti + 1;
END WHILE;
SET finalNumber = CONCAT(finalNumber, RIGHT(digitsAndDotsNumber, 3));
IF negativeNumber = true AND LENGTH(finalNumber) > 0 THEN
SET finalNumber = CONCAT('-', finalNumber);
END IF;
IF LENGTH(finalNumber) = 0 THEN
RETURN 0;
END IF;
RETURN CAST(finalNumber AS decimal(15,2));
ELSE
RETURN 0;
END IF;
END$$
DELIMITER ;
Tests:
select ExtractDecimal("1234"); -- 1234.00
select ExtractDecimal("12.34"); -- 12.34
select ExtractDecimal("1.234"); -- 1234.00
select ExtractDecimal("1,234"); -- 1234.00
select ExtractDecimal("1,111,234"); -- 11111234.00
select ExtractDecimal("11,112,34"); -- 11112.34
select ExtractDecimal("11,112,34 and 123123"); -- 11112.34
select ExtractDecimal("-1"); -- -1.00
select ExtractDecimal("hello. price is 123"); -- 123.00
select ExtractDecimal("123,45,-"); -- 123.45
Here is my improvement over ExtractNumber function by Eugene Yarmash.
It strips not only non-digit characters, but also HTML entities like &#[0-9];, which should be considered as non-digit unicode characters too.
Here is the code without UDP on pure MySQL <8.
CREATE DEFINER = 'user'#'host' FUNCTION `extract_number`(
str CHAR(255)
)
RETURNS char(255) CHARSET utf8mb4 COLLATE utf8mb4_unicode_ci
DETERMINISTIC
NO SQL
SQL SECURITY DEFINER
COMMENT ''
BEGIN
DECLARE tmp VARCHAR(255);
DECLARE res VARCHAR(255) DEFAULT "";
DECLARE chr VARCHAR(1);
DECLARE len INTEGER UNSIGNED DEFAULT LENGTH(str);
DECLARE i INTEGER DEFAULT 1;
IF len > 0 THEN
WHILE i <= len DO
SET chr = SUBSTRING(str, i, 1);
/* remove &#...; */
IF "&" = chr AND "#" = SUBSTRING(str, i+1, 1) THEN
WHILE (i <= len) AND (";" != SUBSTRING(str, i, 1)) DO
SET i = i + 1;
END WHILE;
END IF;
SET tmp = FIND_IN_SET(chr, "0,1,2,3,4,5,6,7,8,9");
IF tmp > 0 THEN
SET res = CONCAT(res, chr);
END IF;
SET i = i + 1;
END WHILE;
RETURN res;
END IF;
RETURN 0;
END;
But if you are using UDP's PREG_REPLACE, you can use just following line:
RETURN PREG_REPLACE("/[^0-9]/", "", PREG_REPLACE("/&#[0-9]+;/", "", str));
I have rewritten this for MemSQL Syntax:
DROP FUNCTION IF EXISTS GetNumeric;
DELIMITER //
CREATE FUNCTION GetNumeric(str CHAR(255)) RETURNS CHAR(255) AS
DECLARE i SMALLINT = 1;
DECLARE len SMALLINT = 1;
DECLARE ret CHAR(255) = '';
DECLARE c CHAR(1);
BEGIN
IF str IS NULL
THEN
RETURN "";
END IF;
WHILE i < CHAR_LENGTH( str ) + 1 LOOP
BEGIN
c = SUBSTRING( str, i, 1 );
IF c BETWEEN '0' AND '9' THEN
ret = CONCAT(ret,c);
END IF;
i = i + 1;
END;
END LOOP;
RETURN ret;
END //
DELIMITER ;
SELECT GetNumeric('abc123def456xyz789') as test;
Based on Eugene Yarmash and Martins Balodis answers.
In my case, I didn't know whether the source string contains dot as a decimal separator. Although, I knew how the specific column should be treated. E.g. in case value came up as "10,00" hours but not as "1.00" we know that the last delimiter character should be treated as a dot decimal separator. For this purposes, we can rely on a secondary boolean param that specifies how the last comma separator should behave.
DELIMITER $$
CREATE FUNCTION EXTRACT_DECIMAL(
inString VARCHAR(255)
, treatLastCommaAsDot BOOLEAN
) RETURNS varchar(255) CHARSET utf8mb4
NO SQL
DETERMINISTIC
BEGIN
DECLARE ctrNumber VARCHAR(255);
DECLARE inStringParsed VARCHAR(255);
DECLARE digitsAndDotsNumber VARCHAR(255) DEFAULT '';
DECLARE digitsBeforeDotNumber VARCHAR(255) DEFAULT '';
DECLARE digitsAfterDotNumber VARCHAR(255) DEFAULT '';
DECLARE finalNumber VARCHAR(255) DEFAULT '';
DECLARE separatorChar VARCHAR(1) DEFAULT '_';
DECLARE iterChar VARCHAR(1);
DECLARE inti INT DEFAULT 1;
DECLARE digitSequenceStarted BOOLEAN DEFAULT false;
DECLARE negativeNumber BOOLEAN DEFAULT false;
-- FIX FIND_IN_SET cannot find a comma ","
-- We need to separate entered dot from another delimiter characters.
SET inStringParsed = TRIM(REPLACE(REPLACE(inString, ',', separatorChar), ' ', ''));
IF LENGTH(inStringParsed) > 0 THEN
-- Extract digits, dots and delimiter character.
WHILE(inti <= LENGTH(inStringParsed)) DO
-- Might contain MINUS as the first character.
SET iterChar = SUBSTRING(inStringParsed, inti, 1);
SET ctrNumber = FIND_IN_SET(iterChar, CONCAT('0,1,2,3,4,5,6,7,8,9,.,', separatorChar));
-- In case the first extracted character is not '.' and `digitsAndDotsNumber` is set.
IF ctrNumber > 0 AND (iterChar != '.' OR LENGTH(digitsAndDotsNumber) > 0) THEN
-- Add first minus if needed. Note: `inti` at this point will be higher than 1.
IF digitSequenceStarted = FALSE AND inti > 1 AND SUBSTRING(inStringParsed, inti - 1, 1) = '-' THEN
SET negativeNumber = TRUE;
END IF;
SET digitSequenceStarted = TRUE;
SET digitsAndDotsNumber = CONCAT(digitsAndDotsNumber, iterChar);
ELSEIF digitSequenceStarted = true THEN
SET inti = LENGTH(inStringParsed);
END IF;
SET inti = inti + 1;
END WHILE;
-- Search the left part of string until the separator.
-- https://stackoverflow.com/a/43699586
IF (
-- Calculates the amount of delimiter characters.
CHAR_LENGTH(digitsAndDotsNumber)
- CHAR_LENGTH(REPLACE(digitsAndDotsNumber, separatorChar, SPACE(LENGTH(separatorChar)-1)))
) + (
-- Calculates the amount of dot characters.
CHAR_LENGTH(digitsAndDotsNumber)
- CHAR_LENGTH(REPLACE(digitsAndDotsNumber, '.', SPACE(LENGTH(separatorChar)-1)))
) > 0 THEN
-- If dot is present in the string. It doesn't matter for the other characters.
IF LOCATE('.', digitsAndDotsNumber) != FALSE THEN
-- Replace all special characters before the dot.
SET inti = LOCATE('.', digitsAndDotsNumber) - 1;
-- Return the first half of numbers before the last dot.
SET digitsBeforeDotNumber = SUBSTRING(digitsAndDotsNumber, 1, inti);
SET digitsBeforeDotNumber = REPLACE(digitsBeforeDotNumber, separatorChar, '');
SET digitsAfterDotNumber = SUBSTRING(digitsAndDotsNumber, inti + 2, LENGTH(digitsAndDotsNumber) - LENGTH(digitsBeforeDotNumber));
SET digitsAndDotsNumber = CONCAT(digitsBeforeDotNumber, '.', digitsAfterDotNumber);
ELSE
IF treatLastCommaAsDot = TRUE THEN
-- Find occurence of the last delimiter within the string.
SET inti = CHAR_LENGTH(digitsAndDotsNumber) - LOCATE(separatorChar, REVERSE(digitsAndDotsNumber));
-- Break the string into left part until the last occurrence of separator character.
SET digitsBeforeDotNumber = SUBSTRING(digitsAndDotsNumber, 1, inti);
SET digitsBeforeDotNumber = REPLACE(digitsBeforeDotNumber, separatorChar, '');
SET digitsAfterDotNumber = SUBSTRING(digitsAndDotsNumber, inti + 2, LENGTH(digitsAndDotsNumber) - LENGTH(digitsBeforeDotNumber));
-- Remove any dot occurence from the right part.
SET digitsAndDotsNumber = CONCAT(digitsBeforeDotNumber, '.', REPLACE(digitsAfterDotNumber, '.', ''));
ELSE
SET digitsAndDotsNumber = REPLACE(digitsAndDotsNumber, separatorChar, '');
END IF;
END IF;
END IF;
SET finalNumber = digitsAndDotsNumber;
IF negativeNumber = TRUE AND LENGTH(finalNumber) > 0 THEN
SET finalNumber = CONCAT('-', finalNumber);
END IF;
IF LENGTH(finalNumber) = 0 THEN
RETURN 0;
END IF;
RETURN CAST(finalNumber AS DECIMAL(25,5));
ELSE
RETURN 0;
END IF;
END$$
DELIMITER ;
Here are some examples of usage:
--
-- SELECT EXTRACT_DECIMAL('-711,712,34 and 123123', FALSE); -- -71171234.00000
-- SELECT EXTRACT_DECIMAL('1.234', FALSE); -- 1.23400
-- SELECT EXTRACT_DECIMAL('1,234.00', FALSE); -- 1234.00000
-- SELECT EXTRACT_DECIMAL('14 9999,99', FALSE); -- 14999999.00000
-- SELECT EXTRACT_DECIMAL('-149,999.99', FALSE); -- -149999.99000
-- SELECT EXTRACT_DECIMAL('3 536 500.53', TRUE); -- 3536500.53000
-- SELECT EXTRACT_DECIMAL('3,536,500,53', TRUE); -- 3536500.53000
-- SELECT EXTRACT_DECIMAL("-1"); -- -1.00000
-- SELECT EXTRACT_DECIMAL('2,233,536,50053', TRUE); -- 2233536.50053
-- SELECT EXTRACT_DECIMAL('13.01666667', TRUE); -- 13.01667
-- SELECT EXTRACT_DECIMAL('1,00000000', FALSE); -- 100000000.00000
-- SELECT EXTRACT_DECIMAL('1000', FALSE); -- 1000.00000
-- ==================================================================================
Try,
Query level,
SELECT CAST('1&hash' AS UNSIGNED);
for PHP,
echo intval('13213&hash');
For any newcomers with a similar request this should be exactly what you need.
select DISTINCT CONVERT(SUBSTRING(referrerURL,71,6), SIGNED) as `foo`
from hotshotsdblog1.annonymoustracking
where advertiserid = 10
limit 10;
I suggest using a pivot table (e.g., a table that only contains a vector of ordered numbers from 1 to at least the length of the string) and then doing the following:
SELECT group_concat(c.elem separator '')
from (
select b.elem
from
(
select substr('PAUL123f3211',iter.pos,1) as elem
from (select id as pos from t10) as iter
where iter.pos <= LENGTH('PAUL123f3211')
) b
where b.elem REGEXP '^[0-9]+$') c
It can be done in PHP instead.
foreach ($query_result as &$row) {
$row['column_with_numbers'] = (int) filter_var($query_result['column_with_numbers'], FILTER_SANITIZE_NUMBER_INT);
}
Try this in php
$string = '9072&h';
echo preg_replace("/[^0-9]/", '', $string);// output: 9072
or follow this link to do this in MySql
Refer the link

Search set in text column

I have text column geotargeting that contains country codes separated by comma ("US,DE,CA,GB,IT"). If i have this input string "DE,CA,FR", I want to find all rows that contains one of the given countries.
Something like FIND_IN_SET(str,strlist) but i have 2 string lists.
Thanks.
I think only way you have to use a stored procedure. Check this not ran this code but it can help you
SET #myArrayOfValue = 'US,DE,CA,GB,IT';
SET #myArrayOfValue1 = 'DE,CA,FR';
WHILE (LOCATE(',', #myArrayOfValue) > 0)
DO
SET #value = ELT(1, #myArrayOfValue);
SET #value = SUBSTRING(#myArrayOfValue, LOCATE(',',#myArrayOfValue) + 1);
WHILE (LOCATE(',', #myArrayOfValue1 ) > 0)
DO
SET #value1 = ELT(1, #myArrayOfValue1 );
SET #value1 = SUBSTRING(#myArrayOfValue1 , LOCATE(',',#myArrayOfValue1 ) + 1);
END WHILE;
IF(value = #value1) THEN
// the value match.. do something here, may be insert it in a temporary table
END IF;
END WHILE;
DELIMITER $$
CREATE
FUNCTION find_set_in_set (set_1 TEXT, set_2 TEXT)
RETURNS BOOLEAN DETERMINISTIC
BEGIN
SET #i1 = LENGTH(set_1) - LENGTH(REPLACE(set_1, ',', '')) + 1;
WHILE (#i1 > 0) DO
SET #substr1 = SUBSTRING_INDEX(SUBSTRING_INDEX(set_1, ',', #i1), ',', -1);
SET #i1 = #i1 - 1;
SET #i2 = LENGTH(set_2) - LENGTH(REPLACE(set_2, ',', '')) + 1;
WHILE (#i2 > 0) DO
SET #substr2 = SUBSTRING_INDEX(SUBSTRING_INDEX(set_2, ',', #i2), ',', -1);
SET #i2 = #i2 - 1;
IF(#substr1 = #substr2) THEN
RETURN TRUE;
END IF;
END WHILE;
END WHILE;
RETURN FALSE;
END
$$

MySQL : how to remove double or more spaces from a string?

I couldn't find this question for MySQL so here it is:
I need to trim all double or more spaces in a string to 1 single space.
For example:
"The Quick Brown Fox"
should be :
"The Quick Brown Fox"
The function REPLACE(str, " ", " ") only removes double spaces, but leaves multiples spaces when there are more...
Here's an old trick that does not require regular expressions or complicated functions.
You can use the replace function 3 times to handle any number of spaces, like so:
REPLACE('This is my long string',' ','<>')
becomes:
This<>is<><><><>my<><><>long<><><><>string
Then you replace all occurrences of '><' with an empty string '' by wrapping it in another replace:
REPLACE(
REPLACE('This is my long string',' ','<>'),
'><',''
)
This<>is<>my<>long<>string
Then finally one last replace converts the '<>' back to a single space
REPLACE(
REPLACE(
REPLACE('This is my long string',
' ','<>'),
'><',''),
'<>',' ')
This is my long string
This example was created in MYSQL (put a SELECT in front) but works in many languages.
Note that you only ever need the 3 replace functions to handle any number of characters to be replaced.
The shortest and, surprisingly, the fastest solution:
CREATE FUNCTION clean_spaces(str VARCHAR(255)) RETURNS VARCHAR(255)
BEGIN
while instr(str, ' ') > 0 do
set str := replace(str, ' ', ' ');
end while;
return trim(str);
END
I know this question is tagged with mysql, but if you're fortunate enough to use MariaDB you can do this more easily:
SELECT REGEXP_REPLACE(column, '[[:space:]]+', ' ');
DELIMITER //
DROP FUNCTION IF EXISTS DELETE_DOUBLE_SPACES//
CREATE FUNCTION DELETE_DOUBLE_SPACES ( title VARCHAR(250) )
RETURNS VARCHAR(250) DETERMINISTIC
BEGIN
DECLARE result VARCHAR(250);
SET result = REPLACE( title, ' ', ' ' );
WHILE (result <> title) DO
SET title = result;
SET result = REPLACE( title, ' ', ' ' );
END WHILE;
RETURN result;
END//
DELIMITER ;
SELECT DELETE_DOUBLE_SPACES('a b');
This solution isn't very elegant but since you don't have any other option:
UPDATE t1 set str = REPLACE( REPLACE( REPLACE( str, " ", " " ), " ", " " ), " ", " " );
After searching I end up writing a function i.e
drop function if exists trim_spaces;
delimiter $$
CREATE DEFINER=`root`#`localhost` FUNCTION `trim_spaces`(`dirty_string` text, `trimChar` varchar(1))
RETURNS text
LANGUAGE SQL
NOT DETERMINISTIC
CONTAINS SQL
SQL SECURITY DEFINER
COMMENT ''
BEGIN
declare cnt,len int(11) ;
declare clean_string text;
declare chr,lst varchar(1);
set len=length(dirty_string);
set cnt=1;
set clean_string='';
while cnt <= len do
set chr=right(left(dirty_string,cnt),1);
if chr <> trimChar OR (chr=trimChar AND lst <> trimChar ) then
set clean_string =concat(clean_string,chr);
set lst=chr;
end if;
set cnt=cnt+1;
end while;
return clean_string;
END
$$
delimiter ;
USAGE:
set #str='------apple--------banana-------------orange---' ;
select trim_spaces( #str,'-')
output: apple-banana-orange-
parameter trimChar to function could by any character that is repeating and you want to remove .
Note it will keep first character in repeating set
cheers :)
For MySQL 8+, you can use REGEXP_REPLACE function:
UPDATE `your_table`
SET `col_to_change`= REGEXP_REPLACE(col_to_change, '[[:space:]]+', ' ');
This is slightly general solution: from
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=56195&whichpage=1
create table t (s sysname)
insert into t select 'The Quick Brown Fox'
-- convert tabs to spaces
update t set s = replace(s, ' ',' ')
where charindex(' ', s) > 0
-- now do the work.
while 1=1
begin
update t
set s = substring(s, 1, charindex(' ', s, 1)-1) + ' ' + ltrim(substring(s,charindex(' ', s, 1), 8000))
where charindex(' ', s, 1) > 0
if ##rowcount = 0
break
end
select s
from t
If the string that you want to convert consists of only alphabets and multiple number of spaces [A-Za-z ]* then the following function will work. I found out a pattern when such strings are converted to hex. Based on that my solution follows. Not so elegant, but it doesn't require any procedures.
unhex(
replace(
replace(
replace(
replace(
replace(
replace(
hex(str)
,204,1014)
,205,1015)
,206,1016)
,207,1017)
,20,'')
,101,20)
)
If you are using php....
try{
$con = new PDO ("mysql:host=localhost;dbname=dbasename","root","");
}
catch(PDOException $e){
echo "error".$e-getMessage();
}
$select = $con->prepare("SELECT * FROM table");
$select->setFetchMode(PDO::FETCH_ASSOC);
$select->execute();
while($data=$select->fetch()){
$id = $data['id'];
$column = $data['column'];
$column = trim(preg_replace('/\s+/',' ', $column)); // remove all extra space
$update = $con->prepare("UPDATE table SET column=:column WHERE id='$id'");
$update->bindParam(':column', $column );
$update->execute();
// echo $column."<br>";
}
Follow my generic function made for MySQL 5.6. My intention was to use regular expression to identify the spaces, CR and LF, however, it is not supported by this version of mysql. So, I had to loop through the string looking for the characters.
CREATE DEFINER=`db_xpto`#`%` FUNCTION `trim_spaces_and_crlf_entire_string`(`StringSuja` text) RETURNS text CHARSET utf8 COLLATE utf8_unicode_ci
DETERMINISTIC
BEGIN
DECLARE StringLimpa TEXT;
DECLARE CaracterAtual, CaracterAnterior TEXT;
DECLARE Contador, TamanhoStringSuja INT;
SET StringLimpa = '';
SET CaracterAtual = '';
SET CaracterAnterior = '';
SET TamanhoStringSuja = LENGTH(StringSuja);
SET Contador = 1;
WHILE Contador <= TamanhoStringSuja DO
SET CaracterAtual = SUBSTRING(StringSuja, Contador, 1);
IF ( CaracterAtual = ' ' AND CaracterAnterior = ' ' ) OR CaracterAtual = '\n' OR CaracterAtual = '\r' THEN
/* DO NOTHING */
SET Contador = Contador;
/* TORNA OS ESPAÇOS DUPLICADOS, CR, LF VISUALIZÁVEIS NO RESULTADO (DEBUG)
IF ( CaracterAtual = ' ' ) THEN SET StringLimpa = CONCAT(StringLimpa, '*');END IF;
IF ( CaracterAtual = '\n' ) THEN SET StringLimpa = CONCAT(StringLimpa, '\\N');END IF;
IF ( CaracterAtual = '\r' ) THEN SET StringLimpa = CONCAT(StringLimpa, '\\R');END IF;
*/
ELSE
/* COPIA CARACTER ATUAL PARA A STRING A FIM DE RECONSTRUÍ-LA SEM OS ESPAÇOS DUPLICADOS */
SET StringLimpa = CONCAT(StringLimpa, CaracterAtual);
/*SET StringLimpa = CONCAT(StringLimpa, Contador, CaracterAtual);*/
SET CaracterAnterior = CaracterAtual;
END IF;
SET Contador = Contador + 1;
END WHILE;
RETURN StringLimpa;
END
In MySQL 8+:
SELECT REGEXP_REPLACE(str, '\\s+', ' ');
you can try removing more tan one space with regex
SELECT REGEXP_REPLACE('This is my long string',' +', ' ');
the result would be this: "This is my long string"

mysql replace regular expression what ever in the braces in a string

Below are the few rows of a column data in my mysql database
Data
test(victoryyyyy)king
java(vaaaarrrryy)side
(vkittt)sap
flex(vuuuuu)
k(vhhhhyyy)kk(abcd)k
In all rows there is random string that starts with
(v
and ends with
)
My task :- I have to replace all string from '(v' to ')' with empty space ( that is ' ') I shouldn't touch other values in the braces , in the above case I should not replace (abcd) in the last row
I mean for the above example the result should be
test king
java side
sap
flex
kkk(abcd)k
Could any one please help me ?
Thank You
Regards
Kiran
Mysql doesn't support regexes for replace tasks.
So you can only use string functions to find and substr necessary part.
I wrote my own function for this and it is working . Thanks every one .
drop FUNCTION replace_v;
CREATE FUNCTION replace_v (village varchar(100)) RETURNS varchar(100)
DETERMINISTIC
BEGIN
DECLARE a INT Default 0 ;
DECLARE lengthofstring INT Default 0 ;
DECLARE returnString varchar(100) Default '' ;
DECLARE charString varchar(100) Default '' ;
DECLARE found char(1) default 'N';
declare seccharString varchar(100) Default '' ;
set lengthofstring = length(village);
simple_loop: LOOP
SET a=a+1;
set charString=substr(village,a,1);
if(charString = '(') then
set seccharString=substr(village,a+1,1);
if( seccharString = 'v' || seccharString = 'V' || seccharString = 'p' || seccharString = 'P'
|| seccharString = 'm' || seccharString = 'M' ) then
set found='y';
end if;
end if ;
if(found='n') then
set returnString = concat (returnString,charString);
end if;
if(charString = ')') then
set found='n';
end if ;
IF a>=lengthofstring THEN
LEAVE simple_loop;
END IF;
END LOOP simple_loop;
if ( found = 'y') then
set returnString =village;
end if;
RETURN (replace( returnString,'&', ' '));
END