Related
I have a MySQL database and I have a query as:
SELECT `id`, `originaltext` FROM `source` WHERE `originaltext` regexp '[0-9][0-9]'
This detects all originaltexts which have numbers with 2 digits in it.
I need MySQL to return those numbers as a field, so i can manipulate them further.
Ideally, if I can add additional criteria that is should be > 20 would be great, but i can do that separately as well.
If you want more regular expression power in your database, you can consider using LIB_MYSQLUDF_PREG. This is an open source library of MySQL user functions that imports the PCRE library. LIB_MYSQLUDF_PREG is delivered in source code form only. To use it, you'll need to be able to compile it and install it into your MySQL server. Installing this library does not change MySQL's built-in regex support in any way. It merely makes the following additional functions available:
PREG_CAPTURE extracts a regex match from a string. PREG_POSITION returns the position at which a regular expression matches a string. PREG_REPLACE performs a search-and-replace on a string. PREG_RLIKE tests whether a regex matches a string.
All these functions take a regular expression as their first parameter. This regular expression must be formatted like a Perl regular expression operator. E.g. to test if regex matches the subject case insensitively, you'd use the MySQL code PREG_RLIKE('/regex/i', subject). This is similar to PHP's preg functions, which also require the extra // delimiters for regular expressions inside the PHP string.
If you want something more simpler, you could alter this function to suit better your needs.
CREATE FUNCTION REGEXP_EXTRACT(string TEXT, exp TEXT)
-- Extract the first longest string that matches the regular expression
-- If the string is 'ABCD', check all strings and see what matches: 'ABCD', 'ABC', 'AB', 'A', 'BCD', 'BC', 'B', 'CD', 'C', 'D'
-- It's not smart enough to handle things like (A)|(BCD) correctly in that it will return the whole string, not just the matching token.
RETURNS TEXT
DETERMINISTIC
BEGIN
DECLARE s INT DEFAULT 1;
DECLARE e INT;
DECLARE adjustStart TINYINT DEFAULT 1;
DECLARE adjustEnd TINYINT DEFAULT 1;
-- Because REGEXP matches anywhere in the string, and we only want the part that matches, adjust the expression to add '^' and '$'
-- Of course, if those are already there, don't add them, but change the method of extraction accordingly.
IF LEFT(exp, 1) = '^' THEN
SET adjustStart = 0;
ELSE
SET exp = CONCAT('^', exp);
END IF;
IF RIGHT(exp, 1) = '$' THEN
SET adjustEnd = 0;
ELSE
SET exp = CONCAT(exp, '$');
END IF;
-- Loop through the string, moving the end pointer back towards the start pointer, then advance the start pointer and repeat
-- Bail out of the loops early if the original expression started with '^' or ended with '$', since that means the pointers can't move
WHILE (s <= LENGTH(string)) DO
SET e = LENGTH(string);
WHILE (e >= s) DO
IF SUBSTRING(string, s, e) REGEXP exp THEN
RETURN SUBSTRING(string, s, e);
END IF;
IF adjustEnd THEN
SET e = e - 1;
ELSE
SET e = s - 1; -- ugh, such a hack to end it early
END IF;
END WHILE;
IF adjustStart THEN
SET s = s + 1;
ELSE
SET s = LENGTH(string) + 1; -- ugh, such a hack to end it early
END IF;
END WHILE;
RETURN NULL;
END
There isn't any syntax in MySQL for extracting text using regular expressions. You can use the REGEXP to identify the rows containing two consecutive digits, but to extract them you have to use the ordinary string manipulation functions which is very difficult in this case.
Alternatives:
Select the entire value from the database then use a regular expression on the client.
Use a different database that has better support for the SQL standard (may not be an option, I know). Then you can use this: SUBSTRING(originaltext from '%#[0-9]{2}#%' for '#').
I think the cleaner way is using REGEXP_SUBSTR():
This extracts exactly two any digits:
SELECT REGEXP_SUBSTR(`originalText`,'[0-9]{2}') AS `twoDigits` FROM `source`;
This extracts exactly two digits, but from 20-99 (example: 1112 return null; 1521 returns 52):
SELECT REGEXP_SUBSTR(`originalText`,'[2-9][0-9]') AS `twoDigits` FROM `source`;
I test both in v8.0 and they work. That's all, good luck!
I'm having the same issue, and this is the solution I found (but it won't work in all cases) :
use LOCATE() to find the beginning and the end of the string you wan't to match
use MID() to extract the substring in between...
keep the regexp to match only the rows where you are sure to find a match.
I used my code as a Stored Procedure (Function), shall work to extract any number built from digits in a single block. This is a part of my wider library.
DELIMITER $$
-- 2013.04 michal#glebowski.pl
-- FindNumberInText("ab 234 95 cd", TRUE) => 234
-- FindNumberInText("ab 234 95 cd", FALSE) => 95
DROP FUNCTION IF EXISTS FindNumberInText$$
CREATE FUNCTION FindNumberInText(_input VARCHAR(64), _fromLeft BOOLEAN) RETURNS VARCHAR(32)
BEGIN
DECLARE _r VARCHAR(32) DEFAULT '';
DECLARE _i INTEGER DEFAULT 1;
DECLARE _start INTEGER DEFAULT 0;
DECLARE _IsCharNumeric BOOLEAN;
IF NOT _fromLeft THEN SET _input = REVERSE(_input); END IF;
_loop: REPEAT
SET _IsCharNumeric = LOCATE(MID(_input, _i, 1), "0123456789") > 0;
IF _IsCharNumeric THEN
IF _start = 0 THEN SET _start = _i; END IF;
ELSE
IF _start > 0 THEN LEAVE _loop; END IF;
END IF;
SET _i = _i + 1;
UNTIL _i > length(_input) END REPEAT;
IF _start > 0 THEN
SET _r = MID(_input, _start, _i - _start);
IF NOT _fromLeft THEN SET _r = REVERSE(_r); END IF;
END IF;
RETURN _r;
END$$
If you want to return a part of a string :
SELECT id , substring(columnName,(locate('partOfString',columnName)),10) from tableName;
Locate() will return the starting postion of the matching string which becomes starting position of Function Substring()
I know it's been quite a while since this question was asked but came across it and thought it would be a good challenge for my custom regex replacer - see this blog post.
...And the good news is it can, although it needs to be called quite a few times. See this online rextester demo, which shows the workings that got to the SQL below.
SELECT reg_replace(
reg_replace(
reg_replace(
reg_replace(
reg_replace(
reg_replace(
reg_replace(txt,
'[^0-9]+',
',',
TRUE,
1, -- Min match length
0 -- No max match length
),
'([0-9]{3,}|,[0-9],)',
'',
TRUE,
1, -- Min match length
0 -- No max match length
),
'^[0-9],',
'',
TRUE,
1, -- Min match length
0 -- No max match length
),
',[0-9]$',
'',
TRUE,
1, -- Min match length
0 -- No max match length
),
',{2,}',
',',
TRUE,
1, -- Min match length
0 -- No max match length
),
'^,',
'',
TRUE,
1, -- Min match length
0 -- No max match length
),
',$',
'',
TRUE,
1, -- Min match length
0 -- No max match length
) AS `csv`
FROM tbl;
How do I convert an id INT column in MySQL to a base 62 alphanumeric string?
Basically I really need a MySQL implementation of the following:
http://kvz.io/blog/2009/06/10/create-short-ids-with-php-like-youtube-or-tinyurl/
I needed to do this for larger-than-INT binary data (base-256 UUIDs specifically), so I created this stored function:
DELIMITER //
CREATE FUNCTION base62(x VARBINARY(16)) RETURNS VARCHAR(22) DETERMINISTIC
BEGIN
DECLARE digits CHAR(62) DEFAULT "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
DECLARE n NUMERIC(39) DEFAULT 0;
DECLARE s VARCHAR(22) DEFAULT "";
DECLARE i INT DEFAULT 1;
WHILE i <= LENGTH(x) DO
SET n = n * 256 + ORD(SUBSTR(x, i, 1));
SET i = i + 1;
END WHILE;
WHILE n > 0 DO
SET s = CONCAT(SUBSTR(digits, (n MOD 62) + 1, 1), s);
SET n = FLOOR(n / 62);
END WHILE;
RETURN s;
END//
You can remove the first loop if you've already got a numeric type. You may also wish to adjust the alphabet; for instance, the base-64 puts letters before numbers.
It is better to generate id using uuid or use auto increment to store in mysql but encode decode it while using it for front end. You can use this library to generate non sequential unique id's from numbers.
http://hashids.org
I've been using this MySQL function successfully. It takes a BIGINT parameter as input and returns BASE62 Youtube style ID's.
CREATE FUNCTION `i2s`(
`_n` BIGINT
)
RETURNS tinytext CHARSET latin1 COLLATE latin1_general_cs
LANGUAGE SQL
NOT DETERMINISTIC
NO SQL
SQL SECURITY DEFINER
COMMENT ''
BEGIN
declare d char(62) default '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz';
declare s tinytext default '0';
declare i int default 1;
if _n>0 then set s=''; end if;
while _n > 0 do
set s = concat(substr(d, (_n mod 62) + 1, 1),s);
set _n = floor(_n / 62);
end while;
return s;
END
I have some data that converts which has a 2 columns one column has IP and it contains values which are integers.I used the following function in my mysql query.Is there a function i can use to to convert my mac column which contains integers and data type is bigint to MAC address.
SELECT INET_NTOA(ip_address) AS myip,mymac
FROM table1
Assuming that you have stored the MAC address by suppressing all separators and converting the resulting HEX number into int, the conversion from this int to a human readable MAC address would be:
function int2macaddress($int) {
$hex = base_convert($int, 10, 16);
while (strlen($hex) < 12)
$hex = '0'.$hex;
return strtoupper(implode(':', str_split($hex,2)));
}
The function is taken from http://www.onurguzel.com/storing-mac-address-in-a-mysql-database/
The MySQL version for this function:
delimiter $$
create function itomac (i BIGINT)
returns char(20)
language SQL
begin
declare temp CHAR(20);
set temp = lpad (hex (i), 12, '0');
return concat (left (temp, 2),':',mid(temp,3,2),':',mid(temp,5,2),':',mid(temp,7,2),':',mid(temp,9,2),':',mid(temp,11,2));
end;
$$
delimiter ;
You can also do it directly in SQL, like this:
select
concat (left (b.mh, 2),':',mid(b.mh,3,2),':',mid(b.mh,5,2),':',mid(b.mh,7,2),':',mid(b.mh,9,2),':',mid(b.mh,11,2))
from (
select lpad (hex (a.mac_as_int), 12, '0') as mh
from (
select 1234567890 as mac_as_int
) a
) b
Just use HEX():
For a numeric argument N, HEX() returns a hexadecimal string representation of the value of N treated as a longlong (BIGINT) number.
Therefore, in your case:
SELECT INET_NTOA(ip_address) AS myip, HEX(mymac)
FROM table1
Note that this won't insert byte delimiters, such as colon characters.
I have a MySQL database and I have a query as:
SELECT `id`, `originaltext` FROM `source` WHERE `originaltext` regexp '[0-9][0-9]'
This detects all originaltexts which have numbers with 2 digits in it.
I need MySQL to return those numbers as a field, so i can manipulate them further.
Ideally, if I can add additional criteria that is should be > 20 would be great, but i can do that separately as well.
If you want more regular expression power in your database, you can consider using LIB_MYSQLUDF_PREG. This is an open source library of MySQL user functions that imports the PCRE library. LIB_MYSQLUDF_PREG is delivered in source code form only. To use it, you'll need to be able to compile it and install it into your MySQL server. Installing this library does not change MySQL's built-in regex support in any way. It merely makes the following additional functions available:
PREG_CAPTURE extracts a regex match from a string. PREG_POSITION returns the position at which a regular expression matches a string. PREG_REPLACE performs a search-and-replace on a string. PREG_RLIKE tests whether a regex matches a string.
All these functions take a regular expression as their first parameter. This regular expression must be formatted like a Perl regular expression operator. E.g. to test if regex matches the subject case insensitively, you'd use the MySQL code PREG_RLIKE('/regex/i', subject). This is similar to PHP's preg functions, which also require the extra // delimiters for regular expressions inside the PHP string.
If you want something more simpler, you could alter this function to suit better your needs.
CREATE FUNCTION REGEXP_EXTRACT(string TEXT, exp TEXT)
-- Extract the first longest string that matches the regular expression
-- If the string is 'ABCD', check all strings and see what matches: 'ABCD', 'ABC', 'AB', 'A', 'BCD', 'BC', 'B', 'CD', 'C', 'D'
-- It's not smart enough to handle things like (A)|(BCD) correctly in that it will return the whole string, not just the matching token.
RETURNS TEXT
DETERMINISTIC
BEGIN
DECLARE s INT DEFAULT 1;
DECLARE e INT;
DECLARE adjustStart TINYINT DEFAULT 1;
DECLARE adjustEnd TINYINT DEFAULT 1;
-- Because REGEXP matches anywhere in the string, and we only want the part that matches, adjust the expression to add '^' and '$'
-- Of course, if those are already there, don't add them, but change the method of extraction accordingly.
IF LEFT(exp, 1) = '^' THEN
SET adjustStart = 0;
ELSE
SET exp = CONCAT('^', exp);
END IF;
IF RIGHT(exp, 1) = '$' THEN
SET adjustEnd = 0;
ELSE
SET exp = CONCAT(exp, '$');
END IF;
-- Loop through the string, moving the end pointer back towards the start pointer, then advance the start pointer and repeat
-- Bail out of the loops early if the original expression started with '^' or ended with '$', since that means the pointers can't move
WHILE (s <= LENGTH(string)) DO
SET e = LENGTH(string);
WHILE (e >= s) DO
IF SUBSTRING(string, s, e) REGEXP exp THEN
RETURN SUBSTRING(string, s, e);
END IF;
IF adjustEnd THEN
SET e = e - 1;
ELSE
SET e = s - 1; -- ugh, such a hack to end it early
END IF;
END WHILE;
IF adjustStart THEN
SET s = s + 1;
ELSE
SET s = LENGTH(string) + 1; -- ugh, such a hack to end it early
END IF;
END WHILE;
RETURN NULL;
END
There isn't any syntax in MySQL for extracting text using regular expressions. You can use the REGEXP to identify the rows containing two consecutive digits, but to extract them you have to use the ordinary string manipulation functions which is very difficult in this case.
Alternatives:
Select the entire value from the database then use a regular expression on the client.
Use a different database that has better support for the SQL standard (may not be an option, I know). Then you can use this: SUBSTRING(originaltext from '%#[0-9]{2}#%' for '#').
I think the cleaner way is using REGEXP_SUBSTR():
This extracts exactly two any digits:
SELECT REGEXP_SUBSTR(`originalText`,'[0-9]{2}') AS `twoDigits` FROM `source`;
This extracts exactly two digits, but from 20-99 (example: 1112 return null; 1521 returns 52):
SELECT REGEXP_SUBSTR(`originalText`,'[2-9][0-9]') AS `twoDigits` FROM `source`;
I test both in v8.0 and they work. That's all, good luck!
I'm having the same issue, and this is the solution I found (but it won't work in all cases) :
use LOCATE() to find the beginning and the end of the string you wan't to match
use MID() to extract the substring in between...
keep the regexp to match only the rows where you are sure to find a match.
I used my code as a Stored Procedure (Function), shall work to extract any number built from digits in a single block. This is a part of my wider library.
DELIMITER $$
-- 2013.04 michal#glebowski.pl
-- FindNumberInText("ab 234 95 cd", TRUE) => 234
-- FindNumberInText("ab 234 95 cd", FALSE) => 95
DROP FUNCTION IF EXISTS FindNumberInText$$
CREATE FUNCTION FindNumberInText(_input VARCHAR(64), _fromLeft BOOLEAN) RETURNS VARCHAR(32)
BEGIN
DECLARE _r VARCHAR(32) DEFAULT '';
DECLARE _i INTEGER DEFAULT 1;
DECLARE _start INTEGER DEFAULT 0;
DECLARE _IsCharNumeric BOOLEAN;
IF NOT _fromLeft THEN SET _input = REVERSE(_input); END IF;
_loop: REPEAT
SET _IsCharNumeric = LOCATE(MID(_input, _i, 1), "0123456789") > 0;
IF _IsCharNumeric THEN
IF _start = 0 THEN SET _start = _i; END IF;
ELSE
IF _start > 0 THEN LEAVE _loop; END IF;
END IF;
SET _i = _i + 1;
UNTIL _i > length(_input) END REPEAT;
IF _start > 0 THEN
SET _r = MID(_input, _start, _i - _start);
IF NOT _fromLeft THEN SET _r = REVERSE(_r); END IF;
END IF;
RETURN _r;
END$$
If you want to return a part of a string :
SELECT id , substring(columnName,(locate('partOfString',columnName)),10) from tableName;
Locate() will return the starting postion of the matching string which becomes starting position of Function Substring()
I know it's been quite a while since this question was asked but came across it and thought it would be a good challenge for my custom regex replacer - see this blog post.
...And the good news is it can, although it needs to be called quite a few times. See this online rextester demo, which shows the workings that got to the SQL below.
SELECT reg_replace(
reg_replace(
reg_replace(
reg_replace(
reg_replace(
reg_replace(
reg_replace(txt,
'[^0-9]+',
',',
TRUE,
1, -- Min match length
0 -- No max match length
),
'([0-9]{3,}|,[0-9],)',
'',
TRUE,
1, -- Min match length
0 -- No max match length
),
'^[0-9],',
'',
TRUE,
1, -- Min match length
0 -- No max match length
),
',[0-9]$',
'',
TRUE,
1, -- Min match length
0 -- No max match length
),
',{2,}',
',',
TRUE,
1, -- Min match length
0 -- No max match length
),
'^,',
'',
TRUE,
1, -- Min match length
0 -- No max match length
),
',$',
'',
TRUE,
1, -- Min match length
0 -- No max match length
) AS `csv`
FROM tbl;
I have a stored procedure that needs to convert hexadecimal numbers to their decimal equivalent. I've read the documentation for the UNHEX() function, but it is returning a binary value. What I'm wanting to do is something like this:
CREATE PROCEDURE foo( hex_val VARCHAR(10) )
BEGIN
DECLARE dec_val INTEGER;
SET dec_val = UNHEX( hex_val );
-- Do something with the decimal value
select dec_val;
END
What am I missing? How can I convert the UNHEX()'d value to a unsigned integer?
You can use the CONV() function to convert between bases.
SET dec_val = CONV(hex_val, 16, 10);
conv(hex_val, 16, 10)
Will convert a number of base 16 to base 10. The UNHEX function does something completely different, it converts pairs of hex digits to characters.
cast(conv(hex_val, 16, 10) as unsigned integer)
that should be solve the problem....