How to generate a UUIDv4 in MySQL? - mysql

MySQL's UUID function returns a UUIDv1 GUID. I'm looking for an easy way to generate random GUIDs (i.e. UUIDv4) in SQL.

I've spent quite some time looking for a solution and came up with the following
mysql function that generates a random UUID (i.e. UUIDv4) using standard MySQL
functions. I'm answering my own question to share that in the hope that it'll be
useful.
-- Change delimiter so that the function body doesn't end the function declaration
DELIMITER //
CREATE FUNCTION uuid_v4()
RETURNS CHAR(36) NO SQL
BEGIN
-- Generate 8 2-byte strings that we will combine into a UUIDv4
SET #h1 = LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0');
SET #h2 = LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0');
SET #h3 = LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0');
SET #h6 = LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0');
SET #h7 = LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0');
SET #h8 = LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0');
-- 4th section will start with a 4 indicating the version
SET #h4 = CONCAT('4', LPAD(HEX(FLOOR(RAND() * 0x0fff)), 3, '0'));
-- 5th section first half-byte can only be 8, 9 A or B
SET #h5 = CONCAT(HEX(FLOOR(RAND() * 4 + 8)),
LPAD(HEX(FLOOR(RAND() * 0x0fff)), 3, '0'));
-- Build the complete UUID
RETURN LOWER(CONCAT(
#h1, #h2, '-', #h3, '-', #h4, '-', #h5, '-', #h6, #h7, #h8
));
END
//
-- Switch back the delimiter
DELIMITER ;
Note: The pseudo-random number generation used (MySQL's RAND) is not
cryptographically secure and thus has some bias which can increase the collision
risk.

Both existing answers relies on MySQL RAND() function:
RAND() is not meant to be a perfect random generator. It is a fast way to generate random numbers on demand that is portable between platforms for the same MySQL version.
In the practice, this mean that the generated UUID using this function might (and will) be biased, and collisions can occur more frequently then expected.
Solution
It's possible to generate safe UUID V4 on MySQL side using random_bytes() function:
This function returns a binary string of len random bytes generated using the random number generator of the SSL library.
So we can update the function to:
CREATE FUNCTION uuid_v4s()
RETURNS CHAR(36)
BEGIN
-- 1th and 2nd block are made of 6 random bytes
SET #h1 = HEX(RANDOM_BYTES(4));
SET #h2 = HEX(RANDOM_BYTES(2));
-- 3th block will start with a 4 indicating the version, remaining is random
SET #h3 = SUBSTR(HEX(RANDOM_BYTES(2)), 2, 3);
-- 4th block first nibble can only be 8, 9 A or B, remaining is random
SET #h4 = CONCAT(HEX(FLOOR(ASCII(RANDOM_BYTES(1)) / 64)+8),
SUBSTR(HEX(RANDOM_BYTES(2)), 2, 3));
-- 5th block is made of 6 random bytes
SET #h5 = HEX(RANDOM_BYTES(6));
-- Build the complete UUID
RETURN LOWER(CONCAT(
#h1, '-', #h2, '-4', #h3, '-', #h4, '-', #h5
));
END
This should generate UUID V4 random enough to don't care about collisions.
NOTE: Unfortunately MariaDB doesn't support RANDOM_BYTES() (See https://mariadb.com/kb/en/function-differences-between-mariadb-105-and-mysql-80/#miscellaneous)
Test
I've created following test scenario: Insert random UUID v4 as primary key for a table until 40.000.000 rows are created. When a collision is found, the row is updated incrementing collisions column:
INSERT INTO test (uuid) VALUES (uuid_v4()) ON DUPLICATE KEY UPDATE collisions=collisions+1;
The sum of collisions after 40 million rows with each function is:
+----------+----------------+
| RAND() | RANDOM_BYTES() |
+----------+----------------+
| 55 | 0 |
+----------+----------------+
The number collisions in both scenarios tends to increase as number of rows grows.

In the off chance you're working with a DB and don't have perms to create functions, here's the same version as above that works just as a SQL expression:
SELECT LOWER(CONCAT(
LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0'),
LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0'), '-',
LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0'), '-',
'4',
LPAD(HEX(FLOOR(RAND() * 0x0fff)), 3, '0'), '-',
HEX(FLOOR(RAND() * 4 + 8)),
LPAD(HEX(FLOOR(RAND() * 0x0fff)), 3, '0'), '-',
LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0'),
LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0'),
LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0')));

Adaptation of Elias Soares's answer using RANDOM_BYTES without creating a DB function:
SELECT LOWER(CONCAT(
HEX(RANDOM_BYTES(4)), '-',
HEX(RANDOM_BYTES(2)), '-4',
SUBSTR(HEX(RANDOM_BYTES(2)), 2, 3), '-',
CONCAT(HEX(FLOOR(ASCII(RANDOM_BYTES(1)) / 64)+8),SUBSTR(HEX(RANDOM_BYTES(2)), 2, 3)), '-',
HEX(RANDOM_BYTES(6))
))

Related

Mysql Export in sql format with custom where condition

I have requirement in my project to extract the rows in sql format. I have multiple select statements with same WHERE condition. Is it possible to set the WHERE condition in a param and make use of it in the select statements?
#set maxrows -1;
#export on;
#export set filename="C:\test\test12.sql" appendfile="true" format="sql";
SELECT ee.employeeId, ee.firstname, ee.lastName, ee.designation
FROM employee ee
WHERE ee.employeeId in (1, 2, 3, 4);
SELECT lib.employeeId, lib.bookId
FROM library lib
WHERE lib.employeeId in (1, 2, 3, 4);
SELECT lv.employeeId, lv.LeaveApplicationDate, lv.NoOfLeaves
FROM Leaves lv
WHERE lv.employeeId in (1, 2, 3, 4);
#export off;
As you can observe here, the WHERE condition is constant for each of SELECT Statements.
Is it possible to set in a param and make use of the same in all the select statements?
Example (below will not work):
#set maxrows -1;
#set employeeIds = 1, 2, 3, 4;
#export on;
#export set filename="c:\Work\test124536.sql" appendfile="true" format="sql";
SELECT ee.employeeId, ee.firstname, ee.lastName, ee.designation
FROM employee ee
WHERE ee.employeeId in (employeeIds);
#export off;

MySQL Adding two binary values stored in user defined variables

I have a SQL script....
SET #lat = 0;
SET #lat = (SELECT (CONV(SUBSTRING(data, 5,8),16,2)) FROM transaction_wtrax WHERE `show` = 0);
SET #lat = REPLACE(#lat, 1, 2);
SET #lat = REPLACE(#lat, 0, 1);
SET #lat = REPLACE(#lat, 2, 0);
The above results in a binary value for #lat.
I would like to add the value 1 to #lat.
I can add two binary literals by preceding the values with 0b
ie. Select 0b10001 + 0b1 (this works 100%)
however the following fails to add binary when you are working with user defined variables ...
Select #lat + 0b1 or Select Concat('0b', #lat) + 0b1 (this does not work)
How can I add my #lat to 0b1?
Thank you.
You can use CONV to convert it to decimal, do the add and then convert back to binary. CONV doesn't mind if its input value is an integer or a string.
SELECT CONV(CONV(#lat, 2, 10) + 1, 10, 2)
e.g.
SELECT CONV(CONV('00000100100011001110000001000111',2,10)+1,10,2)
Output:
100100011001110000001001000

MySQL - my months are current stored 0-11

I thought I ran into a bug with MySQL 5.1, but the bug was in the perl code that's creating the timestamps. perl's localtime uses 0-11 for months, but MySQL's datetime uses 1-12. So, I've got all these malformed timestamps that I need to update.
2012-00-19 09:03:30
This should be:
2012-01-19 09:03:30
The problem is that the date functions for MySQL return NULL on a 00 month. Is there a way to do this in MySQL?
EDIT: Solution =
UPDATE test_stats
SET start_time = CAST(CONCAT(SUBSTRING(start_time, 1, 5),
CAST((CAST(SUBSTRING(start_time, 6, 2) AS UNSIGNED) + 1) AS CHAR(2)),
SUBSTRING(start_time, 8, 12)) AS DATETIME);
By the way, I was using MySQL 5.1
This should work:
UPDATE MyTable
SET DateTimeField =
CAST (
SUBSTRING(DateTimeString, 1, 5) -- '2012-'
+ CAST((CAST(SUBSTRING(DateTimeString, 6, 2) AS INT) + 1) AS VARCHAR) -- '00' => '1'
+ SUBSTRING(DateTimeString, 8, 12) -- '-19 09:03:30'
AS DATETIME)
Test with this select
DECLARE #x VARCHAR(50) = '2012-00-19 09:03:30'
SELECT CAST(SUBSTRING(#x, 1, 5)
+ CAST((CAST(SUBSTRING(#x, 6, 2) AS INT) + 1) AS VARCHAR)
+ SUBSTRING(#x, 8, 12) AS DATETIME)

How can I speed up my MySQL UUID v4 stored function?

I'm attempting to write a MySQL stored function to generate v4 UUIDs as described in RFC 4122's section 4.4 ( http://www.ietf.org/rfc/rfc4122.txt ). My initial naive effort after a few tweaks is the following:
CREATE FUNCTION UUID_V4()
RETURNS BINARY(16)
READS SQL DATA
BEGIN
SET #uuid = CONCAT(
LPAD( HEX( FLOOR( RAND() * 4294967296 ) ), 8, '0' ),
LPAD( HEX( FLOOR( RAND() * 4294967296 ) ), 8, '0' ),
LPAD( HEX( FLOOR( RAND() * 4294967296 ) ), 8, '0' ),
LPAD( HEX( FLOOR( RAND() * 4294967296 ) ), 8, '0' )
);
SET #uuid = CONCAT(
SUBSTR( #uuid FROM 1 FOR 12 ),
'4',
SUBSTR( #uuid FROM 14 FOR 3 ),
SUBSTR( 'ab89' FROM FLOOR( 1 + RAND() * 4 ) FOR 1 ),
SUBSTR( #uuid FROM 18 )
);
RETURN UNHEX(#uuid);
END
The above function is quite slow: almost 100 times slower than the built-in UUID(), according to MySQL's BENCHMARK() feature. Short of writing a UDF using MySQL's C API, are there any improvements I can make here to, say, shave off an order of magnitude from its runtime?
If there is an already existing, well-regarded UUID UDF or stored procedure, I'd be happy to hear about that, too.
I didn't test this for correctness or for performance. It is just the idea of doing one only concatenation in instead of two.
create function uuid_v4()
returns binary(16)
begin
set #h1 = lpad(hex(floor(rand() * 4294967296)), 8, '0');
set #h2 = lpad(hex(floor(rand() * 4294967296)), 8, '0');
set #h3 = lpad(hex(floor(rand() * 4294967296)), 8, '0');
set #h4 = lpad(hex(floor(rand() * 4294967296)), 8, '0');
set #uuid = concat(
#h1,
substr(#h2 from 1 for 4),
'4',
substr(#h2 from 6),
substr('ab89' from floor(1 + rand() * 4) for 1 ),
substr(#h3 from 2),
#h4
);
return unhex(#uuid);
end
;
Also why do you use READS SQL DATA in your function?

Implementing parts of rfc4226 (HOTP) in mysql

Like the title says, I'm trying to implement the programmatic parts of RFC4226 "HOTP: An HMAC-Based One-Time Password Algorithm" in SQL. I think I've got a version that works (in that for a small test sample, it produces the same result as the Java version in the code), but it contains a nested pair of hex(unhex()) calls, which I feel can be done better. I am constrained by a) needing to do this algorithm, and b) needing to do it in mysql, otherwise I'm happy to look at other ways of doing this.
What I've got so far:
-- From the inside out...
-- Concatinate the users secret, and the number of time its been used
-- find the SHA1 hash of that string
-- Turn a 40 byte hex encoding into a 20 byte binary string
-- keep the first 4 bytes
-- turn those back into a hex represnetation
-- convert that into an integer
-- Throw away the most-significant bit (solves signed/unsigned problems)
-- Truncate to 6 digits
-- store into otp
-- from the otpsecrets table
select (conv(hex(substr(unhex(sha1(concat(secret, uses))), 1, 4)), 16, 10) & 0x7fffffff) % 1000000
into otp
from otpsecrets;
Is there a better (more efficient) way of doing this?
I haven't read the spec, but I think you don't need to convert back and forth between hex and binary, so this might be a little more efficient:
SELECT (conv(substr(sha1(concat(secret, uses)), 1, 8), 16, 10) & 0x7fffffff) % 1000000
INTO otp
FROM otpsecrets;
This seems to give the same result as your query for a few examples I tested.
This is absolutely horrific, but it works with my 6-digit OTP tokens. Call as:
select HOTP( floor( unix_timestamp()/60), secret ) 'OTP' from SecretKeyTable;
drop function HOTP;
delimiter //
CREATE FUNCTION HOTP(C integer, K BINARY(64)) RETURNS char(6)
BEGIN
declare i INTEGER;
declare ipad BINARY(64);
declare opad BINARY(64);
declare hmac BINARY(20);
declare cbin BINARY(8);
set i = 1;
set ipad = repeat( 0x36, 64 );
set opad = repeat( 0x5c, 64 );
repeat
set ipad = insert( ipad, i, 1, char( ascii( substr( K, i, 1 ) ) ^ 0x36 ) );
set opad = insert( opad, i, 1, char( ascii( substr( K, i, 1 ) ) ^ 0x5C ) );
set i = i + 1;
until (i > 64) end repeat;
set cbin = unhex( lpad( hex( C ), 16, '0' ) );
set hmac = unhex( sha1( concat( opad, unhex( sha1( concat( ipad, cbin ) ) ) ) ) );
return lpad( (conv(hex(substr( hmac, (ascii( right( hmac, 1 ) ) & 0x0f) + 1, 4 )),16,10) & 0x7fffffff) % 1000000, 6, '0' );
END
//
delimiter ;