Best way for Unique Random String for MySQL Long table

Best way for Unique Random String for MySQL Long table - mysql

I know how to create random chars both at PHP and MySQL but the question is that I have to create a 4 char random string for a table of 10 thousand or so rows. What way is the best to make sure it will remain unique?
I can use a longer string if I need to but not longer then 12.
Just to make it simple, table exists I need to add an extra column and fill it with a 4 char random string and keys must remain unique.

An option:
Put all you possible characters in a table with only one column.
val
------
0
1
...
9
a
b
...
z
Use this query
SELECT CONCAT(a.val,b.val,c.val,d.val)
FROM chars AS a
JOIN chars AS b
JOIN chars AS c
JOIN chars AS d
ORDER BY RAND()
LIMIT 10000
On the other hand if you need to get one ID at a time I see two approaches.
A. If you have a lot of unassigned IDs available.
In this case you just generate an ID and see if it's free. If not try another one.
B. If you want to keep you assigned IDs and the available IDs in the same magnitude level.
In this case it would be best to pre-generate all your IDs, shuffle them, and when you need one just pick the next available one. Say put them all in a table, and when you assign one from that table, you remove it so it can't be picked again.
If your allowed characters are 0-9a-z this means the table will occupy 364. That's just a couple of MB.

As those strings need to be unique, why not use a numeric auto-increment value and then convert that to a character based value similar to the conversion of decimal to hex.
If you choose the e.g. all characters and digits you simply need to create a routine that will convert an integer to a "base 62" number.

You can make use of the DISTINCT keyword.
For example, the following query will only return unique rows by which you can validate that your 4 char random string remains unique:
mysql> SELECT DISTINCT random_strings FROM chars;

This may be lengthy, but would allow you to create what you need:
CREATE FUNCTION gen_alphanum () RETURNS CHAR(4)
RETURN
ELT(FLOOR(1 + (RAND() * (50-1))), 'a','b','c','d','e','f','g','h','i','j','k','l','m ','n','o','p','q','r','s','t','u','v','w','x','y', 'z',
'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y', 'Z',);
)

It sounds like you've got the code in MySQL for creating these random valued strings.
Consider this option:
create a User Defined Function in MySQL. Have this function run the SQL statements to generate and return this new random string. Ensure that you use NOT EXISTS(SELECT MyRandomString FROM MyTable) within that creation statement to check that the random string doesn't already exist in the table.
When inserting new rows, use this function's return value to assign to the MyRandomString column.
to update the data existing, simply:
UPDATE MyTable
SET MyRandomString = fn_CreateSomeRandomString()
when inserting:
INSERT INTO MyTable (foo, bar, MyRandomString)
VALUES ('','', fn_CreateSomeRandomString());
Here's a sample of that UDF on PasteBin.

If you have MySQL 5.6, you can use TO_BASE64 as follows:
select LEFT( TO_BASE64( SHA(rand()) ), 6 ) ;
Alternatively if you don't have 5.6,
DELIMITER //
drop function if exists randChr //
create function randChr()
returns char
BEGIN
IF RAND() <= 0.5 THEN -- Lowercase
return CHAR( 97 + 25*rand() ) ;
ELSE -- uc
return CHAR( 65 + 25*rand() ) ;
END IF;
END //
drop function if exists randString //
create function randString( len int )
returns varchar(255)
BEGIN
SET #n = 0;
SET #res = '' ;
REPEAT
SET #res = concat( #res, randChr() ) ;
set #n = #n + 1 ;
UNTIL #n >= len END REPEAT;
return #res ;
END //
DELIMITER ;
-- USE:
select randString( 5 );
select randString( 60 );

Related

Add column with name of variable value

Inside procedure, I want to create a temporary table "report" with column names of another table "descriptions" rows contents, but I get error, because my query instead of using variable "tmp_description" value, uses its name to create a new column. How to use variable value as name for the new column?
DECLARE n INT DEFAULT 0;
DECLARE i INT DEFAULT 0;
DECLARE tmp_description varchar(30);
...
CREATE TEMPORARY TABLE descriptions (description varchar(30));
insert into descriptions
select distinct description from pure;
SELECT COUNT(*) INTO n FROM descriptions;
SET i=0;
WHILE i<n DO
SELECT * INTO tmp_description FROM (SELECT * FROM descriptions LIMIT i,1) t1;
ALTER TABLE report
ADD COLUMN
tmp_description FLOAT(2) DEFAULT 0.0; <-- I get error here
SET i = i + 1;
END WHILE;

I don't see any value to doing this in a while loop. Your looping mechanism is all off anyway, because you are using LIMIT without ORDER BY -- which means that the row returned on each iteration is arbitrary.
Why not just construct a single statement? First run:
select group_concat('add column ', description, ' numeric(2)' separator ', ') as columns
from t;
Note that float(2) doesn't really make sense to me as a data type. I suspect that you really want a numeric/decimal type.
Then take the results. Prepend them with alter table report and run the code.
You could do this using dynamic SQL, but I see no advantage to doing that.

Make unique string of characters/numbers in SQL

I have a table someTable with a column bin of type VARCHAR(4). Whenever I insert to this table, bin should be a unique combination of characters and numbers. Unique in this sense meaning has not appeared before in the table in another row.
bin is in the form of AA00, where A is a character A-F and 0 is a number 0-9.
Say I insert to this table once: it should come up with a bin value which doesn't appear before. Assuming the table was empty, the first bin could be AA11. On second insertion, it should be AA12, and then AA13, etc.
AA00, AA01, ... AA09, AA10, AA11, ... AA99, AB00, AB01, ... AF99, BA00, BA01, ... FF99
It doesn't matter this table can contain only 3,600 possible rows. How do I create this code, specifically finding a bin that doesn't already exist in someTable? It can be in order as I've described or a random bin, as long as it doesn't appear twice.
CREATE TABLE someTable (
bin VARCHAR(4),
someText VARCHAR(32),
PRIMARY KEY(bin)
);
INSERT INTO someTable
VALUES('?', 'a');
INSERT INTO someTable
VALUES('?', 'b');
INSERT INTO someTable
VALUES('?', 'c');
INSERT INTO someTable
VALUES('?', 'd');
Alternatively, I can use the below procedure to insert instead:
CREATE PROCEDURE insert_someTable(tsomeText VARCHAR(32))
BEGIN
DECLARE var (VARCHAR(4) DEFAULT (
-- some code to find unique bin
);
INSERT INTO someTable
VALUES(var, tsomeText);
END
A possible outcome is:
+------+----------+
| bin | someText |
+------+----------+
| AB31 | a |
| FC10 | b |
| BB22 | c |
| AF92 | d |
+------+----------+

As Gordon said, you will have to use a trigger because it is too complex to do as a simple formula in a default. Should be fairly simple, you just get the last value (order by descending, limit 1) and increment it. Writing the incrementor will be somewhat complicated because of the alpha characters. It would be much easier in an application language, but then you run into issues of table locking and the possibility of two users creating the same value.
A better method would be to use a normal auto-increment primary key and translate it to your binary value. Consider your bin value as two base 6 characters followed by two base 10 values. You then take the id generated by MySQL which is guaranteed to be unique and convert to your special number system. Calculate the bin and store it in the bin column.
To calculate the bin:
Step one would be to get the lower 100 value of the decimal number (mod 100) - that gives you the last two digits. Convert to varchar with a leading zero.
Subtract that from the id, and divide by 100 to get the value for the first two digits.
Get the mod 6 value to determine the 3rd (from the right) digit. Convert to A-F by index.
Subtract this from what's left of the ID, and divide by 6 to get the 4th (from the right) digit. Convert to A-F by index.
Concat the three results together to form the value for the bin.
You may need to edit the following to match your table name and column names, but it should so what you are asking. One possible improvement would be to have it cancel any inserts past the 3600 limit. If you insert the 3600th record, it will duplicate previous bin values. Also, it won't insert AA00 (id=1 = 'AA01'), so it's not perfect. Lastly, you could put a unique index on bin, and that would prevent duplicates.
DELIMITER $$
CREATE TRIGGER `fix_bin`
BEFORE INSERT ON `so_temp`
FOR EACH ROW
BEGIN
DECLARE next_id INT;
SET next_id = (SELECT AUTO_INCREMENT FROM information_schema.TABLES WHERE TABLE_SCHEMA=DATABASE() AND TABLE_NAME='so_temp');
SET #id = next_id;
SET #Part1 = MOD(#id,100);
SET #Temp1 = FLOOR((#id - #Part1) / 100);
SET #Part2 = MOD(#Temp1,6);
SET #Temp2 = FLOOR((#Temp1 - #Part2) / 6);
SET #Part3 = MOD(#Temp2,6);
SET #DIGIT12 = RIGHT(CONCAT("00",#Part1),2);
SET #DIGIT3 = SUBSTR("ABCDEF",#Part2 + 1,1);
SET #DIGIT4 = SUBSTR("ABCDEF",#Part3 + 1,1);
SET NEW.`bin` = CONCAT(#DIGIT4,#DIGIT3,#DIGIT12);
END;
$$
DELIMITER ;

MySQL fast check if hash exists

I'm trying to create a MySQL function which takes n and m as input and generates random n unique combinations of m ids from result of query.
The function will return one combination per call, and that combination must be distinct from all previous combinations.
During generation it must check another table: if combination already exists, to continue loop until every combination stays unique. Return combination as dash separated ids or if there is no room for unique combination to return false.
So I'm getting 100 random items like this:
SELECT
`Item`.`id`
FROM
`Item`
LEFT JOIN `ItemKeyword` ON `Item`.`id` = `ItemKeyword`.`ItemID`
WHERE
(`Item`.`user_id` = '2')
AND(`ItemKeyword`.`keywordID` = 7130)
AND(`Item`.`type` = 1)
ORDER BY RAND()
LIMIT 100
Past combinations are stored as md5 of concatenation of itemIDs by -.
So I need to concatenate result of this query by - and create md5 of it. Then to send another query into second table named Combination and check with hash column if it exists or not. And continue this loop until I get n results.
I can't figure out how to achieve this correctly and fast. Any suggestion?
Update:
Whole SQL Dump is here: https://gist.github.com/anonymous/e5eb3bf1a10f9d762cc20a8146acf866

If you are testing for uniqueness via the md5, you need to sort the list before taking the md5. This can be demonstrated with SELECT MD5('1-2'), MD5('2-1');
Get rid of LEFT, it seems useless. After that, the Optimizer can choose between starting with ItemKeyword instead of Item. (Without knowing the distribution of the data, I cannot say whether this might help.)
(It would be helpful if you provided SHOW CREATE TABLE for each table. In their absence, I will assume you are using InnoDB and have PRIMARY KEY(id) and PRIMARY KEY(keywordID).)
'Composite' indexes needed:
Item: INDEX(user_id, type, id)
ItemKeyword: INDEX(ItemID, keywordID)
ItemKeyword smells like a many:many mapping table. Most such tables can be improved, starting with tossing the id. See 7 tips on many:many .
I am somewhat lost in your secondary processing.
My tips on RAND may or may not be helpful.
Schema Critique
A PRIMARY KEY is a UNIQUE KEY is an INDEX; eliminate redundant indexes.
INT(4) -- the (4) means nothing; INT is always 32-bits (4 bytes) with a large range. See SMALLINT UNSIGNED (2 bytes, 0..64K range).
An MD5 should be declared CHAR(32) CHARACTER SET ascii, not 255, not utf8. (latin1 is OK.)
The table Combination (id + hash) seems to be useless. Instead, simply change KEY md5 (md5) USING BTREE, to UNIQUE(md5) in the table Item.
You have started toward utf8mb4 with SET NAMES utf8mb4;, yet the tables (and their columns) are still utf8. Emoji and Chinese need utf8mb4; most other text does not.
After addressing these issues, the original Question may be solved (as well as doing some cleanup). If now, please add some further clarification.
Minified
1. Get a sorted list of m unique ids. (I need "sorted" for the next step, and since you are looking for "combinations", it seems that "permutations" are not needed.)
SELECT GROUP_CONCAT(id) AS list
FROM (
SELECT id FROM tbl
ORDER BY RAND()
LIMIT $m
) AS x;
2. Check for uniqueness. Do this by taking MD5(list) (from above) and checking in a table of 'used' md5's. Note: Unless you are asking for a lot of combinations among a small list of ids, dups are unlikely (though not impossible).
3. Deliver the list. However, it is a string of ids separated by commas. Splitting this is best done in application code, not MySQL functions.
4. What will you do with the list? This could be important because it may be convenient to fold step 4 in with step 3.
Bottom line: I would do only step 1 and part of step 2 in SQL; I would build a 'function' in the application code to do the rest.

Permutations
DROP FUNCTION IF EXISTS unique_perm;
DELIMITER //
CREATE FUNCTION unique_perm()
RETURNS VARCHAR(255) CHARACTER SET ascii
NOT DETERMINISTIC
SQL SECURITY INVOKER
BEGIN
SET #n := 0;
iterat: LOOP
SELECT SUBSTRING_INDEX(
GROUP_CONCAT(province ORDER BY RAND() SEPARATOR '-'),
'-', 3) INTO #list -- Assuming you want M=3 items
FROM world.provinces;
SET #md5 := MD5(#list);
INSERT IGNORE INTO md5s (md5) VALUES (#md5); -- To prevent dups
IF ROW_COUNT() > 0 THEN -- Check for dup
RETURN #list; -- Got a unique permutation
END IF;
SET #n := #n + 1;
IF #n > 20 THEN
RETURN NULL; -- Probably ran out of combinations
END IF;
END LOOP iterat;
END;
//
DELIMITER ;
Output:
mysql> SELECT unique_perm(), unique_perm(), unique_perm()\G
*************************** 1. row ***************************
unique_perm(): New Brunswick-Nova Scotia-Quebec
unique_perm(): Alberta-Northwest Territories-New Brunswick
unique_perm(): Manitoba-Quebec-Prince Edward Island
1 row in set (0.01 sec)
Notes:
I hard-coded M=3; adjust as needed. (It could be passed in as an arg.)
Change column and table names for your needs.
With out the test on #n, you could get in a loop if you run out of combinations. (However, if N is even modestly large, that is 'impossible', so you could remove the test.)
If the M is large enough, you will need to increase ##group_concat_max_len. Also, the RETURNS.
CREATE TABLE md5s ( md5 CHAR(32) CHARACTER SET ascii PRIMARY KEY ) ENGINE=InnoDB is needed. And, you will need to TRUNCATE md5s between batches of calls to this function.
That is a working example.
Flaw: It gives unique permutations, not unique combinations. If that is not adequate, read on...
Combinations
DROP FUNCTION IF EXISTS unique_comb;
DELIMITER //
CREATE FUNCTION unique_comb()
RETURNS VARCHAR(255) CHARACTER SET ascii
NOT DETERMINISTIC
SQL SECURITY INVOKER
BEGIN
SET #n := 0;
iterat: LOOP
SELECT GROUP_CONCAT(province ORDER BY province SEPARATOR '-') INTO #list
FROM ( SELECT province FROM world.provinces
ORDER BY RAND() LIMIT 2 ) AS x; -- Assuming you want M=2 items
SET #md5 := MD5(#list);
INSERT IGNORE INTO md5s (md5) VALUES (#md5); -- To prevent dups
IF ROW_COUNT() > 0 THEN -- Check for dup
RETURN #list; -- Got a unique permutation
END IF;
SET #n := #n + 1;
IF #n > 20 THEN
RETURN NULL; -- Probably ran out of combinations
END IF;
END LOOP iterat;
END;
//
DELIMITER ;
Output:
mysql> SELECT unique_comb(), unique_comb(), unique_comb()\G
*************************** 1. row ***************************
unique_comb(): Quebec-Yukon
unique_comb(): Ontario-Yukon
unique_comb(): New Brunswick-Nova Scotia
1 row in set (0.01 sec)
Notes:
The subquery adds some to the cost.
Note that the items in each output string are now (necessarily) ordered.

Avoid row was cut by GROUP_CONCAT error on insert without changing group_concat_max_len

I have an insert that uses a GROUP_CONCAT. In certain scenarios, the insert fails with Row XX was cut by GROUP_CONCAT. I understand why it fails but I'm looking for a way to have it not error out since the insert column is already smaller than the group_concat_max_len. I don't want to increase group_concat_max_len.
drop table if exists a;
create table a (x varchar(10), c int);
drop table if exists b;
create table b (x varchar(10));
insert into b values ('abcdefgh');
insert into b values ('ijklmnop');
-- contrived example to show that insert column size varchar(10) < 15
set session group_concat_max_len = 15;
insert into a select group_concat(x separator ', '), count(*) from b;
This insert produces the error Row 2 was cut by GROUP_CONCAT().
I'll try to provide a few clarifications -
The data in table b is unknown. There is no way to say set group_concat_max_len to a value greater than 18.
I do know the insert column size.
Why group_concat 4 GB of data when you want the first x characters?
When the concatenated string is longer than 10 chars, it should insert the first 10 characters.
Thanks.

Your example GROUP_CONCAT is probably cooking up this value:
abcdefgh, ijklmnop
That is 18 characters long, including the separator.
Can you try something like this?
set session group_concat_max_len = 4096;
insert into a
select left(group_concat(x separator ', '),10),
count(*)
from b;
This will trim the GROUP_CONCAT result for you.
You temporarily can set the group_concat_max_len if you need to, then set it back.

I don't know MySQL very well, nor if there is a good reason to do this in the first place, but you could create a running total length, and limit the GROUP_CONCAT() to where that length is under a certain max, you'll still need to set your group_concat_max_len high enough to handle the longest single value (or utilize CASE logic to substring them to be under the max length you desire.
Something like this:
SELECT SUBSTRING(GROUP_CONCAT(col1 separator ', '),1,10)
FROM (SELECT *
FROM (SELECT col1
,#lentot := COALESCE(#lentot,0) + CHAR_LENGTH(col1) AS lentot
FROM Table1
)sub
WHERE lentot < 25
)sub2
Demo: SQL Fiddle
I don't know if it's SQL Fiddle being quirky or if there's a problem with the logic, but sometimes when running I get no output. Not big on MySQL so could definitely be me missing something. It doesn't seem like it should require 2 subqueries but filtering didn't work as expected unless it was nested like that.

Actually, a better way is to use DISTINCT.
I had a situation to add new two fields into existing stored procedure, in a way that a value for that new fields had been obtained by a LEFT JOIN, and because it may have contained a NULL value, a single "concat" value was multiplicated for some cases more than a 100 times.
Because, a group with that new field value contained many NULL values, GROUP_CONCAT exceeded maximum value (in my case 16384).

Is it possible to name column by index

I am wondering if it is possible to use SQL to create a table that name columns by index(number). Say, I would like to create a table with 10 million or so columns, I definitely don't want to name every column...
I know that I can write a script to generate a long string as SQL command. However, I would like to know if there is a more elegant way to so
Like something I make up here:
CREATE TABLE table_name
(
number_columns 10000000,
data_type INT
)
I guess saying 10 million columns caused a lot of confusion. Sorry about that. I looked up the manual of several major commercial DBMS and seems it is not possible. Thank you for pointing this out.
But another question, which is most important, does SQL support numerical naming of columns, say all the columns have the same type and there is 50 columns. And when referring it, just like
SELECT COL.INDEX(3), COL.INDEX(2) FROM MYTABLE
Does the language support that?

Couldn't resist looking into this, and found that the MySQL Docs say "no" to this, that
There is a hard limit of 4096 columns per table, but the effective
maximum may be less for a given table

You can easily do that in Postgres with dynamic SQL. Consider the demo:
DO LANGUAGE plpgsql
$$
BEGIN
EXECUTE '
CREATE TEMP TABLE t ('
|| (
SELECT string_agg('col' || g || ' int', ', ')
FROM generate_series(1, 10) g -- or 1600?
)
|| ')';
END;
$$;
But why would you even want to give life to such a monstrosity?
As #A.H. commented, there is a hard limit on the number of columns in PostgreSQL:
There is a limit on how many columns a table can contain. Depending on
the column types, it is between 250 and 1600. However, defining a
table with anywhere near this many columns is highly unusual and often
a questionable design.
Emphasis mine.
More about table limitations in the Postgres Wiki.
Access columns by index number
As to your additional question: with a schema like the above you can simply write:
SELECT col3, col2 FROM t;
I don't know of a built-in way to reference columns by index. You can use dynamic SQL again. Or, for a table that consists of integer columns exclusively, this will work, too:
SELECT c[3] AS col3, c[2] AS col2
FROM (
SELECT translate(t::text, '()', '{}')::int[] AS c -- transform row to ARRAY
FROM t
) x

Generally when working with databases your schema should be more or less "defined" so dynamic column adding isn't a built in functionality.
You can, however, run a loop and continually ALTER TABLE to add columns like so:
BEGIN
SET #col_index = 0;
start_loop: LOOP
SET #col_index = #col_index + 1;
IF #col_index <= num_columns THEN
SET #alter_query = (SELECT CONCAT('ALTER TABLE table_name ADD COLUMN added_column_',#col_index,' VARCHAR(50)'));
PREPARE stmt FROM #alter_query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
ITERATE start_loop;
END IF;
LEAVE start_loop;
END LOOP start_loop;
END;
But again, like most of the advice you have been given, if you think you need that many columns, you probably need to take a look at your database design, I have personally never heard of a case that would need that.

Note: As mentioned by #GDP you can have only 4096 cols and definitely the idea is not appreciated and as again #GDP said that database design ideas need to be explored to consider if something else could be a better way to handle this requirement.
However, I was just wondering apart from the absurd requirement if ever I need to do this how can I do it? I thought why not create a custom / user defined MySQL function e.g. create_table() tht will receive the parameters you intend to send and which will in turn generate the required CREATE TABLE command.

This is an option for finding columns using ordinal values. It might not be the most elegant or efficient but it works. I am using it to create a new table for faster mappings between data that I need to parse through all the columns / rows.
DECLARE #sqlCommand varchar(1000)
DECLARE #columnNames TABLE (colName varchar(64), colIndex int)
DECLARE #TableName varchar(64) = 'YOURTABLE' --Table Name
DECLARE #rowNumber int = 2 -- y axis
DECLARE #colNumber int = 24 -- x axis
DECLARE #myColumnToOrderBy varchar(64) = 'ID' --use primary key
--Store column names in a temp table
INSERT INTO #columnNames (colName, colIndex)
SELECT COL.name AS ColumnName, ROW_NUMBER() OVER (ORDER BY (SELECT 1))
FROM sys.tables AS TAB
INNER JOIN sys.columns AS COL ON COL.object_id = TAB.object_id
WHERE TAB.name = #TableName
ORDER BY COL.column_id;
DECLARE #colName varchar(64)
SELECT #colName = colName FROM #columnNames WHERE colIndex = #colNumber
--Create Dynamic Query to retrieve the x,y coordinates from table
SET #sqlCommand = 'SELECT ' + #colName + ' FROM (SELECT ' + #colName + ', ROW_NUMBER() OVER (ORDER BY ' + #myColumnToOrderBy+ ') AS RowNum FROM ' + #tableName + ') t2 WHERE RowNum = ' + CAST(#rowNumber AS varchar(5))
EXEC(#sqlCommand)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Best way for Unique Random String for MySQL Long table - mysql

You can make use of the DISTINCT keyword. For example, the following query will only return unique rows by which you can validate that your 4 char random string remains unique: mysql> SELECT DISTINCT random_strings FROM chars;

Related

Add column with name of variable value

Make unique string of characters/numbers in SQL

MySQL fast check if hash exists

Avoid row was cut by GROUP_CONCAT error on insert without changing group_concat_max_len

Is it possible to name column by index

Categories

Resources