Is it sufficient to use UUID for generating random values in a column with Unique constraint?
Or should we append something like current timestamp to the UUID.
Or is there a better way to generate random yet unique values for a sql Column.
Generating random numbers in any language is a misnomer. The numbers are pseudo random. Meaning not entirely 100% random.
Also take into the scenario of generating a random number in the range of 1-5. You might get duplicates generating the same number. Or if you generate more than 5 numbers you absolutely WILL have duplicates.
Time stamp tends to be a good UUID if the fields are only updated from one place. But then you also have to worry about things such as making sure it is in the correct format. Especially when changing between languages/technologies.
With user ID why not just auto increment a column with the values? Start the column somewhere and increment from there. I would leave a few numbers at the beginning of the list so that you have some empty ones early on if you need them for test/admin.
Related
Is the following possible, i been racking my brain to think of a solution.
I have an sql table, very simple table, few text columns and two int columns.
What i want to ideally do is allow user to add a row, but just the text columns and have the sql automatically put the numbers in the integer columns.
Ideally id like these numbers to random but not already exsist (so every row has a unique number) in the column. Also 10 digits long (but think that might be pushing it).
Is there anyway i can achieve this within the query itself?
Thanks
Sure - you pass the string as parameters to the Insert statement and the values as well - after you computed them. you can use SQL fucntion to generate the random number, or use the code you're calling from to generate them.
You can generate unique int numbers for a row with setting it AUTO_INCREMENT. However if you want something like a random hash, you need to do it in your backend. (or in a stored procedure)
Just a thought: if you generate long enough random strings you don't need to worry about having duplication usually. So it's safe to generate a random string, try to insert it and repeat until you get a duplicate entry error. Won't happen most of the time so it might be quicker than checking it first with a select.
You can generate a random number using MySQL. This will generate a random number between 0 and 10.000:
FLOOR(RAND() * 10001)
If you really want the numbers to always be 10 digits long you can generate a number between 1.000.000.000 and 9.999.999.999 like this:
FLOOR(RAND() * 9000000000) + 1000000000
The chance of the number not being unique is ~0.0000000001% and rising as you insert new rows. For a 0% chance of collision I'd suggest doing this the right way and handling this in code and not the database.
The random function explained:
What is happening is RAND() is generating a random decimal number between 0 and 1 (never actually 1). Then we multiply that number by the maximum number that we wish to produce plus 1. We add 1 because the biggest number produced for a set maximum number of 10 will be 9,XXXX and never actually 10 or above (remember I said that RAND() never generates 1), so we add plus one to produce the possibility of 10,XXXX which we later floor using FLOOR() to produce 10. In this case though we don't add 1 because 10.000.000.000 will become possible and it breaches our 10 digit boundary. Then we add the minimum number which we want produced (+ 1.000.000.000 in this case) while subtracting the same from the number we entered before (the maximum number).
If I have a database with the following information, how can I setup my next INSERT query so that the ID is filled in? (so that it is 5 in this instance.)
Basically, once it gets to 24, it will continue inserting in order (ex: 30,31,32)
You don't. Not with an auto-incrementing integer anyway.
You could change the column to not be an auto-incrementing integer, but then you'll need to determine the next ID before performing each insert which would make all of your INSERT queries unnecessarily complex and the code more difficult to maintain. Not to mention introducing a significant point of failure if multiple threads try to insert and the operation to find the next ID and insert a record isn't fully atomic.
Why do you even need this? There's no reason for a database-generated primary key integer to be contiguous like that. Its purpose is to be unique, and as long as it serves that purpose it's working. There's no need to "fill in the holes" left by previously deleted records.
You could add a different column to the database and perform the logic for finding the next contiguous number when inserting records on that column. But you'd still run into the same aforementioned problems of race conditions and unnecessary complexity.
Change your filename to something more meaningful than the id.
I think something like files/uploads/20130515_170349.wv (for the first row) makes a lot of sense (assuming you don't have more than one file per second.
This also has the advantage that ordering the file names alphabetically is chronological order, making it easier to see the newer and older files.
You can just give it the I'd field and value
Insert into table (I'd, etc, etc) values (5, etc, etc);
However I don't think you can do it dynamically. If I'd is auto increment then it'll keep on oncrementinf whether or not previous tuples have been deleted etc.
I am sure this is called something that I don't know the name of.
I want to generate ids like:
59AA307E-94C8-47D1-AA50-AAA7500F5B54
instead of the standard auto incremented number.
It doesn't have to be exactly like that, but would like a long unique string value for it.
Is there an easy way to do this?
I want to do it to reference attachments so they are not easily used, like attachment=1
I know there are ways around that, but I figure the string based id would be better if possible, and im sure I am just not searching for the right thing.
Thank you
Last time I checked, you can't specify UUID() as the default constraint for a column in MySQL. That means using a trigger:
CREATE TRIGGER
newid
BEFORE INSERT ON your_table_name
FOR EACH ROW
SET NEW.id = UUID()
I know there are ways around that, but I figure the string based id would be better
I understand you're after the security by obscurity, but be aware that CHAR/VARCHAR columns larger than 4 characters take more space than INT does (1 byte). This will impact performance in retrieval and JOINs.
You could always just pass the regular auto_increment values through SHA1() or MD5() whenever it comes time to send it out to the "public". With a decent salting string before/after the ID value, it'd be pretty much impossible to guess what the original number was. You wouldn't get a fancy looking string like a UUID, but you'd still have a regular integer ID value to deal with internally.
If you're worried about the extra cpu time involved in repeatedly hashing the column, you can always stored the hash value in a seperate field when the record's created.
In MySQL, is it generally faster/more efficient/scalable to return 100 rows with 3 columns, or 1 row with 100 columns?
In other words, when storing many key => value pairs related to a record, is it better to store each key => value pair in a separate row with with the record_id as a key, or to have one row per record_id with a column for each key?
Also, assume also that keys will need to be added/removed fairly regularly, which I assume would affect the long term maintainability of the many column approach once the table gets sufficiently large.
Edit: to clarify, by "a regular basis" I mean the addition or removal of a key once a month or so.
You should never add or remove columns on a regular basis.
http://en.wikipedia.org/wiki/Entity-Attribute-Value_model
There are a lot of bad things about this model and I would not use it if there was any other alternative. If you don't know the majority (except a few user customizable fields) of data columns you need for your application, then you need to spend more time in design and figure it out.
If your keys are preset (known at design time), then yes, you should put each key into a separate column.
If they are not known in design time, then you have to return your data as a list of key-value pairs which you should later parse outside the RDBMS.
If you are storing key/value pairs, you should have a table with two columns, one for the key (make this the PK for the table) and one for the value (probably don't need this indexed at all). Remember, "The key, the whole key, and nothing but the key."
In the multi-column approach, you will find that you table grows without bound because removing the column will nuke all the values and you won't want to do it. I speak from experience here having worked on a legacy system that had one table with almost 1000 columns, most of which were bit fields. Eventually, you stop being able to make the case to delete any of the columns because someone might be using it and the last time you did it, you had work till 2 am rolling back to backups.
First: determine how frequently your data needs to be accessed. If the data always needs to be retrieved in one shot and most of it used then consider storing all the key pairs as a serialized value or as an xml value. If you need to do any sort of complex analysis on that data and you need the value pairs then columns are ok but limit them to values that you know you will need to perform your queries on. It’s generally easier to design queries that use one column for one parameter than row. You will also find it easier to work with
the returned values if they are all in one row than many.
Second: separate your most frequently accessed data and put it in its own table and the other data in another. 100 columns is a lot by the way so I recommend that you split your data into smaller chunks that will be more manageable.
Lastly: If you have data that may frequently change then you should use create the column (key) in one table and then use its numerical key value against which you would store the key value. This assumes that you will be using the same key more than once and should speed up your search when you go to do your lookup.
Which is the easiest way to perform the following in MySQL 5.1?
I have a table with a primary key as an integer, currently running from 1 to 220. The PK runs sequentially depending on the order in which the rows were written to the table.
I want to be able to randomly reassign this primary key value, so that, for example, row 1 (with a PK of 1 currently) becomes 19 (for example), row 2 becomes 142 (for example), row 3 becomes 99 (for example), etc. and so forth so that all numbers between 1 and 220 will be reassigned to the PK.
Is there a simple way of doing this?
Thanks,
Tim
There is no simple way to do it entirely within SQL. (There is most likely a complex way that isn't worth it.) I recommend you make it the responsibility of application-level logic.
I also recommend that if this is for some kind of 'card-shuffling' type purpose, you use a secondary unique key instead of the primary key.
Thanks for your answers. However, I found a reply to a post which does exactly what I required. It is here:
MySQL query to assign a unique random number to each row
Update the primary key field with the Rand Function. To get an integer, you'd have to multiply it by 100, 1000, etc (depending on how big of a number you wanted) and then truncate the remaining decimal.
Your script, which I'd presume you'd write to do this, would need to ensure that a duplicate number wasn't generated and thus a failed attempt to update was made. I'd do this via a loop, an single update statement wouldn't work because of how Rand works. (At least for other RDBMS)
Is it important for us to know the reason for your doing this? The reason might change my answer...
You'd want to assign a random value to the ID column, then sort by it and reassign the ID based on position in the sorted rows.