How can I get better randomization in my sql query? - mysql

I am attempting to get a random bearing, from 0 to 359.9.
SET bearing = FLOOR((RAND() * 359.9));
I may call the procedure that runs this request within the same while loop, immediately one after the next. Unfortunately, the randomization seems to be anything but unique. e.g.
Results
358.07
359.15
357.85
I understand how randomization works, and I know because of my quick calls to the same function, the ticks used to generate the random number are very close to one another.
In any other situation, I would wait a few milliseconds in between calls or reinit my Random object (such as in C#), which would greatly vary my randomness. However, I don't want to wait in this situation.
How can I increase randomness without waiting?

I understand how randomization works, and I know because of my quick calls to the same function, the ticks used to generate the random number are very close to one another.
That's not quite right. Where folks get into trouble is when they re-seed a random number generator repeatedly with the current time, and because they do it very quickly the time is the same and they end up re-seeding the RNG with the same seed. This results in the RNG spitting out the same sequence of numbers each time it is re-seeded.
Importantly, by "the same" I mean exactly the same. An RNG is either going to return an identical sequence or a completely different one. A "close" seed won't result in a "similar" sequence. You will either get an identical sequence or a totally different one.
The correct solution to this is not to stagger your re-seeds, but actually to stop re-seeding the RNG. You only need to seed an RNG once.
Anyways, that is neither here nor there. MySQL's RAND() function does not require explicit seeding. When you call RAND() without arguments the seeding is taken care of for you meaning you can call it repeatedly without issue. There's no time-based limitation with how often you can call it.
Actually your SQL looks fine as is. There's something missing from your post, in fact. Since you're calling FLOOR() the result you get should always be an integer. There's no way you'll get a fractional result from that assignment. You should see integral results like this:
187
274
89
345
That's what I got from running SELECT FLOOR(RAND() * 359.9) repeatedly.
Also, for what it's worth RAND() will never return 1.0. Its range is 0 &leq; RAND() < 1.0. You are safe using 360 vs. 359.9:
SET bearing = FLOOR(RAND() * 360);

Related

How to prevent (Random) expressions in a watch expression from being cached, or re-evaluate them?

Alternatively, is there a way to force a re-evaluation of a single watch expression?
Say I have the following watch expression:
> Random.splitmix 123 '(Random.natIn 0 100)
When I run this, I might see a result like:
Now evaluating any watch expressions (lines starting with `>`)... Ctrl+C cancels.
5 | > Random.splitmix 123 '(Random.natIn 0 100)
⧩
56
Saving the file again will show the same result every time (it is cached).
I'm not sure if Random results should never be cached (maybe that's still good default behavior to save on computation time), but just wondering what the best workarounds are for this.
debug.clear-cache doesn't work either in this situation, since each time the RNG (Random.splitmix) starts over with the same seed.
Of course, we can manually change the random seed, but this also may not always be desired behavior (and a minor nitpick would be that it involves unnecessary keystrokes and creates additional caching - one cached result per seed, so you have to recall which seeds you already used).
You can clear the expression cache with debug.clear-cache in UCM.
That said, re-evaluating your expression is actually going to give the same result every time! The splitMix function is completely deterministic, so the result you get depends on the seed you provide and on nothing else.
So you could clear the cache here, but it’s not going to do anything.
To get a really random value, you need to use IO which is not allowed in watch expressions. You’d need to provide I/O to your program using run in UCM.
Since the watch expression would somehow need to maintain the random state, which is likely more trouble than it is worth, manually editing the random seed is likely the best compromise. Just re-evaluating will always start from the initial value produced from the given random seed.
Alternatively or conjunctively, evaluating a list of random values may be useful.

Functions to guess a number automatically using a check function up to 50 times in JavaScript or Python

I'm stucking on the logical approach for the following exercise, can choose between JS and Python :
(I already find a lot of ways to generate the random number that's not the point of my question)
Find a secret integer between 1 and 1000000 in less than 50 run of function verify()
This function solves the game without user input
It returns the solution by using the function verify() that takes the number to verify as argument and returns three values: -1 if number to guess is greater, 1 if number to guess is smaller and 0 if it's the right number
The function verify() can't be called more than 50 times
More than a definitive answer I'm looking to upgrade my logical so please describe it,
Thanks in advance !

Picking JSON objects out of array based on their value

Perhaps I think about this wrong, but here is a problem:
I have NSMutableArray all full of JSON objects. Each object look like this, here are 2 of them for example:
{
player = "Lorenz";
speed = "12.12";
},
{
player = "Firmino";
speed = "15.35";
}
Okay so this is fine, this is dynamic info I get from webserver feed. Now what I want though is lets pretend there are 22 such entries, and the speeds vary.
I want to have a timer going that starts at 1.0 seconds and goes to 60.0 seconds, and a few times a second I want it to grab all the players whose speed has just been passed. So for instance if the timer goes off at 12.0 , and then goes off again at 12.5, I want it to grab out all the player names who are between 12.0 and 12.5 in speed, you see?
The obvious easy way would be to iterate over the array completely every time that the timer goes off, but I would like the timer to go off a LOT, like 10 times a second or more, so that would be a fairly wasteful algorithm I think. Any better ideas? I could attempt to alter the way data comes from the webserver but don't feel that should be necessary.
NOTE: edited to reflect a corrected understanding that the number in 1 to 60 is incremented continously across that range rather than being a random number in that interval.
Before you enter the timer loop, you should do some common preprocessing:
Convert the speeds from strings to numeric values upfront for fast comparison without having to parse each time. This is O(1) for each item and O(n) to process all the items.
Put the data in an ordered container such as a sorted list or sorted binary tree. This will allow you to easily find elements in the target range. This is O(n log n) to sort all the items.
On the first iteration:
Use binary search to find the start index. This is O(log n).
Use binary search to find the end index, using the start index to bound the search.
On subsequent iterations:
If each iteration increases by a predictable amount and the step between elements in the list is likewise a predictable amount, then just maintain a pointer and increment as per Pete's comment. This would make each iteration cost O(1) (just stepping ahead by a fixed amount).
If the steps between iterations and/or the entries in the list are not predictable, then do a binary search as in the initial case. If the values are monotonically increasing (as I now understand the problem to be stating), even if they are unpredictable, you can incorporate this into your binary search algorithm by maintaining an index as in the other case, but instead of resuming iteration directly from there, if the values are unpredictable, instead use the remembered index to set a lower bound on the binary search so that you narrow the region being searched. This would make each iteration cost O(log m), where "m" are the remaining elements to be considered.
Overall, this produces an algorithm that is no worse than O((N + I) log N) where "I" is the number of iterations compared to the previous algorithm that was O(I * N) (and shifts most of the computation outside of the loop, rather than inside the loop).
A modern computer can do billions of operations per second. Even if your timer goes off 1000 times per second, and your need to process 1000 entries, you will still be fine with a naive approach.
But to answer the question, the best approach would be to sort the data first based on speed, and then have an index of the last player whose speed was already passed. At the beginning the pointer, obviously, points at the first player. Then every time your timer goes off, you will need to process some continuous chunk of players starting at that index. Something along the lines of (in pseudocode):
global index = 0;
sort(players); // sort on speed
onTimer = function(currentSpeed) {
while (index < players.length && players[index].speed < currentSpeed) {
processPlayer(players[index]);
++ index;
}
}

MYSQL masking data from update very slow on large DB

I have a DEV DB with 16 million(ish) records. I need to 'mask' columns of personal data (name, address, phone, etc.). I found a nice function that will do the data masking wonderfully Howto generate meaningful test data using a MySQL function.
The problem is, when I call the function, it is only processing about 30 records per second.
This is way to slow.
Is there anyway to speed this up. Maybe create a temp table or something.
Here is the UPDATE statement that calls the function.
UPDATE table1
SET first_name = (str_random('Cc{3}c(4)')),
last_name = (str_random('Cc{5}c(6)')),
email = (str_random('c{3}c(5)[.|_]c{8}c(8)#[google|yahoo|live|mail]".com"')),
address1 = (str_random('d{3}d{1} Cc{5} [Street|Lane|Road|Park]')),
city = (str_random('Cc{5}c(6)')),
state = (str_random('C{2}')),
zip = (str_random('d{5}-d{4}'))
Thanks!!
Instead of calling a random function 7*16m times, it would probably be faster if you operated on procedurally generated text.
I checked out the str_random function you linked to. (That's very clever btw - cool stuff)
It calls RAND() once for each random character in the string and once each time you say "choose from list". That's a lot of rands.
I think one way to improve it would be to create and cache (in a table) a large set of random characters and instead of calling rand (say) 5 times for 5 random characters, call it once to determine an offset into the big string of random crap, then just increment the index it uses to pull from the string... (if it needs a bunch in a row - it can just pull them all at once in a row and multi-increment the offset)
The str_random_character function that the parent function calls could be replaced by something that does this instead of calling rand into an array.
It's a bit beyond me for a throwaway piece of code, but it might put you (or a better mysql guru) on a path for speeding this puppy up (maybe).
A different option would be rather than random-masking all the data... can you transform the data in some way? Since you don't need the original back, you could do something like a caesar cipher on each character in their data based on a (single) rand call for the rotation count. (If you rotate the uppers, lowers, and digits in each string separately, the data will stay looking "normal" despite not being easily reversible because of the randomized rotation) -- I wouldn't slap a SECURE sticker on it but it would be a lot quicker and not easy to reverse.
I think I have a Caesar rotator that does that somewhere if it suffices.

Math.random() Code Source?

from the ActionScript 3.0 documentation:
Global Functions > Math.random()
Returns a pseudo-random number n,
where 0 <= n < 1. The number returned
is calculated in an undisclosed
manner, and is "pseudo-random" because
the calculation inevitably contains
some element of non-randomness.
i'm interested in reading the source code for Math.random() and assume it's the same in other C-based languages like AS3. is it available for viewing?
can anyone explain which elements make the code pseudo-random and why? is it impossible to create a function that returns a truely random value?
There are a whole bunch of Pseudo Random Generator functions - the most common one if you aren't doing high end crypto is probably a linear congruent - see wiki for a description and links to implementation code.
To get real random numbers you can use some web services such as random.org
It uses randomness from atmospheric noise
A lot rely on the system time if I remember rightly since it changes so quick.
If you hit the same sydtem time, get the same random out.
As for true random, not possible, theres no bit in a computer that wasnt set. You could say it would be random if you went into something elses memory space and grabbed something, but thats all deterministic just like the time.