How do i know if this is random enough? - language-agnostic

I wrote a program in java that rolls a die and records the total number of times each value 1-6 is rolled. I rolled 6 Million times. Here's the distribution:
#of 0's: 0
#of 1's: 1000068
#of 2's: 999375
#of 3's: 999525
#of 4's: 1001486
#of 5's: 1000059
#of 6's: 999487
(0 wasn't an option.)
Is this distribution consistant with random dice rolls?
What objective statistical tests might confirm that the dice rolls are indeed random enough?
EDIT: questions have been raised over application: a game that i want to be as fair as can be reasonably achieved.

To test whether this particular distribution is consistent with the expected distribution of numbers rolled with a "fair" dive, you need to perform the Pearson's Chi-square test.
Note that this still will not prove that your algorithm is "fair", only that these particular results look "fair".
To test whether your algorithm is "fair" in general, use the Diehard tests, as others have mentioned.

If your random number generator passes the Diehard tests, that's the best you can do.
Even a physical die won't be perfect with 1/6 per face.
Increase the trials by an order of magnitude, then do it again. If you get 1/6 for each trial you'll be fine.

This test alone isn't enough to determine randomness. Not that it's completely useless, but a "random" dice roller that outputs 1,2,3,4,5,6 and repeats would be perfectly random according to this test.
Another suggested test: pick a number, x, and each time it is rolled, record the statistics of what number comes next; you should see an even distribution again. Repeat for all six values of x. If it passes this test it is probably random enough to be used as a dice roller.

The probability that 6'000'000 dice rolls will end up in exactly 1'000'000 outcomes of each is close to 0. As long as the sum if the outcomes is correct, and that the variance (error) of the difference from the expected outcome goes towards 0 (relatively) when the number of trials increase, then your random function is not wrong.
You can either prove it mathematically or by testing the random function with larger and larger trial sequences to see that it converges.
For a repeated number of tests, the sum for each outcome should approximate Gaussian distribution. E.g. each outcome 1-6 should fall within normal distribution centered around 1'000'000 with a variance that is inversely proportional to the number of dice rolls.
The other tests, the Diehard tests, tests that the actual sequence of dice rolls is random in itself and not that the outcome of 6'000'000 rolls for example is 100'000 consecutive 1's, then 100'000 2's and so on and finally some random sequences.

Related

Layer probability always 1

I'm using caffe/example/mnist network to classify numbers. When I give the network a picture of number it seems working ok. But when I give the network a picture not a number, the mnist trained network softmax layer gives the probabilities, which always has one probability 1 and others 0, like:
[0,0,0,...,1,0,0,0].
I think it should be something like:
[0,0.1,0.2,...,0.4,0.1,0.2],
in which case I can say that this should not be a number. What is the problem?
It is hard to know what to expect since it hasn't been trained on non-numbers and it has to give a result that sums to 1. By using Softmax, you are telling the network that there is a number, while showing it a non-number. You cannot look at its output to then determine whether it is a number or not.
Furthermore, the training data for MNIST is very stereotypical and not good for generalization. The foreground numbers are always 255 and the background is always 0. The average value will be much closer to 0, since there are more background pixels. Simply presenting an image with a mean pixel value of 100 could bias the prediction to numbers that typically have more pixels (like 8, perhaps). You can only expect to the network to generalize to similar types of stimulus. For your task, you should do quite a lot of data augmentation.
You want to allow the probability to be zeros for all numbers, which you can do by using the cross-entropy loss. This would also allow the probability to also sum to more than 1 (maximum 10). You could also try adding another class for "non-number" with the Softmax, however, then you should present non-number stimuli and number stimuli that is more similar to natural stimuli (so that it is not trivially separable).

Statistical method to know when enough performance test iterations have been performed

I'm doing some performance/load testing of a service. Imagine the test function like this:
bytesPerSecond = test(filesize: 10MB, concurrency: 5)
Using this, I'll populate a table of results for different sizes and levels of concurrency. There are other variables too, but you get the idea.
The test function spins up concurrency requests and tracks throughput. This rate starts off at zero, then spikes and dips until it eventually stabilises on the 'true' value.
However it can take a while for this stability to occur, and there are lot of combinations of input to evaluate.
How can the test function decide when it's performed enough samples? By enough, I suppose I mean that the result isn't going to change beyond some margin if testing continues.
I remember reading an article about this a while ago (from one of the jsperf authors) that discussed a robust method, but I cannot find the article any more.
One simple method would be to compute the standard deviation over a sliding window of values. Is there a better approach?
IIUC, you're describing the classic problem of estimating the confidence interval of the mean with unknown variance. That is, suppose you have n results, x1, ..., xn, where each of the xi is a sample from some process of which you don't know much: not the mean, not the variance, and not the distribution's shape. For some required confidence interval, you'd like to now whether n is large enough so that, with high probability the true mean is within the interval of your mean.
(Note that with relatively-weak conditions, the Central Limit Theorem guarantees that the sample mean will converge to a normal distribution, but to apply it directly you would need the variance.)
So, in this case, the classic solution to determine if n is large enough, is as follows:
Start by calculating the sample mean μ = ∑i [xi] / n. Also calculate the normalized sample variance s2 = ∑i [(xi - μ)2] / (n - 1)
Depending on the size of n:
If n > 30, the confidence interval is approximated as μ ± zα / 2(s / √(n)), where, if necessary, you can find here an explanation on the z and α.
If n < 30, the confidence interval is approximated as μ ± tα / 2(s / √(n)); see again here an explanation of the t value, as well as a table.
If the confidence is enough, stop. Otherwise, increase n.
Stability means rate of change (derivative) is zero or close to zero.
The test function spins up concurrency requests and tracks throughput.
This rate starts off at zero, then spikes and dips until it eventually
stabilises on the 'true' value.
I would track your past throughput values. For example last X values or so. According to this values, I would calculate rate of change (derivative of your throughput). If your derivative is close to zero, then your test is stable. I will stop test.
How to find X? I think instead of constant value, such as 10, choosing a value according to maximum number of test can be more suitable, for example:
X = max(10,max_test_count * 0.01)

How to print probability for repeated measures logistic regression?

I would like SAS to print the probability of my binary dependent variable occurring (“Calliphoridae” a particular fly family being present (1) or not (0), at a specific instance for my continuous independent variable (“degree_index” that was recorded from .055 to 2.89, but can be continuously recorded past 2.89 and always increases as time goes on) using Proc GENMOD. How do I change my code to print the probability, for example, that Calliphoridae is present at degree_index=.1?
My example code is:
proc genmod data=thesis descending ;
class Body_number ;
model Calliphoridae = degree_index / dist=binomial link=logit ;
repeated subject=Body_number/ type=cs;
estimate 'degreeindex=.1' intercept 1 degree_index 0 /exp;
estimate 'degree_index=.2' intercept 1 degree_index .1 /exp;run;
I get an output for the contrast estimate results as mean estimate at degree_index=.1 is ..99; degree_index=.2 is .98.
I think that it is correctly modeling the probability...I just didn't include the square of
the degree-day index. If you do, it allows the probability to increase and decrease. I
realized this when I did the probability by hand
(e^-1.1307x+.2119)/(1+e^-1.1307x+.2119) to verify that this really was modeling
probability when y=1 for the mean estimates at specific x values...and then I realized that it is
fitting a regression line and cannot increase and decrease because there is only
one x value. http://www.stat.sc.edu/~hansont/stat704/chapter14a.pdf

Temperature Scale in SA

First, this is not a question about temperature iteration counts or automatically optimized scheduling. It's how the data magnitude relates to the scaling of the exponentiation.
I'm using the classic formula:
if(delta < 0 || exp(-delta/tK) > random()) { // new state }
The input to the exp function is negative because delta/tK is positive, so the exp result is always less then 1. The random function also returns a value in the 0 to 1 range.
My test data is in the range 1 to 20, and the delta values are below 20. I pick a start temperature equal to the initial computed temperature of the system and linearly ramp down to 1.
In order to get SA to work, I have to scale tK. The working version uses:
exp(-delta/(tK * .001)) > random()
So how does the magnitude of tK relate to the magnitude of delta? I found the scaling factor by trial and error, and I don't understand why it's needed. To my understanding, as long as delta > tK and the step size and number of iterations are reasonable, it should work. In my test case, if I leave out the extra scale the temperature of the system does not decrease.
The various online sources I've looked at say nothing about working with real data. Sometimes they include the Boltzmann constant as a scale, but since I'm not simulating a physical particle system that doesn't help. Examples (typically with pseudocode) use values like 100 or 1000000.
So what am I missing? Is scaling another value that I must set by trial and error? It's bugging me because I don't just want to get this test case running, I want to understand the algorithm, and magic constants mean I don't know what's going on.
Classical SA has 2 parameters: startingTemperate and cooldownSchedule (= what you call scaling).
Configuring 2+ parameters is annoying, so in OptaPlanner's implementation, I automatically calculate the cooldownSchedule based on the timeGradiant (which is a double going from 0.0 to 1.0 during the solver time). This works well. As a guideline for the startingTemperature, I use the maximum score diff of a single move. For more information, see the docs.

What is the proper method of constraining a pseudo-random number to a smaller range?

What is the best way to constrain the values of a PRNG to a smaller range? If you use modulus and the old max number is not evenly divisible by the new max number you bias toward the 0 through (old_max - new_max - 1). I assume the best way would be something like this (this is floating point, not integer math)
random_num = PRNG() / max_orginal_range * max_smaller_range
But something in my gut makes me question that method (maybe floating point implementation and representation differences?).
The random number generator will produce consistent results across hardware and software platforms, and the constraint needs to as well.
I was right to doubt the pseudocode above (but not for the reasons I was thinking). MichaelGG's answer got me thinking about the problem in a different way. I can model it using smaller numbers and test every outcome. So, let's assume we have a PRNG that produces a random number between 0 and 31 and you want the smaller range to be 0 to 9. If you use modulus you bias toward 0, 1, 2, and 3. If you use the pseudocode above you bias toward 0, 2, 5, and 7. I don't think there can be a good way to map one set into the other. The best that I have come up with so far is to regenerate the random numbers that are greater than old_max/new_max, but that has deep problems as well (reducing the period, time to generate new numbers until one is in the right range, etc.).
I think I may have naively approached this problem. It may be time to start some serious research into the literature (someone has to have tackled this before).
I know this might not be a particularly helpful answer, but I think the best way would be to conceive of a few different methods, then trying them out a few million times, and check the result sets.
When in doubt, try it yourself.
EDIT
It should be noted that many languages (like C#) have built in limiting in their functions
int maximumvalue = 20;
Random rand = new Random();
rand.Next(maximumvalue);
And whenever possible, you should use those rather than any code you would write yourself. Don't Reinvent The Wheel.
This problem is akin to rolling a k-sided die given only a p-sided die, without wasting randomness.
In this sense, by Lemma 3 in "Simulating a dice with a dice" by B. Kloeckner, this waste is inevitable unless "every prime number dividing k also divides p". Thus, for example, if p is a power of 2 (and any block of random bits is the same as rolling a die with a power of 2 number of faces) and k has prime factors other than 2, the best you can do is get arbitrarily close to no waste of randomness, such as by batching multiple rolls of the p-sided die until p^n is "close enough" to a power of k.
Let me also go over some of your concerns about regenerating random numbers:
"Reducing the period": Besides batching of bits, this concern can be dealt with in several ways:
Use a PRNG with a bigger "period" (maximum cycle length).
Add a Bays–Durham shuffle to the PRNG's implementation.
Use a "true" random number generator; this is not trivial.
Employ randomness extraction, which is discussed in Devroye and Gravel 2015-2020 and in my Note on Randomness Extraction. However, randomness extraction is pretty involved.
Ignore the problem, especially if it isn't a security application or serious simulation.
"Time to generate new numbers until one is in the right range": If you want unbiased random numbers, then any algorithm that does so will generally have to run forever in the worst case. Again, by Lemma 3, the algorithm will run forever in the worst case unless "every prime number dividing k also divides p", which is not the case if, say, k is 10 and p is 32.
See also the question: How to generate a random integer in the range [0,n] from a stream of random bits without wasting bits?, especially my answer there.
If PRNG() is generating uniformly distributed random numbers then the above looks good. In fact (if you want to scale the mean etc.) the above should be fine for all purposes. I guess you need to ask what the error associated with the original PRNG() is, and whether further manipulating will add to that substantially.
If in doubt, generate an appropriately sized sample set, and look at the results in Excel or similar (to check your mean / std.dev etc. for what you'd expect)
If you have access to a PRNG function (say, random()) that'll generate numbers in the range 0 <= x < 1, can you not just do:
random_num = (int) (random() * max_range);
to give you numbers in the range 0 to max_range?
Here's how the CLR's Random class works when limited (as per Reflector):
long num = maxValue - minValue;
if (num <= 0x7fffffffL) {
return (((int) (this.Sample() * num)) + minValue);
}
return (((int) ((long) (this.GetSampleForLargeRange() * num))) + minValue);
Even if you're given a positive int, it's not hard to get it to a double. Just multiply the random int by (1/maxint). Going from a 32-bit int to a double should provide adequate precision. (I haven't actually tested a PRNG like this, so I might be missing something with floats.)
Psuedo random number generators are essentially producing a random series of 1s and 0s, which when appended to each other, are an infinitely large number in base two. each time you consume a bit from you're prng, you are dividing that number by two and keeping the modulus. You can do this forever without wasting a single bit.
If you need a number in the range [0, N), then you need the same, but instead of base two, you need base N. It's basically trivial to convert the bases. Consume the number of bits you need, return the remainder of those bits back to your prng to be used next time a number is needed.