How would you write this algorithm for large combinations in the most compact way? - language-agnostic

The number of combinations of k items which can be retrieved from N items is described by the following formula.
N!
c = ___________________
(k! * (N - k)!)
An example would be how many combinations of 6 Balls can be drawn from a drum of 48 Balls in a lottery draw.
Optimize this formula to run with the smallest O time complexity
This question was inspired by the new WolframAlpha math engine and the fact that it can calculate extremely large combinations very quickly. e.g. and a subsequent discussion on the topic on another forum.
http://www97.wolframalpha.com/input/?i=20000000+Choose+15000000
I'll post some info/links from that discussion after some people take a stab at the solution.
Any language is acceptable.

Python: O(min[k,n-k]2)
def choose(n,k):
k = min(k,n-k)
p = q = 1
for i in xrange(k):
p *= n - i
q *= 1 + i
return p/q
Analysis:
The size of p and q will increase linearly inside the loop, if n-i and 1+i can be considered to have constant size.
The cost of each multiplication will then also increase linearly.
This sum of all iterations becomes an arithmetic series over k.
My conclusion: O(k2)
If rewritten to use floating point numbers, the multiplications will be atomic operations, but we will lose a lot of precision. It even overflows for choose(20000000, 15000000). (Not a big surprise, since the result would be around 0.2119620413×104884378.)
def choose(n,k):
k = min(k,n-k)
result = 1.0
for i in xrange(k):
result *= 1.0 * (n - i) / (1 + i)
return result

Notice that WolframAlpha returns a "Decimal Approximation". If you don't need absolute precision, you could do the same thing by calculating the factorials with Stirling's Approximation.
Now, Stirling's approximation requires the evaluation of (n/e)^n, where e is the base of the natural logarithm, which will be by far the slowest operation. But this can be done using the techniques outlined in another stackoverflow post.
If you use double precision and repeated squaring to accomplish the exponentiation, the operations will be:
3 evaluations of a Stirling approximation, each requiring O(log n) multiplications and one square root evaluation.
2 multiplications
1 divisions
The number of operations could probably be reduced with a bit of cleverness, but the total time complexity is going to be O(log n) with this approach. Pretty manageable.
EDIT: There's also bound to be a lot of academic literature on this topic, given how common this calculation is. A good university library could help you track it down.
EDIT2: Also, as pointed out in another response, the values will easily overflow a double, so a floating point type with very extended precision will need to be used for even moderately large values of k and n.

I'd solve it in Mathematica:
Binomial[n, k]
Man, that was easy...

Python: approximation in O(1) ?
Using python decimal implementation to calculate an approximation. Since it does not use any external loop, and the numbers are limited in size, I think it will execute in O(1).
from decimal import Decimal
ln = lambda z: z.ln()
exp = lambda z: z.exp()
sinh = lambda z: (exp(z) - exp(-z))/2
sqrt = lambda z: z.sqrt()
pi = Decimal('3.1415926535897932384626433832795')
e = Decimal('2.7182818284590452353602874713527')
# Stirling's approximation of the gamma-funciton.
# Simplification by Robert H. Windschitl.
# Source: http://en.wikipedia.org/wiki/Stirling%27s_approximation
gamma = lambda z: sqrt(2*pi/z) * (z/e*sqrt(z*sinh(1/z)+1/(810*z**6)))**z
def choose(n, k):
n = Decimal(str(n))
k = Decimal(str(k))
return gamma(n+1)/gamma(k+1)/gamma(n-k+1)
Example:
>>> choose(20000000,15000000)
Decimal('2.087655025913799812289651991E+4884377')
>>> choose(130202807,65101404)
Decimal('1.867575060806365854276707374E+39194946')
Any higher, and it will overflow. The exponent seems to be limited to 40000000.

Given a reasonable number of values for n and K, calculate them in advance and use a lookup table.
It's dodging the issue in some fashion (you're offloading the calculation), but it's a useful technique if you're having to determine large numbers of values.

MATLAB:
The cheater's way (using the built-in function NCHOOSEK): 13 characters, O(?)
nchoosek(N,k)
My solution: 36 characters, O(min(k,N-k))
a=min(k,N-k);
prod(N-a+1:N)/prod(1:a)

I know this is a really old question but I struggled with a solution to this problem for a long while until I found a really simple one written in VB 6 and after porting it to C#, here is the result:
public int NChooseK(int n, int k)
{
var result = 1;
for (var i = 1; i <= k; i++)
{
result *= n - (k - i);
result /= i;
}
return result;
}
The final code is so simple you won't believe it will work until you run it.
Also, the original article gives some nice explanation on how he reached the final algorithm.

Related

How to map number in a range to another in the same range with no collisions?

Effectively what I'm looking for is a function f(x) that outputs into a range that is pre-defined. Calling f(f(x)) should be valid as well. The function should be cyclical, so calling f(f(...(x))) where the number of calls is equal to the size of the range should give you the original number, and f(x) should not be time dependent and will always give the same output.
While I can see that taking a list of all possible values and shuffling it would give me something close to what I want, I'd much prefer it if I could simply plug values into the function one at a time so that I do not have to compute the entire range all at once.
I've looked into Minimal Perfect Hash Functions but haven't been able to find one that doesn't use external libraries. I'm okay with using them, but would prefer to not do so.
If an actual range is necessary to help answer my question, I don't think it would need to be bigger than [0, 2^24-1], but the starting and ending values don't matter too much.
You might want to take a look at Linear Congruential Generator. You shall be looking at full period generator (say, m=224), which means parameters shall satisfy Hull-Dobell Theorem.
Calling f(f(x)) should be valid as well.
should work
the number of calls is equal to the size of the range should give you the original number
yes, for LCG with parameters satisfying Hull-Dobell Theorem you'll get full period covered once, and 'm+1' call shall put you back at where you started.
Period of such LCG is exactly equal to m
should not be time dependent and will always give the same output
LCG is O(1) algorithm and it is 100% reproducible
LCG is reversible as well, via extended Euclid algorithm, check Reversible pseudo-random sequence generator for details
Minimal perfect hash functions are overkill, all you've asked for is a function f that is,
bijective, and
"cyclical" (ie fN=f)
For a permutation to be cyclical in that way, its order must divide N (or be N but in a way that's just a special case of dividing N). Which in turn means the LCM of the orders of the sub-cycles must divide N. One way to do that is to just have one "sub"-cycle of order N. For power of two N, it's also really easy to have lots of small cycles of some other power-of-two order. General permutations do not necessarily satisfy the cycle-requirement, of course they are bijective but the LCM of the orders of the sub-cycles may exceed N.
In the following I will leave all reduction modulo N implicit. Without loss of generality I will assume the range starts at 0 and goes up to N-1, where N is the size of the range.
The only thing I can immediately think of for general N is f(x) = x + c where gcd(c, N) == 1. The GCD condition ensures there is only one cycle, which necessarily has order N.
For power-of-two N I have more inspiration:
f(x) = cx where c is odd. Bijective because gcd(c, N) == 1 so c has a modular multiplicative inverse. Also cN=1, because φ(N)=N/2 (since N is a power of two) so cφ(N)=1 (Euler's theorem).
f(x) = x XOR c where c < N. Trivially bijective and trivially cycles with a period of 2, which divides N.
f(x) = clmul(x, c) where c is odd and clmul is carry-less multiplication. Bijective because any odd c has a carry-less multiplicative inverse. Has some power-of-two cycle length (less than N) so it divides N. I don't know why though. This is a weird one, but it has decent special cases such as x ^ (x << k). By symmetry, the "mirrored" version also works.
Eg x ^ (x >> k).
f(x) = x >>> k where >>> is bit-rotation. Obviously bijective, and fN(x) = x >>> Nk, where Nk mod N = 0 so it rotates all the way back to the unrotated position regardless of what k is.

Pollard’s p−1 algorithm: understanding of Berkeley paper

This paper explains about Pollard's p-1 factorization algorithm. I am having trouble understanding the case when factor found is equal to the input we go back and change 'a' (basically page 2 point 2 in the aforementioned paper).
Why we go back and increment 'a'?
Why we not go ahead and keep incrementing the factorial? It it because we keep going into the same cycle we have already seen?
Can I get all the factors using this same algorithm? Such as 49000 = 2^3 * 5^3 * 7^2. Currently I only get 7 and 7000. Perhaps I can use this get_factor() function recursively but I am wondering about the base cases.
def gcd(a, b):
if not b:
return a
return gcd(b, a%b)
def get_factor(input):
a = 2
for factorial in range(2, input-1):
'''we are not calculating factorial as anyway we need to find
out the gcd with n so we do mod n and we also use previously
calculate factorial'''
a = a**factorial % input
factor = gcd(a - 1, input)
if factor == 1:
continue
elif factor == input:
a += 1
elif factor > 1:
return factor
n = 10001077
p = get_factor(n)
q = n/p
print("factors of", n, "are", p, "and", q)
The linked paper is not a particularly good description of Pollard's p − 1 algorithm; most descriptions discuss smoothness bounds that make the algorithm much more practical. You might like to read this page at Prime Wiki. To answer your specific questions:
Why increment a? Because the original a doesn't work. In practice, most implementations don't bother; instead, a different factoring method, such as the elliptic curve method, is tried instead.
Why not increment the factorial? This is where the smoothness bound comes into play. Read the page at Mersenne Wiki for more details.
Can I get all factors? This question doesn't apply to the paper you linked, which assumes that the number being factored is a semi-prime with exactly two factors. The more general answer is "maybe." This is what happens at Step 3a of the linked paper, and choosing a new a may work (or may not). Or you may want to move to a different factoring algorithm.
Here is my simple version of the p − 1 algorithm, using x instead of a. The while loop computes the magical L of the linked paper (it's the least common multiple of the integers less than the smoothness bound b), which is the same calculation as the factorial of the linked paper, but done in a different way.
def pminus1(n, b, x=2):
q = 0; pgen = primegen(); p = next(pgen)
while p < b:
x = pow(x, p**ilog(p,b), n)
q, p = p, next(pgen)
g = gcd(x-1, n)
if 1 < g < n: return g
return False
You can see it in action at http://ideone.com/eMPHtQ, where it factors 10001 as in the linked paper as well as finding a rather spectacular 36-digit factor of fibonacci(522). Once you master that algorithm, you might like to move on to the two-stage version of the algorithm.

Why use a special function to generate pseudo-random numbers in Pollard rho factorization?

I've been getting myself acquainted with the Pollard Rho factorization from this page.
I think I understand almost everything there, but one thing I'm confused about – the fact that it uses f(x) = x^2 + a mod N to generate pseudo-random numbers for checking.
My question is, why can't we simply have a random number generator give us some two random numbers (xi, xj) each time, where xi, xj < N?
Why use this function f(x)?
The particular random number generator doesn't matter. Pollard in his original paper describing the algorithm says "Other polynomials of degree ≥ 2 and other starting values can be used." Brent says "The choice of a (pseudo-) random u ∈ [0,1) is not essential; it merely makes the average-case analysis tractable. Pollard and Brent describe the use of a function other than x2 + c to factor 228 + 1. The advantage of the x2 + c method is that it is simple to implement and gives a family of polynomials, making it easy to switch to another one if the first doesn't work.

Operate on all rows of a matrix without loop, octave

Suppose I have a matrix A and I want to obtain the following:
for i=1:m
A(i,:) = something which depends on i;
endfor
Is there a way to obtain that without the loop?
Added: Ok, I've understood I have to be more specific.
I have two matrices B and C (all the matrices we are considering have m rows).
I want to record in the i-th row of A the product of the polynomials written in the i-th rows of B and C (so using the loop I would call the conv function).
Any ideas?
I don't think it's possible to do this without a for loop as conv only accepts vector inputs, but I might be wrong. I can't see a way of using either bsxfun or arrayfun in conjunction with conv and matrix inputs. I might be wrong though... I stand to be corrected.
That is a very very general question, and not possible to answer with more details. Mainly on what i will be involved with. Suppose the following
for i = 1:m
A(i,:) += i;
endfor
It could be written with the much more efficient:
A .+ (1:m)'
Just compare:
octave> n = 1000;
octave> A = B = rand (n);
octave> tic; for i = 1:n, B(i,:) += i; endfor; toc
Elapsed time is 0.051 seconds.
octave> tic; C = A.+ (1:n)'; toc
Elapsed time is 0.01 seconds.
octave> isequal (C, B)
ans = 1
If you have a very old version of octave, you can instead do bsxfun (#plus, A, (i:m)').
However, if i on the right side of the expression will be used for indexing some other variable, then solution would be different. Maybe, the solution is cumsum, or some other of cumfoo function.
Your question is basically, "how do I vectorize code?", which is a really large subject, without telling us what you're trying to vectorize.

Is the use of machine epsilon appropriate for floating-point equality tests?

This is a follow-up to Testing for floating-point value equality: Is there a standard name for the “precision” constant?.
There is a very similar question Double.Epsilon for equality, greater than, less than, less than or equal to, greater than or equal to.
It is well known that an equality test for two floating-point values x and y should look more like this (rather than a straightforward =):
abs( x - y ) < epsilon , where epsilon is some very small value.
How to choose a value for epsilon?
It would obviously be preferable to choose for epsilon as small a value as possible, to get the highest-possible precision for the equality check.
As an example, the .NET framework offers a constant System.Double.Epsilon (= 4.94066 × 10-324), which represents the smallest positive System.Double value that is greater than zero.
However, it turns out that this particular value can't be reliably used as epsilon, since:
0 + System.Double.Epsilon ≠ 0
1 + System.Double.Epsilon = 1 (!)
which is, if I understand correctly, because that constant is less than machine epsilon.
→ Is this correct?
→ Does this also mean that I can reliably use epsilon := machine epsilon for equality tests?
Removed these two questions, as they are already adequately answered by the second SO question linked-to above.
The linked-to Wikipedia article says that for 64-bit floating-point numbers (ie. the double type in many languages), machine epsilon is equal to:
2-53, or approx. 0.000000000000000111 (a number with 15 zeroes after the decimal point)
→ Does it follow from this that all 64-bit floating point values are guaranteed to be accurate to 14 (if not 15) digits?
How to choose a value for epsilon?
Short Answer: You take a small value which fits your applications needs.
Long Answer: Nobody can know which calculations your application does and how accurate you expect your results to be. Since rounding errors sum up machine epsilon will be almost all times far too big so you have to chose your own value. Depending on your needs, 0.01 be be sufficient, or maybe 0.00000000000001 or less will.
The question is, do you really want/need to do equality tests on floating point values? Maybe you should redesign your algorithms.
In the past when I have had to use an epsilon value it's been very much bigger than the machine epsilon value.
Although it was for 32 bit doubles (rather than 64 bit doubles) we found that an epsilon value of 10-6 was needed for most (if not all) calculated values in our particular application.
The value of epsilon you choose depends on the scale of your numbers. If you are dealing with the very large (10+10 say) then you might need a larger value of epsilon as your significant digits don't stretch very far into the fractional part (if at all). If you are dealing with the very small (10-10 say) then obviously you need an epsilon value that's smaller than this.
You need to do some experimentation, performing your calculations and checking the differences between your output values. Only when you know the range of your potential answers will you be able to decide on a suitable value for your application.
The sad truth is: There is no appropriate epsilon for floating-point comparisons. Use another approach for floating-point equality tests if you don't want to run into serious bugs.
Approximate floating-point comparison is an amazingly tricky field, and the abs(x - y) < eps approach works only for a very limited range of values, mainly because of the absolute difference not taking into account the magnitude of the compared values, but also due to the significant digit cancellation occurring in the subtraction of two floating-point values with different exponents.
There are better approaches, using relative differences or ULPs, but they have their own shortcomings and pitfalls. Read Bruce Dawson's excellent article Comparing Floating Point Numbers, 2012 Edition for a great introduction into how tricky floating-point comparisons really are -- a must-read for anyone doing floating-point programming IMHO! I'm sure countless thousands of man-years have been spent finding out the subtle bugs due to naive floating-point comparisons.
I also have questions regarding what would be the correct procedure. However I believe one should do:
abs(x - y) <= 0.5 * eps * max(abs(x), abs(y))
instead of:
abs(x - y) < eps
The reason for this arises from the definition of the machine epsilon. Using python code:
import numpy as np
real = np.float64
eps = np.finfo(real).eps
## Let's get the machine epsilon
x, dx = real(1), real(1)
while x+dx != x: dx/= real(2) ;
print "eps = %e dx = %e eps*x/2 = %e" % (eps, dx, eps*x/real(2))
Which gives: eps = 2.220446e-16 dx = 1.110223e-16 eps*x/2 = 1.110223e-16
## Now for x=16
x, dx = real(16), real(1)
while x+dx != x: dx/= real(2) ;
print "eps = %e dx = %e eps*x/2 = %e" % (eps, dx, eps*x/real(2))
Which now gives: eps = 2.220446e-16 dx = 1.776357e-15 eps*x/2 = 1.776357e-15
## For x not equal to 2**n
x, dx = real(36), real(1)
while x+dx != x: dx/= real(2) ;
print "eps = %e dx = %e eps*x/2 = %e" % (eps, dx, eps*x/real(2))
Which returns: eps = 2.220446e-16 dx = 3.552714e-15 eps*x/2 = 3.996803e-15
However, despite the difference between dx and eps*x/2, we see that dx <= eps*x/2,
thus it serves the purpose for equality tests, checking for tolerances when testing for convergence in numerical procedures, etc.
Such is similar to what is in:
www.ibiblio.org/pub/languages/fortran/ch1-8.html#02,
however if someone knows of better procedures or if something here is incorrect, please do say.