I have a simple class that contains three unsigned integer fields: a whole value, a numerator, and a denominator, representing a mixed number of the form:
<Whole> <Num>/<Den> // e.g. 3 1/2
I'd like to be able to multiply instances of these classes with each other, but since my main app uses relatively large numbers, I'm concerned about overflow. Is there an algorithm for performing this kind of multiplication that minimizes the potential for multiplication overflow?
I'm OK with having overflow if it's unavoidable, what I'm looking for is for a way to "intelligently" multiply to avoid having overflow if it's possible.
I'm not sure if you actually needed info on multiplying mixed numbers... but this site explains how to do it fairly simply: Multiplying Mixed Numbers.
At any rate... the data structure you've created has inherited the limitations of its parts. That is to say, even if you were just working with rounded up unsigned ints, you were still going to end up with the potential for overflow. If you're worried about blowing out your unsigned int then you should consider bumping the type you're using up to something that can handle larger numbers.
Wikipedia has a pretty good summary on Arithmetic Overflow and some ideas for handling it: Arithmetic Overflow
Calculating the least-common-multiple (LCM) of the two denominators can help to keep the numbers small. There is a lot of info on wikipedia, have a look at the "Reduction by greatest common divisor" section of http://en.wikipedia.org/wiki/Least_common_multiple and the "Implementations" section of http://en.wikipedia.org/wiki/Euclidean_algorithm.
There is a way of doing it without resorting to arbitrary presision arithmetics as well. Unless you are coding in assembly, it's more of a curiosity rather than a useful algorithm, but it may be worth mentioning.
int dividend = 0;
int result = 0;
int remainder = 0;
while( num != 0 ) {
boolean bit = <take the topmost bit of num>
dividend = remainder << 1;
if( bit ) {
dividend += whole;
}
int quotient = dividend / div;
result = (result << 1) + quotient;
remainder = dividend % div;
num = num << 1;
}
result <<= 1;
result += ( remainder << 1 ) / div;
I know the loop is clumsy but my mind's gone blank and I can't rephrase it so that everything fits neatly into it, but you should get the general idea, which is basically perform the multiplication bit by bit while you're doing the division.
Related
I'm implementing a realtime graphics engine (C++ / OpenGL) that moves a vehicle over time along a specified course that is described by a polynomial function. The function itself was programmatically generated outside the application and is of a high order (I believe >25), so I can't really post it here (I don't think it matters anyway). During runtime the function does not change, so it's easy to calculate the first and second derivatives once to have them available quickly later on.
My problem is that I have to move along the curve with a constant speed (say 10 units per second), so my function parameter is not equal to the time directly, since the arc length between two points x1 and x2 differs dependent on the function values. For example the difference f(a+1) - f(a) may be way larger or smaller than f(b+1) - f(b), depending on how the function looks at points a and b.
I don't need a 100% accurate solution, since the movement is only visual and will not be processed any further, so any approximation is OK as well. Also please keep in mind that the whole thing has to be calculated at runtime each frame (60fps), so solving huge equations with complex math may be out of the question, depending on computation time.
I'm a little lost on where to start, so even any train of thought would be highly appreciated!
Since the criterion was not to have an exact solution, but a visually appealing approximation, there were multiple possible solutions to try out.
The first approach (suggested by Alnitak in the comments and later answered by coproc) I implemented, which is approximating the actual arclength integral by tiny iterations. This version worked really well most of the time, but was not reliable at really steep angles and used too many iterations at flat angles. As coproc already pointed out in the answer, a possible solution would be to base dx on the second derivative.
All these adjustments could be made, however, I need a runtime friendly algorithm. With this one it is hard to predict the number of iterations, which is why I was not happy with it.
The second approach (also inspired by Alnitak) is utilizing the first derivative by "pushing" the vehicle along the calculated slope (which is equal to the derivative at the current x value). The function for calculating the next x value is really compact and fast. Visually there is no obvious inaccuracy and the result is always consistent. (That's why I chose it)
float current_x = ...; //stores current x
float f(x) {...}
float f_derv(x) {...}
void calc_next_x(float units_per_second, float time_delta) {
float arc_length = units_per_second * time_delta;
float derv_squared = f_derv(current_x) * f_derv(current_x);
current_x += arc_length / sqrt(derv_squared + 1);
}
This approach, however, will possibly only be accurate enough for cases with high frame time (mine is >60fps), since the object will always be pushed along a straight line with a length depending on said frame time.
Given the constant speed and the time between frames the desired arc length between frames can be computed. So the following function should do the job:
#include <cmath>
typedef double (*Function)(double);
double moveOnArc(Function f, const double xStart, const double desiredArcLength, const double dx = 1e-2)
{
double arcLength = 0.;
double fPrev = f(xStart);
double x = xStart;
double dx2 = dx*dx;
while (arcLength < desiredArcLength)
{
x += dx;
double fx = f(x);
double dfx = fx - fPrev;
arcLength += sqrt(dx2 + dfx*dfx);
fPrev = fx;
}
return x;
}
Since you say that accuracy is not a top criteria, choosing an appropriate dx the above function might work right away. Ofcourse, it could be improved by adjusting dx automatically (e.g. based on the second derivative) or by refining the endpoint with a binary search.
I want to make a program that can compare the prime factorisation of numbers of the size 100^100. I want the program to tell me if there is an 2, for example, in the prime factorisation, but I don't want to know how many there are... And than I want to compare the prime factorisation of two numbers... Could someone help? The difficulty is really the size of the numbers... And the efficiency of the program... I would like that comparing two numbers only takes like maximum 10 seconds...
I have this;
import java.util.ArrayList;
import java.util.List;
public class PrimeFactorisation {
public static List<Integer> primeFactors(int numbers) {
int n = numbers;
List<Integer> factors = new ArrayList<Integer>();
for (int i = 2; i <= n / i; i++) {
while (n % i == 0) {
factors.add(i);
n /= i;
}
}
if (n > 1) {
factors.add(n);
}
return factors;
}
public static void main(String[] args) {
System.out.println("Primefactors of 44");
for (Integer integer : primeFactors(12345678901)) {
System.out.println(integer);
}
}
}
This code is for Java, but I am primarly searching for something efficient, so I am willing to change the language...
In general, you won't be able to find factors of numbers that large. But if you only want to know if two numbers have the same factors except for multiplicity, it's fairly easy. Use Euclid's algorithm to find the greatest common factor of the two numbers. If the result is 1, the numbers are coprime and do not have the same factors. Otherwise, factor the greatest common factor into primes; that should be much easier than factoring the two big numbers, since it will probably be much smaller. Then divide each of the big numbers by each of the common prime factors until it doesn't divide evenly; if, after dividing by all of the common prime factors, the two remainders are the same, you can declare that the two original numbers have the same prime factors but in different multiplicities, otherwise you know that they have prime factors that aren't shared. Ask if you need to know more.
That's a rather odd request. What is your use case? Perhaps there is some other way to solve the underlying problem.
These are huge integers. Using a huge integer library, you will get most of them down to pretty low values quite quickly by tasking out the low factors, but not all of them - primes aren't uncommon.
But you don't need the factors, you need to know if two numbers share a common factor. There is I believe a test for that. It's fairly complicated and a bit beyond my mathematical expertise, but it's a component or randomised primality testing.
I have a lot of random floating point numbers residing in global GPU memory. I also have "buckets" that specify ranges of numbers they will accept and a capacity of numbers they will accept.
ie:
numbers: -2 0 2 4
buckets(size=1): [-2, 0], [1, 5]
I want to run a filtration process that yields me
filtered_nums: -2 2
(where filtered_nums can be a new block of memory)
But every approach I take runs into a huge overhead of trying to synchronize threads across bucket counters. If I try to use a single-thread, the algorithm completes successfully, but takes frighteningly long (over 100 times slower than generating the numbers in the first place).
What I am asking for is a general high-level, efficient, as-simple-as-possible approach algorithm that you would use to filter these numbers.
edit
I will be dealing with 10 buckets and half a million numbers. Where all the numbers fall into exactly 1 of the 10 bucket ranges. Each bucket will hold 43000 elements. (There are excess elements, since the objective is to fill every bucket, and many numbers will be discarded).
2nd edit
It's important to point out that the buckets do not have to be stored individually. The objective is just to discard elements that would not fit into a bucket.
You can use thrust::remove_copy_if
struct within_limit
{
__host__ __device__
bool operator()(const int x)
{
return (x >=lo && x < hi);
}
};
thrust::remove_copy_if(input, input + N, result, within_limit());
You will have to replace lo and hi with constants for each bin..
I think you can templatize the kernel, but then again you will have to instantiate the template with actual constants. I can't see an easy way at it, but I may be missing something.
If you are willing to look at third party libraries, arrayfire may offer an easier solution.
array I = array(N, input, afDevice);
float **Res = (float **)malloc(sizeof(float *) * nbins);
for(int i = 0; i < nbins; i++) {
array res = where(I >= lo[i] && I < hi[i]);
Res[i] = res.device<float>();
}
I'm curious about an issue spotted in our team with a very large number:
var n:Number = 64336512942563914;
trace(n < Number.MAX_VALUE); // true
trace(n); // 64336512942563910
var a1:Number = n +4;
var a2:Number = a1 - n;
trace(a2); // 8 Expect to see 4
trace(n + 4 - n); // 8
var a3:Number = parseInt("64336512942563914");
trace(a3); // 64336512942563920
n++;
trace(n); //64336512942563910
trace(64336512942563914 == 64336512942563910); // true
What's going on here?
Although n is large, it's smaller than Number.MAX_VALUE, so why am I seeing such odd behaviour?
I thought that perhaps it was an issue with formatting large numbers when being trace'd out, but that doesn't explain n + 4 - n == 8
Is this some weird floating point number issue?
Yes, it is a floating point issue, but it is not a weird one. It is all expected behavior.
Number data type in AS3 is actually a "64-bit double-precision format as specified by the IEEE Standard for Binary Floating-Point Arithmetic (IEEE-754)" (source). Because the number you assigned to n has too many digits to fit into thos 64 bits, it gets rounded off, and that's the reason for all the "weird" results.
If you need to do some exact big integer arithmetic, you will have to use a custom big integer class, e.g. this one.
Numbers in flash are double precision floating point numbers. Meaning they store a number of significant digits and and exponent. It favors a larger range of expressible numbers over the precision of the numbers due to memory constraints. At some point, numbers with a lot of significant digits, rounding will occur. Which is what you are seeing; the least significant digits are being rounded. Google double precision floating point numbers and you'll find a bunch of technical information on why.
It is the nature of the datatype. If you need precise numbers you should really stick to uint or int based integers. Other languages have fixed point or bigint number processing libraries (sometimes called BigInt or Decimal) which are wrappers around ints and longs to express much larger numbers at the cost of memory consumption.
We have an as3 implementation of BigDecimal copied from Java that we use for ALL calculations. In a trading app the floating point errors were not acceptable.
To be safe with integer math when using Numbers, I checked the AS3 documentation for Number and it states:
The Number class can be used to represent integer values well beyond
the valid range of the int and uint data types. The Number data type
can use up to 53 bits to represent integer values, compared to the 32
bits available to int and uint.
A 53 bit integer gets you to 2^53 - 1, if I'm not mistaken which is 9007199254740991, or about 9 quadrillion. The other 11 bits that help make up the 64 bit Number are used in the exponent. The number used in the question is about 64.3 quadrillion. Going past that point (9 quadrillion) requires more bits for the significant number portion (the mantissa) than is allotted and so rounding occurs. A helpful video explaining why this makes sense (by PBS Studio's Infinite Series).
So yeah, one must search for outside resources such as the BigInt. Hopefully, the resources I linked to are useful to someone.
This is, indeed, a floating point number approximation issue.
I guess n is to large compared to 4, so it has to stick with children of its age:
trace(n - n + 4) is ok since it does n-n = 0; 0 + 4 = 4;
Actually, Number is not the type to be used for large integers, but floating point numbers. If you want to compute large integers you have to stay within the limit of uint.MAX_VALUE.
Cheers!
The number of combinations of k items which can be retrieved from N items is described by the following formula.
N!
c = ___________________
(k! * (N - k)!)
An example would be how many combinations of 6 Balls can be drawn from a drum of 48 Balls in a lottery draw.
Optimize this formula to run with the smallest O time complexity
This question was inspired by the new WolframAlpha math engine and the fact that it can calculate extremely large combinations very quickly. e.g. and a subsequent discussion on the topic on another forum.
http://www97.wolframalpha.com/input/?i=20000000+Choose+15000000
I'll post some info/links from that discussion after some people take a stab at the solution.
Any language is acceptable.
Python: O(min[k,n-k]2)
def choose(n,k):
k = min(k,n-k)
p = q = 1
for i in xrange(k):
p *= n - i
q *= 1 + i
return p/q
Analysis:
The size of p and q will increase linearly inside the loop, if n-i and 1+i can be considered to have constant size.
The cost of each multiplication will then also increase linearly.
This sum of all iterations becomes an arithmetic series over k.
My conclusion: O(k2)
If rewritten to use floating point numbers, the multiplications will be atomic operations, but we will lose a lot of precision. It even overflows for choose(20000000, 15000000). (Not a big surprise, since the result would be around 0.2119620413×104884378.)
def choose(n,k):
k = min(k,n-k)
result = 1.0
for i in xrange(k):
result *= 1.0 * (n - i) / (1 + i)
return result
Notice that WolframAlpha returns a "Decimal Approximation". If you don't need absolute precision, you could do the same thing by calculating the factorials with Stirling's Approximation.
Now, Stirling's approximation requires the evaluation of (n/e)^n, where e is the base of the natural logarithm, which will be by far the slowest operation. But this can be done using the techniques outlined in another stackoverflow post.
If you use double precision and repeated squaring to accomplish the exponentiation, the operations will be:
3 evaluations of a Stirling approximation, each requiring O(log n) multiplications and one square root evaluation.
2 multiplications
1 divisions
The number of operations could probably be reduced with a bit of cleverness, but the total time complexity is going to be O(log n) with this approach. Pretty manageable.
EDIT: There's also bound to be a lot of academic literature on this topic, given how common this calculation is. A good university library could help you track it down.
EDIT2: Also, as pointed out in another response, the values will easily overflow a double, so a floating point type with very extended precision will need to be used for even moderately large values of k and n.
I'd solve it in Mathematica:
Binomial[n, k]
Man, that was easy...
Python: approximation in O(1) ?
Using python decimal implementation to calculate an approximation. Since it does not use any external loop, and the numbers are limited in size, I think it will execute in O(1).
from decimal import Decimal
ln = lambda z: z.ln()
exp = lambda z: z.exp()
sinh = lambda z: (exp(z) - exp(-z))/2
sqrt = lambda z: z.sqrt()
pi = Decimal('3.1415926535897932384626433832795')
e = Decimal('2.7182818284590452353602874713527')
# Stirling's approximation of the gamma-funciton.
# Simplification by Robert H. Windschitl.
# Source: http://en.wikipedia.org/wiki/Stirling%27s_approximation
gamma = lambda z: sqrt(2*pi/z) * (z/e*sqrt(z*sinh(1/z)+1/(810*z**6)))**z
def choose(n, k):
n = Decimal(str(n))
k = Decimal(str(k))
return gamma(n+1)/gamma(k+1)/gamma(n-k+1)
Example:
>>> choose(20000000,15000000)
Decimal('2.087655025913799812289651991E+4884377')
>>> choose(130202807,65101404)
Decimal('1.867575060806365854276707374E+39194946')
Any higher, and it will overflow. The exponent seems to be limited to 40000000.
Given a reasonable number of values for n and K, calculate them in advance and use a lookup table.
It's dodging the issue in some fashion (you're offloading the calculation), but it's a useful technique if you're having to determine large numbers of values.
MATLAB:
The cheater's way (using the built-in function NCHOOSEK): 13 characters, O(?)
nchoosek(N,k)
My solution: 36 characters, O(min(k,N-k))
a=min(k,N-k);
prod(N-a+1:N)/prod(1:a)
I know this is a really old question but I struggled with a solution to this problem for a long while until I found a really simple one written in VB 6 and after porting it to C#, here is the result:
public int NChooseK(int n, int k)
{
var result = 1;
for (var i = 1; i <= k; i++)
{
result *= n - (k - i);
result /= i;
}
return result;
}
The final code is so simple you won't believe it will work until you run it.
Also, the original article gives some nice explanation on how he reached the final algorithm.