Related
I'm currently writing some code where I have something along the lines of:
double a = SomeCalculation1();
double b = SomeCalculation2();
if (a < b)
DoSomething2();
else if (a > b)
DoSomething3();
And then in other places I may need to do equality:
double a = SomeCalculation3();
double b = SomeCalculation4();
if (a == 0.0)
DoSomethingUseful(1 / a);
if (b == 0.0)
return 0; // or something else here
In short, I have lots of floating point math going on and I need to do various comparisons for conditions. I can't convert it to integer math because such a thing is meaningless in this context.
I've read before that floating point comparisons can be unreliable, since you can have things like this going on:
double a = 1.0 / 3.0;
double b = a + a + a;
if ((3 * a) != b)
Console.WriteLine("Oh no!");
In short, I'd like to know: How can I reliably compare floating point numbers (less than, greater than, equality)?
The number range I am using is roughly from 10E-14 to 10E6, so I do need to work with small numbers as well as large.
I've tagged this as language agnostic because I'm interested in how I can accomplish this no matter what language I'm using.
TL;DR
Use the following function instead of the currently accepted solution to avoid some undesirable results in certain limit cases, while being potentially more efficient.
Know the expected imprecision you have on your numbers and feed them accordingly in the comparison function.
bool nearly_equal(
float a, float b,
float epsilon = 128 * FLT_EPSILON, float abs_th = FLT_MIN)
// those defaults are arbitrary and could be removed
{
assert(std::numeric_limits<float>::epsilon() <= epsilon);
assert(epsilon < 1.f);
if (a == b) return true;
auto diff = std::abs(a-b);
auto norm = std::min((std::abs(a) + std::abs(b)), std::numeric_limits<float>::max());
// or even faster: std::min(std::abs(a + b), std::numeric_limits<float>::max());
// keeping this commented out until I update figures below
return diff < std::max(abs_th, epsilon * norm);
}
Graphics, please?
When comparing floating point numbers, there are two "modes".
The first one is the relative mode, where the difference between x and y is considered relatively to their amplitude |x| + |y|. When plot in 2D, it gives the following profile, where green means equality of x and y. (I took an epsilon of 0.5 for illustration purposes).
The relative mode is what is used for "normal" or "large enough" floating points values. (More on that later).
The second one is an absolute mode, when we simply compare their difference to a fixed number. It gives the following profile (again with an epsilon of 0.5 and a abs_th of 1 for illustration).
This absolute mode of comparison is what is used for "tiny" floating point values.
Now the question is, how do we stitch together those two response patterns.
In Michael Borgwardt's answer, the switch is based on the value of diff, which should be below abs_th (Float.MIN_NORMAL in his answer). This switch zone is shown as hatched in the graph below.
Because abs_th * epsilon is smaller that abs_th, the green patches do not stick together, which in turn gives the solution a bad property: we can find triplets of numbers such that x < y_1 < y_2 and yet x == y2 but x != y1.
Take this striking example:
x = 4.9303807e-32
y1 = 4.930381e-32
y2 = 4.9309825e-32
We have x < y1 < y2, and in fact y2 - x is more than 2000 times larger than y1 - x. And yet with the current solution,
nearlyEqual(x, y1, 1e-4) == False
nearlyEqual(x, y2, 1e-4) == True
By contrast, in the solution proposed above, the switch zone is based on the value of |x| + |y|, which is represented by the hatched square below. It ensures that both zones connects gracefully.
Also, the code above does not have branching, which could be more efficient. Consider that operations such as max and abs, which a priori needs branching, often have dedicated assembly instructions. For this reason, I think this approach is superior to another solution that would be to fix Michael's nearlyEqual by changing the switch from diff < abs_th to diff < eps * abs_th, which would then produce essentially the same response pattern.
Where to switch between relative and absolute comparison?
The switch between those modes is made around abs_th, which is taken as FLT_MIN in the accepted answer. This choice means that the representation of float32 is what limits the precision of our floating point numbers.
This does not always make sense. For example, if the numbers you compare are the results of a subtraction, perhaps something in the range of FLT_EPSILON makes more sense. If they are squared roots of subtracted numbers, the numerical imprecision could be even higher.
It is rather obvious when you consider comparing a floating point with 0. Here, any relative comparison will fail, because |x - 0| / (|x| + 0) = 1. So the comparison needs to switch to absolute mode when x is on the order of the imprecision of your computation -- and rarely is it as low as FLT_MIN.
This is the reason for the introduction of the abs_th parameter above.
Also, by not multiplying abs_th with epsilon, the interpretation of this parameter is simple and correspond to the level of numerical precision that we expect on those numbers.
Mathematical rumbling
(kept here mostly for my own pleasure)
More generally I assume that a well-behaved floating point comparison operator =~ should have some basic properties.
The following are rather obvious:
self-equality: a =~ a
symmetry: a =~ b implies b =~ a
invariance by opposition: a =~ b implies -a =~ -b
(We don't have a =~ b and b =~ c implies a =~ c, =~ is not an equivalence relationship).
I would add the following properties that are more specific to floating point comparisons
if a < b < c, then a =~ c implies a =~ b (closer values should also be equal)
if a, b, m >= 0 then a =~ b implies a + m =~ b + m (larger values with the same difference should also be equal)
if 0 <= λ < 1 then a =~ b implies λa =~ λb (perhaps less obvious to argument for).
Those properties already give strong constrains on possible near-equality functions. The function proposed above verifies them. Perhaps one or several otherwise obvious properties are missing.
When one think of =~ as a family of equality relationship =~[Ɛ,t] parameterized by Ɛ and abs_th, one could also add
if Ɛ1 < Ɛ2 then a =~[Ɛ1,t] b implies a =~[Ɛ2,t] b (equality for a given tolerance implies equality at a higher tolerance)
if t1 < t2 then a =~[Ɛ,t1] b implies a =~[Ɛ,t2] b (equality for a given imprecision implies equality at a higher imprecision)
The proposed solution also verifies these.
Comparing for greater/smaller is not really a problem unless you're working right at the edge of the float/double precision limit.
For a "fuzzy equals" comparison, this (Java code, should be easy to adapt) is what I came up with for The Floating-Point Guide after a lot of work and taking into account lots of criticism:
public static boolean nearlyEqual(float a, float b, float epsilon) {
final float absA = Math.abs(a);
final float absB = Math.abs(b);
final float diff = Math.abs(a - b);
if (a == b) { // shortcut, handles infinities
return true;
} else if (a == 0 || b == 0 || diff < Float.MIN_NORMAL) {
// a or b is zero or both are extremely close to it
// relative error is less meaningful here
return diff < (epsilon * Float.MIN_NORMAL);
} else { // use relative error
return diff / (absA + absB) < epsilon;
}
}
It comes with a test suite. You should immediately dismiss any solution that doesn't, because it is virtually guaranteed to fail in some edge cases like having one value 0, two very small values opposite of zero, or infinities.
An alternative (see link above for more details) is to convert the floats' bit patterns to integer and accept everything within a fixed integer distance.
In any case, there probably isn't any solution that is perfect for all applications. Ideally, you'd develop/adapt your own with a test suite covering your actual use cases.
I had the problem of Comparing floating point numbers A < B and A > B
Here is what seems to work:
if(A - B < Epsilon) && (fabs(A-B) > Epsilon)
{
printf("A is less than B");
}
if (A - B > Epsilon) && (fabs(A-B) > Epsilon)
{
printf("A is greater than B");
}
The fabs--absolute value-- takes care of if they are essentially equal.
We have to choose a tolerance level to compare float numbers. For example,
final float TOLERANCE = 0.00001;
if (Math.abs(f1 - f2) < TOLERANCE)
Console.WriteLine("Oh yes!");
One note. Your example is rather funny.
double a = 1.0 / 3.0;
double b = a + a + a;
if (a != b)
Console.WriteLine("Oh no!");
Some maths here
a = 1/3
b = 1/3 + 1/3 + 1/3 = 1.
1/3 != 1
Oh, yes..
Do you mean
if (b != 1)
Console.WriteLine("Oh no!")
Idea I had for floating point comparison in swift
infix operator ~= {}
func ~= (a: Float, b: Float) -> Bool {
return fabsf(a - b) < Float(FLT_EPSILON)
}
func ~= (a: CGFloat, b: CGFloat) -> Bool {
return fabs(a - b) < CGFloat(FLT_EPSILON)
}
func ~= (a: Double, b: Double) -> Bool {
return fabs(a - b) < Double(FLT_EPSILON)
}
Adaptation to PHP from Michael Borgwardt & bosonix's answer:
class Comparison
{
const MIN_NORMAL = 1.17549435E-38; //from Java Specs
// from http://floating-point-gui.de/errors/comparison/
public function nearlyEqual($a, $b, $epsilon = 0.000001)
{
$absA = abs($a);
$absB = abs($b);
$diff = abs($a - $b);
if ($a == $b) {
return true;
} else {
if ($a == 0 || $b == 0 || $diff < self::MIN_NORMAL) {
return $diff < ($epsilon * self::MIN_NORMAL);
} else {
return $diff / ($absA + $absB) < $epsilon;
}
}
}
}
You should ask yourself why you are comparing the numbers. If you know the purpose of the comparison then you should also know the required accuracy of your numbers. That is different in each situation and each application context. But in pretty much all practical cases there is a required absolute accuracy. It is only very seldom that a relative accuracy is applicable.
To give an example: if your goal is to draw a graph on the screen, then you likely want floating point values to compare equal if they map to the same pixel on the screen. If the size of your screen is 1000 pixels, and your numbers are in the 1e6 range, then you likely will want 100 to compare equal to 200.
Given the required absolute accuracy, then the algorithm becomes:
public static ComparisonResult compare(float a, float b, float accuracy)
{
if (isnan(a) || isnan(b)) // if NaN needs to be supported
return UNORDERED;
if (a == b) // short-cut and takes care of infinities
return EQUAL;
if (abs(a-b) < accuracy) // comparison wrt. the accuracy
return EQUAL;
if (a < b) // larger / smaller
return SMALLER;
else
return LARGER;
}
The standard advice is to use some small "epsilon" value (chosen depending on your application, probably), and consider floats that are within epsilon of each other to be equal. e.g. something like
#define EPSILON 0.00000001
if ((a - b) < EPSILON && (b - a) < EPSILON) {
printf("a and b are about equal\n");
}
A more complete answer is complicated, because floating point error is extremely subtle and confusing to reason about. If you really care about equality in any precise sense, you're probably seeking a solution that doesn't involve floating point.
I tried writing an equality function with the above comments in mind. Here's what I came up with:
Edit: Change from Math.Max(a, b) to Math.Max(Math.Abs(a), Math.Abs(b))
static bool fpEqual(double a, double b)
{
double diff = Math.Abs(a - b);
double epsilon = Math.Max(Math.Abs(a), Math.Abs(b)) * Double.Epsilon;
return (diff < epsilon);
}
Thoughts? I still need to work out a greater than, and a less than as well.
I came up with a simple approach to adjusting the size of epsilon to the size of the numbers being compared. So, instead of using:
iif(abs(a - b) < 1e-6, "equal", "not")
if a and b can be large, I changed that to:
iif(abs(a - b) < (10 ^ -abs(7 - log(a))), "equal", "not")
I suppose that doesn't satisfy all the theoretical issues discussed in the other answers, but it has the advantage of being one line of code, so it can be used in an Excel formula or an Access query without needing a VBA function.
I did a search to see if others have used this method and I didn't find anything. I tested it in my application and it seems to be working well. So it seems to be a method that is adequate for contexts that don't require the complexity of the other answers. But I wonder if it has a problem I haven't thought of since no one else seems to be using it.
If there's a reason the test with the log is not valid for simple comparisons of numbers of various sizes, please say why in a comment.
You need to take into account that the truncation error is a relative one. Two numbers are about equal if their difference is about as large as their ulp (Unit in the last place).
However, if you do floating point calculations, your error potential goes up with every operation (esp. careful with subtractions!), so your error tolerance needs to increase accordingly.
The best way to compare doubles for equality/inequality is by taking the absolute value of their difference and comparing it to a small enough (depending on your context) value.
double eps = 0.000000001; //for instance
double a = someCalc1();
double b = someCalc2();
double diff = Math.abs(a - b);
if (diff < eps) {
//equal
}
I have 16 unrelated binary strings (of the same length). eg. 100000001010, 010100010010 and so on, and I need to find out a bitstring in which position x is a 1 IF position x is 1 for ATLEAST 2 bitstrings out of the 16.
Initially, I tries using bitwise XOR and this works great as long as even number of strings contain a 1, but when odd number of strings contain 1, the answer given is reverse.
A simple example (with 3 strings) would be:
A: 10101010
B: 01010111
C: 11011011
f(A,B,C)= answer
Expected answer: 11011011
Answer I'm getting right now: 11011001
I know I'm wrong somewhere but I'm at a loss on how to proceed
Help much appreciated
You can do something like
unsigned once = x[0], twice = 0;
for (int i = 1; i < 16; ++i) {
twice |= once & x[i];
once |= x[i];
}
(A AND B) OR (A AND C) OR (B AND C)
This is higher complexity than what you had originally.
How would you plot these in SciLab or MatLab? I am new to these and have no idea how the software works. Please help.
$Plot following functions with different colors in Scilab or MatLab
– f2(x) = logn
– f3(x) = n
– f4(x) = nlogn
– f5(x) = n2
– f6(x) = nj (j > 2)
– f7(x) = cn (c > 1)
– f8(x) = n!
where x = linspace(1, 50, 50).
Well, a lot of these are built-in functions. For example
>> x = linspace(1,50,50);
>> plot(x,log(x))
>> plot(x,x)
>> plot(x,x.*log(x))
>> plot(x,x.^2)
I don't know what nj (j > 2) and cn (c > 1) are supposed to mean.
For the last one, you should look at the function factorial.
It's not clear from the context whether you're supposed to plot them on different graphs or all on the same graph. If all on the same graph, then you can use
>> hold on;
to freeze the current axes - that means that any new lines will get drawn on top of the old ones, instead of being drawn on a fresh set of axes.
In Matlab (and probably in Scilab) you can supply a "line spec" argument to the plot function, which tells it what color and style to draw the line in. For example,
>> figure
>> hold on
>> plot(x,log(x),'b')
>> plot(x,x/10,'r')
>> plot(x,x.^2/1000,'g')
Tells Matlab to plot the function f(x)=log(x) in blue, f(x)=x/10 in red and f(x)=x^2/1000 in green, which results in this plot:
I can't comment or upvote yet but I'd add to Chris Taylor's answer that in Scilab the hold on and hold off convention isn't used. All plot commands output to the current axes, which are 'held on' all the time. If you want to generate a new figure or change the current axes you can use figure(n), where n can be any (nonconsecutive) positive integer - just a label really.
See also clf(n), gcf() and gca() - Scilab's figure handling differs quite a bit from Matlab's, though the matplotlib ATOMS module goes some way towards making Scilab look and behave more like Matlab.
In Scilab, it will be
x = 1:50;
clf
plot("ll", x,log, x,x, x,x.*log(x), x,x.^2)
gca().sub_ticks(2) = 8;
xgrid(color("grey"))
legend("$"+["ln(x)", "x", "x.ln(x)", "x^2"]+"$", "in_upper_left")
I've got these data types:
data PointPlus = PointPlus
{ coords :: Point
, velocity :: Vector
} deriving (Eq)
data BodyGeo = BodyGeo
{ pointPlus :: PointPlus
, size :: Point
} deriving (Eq)
data Body = Body
{ geo :: BodyGeo
, pict :: Color
} deriving (Eq)
It's the base datatype for characters, enemies, objects, etc. in my game (well, I just have two rectangles as the player and the ground right now :p).
When a key, the characters moves right, left or jumps by changing its velocity. Moving is done by adding the velocity to the coords. Currently, it's written as follows:
move (PointPlus (x, y) (xi, yi)) = PointPlus (x + xi, y + yi) (xi, yi)
I'm just taking the PointPlus part of my Body and not the entire Body, otherwise it would be:
move (Body (BodyGeo (PointPlus (x, y) (xi, yi)) wh) col) = (Body (BodyGeo (PointPlus (x + xi, y + yi) (xi, yi)) wh) col)
Is the first version of move better? Anyway, if move only changes PointPlus, there must be another function that calls it inside a new Body. I explain: there's a function update which is called to update the game state; it is passed the current game state, a single Body for now, and returns the updated Body.
update (Body (BodyGeo (PointPlus xy (xi, yi)) wh) pict) = (Body (BodyGeo (move (PointPlus xy (xi, yi))) wh) pict)
That tickles me. Everything is kept the same within Body except the PointPlus. Is there a way to avoid this complete "reconstruction" by hand? Like in:
update body = backInBody $ move $ pointPlus body
Without having to define backInBody, of course.
You're looking for "lenses". There are several different packages for lenses; here is a good summary of them.
My understanding is that a lens on a data type a for some field b provides two operations: a way to get the value of b and a way to get a new a with a different value of b. So you would just use a lens to work with the deeply nested PointPlus.
The lens packages provide useful functions for working with lenses as well as ways of generating lenses automatically (with template Haskell) which could be very convenient.
I think they are worth looking into for your project, especially because you are likely to encounter similar problems with nesting in other places thanks to the structure of your data types.
So, I was just thinking about how cool chaining is and how it makes things easier to read. With a lot of languages, when applying a bunch of functions to a variable, you'd write something like this:
i(h(g(f(x))))
And you have to read it from right-to-left or inner-most to outer-most. You apply f first, then g, and so forth. But if it were chained, it would look more like
x|f|g|h|i
And you could read it like a normal human being. So, my question is, there has to be some languages that do it that way, what are they? Is that what these fancy-pants functional programming languages do?
Because of this, I usually end up creating a whole bunch of temp variables so that I can split it onto separate lines and make it more readable:
a = f(x)
b = g(a)
c = h(b)
what_i_really_wanted_all_along = i(c)
Where's with my magical language, you could still split it onto different lines, if they're getting too long, without needing intervening variables:
x | f
| g
| h
| i
Yes, with F# you have a pipeline operator |> (also called forward pipe operator, and you have a backward pipe <|).
You write it like: x |> f |> g |> h |> i
Check this blog post that gives a good idea of real life usage.
It's not exclusive to functional programming, though it probably best implemented in functional languages, since the whole concept of function composition is squarely in the functional programming's domain.
For one thing, any language with object-oriented bent has chaining for methods which return an instance of the class:
obj.method1().method2().method3(); // JavaScript
MyClass->new()->f()->g()->i(); # Perl
Alternately, the most famous yet the least "programming-language" example of this chaining pattern would be something completely non-OO and non-functional ... you guessed it, pipes in Unix. As in, ls | cut -c1-4 | sort -n. Since shell programming is considered a language, I say it's a perfectly valid example.
Well, you can do this in JavaScript and its relatives:
function compose()
{
var funcs = Array.prototype.slice.call(arguments);
return function(x)
{
var i = 0, len = funcs.length;
while(i < len)
{
x = funcs[i].call(null, x);
++i;
}
return x;
}
}
function doubleIt(x) { print('Doubling...'); return x * 2; }
function addTwo(x) { print('Adding 2...'); return x + 2; }
function tripleIt(x) { print('Tripling...'); return x * 3; }
var theAnswer = compose(doubleIt, addTwo, tripleIt)( 6 );
print( 'The answer is: ' + theAnswer );
// Prints:
// Doubling...
// Adding 2...
// Tripling...
// The answer is: 42
As you can see, the functions read left-to-right and neither the object nor the functions need any special implementation. The secret is all in compose.
What you're describing is essentially the Fluent Interface pattern.
Wikipedia has a good example from a number of languages:
http://en.wikipedia.org/wiki/Fluent_interface
And Martin Fowler has his write up here:
http://www.martinfowler.com/bliki/FluentInterface.html
As DVK points out - any OO language where a method can return an instance of the class it belongs to can provide this functionality.
C# extension methods accomplish something very close to your magical language, if a little less concisely:
x.f()
.g()
.h()
.i();
Where the methods are declared thus:
static class Extensions
{
static T f<T>(this T x) { return x; }
static T g<T>(this T x) { return x; }
...
}
Linq uses this very extensively.
Haskell. The following three examples are equivalent:
i(h(g(f(x)))) (Nested function calls)
x & f & g & h & i (Left-to-right chaining as requested)
(i . h . g . f)(x) (Function composition, which is more common in Haskell)
http://www.haskell.org/haskellwiki/Function_composition
http://en.wikipedia.org/wiki/Function_composition_(computer_science)
I am not suggesting you could use Mathematica if you don't do some math usually, but it certainly is flexible enough for supporting Postfix notation. In fact, you may define your own notation, but let's keep with Postfix for simplicity.
You may enter:
Postfix[Sin[x]]
To get
x // Sin
Which translates to Postfix notation. Or if you have a deeper expression:
MapAll[Postfix, Cos[Sin[x]]]
To get:
(Postfix[x]//Sin)//Cos
Where you may see Postfix[x] first, as for Mathematica x is an expression to be evaluated later.
Conversely, you may input:
x // Sin // Cos
To get of course
Cos[Sin[x]]
Or you can use an idiom very frequently used, use Postfix in Postfix form:
Cos[x] // Postfix
To get
x // Cos
HTH!
BTW:
As an answer to Where's with my magical language,? , see this:
(x//Sin
// Cos
// Exp
// Myfunct)
gives
Myfunct[E^Cos[Sin[x]]]
PS: As an excercise to the readers :) ... How to do this for functions that take n vars?
As has been previously mentioned, Haskell supports function composition, as follows:
(i . h . g . f) x, which is equivalent to: i(h(g(f(x))))
This is the standard order of operations for function composition in mathematics. Some people still consider this to be backward, however. Without getting too much into a debate over which approach is better, I would like to point out that you can easily define the flipped composition operator:
infixr 1 >>>, <<<
(<<<) = (.) -- regular function composition
(>>>) = flip (.) -- flipped function composition
(f >>> g >>> h >>> i) x
-- or --
(i <<< h <<< g <<< f) x
This is the notation used by the standard library Control.Category. (Although the actual type signature is generalized and works on other things besides functions). If you're still bothered by the parameter being at the end, you can also use a variant of the function application operator:
infixr 0 $
infixl 0 #
f $ x = f x -- regular function application
(%) = flip ($) -- flipped function application
i $ h $ g $ f $ x
-- or --
x % f % g % h % i
Which is close to the syntax you want. To my knowledge, % is NOT a built-in operator in Haskell, but $ is. I've glossed over the infix bits. If you're curious, thats a technicality that makes the above code parse as:
(((x % f) % g) % h) % i -- intended
and not:
x % (f % (g % (h % i))) -- parse error (which then leads to type error)