I have an unsorted list of noisy X, Y points. They do, however, form a path through the world. I would like an algorithm to draw an approximation of this data using line segments.
This is similar to how you would use a line -fitting algorithm to pick an approximation of linear data. My problem is only harder because the path bends and winds around the world.
alt text http://www.praeclarum.org/so/pathfinder.png
Does anyone know of any standard / robust / easy to comprehend algorithms to accomplish this?
Q&A:
What do you mean by noisy? If I had an ideal realization of the path, then my set of points would be sampled from that ideal path with gaussian noise added to the X and Y elements. I do not know the mean or standard deviation of that noise. I may be able to guess at the std dev...
Do the points lie near, but not on, some ideal but complicated path which you seek to approximate? Yes.
Do you have any a priori information about he shape of the path? Any other way to get such information? Unfortunately not.
Bezier Interpolation may fit your problem.
This does not address the ordering of the points into a path, however; there are a number of approaches to consider:
Any "optimal" type of path (e.g. smallest direction change at each point on the path, * Shortest path through all points) will likely boil down the NP complete Travelling Salesman Problem (TSP).
A "reasonable" path to cluster the nodes and then route between clusters, and within clusters. Of course, the larger the cluster, or the larger the number of clusters the more this smaller problem looks like a large n TSP.
Ordering the points by one axis. If there are much more than 2 axes, some dimensional reduction strategy may be useful. e.g. Independent Component Analysis.
With an unsorted list, you won't really know which points to include in each segment, so I guess you could just go with the closest point.
One way could be to pick a start point at random, and pick the closest point as the next point in each step. Add the first two points to a set S.
Fit a line to the points in S until the RMS exceeds some value, then clear S and start a new line.
The intersection of consecutive lines would be the end-points of the segments.
If your points are close to each other, you can normal "straight" lines (orthogonal lines). Using the normal smoothing algorithms. You can see the world as being flat.
If they are far apart, you need to compensate for the rounding of the earth, by using great circles to navigate from point to point. Otherwise your straight lines will make a longer way.
It is your choice if a point is too far to create straight lines.
Further you have to know if you need to "visit" each point, or just need to go near, and how near that near is.
If you need to send the course(s) to a plane, ship or other traveller, you probably need to visit each point. If you get the GPS data from an object, you probably just want to plot a course on a screen, and remove the noise.
After seeing your edits:
If this is an object moving some traject you want to plot, you might want to smooth the direction and speed instead of the x/y values. (Making your measured values (x) have a fixed and increasing Y-interval makes smoothing a lot easier.)
Here is a heuristic hack that might address the ordering problem for the data, if
you have enough points
the mean distance between points is small compared to the smallest radius of curvature expected of the path
the mean distance between points is not large compared to the std. dev. of the noise
the path is not self-crossing (you might get lucky, but no guarantees)
Proceed like this:
Pick (hopefully by a meaningful rather than random means) a starting point, p1.
Find all the points that lie within some clustering distance, r_c of p1. Choose r_c small compared to the expected turning radius, but large compared to the scatter.
Call this cluster C1.
Find point q1 the mean of positions in C1.
Fit a line to the points in C1 and project to (or just beyond) the edge of the cluster, and find the nearest point in your original data. Label that point p2.
Iterate steps 2-5 until you run out of data.
Now you have a new list of points q1..qn that are ordered.
Off the top of my head, very rough, and only works under pretty good conditions...
Self-crossing behavior can probably be improved by requiring in step (5) that the new projected line lie within some maximum angle of the previous one.
The problem with the Bezier curve is that is doesn't actually go though the points you have sampled and even though the points samples are distorted a little; the bezier curve might actually be miles off.
A better approximation, and a solution that seems to resemble the original image way better is a Catmull-Rom Spline because it does run though all the points in the curve.
My approach would be to first sort your list of points, then use a bezier curve.
The trick is of course the sorting. Start with one random point and find the nearest point. Assume these two are connected. With those two endpoints, find the nearest points to them. Assume that the one with the smaller distance to it's endpoint is connected to that point. Repeat until all points are connected.
I assume that there are still some problems with this approach, but maybe you can use it as a starting point (pun intended).
Edit: You can do it several times with different starting points, and then see where the results differ. That at least gives you some confidence, which points are connected to each other.
A completely different approach, that does not require another constraint, but details may depend on your application. It sghould work best if you have a "dense cloud of points" around the path.
Use a "cost" function that defines the difference between the curve and the cloud of points. Use a parametrized curve, and a standard optimization algorithm. - OR -
Start with a straight curve from start to end, then use a genetic algorithm to modify it.
The typical cost function would be to take the smallest distance between each point and the curve, and sum the squares.
I have not enough experience to suggest an optimization or genetic algorithm but I am sure it can be done :)
I could imagine a genetic algorithm as follows:
The path will be built from Waypoints. Start with putting N waypoints in a straigt line from start to end. (N can be chosen depending on the problem). Mutations could be:
For each segment, if rnd() < x, a new waypoint is introduced in the middle.
For each waypoint, the X and Y coordinate are varied slightly.
You will need to include the total length in the cost function. Splitting might not be needed, or maybe x (the "split chance") might need to decrease as more waypoints are introduced. You may or may not want to apply (2) to the start- and endpoint.
Would be fun to try that...
I take it that "unsorted list" means that while your set of points is complete, you don't know what order they were travelled through?
The gaussian noise has to be basically ignored. We're given absolutely no information that allows us to make any attempt to reconstruct the original, un-noisy path. So I think the best we can do is assume the points are correct.
At this point, the task consists of "find the best path through a set of points", with "best" left vague. I whipped up some code that attempts to order a set of points in euclidean space, preferring orderings that result in straighter lines. While the metric was easy to implement, I couldn't think of a good way to improve the ordering based on that, so I just randomly swap points looking for a better arrangement.
So, here is some PLT Scheme code that does that.
#lang scheme
(require (only-in srfi/1 iota))
; a bunch of trig
(define (deg->rad d)
(* pi (/ d 180)))
(define (rad->deg r)
(* 180 (/ r pi)))
(define (euclidean-length v)
(sqrt (apply + (map (lambda (x) (expt x 2)) v))))
(define (dot a b)
(apply + (map * a b)))
(define (angle-ratio a b)
(/ (dot a b)
(* (euclidean-length a) (euclidean-length b))))
; given a list of 3 points, calculate the likelihood of the
; angle they represent. straight is better.
(define (probability-triple a b c)
(let ([av (map - a b)]
[bv (map - c b)])
(cos (/ (- pi (abs (acos (angle-ratio av bv)))) 2))))
; makes a random 2d point. uncomment the bit for a 3d point
(define (random-point . x)
(list (/ (random 1000) 100)
(/ (random 1000) 100)
#;(/ (random 1000) 100)))
; calculate the likelihood of an entire list of points
(define (point-order-likelihood lst)
(if (null? (cdddr lst))
1
(* (probability-triple (car lst)
(cadr lst)
(caddr lst))
(point-order-likelihood (cdr lst)))))
; just print a list of points
(define (print-points lst)
(for ([p (in-list lst)])
(printf "~a~n"
(string-join (map number->string
(map exact->inexact p))
" "))))
; attempts to improve upon a list
(define (find-better-arrangement start
; by default, try only 10 times to find something better
[tries 10]
; if we find an arrangement that is as good as one where
; every segment bends by 22.5 degrees (which would be
; reasonably gentle) then call it good enough. higher
; cut offs are more demanding.
[cut-off (expt (cos (/ pi 8))
(- (length start) 2))])
(let ([vec (list->vector start)]
; evaluate what we've started with
[eval (point-order-likelihood start)])
(let/ec done
; if the current list exceeds the cut off, we're done
(when (> eval cut-off)
(done start))
; otherwise, try no more than 'tries' times...
(for ([x (in-range tries)])
; pick two random points in the list
(let ([ai (random (vector-length vec))]
[bi (random (vector-length vec))])
; if they're the same...
(when (= ai bi)
; increment the second by 1, wrapping around the list if necessary
(set! bi (modulo (add1 bi) (vector-length vec))))
; take the values from the two positions...
(let ([a (vector-ref vec ai)]
[b (vector-ref vec bi)])
; swap them
(vector-set! vec bi a)
(vector-set! vec ai b)
; make a list out of the vector
(let ([new (vector->list vec)])
; if it evaluates to better
(when (> (point-order-likelihood new) eval)
; start over with it
(done (find-better-arrangement new tries cut-off)))))))
; we fell out the bottom of the search. just give back what we started with
start)))
; evaluate, display, and improve a list of points, five times
(define points (map random-point (iota 10)))
(define tmp points)
(printf "~a~n" (point-order-likelihood tmp))
(print-points tmp)
(set! tmp (find-better-arrangement tmp 10))
(printf "~a~n" (point-order-likelihood tmp))
(print-points tmp)
(set! tmp (find-better-arrangement tmp 100))
(printf "~a~n" (point-order-likelihood tmp))
(print-points tmp)
(set! tmp (find-better-arrangement tmp 1000))
(printf "~a~n" (point-order-likelihood tmp))
(print-points tmp)
(set! tmp (find-better-arrangement tmp 10000))
(printf "~a~n" (point-order-likelihood tmp))
(print-points tmp)
It seems that you know the 'golden curve' from your answers to questions, I would suggest finding the Bezier curve of the 'golden curve' as suggested by #jamesh and drawing that.
How many points you have?
A Bezier curve, as mentioned, is a good idea if you have comparedly few points. If you have many points, buiding clusters as suggested by dmckee.
However you also need another constraint for defining the order of points. There have been many good suggestions for how to chose the points, but unless you introduce another constraint, any gives a possible solution.
Possible constraints I can think of:
shortest path
most straight segments
least total absolute rotation
directional preference (i.e. horizontal / vertical is more likely than crisscrossing)
In all cases, to meet the constraint you probably need to test all permutations of the sequence. If you start with a "good guess", you cayn terminate the others quickly.
Related
Using Mathematica I need to evaluate the integral of a function. Since it is taking the program too much to compute it, would it be possible to use parallel computation to shorten the time needed? If so, how can I do it?
I uploaded a picture of the integrand function:
I need to integrate it with respect to (x3, y3, x, y) all of them ranging in a certain interval (x3 and y3 from 0 to 1) (x and y from 0 to 100). The parameters (a,b,c...,o) are preventing the NIntegrate function to work. Any suggestions?
If you evaluate this
expr=E^((-(x-y)^4-(x3-y3)^4)/10^4)*
(f x+e x^2+(m+n x)x3-f y-e y^2-(m+n y)y3)*
((378(x-y)^2(f x+e x^2+(m+n x)x3-f y-e y^2-(m+n y)y3))/
(Pi(1/40+Sqrt[((x-y)^2+(x3-y3)^2)^3]))+
(378(x-y)(x3-y3)(h x+g x^2+(o+p x)x3-h y-g y^2-(o+p y)y3))/
(Pi(1/40+Sqrt[((x-y)^2+(x3-y3)^2)^3])))+
(h x+g x^2+(o+p x)x3-h y-g y^2-(o +p y) y3)*
((378(x-y)(x3-y3)(f x+e x^2+(m+n x)x3-f y-e y^2-(m+n y)y3))/
(Pi(1/40+Sqrt[((x-y)^2+(x3-y3)^2)^3]))+
(378 (x3 - y3)^2 (h x + g x^2 + (o + p x)x3-h y-g y^2-(o+p y)y3))/
(Pi(1/40+Sqrt[((x-y)^2+(x3-y3)^2)^3])));
list=List ## Expand[expr]
then you will get a list of 484 expressions, each very similar in form to this
(378*f*h*x^3*x3)/(Pi*(1/40+Sqrt[(x^2+x3^2-2*x*y+y^2-2*x3*y3+y3^2)^3]))
Notice that you can then use NIntegrate in this way
f*h*NIntegrate[(378*x^3*x3)/(Pi*(1/40+Sqrt[(x^2+x3^2-2*x*y+y^2-2*x3*y3+y3^2)^3])),
{x,0,100},{y,0,100},{x3,0,1},{y3,0,1}]
but it gives warnings and errors about the convergence and accuracy, almost certainly due to your fractional powers in the denominator.
If you can find a way to pull out the scalar multipliers which are independent of x,y,x3,y3 and then perform that integration without warnings and errors and get an accurate result which isn't infinity then you could perhaps perform these integrals in parallel and total the results.
Some of the integrands are scalar multiples of others and if you combine similar integrands then you can reduce this down to 300 unique integrands.
I doubt this is going to lead to an acceptable solution for you.
Please check all this very carefully to make certain that no mistakes have been made.
EDIT
Since the variables that are independent of the integration appear to be easily separated from the dependent variables in the problem posed above, I think this will allow parallel NIntegrate
independentvars[z_] := (z/(z//.{e->1, f->1, g->1, h->1, m->1, n->1, o->1, p->1}))*
NIntegrate[(z//.{e->1, f->1, g->1, h->1, m->1, n->1, o->1, p->1}),
{x, 0, 100}, {y, 0, 100}, {x3, 0, 1}, {y3, 0, 1}]
Total[ParallelMap[independentvars, list]]
As I mentioned previously, the fractional powers in the denominator result in a flood of warnings and errors about convergence failing.
You can test this with the following much simpler example
expr = f x + f g x3 + o^2 x x3;
list = List ## Expand[expr];
Total[ParallelMap[independentvars, list]]
which instantly returns
500000. f + 5000. f g + 250000. o^2
This is a very primitive method of pulling independent symbolic variables outside an NIntegrate. This gives absolutely no warning if one of the integrands is not in a form where this primitive attempt at extraction is not appropriate or fails.
There may be a far better method that someone else has written out there somewhere. If someone could show a far better method of doing this then I would appreciate it.
It might be nice if Wolfram would consider incorporating something like this into NIntegrate itself.
Effectively what I'm looking for is a function f(x) that outputs into a range that is pre-defined. Calling f(f(x)) should be valid as well. The function should be cyclical, so calling f(f(...(x))) where the number of calls is equal to the size of the range should give you the original number, and f(x) should not be time dependent and will always give the same output.
While I can see that taking a list of all possible values and shuffling it would give me something close to what I want, I'd much prefer it if I could simply plug values into the function one at a time so that I do not have to compute the entire range all at once.
I've looked into Minimal Perfect Hash Functions but haven't been able to find one that doesn't use external libraries. I'm okay with using them, but would prefer to not do so.
If an actual range is necessary to help answer my question, I don't think it would need to be bigger than [0, 2^24-1], but the starting and ending values don't matter too much.
You might want to take a look at Linear Congruential Generator. You shall be looking at full period generator (say, m=224), which means parameters shall satisfy Hull-Dobell Theorem.
Calling f(f(x)) should be valid as well.
should work
the number of calls is equal to the size of the range should give you the original number
yes, for LCG with parameters satisfying Hull-Dobell Theorem you'll get full period covered once, and 'm+1' call shall put you back at where you started.
Period of such LCG is exactly equal to m
should not be time dependent and will always give the same output
LCG is O(1) algorithm and it is 100% reproducible
LCG is reversible as well, via extended Euclid algorithm, check Reversible pseudo-random sequence generator for details
Minimal perfect hash functions are overkill, all you've asked for is a function f that is,
bijective, and
"cyclical" (ie fN=f)
For a permutation to be cyclical in that way, its order must divide N (or be N but in a way that's just a special case of dividing N). Which in turn means the LCM of the orders of the sub-cycles must divide N. One way to do that is to just have one "sub"-cycle of order N. For power of two N, it's also really easy to have lots of small cycles of some other power-of-two order. General permutations do not necessarily satisfy the cycle-requirement, of course they are bijective but the LCM of the orders of the sub-cycles may exceed N.
In the following I will leave all reduction modulo N implicit. Without loss of generality I will assume the range starts at 0 and goes up to N-1, where N is the size of the range.
The only thing I can immediately think of for general N is f(x) = x + c where gcd(c, N) == 1. The GCD condition ensures there is only one cycle, which necessarily has order N.
For power-of-two N I have more inspiration:
f(x) = cx where c is odd. Bijective because gcd(c, N) == 1 so c has a modular multiplicative inverse. Also cN=1, because φ(N)=N/2 (since N is a power of two) so cφ(N)=1 (Euler's theorem).
f(x) = x XOR c where c < N. Trivially bijective and trivially cycles with a period of 2, which divides N.
f(x) = clmul(x, c) where c is odd and clmul is carry-less multiplication. Bijective because any odd c has a carry-less multiplicative inverse. Has some power-of-two cycle length (less than N) so it divides N. I don't know why though. This is a weird one, but it has decent special cases such as x ^ (x << k). By symmetry, the "mirrored" version also works.
Eg x ^ (x >> k).
f(x) = x >>> k where >>> is bit-rotation. Obviously bijective, and fN(x) = x >>> Nk, where Nk mod N = 0 so it rotates all the way back to the unrotated position regardless of what k is.
From my understanding, the atan2() function exists in programming languages because atan() itself cannot always determine the correct theta since the output is restricted to -pi/2 to pi/2.
If this is the case, then the same problem applies to both asin() and acos(), both of whom also have restricted ranges, so then why are there no asin2() and acos2() functions?
First off, note that the syntaxes of the two arctan functions are atan(y/x) and atan2(y, x). This distinction is important, because by not performing the division you provide additional information, most importantly the individual signs of x and y. If you know the individual x and y coordinates, the particular solution to the atan function can be found (i.e. the solution which takes into account the quadrant that (x,y) is in).
If you go from tan(θ) = y/x to sin(θ) = y/sqrt(x²+y²), then the inverse operation asin takes y and sqrt(x²+y²) and combines that to obtain some information about the angle. Here it doesn't matter whether we perform the division ourself or let some hypothetical asin2 function handle it. The denominator is always positive, so the divided argument contains just as much information as separate numerator and denominator contain. (At least in an IEEE environment where division by zero leads to a correctly-signed infinity.)
If you know the y coordinate and the hypothenuse sqrt(x²+y²) then you know the sine of the angle, but you cannot know the angle itself, since you cannot distinguish between negative and positive x values. Likewise, if you know the x coordinate and the hypothenuse, you know the cosine of the angle but you cannot know the sign of the y value.
So asin2 and acos2 are not mathematically feasible, at least not in an obvious way. If you had some kind of sign encoded into the hypothenuse, things might be different, but I can think of no situation where such a sign would arise naturally.
Because asin(y,x) acos(y,x) would each take the same parameters as atan(y,x) and each give the same answer. Each would be equally valid, but we only need one such function.
The unclarity arises from the name (of atan2). Its a function that given x and y, computes the angle (made by a line from the origin to this point) with the (positive) x-axis. A name like angle_from(x,y) would arguably have been more appropriate.
There are times when a function like "acos2" is needed, for example when performing rotations of vectors in 3D space. Under those circumstances, I hard-code my own acos2 function which simply performs the following checks:
x_perp=sqrt(x*x+y*y)
r=sqrt(x*x+y*y+z*z)
if(x_perp.gt.0.0d0) then
phi=acos(x/x_perp)
else
phi=0.0d0
endif
if(y.lt.0.0d0) phi=2.0d0*pi-phi
theta=acos(z/r)
where theta and phi are the usual spherical coordinates and x,y,z the Cartesian coordinates. The problem arises when y is negative, there needs to be a phase shift in phi. There is no such problem for theta.
I will explain in SIMPLE TERMS this way.
Refer to this image for the following explanation:
Task: Choose a function that will track the correct angle across a range -180 < θ < 180
Trial 1:
sin() is positive in the first and second quadrants, sin(30) = sin(150) = 0.5. It won't be easy to track quadrant change with sin().
Therefore, asin2() is not feasible.
Trial 2:
cos() is positive in the first and fourth quadrants, cos(60) = sin(300) = 0.5. Also, it won't be easy to track quadrant change with cos().
Therefore, acos2() is again not feasible.
Trial 3:
tan() is positive in the first and third quadrants, and in an interesting order.
It is positive in the 1st quadrant, negative in the 2nd, positive in the 3rd, negative in the 4th, and positive in the wrapped-around-1st quadrant.
such that tan(45) = 1 , tan(135) = -1, tan(225) = 1, tan(315) = -1, and tan(360+45) = 1. Hurray! we can track quadrant change.
Notice that the unambiguous range is -180 < θ < 180. Also, note in my 45-degree-increment example above, if the sequence is 1,-1,.. the angle goes counter-clockwise, and if the sequence is -1,1,.. it goes clockwise. This idea should resolve directionality.
Therefore, atan2() BECOMES OUR CHOICE.
The problem is as follows,
I would be given a set of x and y coordinates(an coordinate array of around 30 to 40 thousand) of a long rope. The rope is lying on the ground and can be in any shape.
Now I would be given a start point(essentially x and y coordinate) and an ending point.
What is the efficient way to determine the set of x and y coordinates from the above mentioned coordinate array lie between the start and end points.
Exhaustive searching ie looping 40k times is not an acceptable solution (mentioned on the question paper)
A little bit margin for error is acceptable
We need to find the start point in the array, then the end point. For each, we can think of the rope as describing a function of distance from that point, and we're looking for the lowest point on that distance graph. If one point is a long way away and another is pretty close, we can do some kind of interpolation guess of where to search next.
distance
| /---\
|-- \ /\ -
| -- ------- -- ------ ---------- -
| \ / \---/ \--/
+-----------------------X--------------------------- array index
In the representation above, we want to find "X"... we look at the distances at a few points, get an impression of the slope of the distance curve, possibly even the rate of change of that slope, to help guide our next bit of probing....
To refine the basic approach of doing binary- or interpolated- searches in areas where we know the distance values are low, we may be able to use the following:
if we happen to be given the rope length and know the coordinate samples are equidistant along the rope, then we can calculate a maximum change in distance from our target point per sample.
if we know the rope has a stiffness ensuring it can't loop in a trivially small diameter, then
there's a known limit to how fast the slope of the curve can change
distance curve converges to vertical on both sides of the 0 point
you could potentially cross-reference/combine distance with, or use instead, the direction of each point from the target: only at the target would the direction instantly change ~180 degrees (how well the data points capture this still depends on the distance between adjacent samples and any stiffness of the rope).
Otherwise, there's always risk the target point may weirdly be encased by two very distance points, frustrating our whole searching algorithm (that must be what they mean about some margin for error - every now and then this search would have to revert to a O(N) brute-force search because any trend analysis fails).
For a one-time search, sometimes linear traversal is the simplest, fastest solution. Maybe that's the case for this problem.
Iterate through the ordered list of points until finding the start or end, and then collect points until hitting the other endpoint.
Now, if we expected to repeat the search, we could build an index to the points.
Edit: This presumes no additional constraints beyond those mentioned by #koool. Constraining the distance between the points would allow the hill-climbing approach described in #Tony's answer.
I don't think you can solve it accurately using anything other than exhaustive search. Say for cases where the rope is folded into half and the resulting double rope forms a spiral with the two ends on the centre.
However if we assume that long portions of the rope are in straight line, then we can eliminate a lot of points based on the slope check:
if (abs(slope(x[i],y[i],x[i+1],y[i+1])
-slope(x[i+1],y[i+1],x[i+2],y[i+2]))<tolerance)
eliminate (x[i+1],y[i+1]);
This will reduce the search time significantly if large portions of the rope are in straight line. But will be linear WRT number of remaining points.
So basically, you've got a sorted list of the points that comprise the entire rope and you're given two arbitrary points from within that list, and tasked with returning the sublist that exists between those two points.
I'm going to make the assumption that the start and end points that are provided are guaranteed to coincide exactly with points within the sorted list (otherwise it introduces a host of issues, particularly if the rope may be arbitrarily thin and passes by the start/end points multiple times).
That means all you're really looking for are the indices of the two provided coordinates. Or the index of one, and the answer to "is the second coordinate to the right or to the left?".
A simple O(n) solution to that would be:
For each index in array
coord = array[index]
if (coord == point1)
startIndex = index
if (coord == point2)
endIndex = index
if (endIndex < startIndex)
swap(startIndex, endIndex)
return array.sublist(startIndex, endIndex)
Or, if you wanted to optimize for repeated queries, I'd suggest a hashing based approach where you map each cooordinate to its index in the array. Something like:
//build the map (do this once, at init)
map = {}
For each index in array
coord = array[index]
map[coord] = index
//find a sublist (do this for each set of start/end points)
startIndex = map[point1]
endIndex = map[point2]
if (endIndex < startIndex)
swap(startIndex, endIndex)
return array.sublist(startIndex, endIndex)
That's O(n) to build the map, but once it's built you can determine the sublist between any two points in O(1). Assuming an efficient hashmap, of course.
Note that if my assumption doesn't hold, then the same solutions are still usable, provided that as a first step you take the provided start and end points and locate the points in the array that best correspond to each one. As noted, unless you are given some constraints regarding the thickness of the rope then interpolating from an arbitrary coordinate to one that's actually part of the rope can only be guesswork at best.
Reading this question got me thinking: For a given function f, how can we know that a loop of this form:
while (x > 2)
x = f(x)
will stop for any value x? Is there some simple criterion?
(The fact that f(x) < x for x > 2 doesn't seem to help since the series may converge).
Specifically, can we prove this for sqrt and for log?
For these functions, a proof that ceil(f(x))<x for x > 2 would suffice. You could do one iteration -- to arrive at an integer number, and then proceed by simple induction.
For the general case, probably the best idea is to use well-founded induction to prove this property. However, as Moron pointed out in the comments, this could be impossible in the general case and the right ordering is, in many cases, quite hard to find.
Edit, in reply to Amnon's comment:
If you wanted to use well-founded induction, you would have to define another strict order, that would be well-founded. In case of the functions you mentioned this is not hard: you can take x << y if and only if ceil(x) < ceil(y), where << is a symbol for this new order. This order is of course well-founded on numbers greater then 2, and both sqrt and log are decreasing with respect to it -- so you can apply well-founded induction.
Of course, in general case such an order is much more difficult to find. This is also related, in some way, to total correctness assertions in Hoare logic, where you need to guarantee similar obligations on each loop construct.
There's a general theorem for when then sequence of iterations will converge. (A convergent sequence may not stop in a finite number of steps, but it is getting closer to a target. You can get as close to the target as you like by going far enough out in the sequence.)
The sequence x, f(x), f(f(x)), ... will converge if f is a contraction mapping. That is, there exists a positive constant k < 1 such that for all x and y, |f(x) - f(y)| <= k |x-y|.
(The fact that f(x) < x for x > 2 doesn't seem to help since the series may converge).
If we're talking about floats here, that's not true. If for all x > n f(x) is strictly less than x, it will reach n at some point (because there's only a limited number of floating point values between any two numbers).
Of course this means you need to prove that f(x) is actually less than x using floating point arithmetic (i.e. proving it is less than x mathematically does not suffice, because then f(x) = x may still be true with floats when the difference is not enough).
There is no general algorithm to determine whether a function f and a variable x will end or not in that loop. The Halting problem is reducible to that problem.
For sqrt and log, we could safely do that because we happen to know the mathematical properties of those functions. Say, sqrt approaches 1, log eventually goes negative. So the condition x < 2 has to be false at some point.
Hope that helps.
In the general case, all that can be said is that the loop will terminate when it encounters xi≤2. That doesn't mean that the sequence will converge, nor does it even mean that it is bounded below 2. It only means that the sequence contains a value that is not greater than 2.
That said, any sequence containing a subsequence that converges to a value strictly less than two will (eventually) halt. That is the case for the sequence xi+1 = sqrt(xi), since x converges to 1. In the case of yi+1 = log(yi), it will contain a value less than 2 before becoming undefined for elements of R (though it is well defined on the extended complex plane, C*, but I don't think it will, in general converge except at any stable points that may exist (i.e. where z = log(z)). Ultimately what this means is that you need to perform some upfront analysis on the sequence to better understand its behavior.
The standard test for convergence of a sequence xi to a point z is that give ε > 0, there is an n such that for all i > n, |xi - z| < ε.
As an aside, consider the Mandelbrot Set, M. The test for a particular point c in C for an element in M is whether the sequence zi+1 = zi2 + c is unbounded, which occurs whenever there is a |zi| > 2. Some elements of M may converge (such as 0), but many do not (such as -1).
Sure. For all positive numbers x, the following inequality holds:
log(x) <= x - 1
(this is a pretty basic result from real analysis; it suffices to observe that the second derivative of log is always negative for all positive x, so the function is concave down, and that x-1 is tangent to the function at x = 1). From this it follows essentially immediately that your while loop must terminate within the first ceil(x) - 2 steps -- though in actuality it terminates much, much faster than that.
A similar argument will establish your result for f(x) = sqrt(x); specifically, you can use the fact that:
sqrt(x) <= x/(2 sqrt(2)) + 1/sqrt(2)
for all positive x.
If you're asking whether this result holds for actual programs, instead of mathematically, the answer is a little bit more nuanced, but not much. Basically, many languages don't actually have hard accuracy requirements for the log function, so if your particular language implementation had an absolutely terrible math library this property might fail to hold. That said, it would need to be a really, really terrible library; this property will hold for any reasonable implementation of log.
I suggest reading this wikipedia entry which provides useful pointers. Without additional knowledge about f, nothing can be said.