Related
From my understanding, the atan2() function exists in programming languages because atan() itself cannot always determine the correct theta since the output is restricted to -pi/2 to pi/2.
If this is the case, then the same problem applies to both asin() and acos(), both of whom also have restricted ranges, so then why are there no asin2() and acos2() functions?
First off, note that the syntaxes of the two arctan functions are atan(y/x) and atan2(y, x). This distinction is important, because by not performing the division you provide additional information, most importantly the individual signs of x and y. If you know the individual x and y coordinates, the particular solution to the atan function can be found (i.e. the solution which takes into account the quadrant that (x,y) is in).
If you go from tan(θ) = y/x to sin(θ) = y/sqrt(x²+y²), then the inverse operation asin takes y and sqrt(x²+y²) and combines that to obtain some information about the angle. Here it doesn't matter whether we perform the division ourself or let some hypothetical asin2 function handle it. The denominator is always positive, so the divided argument contains just as much information as separate numerator and denominator contain. (At least in an IEEE environment where division by zero leads to a correctly-signed infinity.)
If you know the y coordinate and the hypothenuse sqrt(x²+y²) then you know the sine of the angle, but you cannot know the angle itself, since you cannot distinguish between negative and positive x values. Likewise, if you know the x coordinate and the hypothenuse, you know the cosine of the angle but you cannot know the sign of the y value.
So asin2 and acos2 are not mathematically feasible, at least not in an obvious way. If you had some kind of sign encoded into the hypothenuse, things might be different, but I can think of no situation where such a sign would arise naturally.
Because asin(y,x) acos(y,x) would each take the same parameters as atan(y,x) and each give the same answer. Each would be equally valid, but we only need one such function.
The unclarity arises from the name (of atan2). Its a function that given x and y, computes the angle (made by a line from the origin to this point) with the (positive) x-axis. A name like angle_from(x,y) would arguably have been more appropriate.
There are times when a function like "acos2" is needed, for example when performing rotations of vectors in 3D space. Under those circumstances, I hard-code my own acos2 function which simply performs the following checks:
x_perp=sqrt(x*x+y*y)
r=sqrt(x*x+y*y+z*z)
if(x_perp.gt.0.0d0) then
phi=acos(x/x_perp)
else
phi=0.0d0
endif
if(y.lt.0.0d0) phi=2.0d0*pi-phi
theta=acos(z/r)
where theta and phi are the usual spherical coordinates and x,y,z the Cartesian coordinates. The problem arises when y is negative, there needs to be a phase shift in phi. There is no such problem for theta.
I will explain in SIMPLE TERMS this way.
Refer to this image for the following explanation:
Task: Choose a function that will track the correct angle across a range -180 < θ < 180
Trial 1:
sin() is positive in the first and second quadrants, sin(30) = sin(150) = 0.5. It won't be easy to track quadrant change with sin().
Therefore, asin2() is not feasible.
Trial 2:
cos() is positive in the first and fourth quadrants, cos(60) = sin(300) = 0.5. Also, it won't be easy to track quadrant change with cos().
Therefore, acos2() is again not feasible.
Trial 3:
tan() is positive in the first and third quadrants, and in an interesting order.
It is positive in the 1st quadrant, negative in the 2nd, positive in the 3rd, negative in the 4th, and positive in the wrapped-around-1st quadrant.
such that tan(45) = 1 , tan(135) = -1, tan(225) = 1, tan(315) = -1, and tan(360+45) = 1. Hurray! we can track quadrant change.
Notice that the unambiguous range is -180 < θ < 180. Also, note in my 45-degree-increment example above, if the sequence is 1,-1,.. the angle goes counter-clockwise, and if the sequence is -1,1,.. it goes clockwise. This idea should resolve directionality.
Therefore, atan2() BECOMES OUR CHOICE.
For bit-wise shift (or rotation, circulation) operations, we usually have an operator, i mean two of them, for instance
x << n
x >> n
for left or right shift of x by n bits.
We want to define a single function
bitshift(x, n)
Before that, we have to determine, which shift is to be used for positive and negative n - what is the "sign" of each shift (or rotation) direction.
Is there a definition or convention for that?
(Please note that this question has nothing to do with signed/unsigned types)
UPDATES
Please also note that i am not asking for implementation details of this function, even it might be somewhat related..
There are similar functions in scheme/lisp-like languages, like ash, which do left shift for positive n
Since shifting right by k places is equal to multiplying by 2^-k, and shifting left is equal to multiplying by 2^k, I think that should give you a hint.
Note: Reason I would argue for this way of looking at it is that it is common to consider multiplication as more fundamental operation in some sense than division is, although you could certainly argue the other way around.
You can use negative number as argument
For example
x << n
so pass n=2 if yuo want to left shift two positions
and pass n=-2 if you want to right shift two positions.
I am unsure how to use the Distributive property on the following function:
F = B'D + A'D + BD
I understand that F = xy + x'z would become (xy + x')(xy + z) but I'm not sure how to do this with three terms with two variables.
Also another small question:
I was wondering how to know what number a minterm is without having to consult (or memorise) the table of minterms.
For example how can I tell that xy'z' is m4?
When you're trying to use the distributive property there, what you're doing is converting minterms to maxterms. This is actually very related to your second question.
To tell that xy'z' is m4, think of function as binary where false is 0 and true is 1. xy'z' then becomes 100, binary for the decimal 4. That's really what a k-map/minterm table is doing for you to give a number.
Now an important extension of this: the number of possible combinations is 2^number of different variables. If you have 3 variables, there are 2^3 or 8 different combinations. That means you have min/maxterm possible numbers from 0-7. Here's the cool part: anything that isn't a minterm is a maxterm, and vice versa.
So, if you have variables x and y, and you have the expression xy', you can see that as 10, or m2. Because the numbers go from 0-3 with 2 variables, m2 implies M0, M1, and M3. Therefore, xy'=(x+y)(x+y')(x'+y').
In other words, the easiest way to do the distributive property in either direction is to note what minterm or maxterm you're dealing with, and just switch it to the other.
For more info/different wording.
This discussion came up in a previous question and I'm interested in knowing the difference between the two. Illustration with an example would be nice.
Basic Example
Here is an example from Leonid Shifrin's book Mathematica programming: an advanced introduction
It is an excellent resource for this kind of question. See: (1) (2)
ClearAll[a, b]
a = RandomInteger[{1, 10}];
b := RandomInteger[{1, 10}]
Table[a, {5}]
{4, 4, 4, 4, 4}
Table[b, {5}]
{10, 5, 2, 1, 3}
Complicated Example
The example above may give the impression that once a definition for a symbol is created using Set, its value is fixed, and does not change. This is not so.
f = ... assigns to f an expression as it evaluates at the time of assignment. If symbols remain in that evaluated expression, and later their values change, so does the apparent value of f.
ClearAll[f, x]
f = 2 x;
f
2 x
x = 7;
f
14
x = 3;
f
6
It is useful to keep in mind how the rules are stored internally. For symbols assigned a value as symbol = expression, the rules are stored in OwnValues. Usually (but not always), OwnValues contains just one rule. In this particular case,
In[84]:= OwnValues[f]
Out[84]= {HoldPattern[f] :> 2 x}
The important part for us now is the r.h.s., which contains x as a symbol. What really matters for evaluation is this form - the way the rules are stored internally. As long as x did not have a value at the moment of assignment, both Set and SetDelayed produce (create) the same rule above in the global rule base, and that is all that matters. They are, therefore, equivalent in this context.
The end result is a symbol f that has a function-like behavior, since its computed value depends on the current value of x. This is not a true function however, since it does not have any parameters, and triggers only changes of the symbol x. Generally, the use of such constructs should be discouraged, since implicit dependencies on global symbols (variables) are just as bad in Mathematica as they are in other languages - they make the code harder to understand and bugs subtler and easier to overlook. Somewhat related discussion can be found here.
Set used for functions
Set can be used for functions, and sometimes it needs to be. Let me give you an example. Here Mathematica symbolically solves the Sum, and then assigns that to aF(x), which is then used for the plot.
ClearAll[aF, x]
aF[x_] = Sum[x^n Fibonacci[n], {n, 1, \[Infinity]}];
DiscretePlot[aF[x], {x, 1, 50}]
If on the other hand you try to use SetDelayed then you pass each value to be plotted to the Sum function. Not only will this be much slower, but at least on Mathematica 7, it fails entirely.
ClearAll[aF, x]
aF[x_] := Sum[x^n Fibonacci[n], {n, 1, \[Infinity]}];
DiscretePlot[aF[x], {x, 1, 50}]
If one wants to make sure that possible global values for formal parameters (x here) do not interfere and are ignored during the process of defining a new function, an alternative to Clear is to wrap Block around the definition:
ClearAll[aF, x];
x = 1;
Block[{x}, aF[x_] = Sum[x^n Fibonacci[n], {n, 1, \[Infinity]}]];
A look at the function's definition confirms that we get what we wanted:
?aF
Global`aF
aF[x_]=-(x/(-1+x+x^2))
In[1]:= Attributes[Set]
Out[1]= {HoldFirst, Protected, SequenceHold}
In[2]:= Attributes[SetDelayed]
Out[2]= {HoldAll, Protected, SequenceHold}
As you can see by their attributes, both functions hold their first argument (the symbol to which you are assigning), but they differ in that SetDelayed also holds its second argument, while Set does not. This means that Set will evaluate the expression to the right of = at the time the assignment is made. SetDelayed does not evaluate the expression to the right of the := until the variable is actually used.
What's happening is more clear if the right hand side of the assignment has a side effect (e.g. Print[]):
In[3]:= x = (Print["right hand side of Set"]; 3)
x
x
x
During evaluation of In[3]:= right hand side of Set
Out[3]= 3
Out[4]= 3
Out[5]= 3
Out[6]= 3
In[7]:= x := (Print["right hand side of SetDelayed"]; 3)
x
x
x
During evaluation of In[7]:= right hand side of SetDelayed
Out[8]= 3
During evaluation of In[7]:= right hand side of SetDelayed
Out[9]= 3
During evaluation of In[7]:= right hand side of SetDelayed
Out[10]= 3
:= is for defining functions and = is for setting a value, basically.
ie := will evaluate when its read, = will be evaluated when it is set.
think about:
x = 2
y = x
z := x
x = 4
Now, z is 4 if evaluated while y is still 2
I need an algorithm to perform a 2D bisection method for solving a 2x2 non-linear problem. Example: two equations f(x,y)=0 and g(x,y)=0 which I want to solve simultaneously. I am very familiar with the 1D bisection ( as well as other numerical methods ). Assume I already know the solution lies between the bounds x1 < x < x2 and y1 < y < y2.
In a grid the starting bounds are:
^
| C D
y2 -+ o-------o
| | |
| | |
| | |
y1 -+ o-------o
| A B
o--+------+---->
x1 x2
and I know the values f(A), f(B), f(C) and f(D) as well as g(A), g(B), g(C) and g(D). To start the bisection I guess we need to divide the points out along the edges as well as the middle.
^
| C F D
y2 -+ o---o---o
| | |
|G o o M o H
| | |
y1 -+ o---o---o
| A E B
o--+------+---->
x1 x2
Now considering the possibilities of combinations such as checking if f(G)*f(M)<0 AND g(G)*g(M)<0 seems overwhelming. Maybe I am making this a little too complicated, but I think there should be a multidimensional version of the Bisection, just as Newton-Raphson can be easily be multidimed using gradient operators.
Any clues, comments, or links are welcomed.
Sorry, while bisection works in 1-d, it fails in higher dimensions. You simply cannot break a 2-d region into subregions using only information about the function at the corners of the region and a point in the interior. In the words of Mick Jagger, "You can't always get what you want".
I just stumbled upon the answer to this from geometrictools.com and C++ code.
edit: the code is now on github.
I would split the area along a single dimension only, alternating dimensions. The condition you have for existence of zero of a single function would be "you have two points of different sign on the boundary of the region", so I'd just check that fro the two functions. However, I don't think it would work well, since zeros of both functions in a particular region don't guarantee a common zero (this might even exist in a different region that doesn't meet the criterion).
For example, look at this image:
There is no way you can distinguish the squares ABED and EFIH given only f() and g()'s behaviour on their boundary. However, ABED doesn't contain a common zero and EFIH does.
This would be similar to region queries using eg. kD-trees, if you could positively identify that a region doesn't contain zero of eg. f. Still, this can be slow under some circumstances.
If you can assume (per your comment to woodchips) that f(x,y)=0 defines a continuous monotone function y=f2(x), i.e. for each x1<=x<=x2 there is a unique solution for y (you just can't express it analytically due to the messy form of f), and similarly y=g2(x) is a continuous monotone function, then there is a way to find the joint solution.
If you could calculate f2 and g2, then you could use a 1-d bisection method on [x1,x2] to solve f2(x)-g2(x)=0. And you can do that by using 1-d bisection on [y1,y2] again for solving f(x,y)=0 for y for any given fixed x that you need to consider (x1, x2, (x1+x2)/2, etc) - that's where the continuous monotonicity is helpful -and similarly for g. You have to make sure to update x1-x2 and y1-y2 after each step.
This approach might not be efficient, but should work. Of course, lots of two-variable functions don't intersect the z-plane as continuous monotone functions.
I'm not much experient on optimization, but I built a solution to this problem with a bisection algorithm like the question describes. I think is necessary to fix a bug in my solution because it compute tow times a root in some cases, but i think it's simple and will try it later.
EDIT: I seem the comment of jpalecek, and now I anderstand that some premises I assumed are wrong, but the methods still works on most cases. More especificaly, the zero is garanteed only if the two functions variate the signals at oposite direction, but is need to handle the cases of zero at the vertices. I think is possible to build a justificated and satisfatory heuristic to that, but it is a little complicated and now I consider more promising get the function given by f_abs = abs(f, g) and build a heuristic to find the local minimuns, looking to the gradient direction on the points of the middle of edges.
Introduction
Consider the configuration in the question:
^
| C D
y2 -+ o-------o
| | |
| | |
| | |
y1 -+ o-------o
| A B
o--+------+---->
x1 x2
There are many ways to do that, but I chose to use only the corner points (A, B, C, D) and not middle or center points liky the question sugests. Assume I have tow function f(x,y) and g(x,y) as you describe. In truth it's generaly a function (x,y) -> (f(x,y), g(x,y)).
The steps are the following, and there is a resume (with a Python code) at the end.
Step by step explanation
Calculate the product each scalar function (f and g) by them self at adjacent points. Compute the minimum product for each one for each direction of variation (axis, x and y).
Fx = min(f(C)*f(B), f(D)*f(A))
Fy = min(f(A)*f(B), f(D)*f(C))
Gx = min(g(C)*g(B), g(D)*g(A))
Gy = min(g(A)*g(B), g(D)*g(C))
It looks to the product through tow oposite sides of the rectangle and computes the minimum of them, whats represents the existence of a changing of signal if its negative. It's a bit of redundance but work's well. Alternativaly you can try other configuration like use the points (E, F, G and H show in the question), but I think make sense to use the corner points because it consider better the whole area of the rectangle, but it is only a impression.
Compute the minimum of the tow axis for each function.
F = min(Fx, Fy)
G = min(Gx, Gy)
It of this values represents the existence of a zero for each function, f and g, within the rectangle.
Compute the maximum of them:
max(F, G)
If max(F, G) < 0, then there is a root inside the rectangle. Additionaly, if f(C) = 0 and g(C) = 0, there is a root too and we do the same, but if the root is in other corner we ignore him, because other rectangle will compute it (I want to avoid double computation of roots). The statement bellow resumes:
guaranteed_contain_zeros = max(F, G) < 0 or (f(C) == 0 and g(C) == 0)
In this case we have to proceed breaking the region recursively ultil the rectangles are as small as we want.
Else, may still exist a root inside the rectangle. Because of that, we have to use some criterion to break this regions ultil the we have a minimum granularity. The criterion I used is to assert the largest dimension of the current rectangle is smaller than the smallest dimension of the original rectangle (delta in the code sample bellow).
Resume
This Python code resume:
def balance_points(x_min, x_max, y_min, y_max, delta, eps=2e-32):
width = x_max - x_min
height = y_max - y_min
x_middle = (x_min + x_max)/2
y_middle = (y_min + y_max)/2
Fx = min(f(C)*f(B), f(D)*f(A))
Fy = min(f(A)*f(B), f(D)*f(C))
Gx = min(g(C)*g(B), g(D)*g(A))
Gy = min(g(A)*g(B), g(D)*g(C))
F = min(Fx, Fy)
G = min(Gx, Gy)
largest_dim = max(width, height)
guaranteed_contain_zeros = max(F, G) < 0 or (f(C) == 0 and g(C) == 0)
if guaranteed_contain_zeros and largest_dim <= eps:
return [(x_middle, y_middle)]
elif guaranteed_contain_zeros or largest_dim > delta:
if width >= height:
return balance_points(x_min, x_middle, y_min, y_max, delta) + balance_points(x_middle, x_max, y_min, y_max, delta)
else:
return balance_points(x_min, x_max, y_min, y_middle, delta) + balance_points(x_min, x_max, y_middle, y_max, delta)
else:
return []
Results
I have used a similar code similar in a personal project (GitHub here) and it draw the rectangles of the algorithm and the root (the system have a balance point at the origin):
Rectangles
It works well.
Improvements
In some cases the algorithm compute tow times the same zero. I thinh it can have tow reasons:
I the case the functions gives exatly zero at neighbour rectangles (because of an numerical truncation). In this case the remedy is to incrise eps (increase the rectangles). I chose eps=2e-32, because 32 bits is a half of the precision (on 64 bits archtecture), then is problable that the function don't gives a zero... but it was more like a guess, I don't now if is the better. But, if we decrease much the eps, it extrapolates the recursion limit of Python interpreter.
The case in witch the f(A), f(B), etc, are near to zero and the product is truncated to zero. I think it can be reduced if we use the product of the signals of f and g in place of the product of the functions.
I think is possible improve the criterion to discard a rectangle. It can be made considering how much the functions are variating in the region of the rectangle and how distante the function is of zero. Perhaps a simple relation between the average and variance of the function values on the corners. In another way (and more complicated) we can use a stack to store the values on each recursion instance and garantee that this values are convergent to stop recursion.
This is a similar problem to finding critical points in vector fields (see http://alglobus.net/NASAwork/topology/Papers/alsVugraphs93.ps).
If you have the values of f(x,y) and g(x,y) at the vertexes of your quadrilateral and you are in a discrete problem (such that you don't have an analytical expression for f(x,y) and g(x,y) nor the values at other locations inside the quadrilateral), then you can use bilinear interpolation to get two equations (for f and g). For the 2D case the analytical solution will be a quadratic equation which, according to the solution (1 root, 2 real roots, 2 imaginary roots) you may have 1 solution, 2 solutions, no solutions, solutions inside or outside your quadrilateral.
If instead you have analytic functions of f(x,y) and g(x,y) and want to use them, this is not useful. Instead you could divide your quadrilateral recursively, however as it was already pointed out by jpalecek (2nd post), you would need a way to stop your divisions by figuring out a test that would assure you would have no zeros inside a quadrilateral.
Let f_1(x,y), f_2(x,y) be two functions which are continuous and monotonic with respect to x and y. The problem is to solve the system f_1(x,y) = 0, f_2(x,y) = 0.
The alternating-direction algorithm is illustrated below. Here, the lines depict sets {f_1 = 0} and {f_2 = 0}. It is easy to see that the direction of movement of the algorithm (right-down or left-up) depends on the order of solving the equations f_i(x,y) = 0 (e.g., solve f_1(x,y) = 0 w.r.t. x then solve f_2(x,y) = 0 w.r.t. y OR first solve f_1(x,y) = 0 w.r.t. y and then solve f_2(x,y) = 0 w.r.t. x).
Given the initial guess, we don't know where the root is. So, in order to find all roots of the system, we have to move in both directions.