Keep getting the error message "Arguments are not sufficiently instantiated" can't understand why - exception

Keep getting the error Arguments are not sufficiently instantiated for the multiplication by addition rule I wrote as shown below.
mult(_, 0, 0). %base case for multiplying by 0
mult(X, 1, X). % another base case
mult(X, Y, Z) :-
Y > 1,
Y1 is Y - 1,
mult(X, Y1, Z1),
Z is X + Z1.
I am new to Prolog and really struggling with even such simple problems.
Any recommendations for books or online tutorials would be great.
I am running it on SWI-Prolog on Ubuntu Linux.

In your definition of mult/3 the first two arguments have to be known. If one of them is still a variable, an instantiation error will occur. Eg. mult(2, X, 6) will yield an instantiation error, although X = 3 is a correct answer ; in fact, the only answer.
There are several options you have:
successor-arithmetics, constraints, or meta-logical predicates.
Here is a starting point with successor arithmetics:
add(0,Y,Y).
add(s(X),Y,s(Z)) :- add(X,Y,Z).
Another approach would be to use constraints over the integers. YAP and SWI have a library(clpfd) that can be used in a very flexible manner: Both for regular integer computations and the more general constraints. Of course, multiplication is already predefined:
?- A * B #= C.
A*B#=C.
?- A * B #= C, C = 6.
C = 6, A in -6.. -1\/1..6, A*B#=6, B in -6.. -1\/1..6.
?- A * B #= C, C = 6, A = 2.
A = 2, B = 3, C = 6.
Meta-logical predicates: I cannot recommend this option in which you would use var/1, nonvar/1, ground/1 to distinguish various cases and handle them differently. This is so error prone that I have rarely seen a correct program using them. In fact, even very well known textbooks contain serious errors!

I think you got the last two calls reversed. Don't you mean:
mult(X,Y,Z):- Y>1,Y1 is Y-1, Z1 is X+Z, mult(X,Y1,Z1).
Edit: nevermind that, looked at the code again and it doesn't make sense. I believe your original code is correct.
As for why that error is occuring, I need to know how you're calling the predicate. Can you give an example input?
The correct way of calling your predicate is mult(+X, +Y, ?Z):
?- mult(5,0,X).
X = 0
?- mult(5,1,X).
X = 5
?- mult(5,5,X).
X = 25
?- mult(4,4,16).
yes
?- mult(3,3,10).
no
etc. Calling it with a free variable in the first two arguments will produce that error, because one of them will be used in the right side of an is or in either side of the <, and those predicates expect ground terms to succeed.

Related

Can I define a maxima function f(x) which assigns to the argument x

Sorry for the basic question, but it's quite hard to find too much discussion on Maxima specifics.
I'm trying to learn some Maxima and wanted to use something like
x:2
x+=2
which as far as I can tell doesn't exist in Maxima. Then I discovered that I can define my own operators as infix operators, so I tried doing
infix("+=");
"+=" (a,b):= a:(a+b);
However this doesn't work, as if I first set x:1 then try calling x+=2, the function returns 3, but if I check the value of x I see it hasn't changed.
Is there a way to achieve what I was trying to do in Maxima? Could anyone explain why the definition I gave fails?
Thanks!
The problem with your implementation is that there is too much and too little evaluation -- the += function doesn't see the symbol x so it doesn't know to what variable to assign the result, and the left-hand side of an assignment isn't evaluated, so += thinks it is assigning to a, not x.
Here's one way to get the right amount of evaluation. ::= defines a macro, which is just a function which quotes its arguments, and for which the return value is evaluated again. buildq is a substitution function which quotes the expression into which you are substituting. So the combination of ::= and buildq here is to construct the x: x + 2 expression and then evaluate it.
(%i1) infix ("+=") $
(%i2) "+="(a, b) ::= buildq ([a, b], a: a + b) $
(%i3) x: 100 $
(%i4) macroexpand (x += 1);
(%o4) x : x + 1
(%i5) x += 1;
(%o5) 101
(%i6) x;
(%o6) 101
(%i7) x += 1;
(%o7) 102
(%i8) x;
(%o8) 102
So it is certainly possible to do so, if you want to do that. But may I suggest maybe you don't need it? Modifying a variable makes it harder to keep track, mentally, what is going on. A programming policy such as one-time assignment can make it easier for the programmer to understand the program. This is part of a general approach called functional programming; perhaps you can take a look at that. Maxima has various features which make it possible to use functional programming, although you are not required to use them.

Octave -inf and NaN

I searched the forum and found this thread, but it does not cover my question
Two ways around -inf
From a Machine Learning class, week 3, I am getting -inf when using log(0), which later turns into an NaN. The NaN results in no answer being given in a sum formula, so no scalar for J (a cost function which is the result of matrix math).
Here is a test of my function
>> sigmoid([-100;0;100])
ans =
3.7201e-44
5.0000e-01
1.0000e+00
This is as expected. but the hypothesis requires ans = 1-sigmoid
>> 1-ans
ans =
1.00000
0.50000
0.00000
and the Log(0) gives -Inf
>> log(ans)
ans =
0.00000
-0.69315
-Inf
-Inf rows do not add to the cost function, but the -Inf carries through to NaN, and I do not get a result. I cannot find any material on -Inf, but am thinking there is a problem with my sigmoid function.
Can you provide any direction?
The typical way to avoid infinity in these cases is to add eps to the operand:
log(ans + eps)
eps is a very, very small value, and won't affect the output for values of ans unless ans is zero:
>> z = [-100;0;100];
>> g = 1 ./ (1+exp(-z));
>> log(1-g + eps)
ans =
0.0000
-0.6931
-36.0437
Adding to the answers here, I really do hope you would provide some more context to your question (in particular, what are you actually trying to do.
I will go out on a limb and guess the context, just in case this is useful. You are probably doing machine learning, and trying to define a cost function based on the negative log likelihood of a model, and then trying to differentiate it to find the point where this cost is at its minimum.
In general for a reasonable model with a useful likelihood that adheres to Cromwell's rule, you shouldn't have these problems, but, in practice it happens. And presumably in the process of trying to calculate a negative log likelihood of a zero probability you get inf, and trying to calculate a differential between two points produces inf / inf = nan.
In this case, this is an 'edge case', and generally in computer science edge cases need to be spotted as exceptional circumstances and dealt with appropriately. The reality is that you can reasonably expect that inf isn't going to be your function's minimum! Therefore, whether you remove it from the calculations, or replace it by a very large number (whether arbitrarily or via machine precision) doesn't really make a difference.
So in practice you can do either of the two things suggested by others here, or even just detect such instances and skip them from the calculation. The practical result should be the same.
-inf means negative infinity. Which is the correct answer because log of (0) is minus infinity by definition.
The easiest thing to do is to check your intermediate results and if the number is below some threshold (like 1e-12) then just set it to that threshold. The answers won't be perfect but they will still be pretty close.
Using the following as the sigmoid function:
function g = sigmoid(z)
g = 1 ./ (1 + e.^-z);
end
Then the following code runs with no issues. Choose the threshold value in the 'max' statement to be less than the expected noise in your measurements and then you're good to go
>> a = sigmoid([-100, 0, 100])
a =
3.7201e-44 5.0000e-01 1.0000e+00
>> b = 1-a
b =
1.00000 0.50000 0.00000
>> c = max(b, 1e-12)
c =
1.0000e+00 5.0000e-01 1.0000e-12
>> d = log(c)
d =
0.00000 -0.69315 -27.63102

Mathematica Integration taking too long

Using Mathematica I need to evaluate the integral of a function. Since it is taking the program too much to compute it, would it be possible to use parallel computation to shorten the time needed? If so, how can I do it?
I uploaded a picture of the integrand function:
I need to integrate it with respect to (x3, y3, x, y) all of them ranging in a certain interval (x3 and y3 from 0 to 1) (x and y from 0 to 100). The parameters (a,b,c...,o) are preventing the NIntegrate function to work. Any suggestions?
If you evaluate this
expr=E^((-(x-y)^4-(x3-y3)^4)/10^4)*
(f x+e x^2+(m+n x)x3-f y-e y^2-(m+n y)y3)*
((378(x-y)^2(f x+e x^2+(m+n x)x3-f y-e y^2-(m+n y)y3))/
(Pi(1/40+Sqrt[((x-y)^2+(x3-y3)^2)^3]))+
(378(x-y)(x3-y3)(h x+g x^2+(o+p x)x3-h y-g y^2-(o+p y)y3))/
(Pi(1/40+Sqrt[((x-y)^2+(x3-y3)^2)^3])))+
(h x+g x^2+(o+p x)x3-h y-g y^2-(o +p y) y3)*
((378(x-y)(x3-y3)(f x+e x^2+(m+n x)x3-f y-e y^2-(m+n y)y3))/
(Pi(1/40+Sqrt[((x-y)^2+(x3-y3)^2)^3]))+
(378 (x3 - y3)^2 (h x + g x^2 + (o + p x)x3-h y-g y^2-(o+p y)y3))/
(Pi(1/40+Sqrt[((x-y)^2+(x3-y3)^2)^3])));
list=List ## Expand[expr]
then you will get a list of 484 expressions, each very similar in form to this
(378*f*h*x^3*x3)/(Pi*(1/40+Sqrt[(x^2+x3^2-2*x*y+y^2-2*x3*y3+y3^2)^3]))
Notice that you can then use NIntegrate in this way
f*h*NIntegrate[(378*x^3*x3)/(Pi*(1/40+Sqrt[(x^2+x3^2-2*x*y+y^2-2*x3*y3+y3^2)^3])),
{x,0,100},{y,0,100},{x3,0,1},{y3,0,1}]
but it gives warnings and errors about the convergence and accuracy, almost certainly due to your fractional powers in the denominator.
If you can find a way to pull out the scalar multipliers which are independent of x,y,x3,y3 and then perform that integration without warnings and errors and get an accurate result which isn't infinity then you could perhaps perform these integrals in parallel and total the results.
Some of the integrands are scalar multiples of others and if you combine similar integrands then you can reduce this down to 300 unique integrands.
I doubt this is going to lead to an acceptable solution for you.
Please check all this very carefully to make certain that no mistakes have been made.
EDIT
Since the variables that are independent of the integration appear to be easily separated from the dependent variables in the problem posed above, I think this will allow parallel NIntegrate
independentvars[z_] := (z/(z//.{e->1, f->1, g->1, h->1, m->1, n->1, o->1, p->1}))*
NIntegrate[(z//.{e->1, f->1, g->1, h->1, m->1, n->1, o->1, p->1}),
{x, 0, 100}, {y, 0, 100}, {x3, 0, 1}, {y3, 0, 1}]
Total[ParallelMap[independentvars, list]]
As I mentioned previously, the fractional powers in the denominator result in a flood of warnings and errors about convergence failing.
You can test this with the following much simpler example
expr = f x + f g x3 + o^2 x x3;
list = List ## Expand[expr];
Total[ParallelMap[independentvars, list]]
which instantly returns
500000. f + 5000. f g + 250000. o^2
This is a very primitive method of pulling independent symbolic variables outside an NIntegrate. This gives absolutely no warning if one of the integrands is not in a form where this primitive attempt at extraction is not appropriate or fails.
There may be a far better method that someone else has written out there somewhere. If someone could show a far better method of doing this then I would appreciate it.
It might be nice if Wolfram would consider incorporating something like this into NIntegrate itself.

What is the difference between Set ( = ) and SetDelayed ( := )?

This discussion came up in a previous question and I'm interested in knowing the difference between the two. Illustration with an example would be nice.
Basic Example
Here is an example from Leonid Shifrin's book Mathematica programming: an advanced introduction
It is an excellent resource for this kind of question. See: (1) (2)
ClearAll[a, b]
a = RandomInteger[{1, 10}];
b := RandomInteger[{1, 10}]
Table[a, {5}]
{4, 4, 4, 4, 4}
Table[b, {5}]
{10, 5, 2, 1, 3}
Complicated Example
The example above may give the impression that once a definition for a symbol is created using Set, its value is fixed, and does not change. This is not so.
f = ... assigns to f an expression as it evaluates at the time of assignment. If symbols remain in that evaluated expression, and later their values change, so does the apparent value of f.
ClearAll[f, x]
f = 2 x;
f
2 x
x = 7;
f
14
x = 3;
f
6
It is useful to keep in mind how the rules are stored internally. For symbols assigned a value as symbol = expression, the rules are stored in OwnValues. Usually (but not always), OwnValues contains just one rule. In this particular case,
In[84]:= OwnValues[f]
Out[84]= {HoldPattern[f] :> 2 x}
The important part for us now is the r.h.s., which contains x as a symbol. What really matters for evaluation is this form - the way the rules are stored internally. As long as x did not have a value at the moment of assignment, both Set and SetDelayed produce (create) the same rule above in the global rule base, and that is all that matters. They are, therefore, equivalent in this context.
The end result is a symbol f that has a function-like behavior, since its computed value depends on the current value of x. This is not a true function however, since it does not have any parameters, and triggers only changes of the symbol x. Generally, the use of such constructs should be discouraged, since implicit dependencies on global symbols (variables) are just as bad in Mathematica as they are in other languages - they make the code harder to understand and bugs subtler and easier to overlook. Somewhat related discussion can be found here.
Set used for functions
Set can be used for functions, and sometimes it needs to be. Let me give you an example. Here Mathematica symbolically solves the Sum, and then assigns that to aF(x), which is then used for the plot.
ClearAll[aF, x]
aF[x_] = Sum[x^n Fibonacci[n], {n, 1, \[Infinity]}];
DiscretePlot[aF[x], {x, 1, 50}]
If on the other hand you try to use SetDelayed then you pass each value to be plotted to the Sum function. Not only will this be much slower, but at least on Mathematica 7, it fails entirely.
ClearAll[aF, x]
aF[x_] := Sum[x^n Fibonacci[n], {n, 1, \[Infinity]}];
DiscretePlot[aF[x], {x, 1, 50}]
If one wants to make sure that possible global values for formal parameters (x here) do not interfere and are ignored during the process of defining a new function, an alternative to Clear is to wrap Block around the definition:
ClearAll[aF, x];
x = 1;
Block[{x}, aF[x_] = Sum[x^n Fibonacci[n], {n, 1, \[Infinity]}]];
A look at the function's definition confirms that we get what we wanted:
?aF
Global`aF
aF[x_]=-(x/(-1+x+x^2))
In[1]:= Attributes[Set]
Out[1]= {HoldFirst, Protected, SequenceHold}
In[2]:= Attributes[SetDelayed]
Out[2]= {HoldAll, Protected, SequenceHold}
As you can see by their attributes, both functions hold their first argument (the symbol to which you are assigning), but they differ in that SetDelayed also holds its second argument, while Set does not. This means that Set will evaluate the expression to the right of = at the time the assignment is made. SetDelayed does not evaluate the expression to the right of the := until the variable is actually used.
What's happening is more clear if the right hand side of the assignment has a side effect (e.g. Print[]):
In[3]:= x = (Print["right hand side of Set"]; 3)
x
x
x
During evaluation of In[3]:= right hand side of Set
Out[3]= 3
Out[4]= 3
Out[5]= 3
Out[6]= 3
In[7]:= x := (Print["right hand side of SetDelayed"]; 3)
x
x
x
During evaluation of In[7]:= right hand side of SetDelayed
Out[8]= 3
During evaluation of In[7]:= right hand side of SetDelayed
Out[9]= 3
During evaluation of In[7]:= right hand side of SetDelayed
Out[10]= 3
:= is for defining functions and = is for setting a value, basically.
ie := will evaluate when its read, = will be evaluated when it is set.
think about:
x = 2
y = x
z := x
x = 4
Now, z is 4 if evaluated while y is still 2

Repeated application of functions

Reading this question got me thinking: For a given function f, how can we know that a loop of this form:
while (x > 2)
x = f(x)
will stop for any value x? Is there some simple criterion?
(The fact that f(x) < x for x > 2 doesn't seem to help since the series may converge).
Specifically, can we prove this for sqrt and for log?
For these functions, a proof that ceil(f(x))<x for x > 2 would suffice. You could do one iteration -- to arrive at an integer number, and then proceed by simple induction.
For the general case, probably the best idea is to use well-founded induction to prove this property. However, as Moron pointed out in the comments, this could be impossible in the general case and the right ordering is, in many cases, quite hard to find.
Edit, in reply to Amnon's comment:
If you wanted to use well-founded induction, you would have to define another strict order, that would be well-founded. In case of the functions you mentioned this is not hard: you can take x << y if and only if ceil(x) < ceil(y), where << is a symbol for this new order. This order is of course well-founded on numbers greater then 2, and both sqrt and log are decreasing with respect to it -- so you can apply well-founded induction.
Of course, in general case such an order is much more difficult to find. This is also related, in some way, to total correctness assertions in Hoare logic, where you need to guarantee similar obligations on each loop construct.
There's a general theorem for when then sequence of iterations will converge. (A convergent sequence may not stop in a finite number of steps, but it is getting closer to a target. You can get as close to the target as you like by going far enough out in the sequence.)
The sequence x, f(x), f(f(x)), ... will converge if f is a contraction mapping. That is, there exists a positive constant k < 1 such that for all x and y, |f(x) - f(y)| <= k |x-y|.
(The fact that f(x) < x for x > 2 doesn't seem to help since the series may converge).
If we're talking about floats here, that's not true. If for all x > n f(x) is strictly less than x, it will reach n at some point (because there's only a limited number of floating point values between any two numbers).
Of course this means you need to prove that f(x) is actually less than x using floating point arithmetic (i.e. proving it is less than x mathematically does not suffice, because then f(x) = x may still be true with floats when the difference is not enough).
There is no general algorithm to determine whether a function f and a variable x will end or not in that loop. The Halting problem is reducible to that problem.
For sqrt and log, we could safely do that because we happen to know the mathematical properties of those functions. Say, sqrt approaches 1, log eventually goes negative. So the condition x < 2 has to be false at some point.
Hope that helps.
In the general case, all that can be said is that the loop will terminate when it encounters xi≤2. That doesn't mean that the sequence will converge, nor does it even mean that it is bounded below 2. It only means that the sequence contains a value that is not greater than 2.
That said, any sequence containing a subsequence that converges to a value strictly less than two will (eventually) halt. That is the case for the sequence xi+1 = sqrt(xi), since x converges to 1. In the case of yi+1 = log(yi), it will contain a value less than 2 before becoming undefined for elements of R (though it is well defined on the extended complex plane, C*, but I don't think it will, in general converge except at any stable points that may exist (i.e. where z = log(z)). Ultimately what this means is that you need to perform some upfront analysis on the sequence to better understand its behavior.
The standard test for convergence of a sequence xi to a point z is that give ε > 0, there is an n such that for all i > n, |xi - z| < ε.
As an aside, consider the Mandelbrot Set, M. The test for a particular point c in C for an element in M is whether the sequence zi+1 = zi2 + c is unbounded, which occurs whenever there is a |zi| > 2. Some elements of M may converge (such as 0), but many do not (such as -1).
Sure. For all positive numbers x, the following inequality holds:
log(x) <= x - 1
(this is a pretty basic result from real analysis; it suffices to observe that the second derivative of log is always negative for all positive x, so the function is concave down, and that x-1 is tangent to the function at x = 1). From this it follows essentially immediately that your while loop must terminate within the first ceil(x) - 2 steps -- though in actuality it terminates much, much faster than that.
A similar argument will establish your result for f(x) = sqrt(x); specifically, you can use the fact that:
sqrt(x) <= x/(2 sqrt(2)) + 1/sqrt(2)
for all positive x.
If you're asking whether this result holds for actual programs, instead of mathematically, the answer is a little bit more nuanced, but not much. Basically, many languages don't actually have hard accuracy requirements for the log function, so if your particular language implementation had an absolutely terrible math library this property might fail to hold. That said, it would need to be a really, really terrible library; this property will hold for any reasonable implementation of log.
I suggest reading this wikipedia entry which provides useful pointers. Without additional knowledge about f, nothing can be said.