I am not sure what is the use of output while using fminunc.
>>options = optimset('GradObj','on','MaxIter','1');
>>initialTheta=zeros(2,1);
>>[optTheta, functionVal, exitFlag, output, grad, hessian]=
fminunc(#CostFunc,initialTheta,options);
>> output
output =
scalar structure containing the fields:
iterations = 11
successful = 10
funcCount = 21
Even when I use max no of iteration = 1 still it is giving no of iteration = 11??
Could anyone please explain me why is this happening?
help me with grad and hessian properties too, means the use of those.
Given we don't have the full code, I think the easiest thing for you to do to understand exactly what is happening is to just set a breakpoint in fminunc.m itself, and follow the logic of the code. This is one of the nice things about working with Octave, since the source code is provided and you can check it freely (there's often useful information in octave source code in fact, such as references to papers which they relied on for the implementation, etc).
From a quick look, it doesn't seem like fminunc expects a maxiter of 1. Have a look at line 211:
211 while (niter < maxiter && nfev < maxfev && ! info)
Since niter is initialised just before (at line 176) with the value of 1, in theory this loop will never be entered if your maxiter is 1, which defeats the whole point of the optimization.
There are other interesting things happening in there too, e.g. the inner while loop starting at line 272:
272 while (! suc && niter <= maxiter && nfev < maxfev && ! info)
This uses "shortcut evaluation", to first check if the previous iteration was "unsuccessful", before checking if the number of iterations are less than "maxiter".
In other words, if the previous iteration was successful, you don't get to run the inner loop at all, and you never get to increment niter.
What flags an iteration as "successful" seems to be defined by the ratio of "actual vs predicted reduction", as per the following (non-consecutive) lines:
286 actred = (fval - fval1) / (abs (fval1) + abs (fval));
...
295 prered = -t/(abs (fval) + abs (fval + t));
296 ratio = actred / prered;
...
321 if (ratio >= 1e-4)
322 ## Successful iteration.
...
326 nsuciter += 1;
...
328 endif
329
330 niter += 1;
In other words, it seems like fminunc will respect your maxiters ignoring whether these have been "successful" or "unsuccessful", with the exception that it does not like to "end" the algorithm at a "successful" turn (since the success condition needs to be fulfilled first before the maxiters condition is checked).
Obviously this is an academic point, since you shouldn't even be entering this inner loop when you couldn't even make it past the outer loop in the first place.
I cannot really know exactly what is going on without knowing your specific code, but you should be able to follow easily if you run your code with a breakpoint at fminunc. The maths behind that implementation may be complex, but the code itself seems fairly simple and straightforward enough to follow.
Good luck!
Related
I created a game that randomly picks a number, and the user has to guess that number. Normally, this would be easy, but I'm required to use functions to make it happen.
I have my code linked below. To explain:
get_num() function gives us a number (supposed to be 1 to 1000, but I have it 1 to 10 for troubleshooting)
ask_user() is an input that prompts the user to put in a number.
guess_check() is supposed to determine if your number it too high or too low.
num_guesses() is going to keep track of the number of times the user has guessed. But it's not done, you can ignore that for now.
I'm running PyCharm Community Edition 2021.3.2, for the record.
The problem: The program works fine, except it cannot tell if a number is too big or too small. When you keep guessing the same number, lets say "2", it will keep saying the number is too high and too low. Why? I have the If Statements perfect. If you look at the screenshot, the correct number is 7, and I have the proper if statement. Yet, it still recognizes 2 as higher than 7. Why?
Here is a picture for proof:
Here is the code:
def main():
answer = get_num()
guess = ask_user()
num_guesses = 0
while guess != answer:
check = guess_check(guess)
if check == 2:
print(f"Check is {check}. Your guess of {guess} is too high. Pick something lower. You're now on Guess {num_guesses}!\n")
guess = ask_user()
elif check == 1:
print(f"Check is {check}. Your guess of {guess} is too low. Pick something higher. You're now on Guess {num_guesses}! The correct answer is {answer}\n")
guess = ask_user()
print(f"Check is {check}. Congratulations! Your guess of {guess} is correct! \n\nNumber of Guesses: {num_guesses}")
# This function actually gives us the number to guess
def get_num():
# Randomly determines a number between 0 and 1000
import random
answer = random.randrange(1, 10)
return answer
def ask_user():
answer = int(input("Pick a number: "))
return answer
def guess_check(guess):
answer = get_num()
if guess > answer:
# Guess is too high, therefore value for guess_check() is 2
check = 2
elif guess < answer:
# Guess is too low, therefore value for guess_check() is 1
check = 1
else:
# Guess is right on, therefore value for guess_check() is 0
check = 0
return check
def num_guesses():
number = 1
guess = guess_check()
#If Statement for whether guess is too high (2) or too low (1). If it's 0, then the number of guesses will not increase.'
if guess == 1 or guess == 2:
number += 1
return number
main()
I've tried a few things to get around the problem:
I combined the ask_user() and guess_check() functions into 1 function. This did not make a difference.
Tried coding this exact same program without using functions. Ran just fine. The guess check part of the code ran without issues. So this tells me the functions are the reason this issue is happening.
Anyway, thanks so much for the help. You don't even know how much I appreciate this. I'm desperate.
I'm having issues with the index for my code. I'm trying to create a code on Octave for the power method (vector iteration) and the error: 'x(4): out of bound 3' keeps popping up at line 6.
A=[6,-2,2,4;0,-4,2,2;0,0,2,-5;0,0,0,-3]
b=[12;10;-9;-3]
n=4
for i=rows(A):-1:1
for j=i+1:rows(A)
x(i)=[b(i)-A(i,j)*x(j)]/A(i,i); #error: 'x(4): out of bound 3'
endfor
endfor
x
In the following line, note that you have x appearing twice; the first seeks to assign to it, but the second simply tries to access its value:
x(i) = [ b(i) - A(i,j) * x(j) ] / A(i,i);
⬑ assignment ⬑ access
Assigning to an index that doesn't exist (yet) is absolutely fine; octave will simply fill in the intervening values with 'zeroes'. E.g.
>> clear x
>> x(3) = 1 % output: x = [0, 0, 1]
However, trying to access an index which doesn't exist yet is an error, since there's nothing there to access. This results in an "out of bound" error (and, in its error message, octave is kind enough to tell you what the last legitimate index is that you can access in that particular array).
Therefore, this is an error:
>> clear x
>> x(3) = 1 % output: x = [0, 0, 1]
>> 1 + x(4) % output: error: x(4): out of bound 3
Now going back to your specific code, you are trying to access something that doesn't exist yet. The reason it doesn't exist yet, is that you have set up your for loops such that j will achieve a higher value than i at a particular step, such that you are trying to access x(j), which does not exist yet, in order to assign it to x(i), where i < j. Therefore this results in an out of bounds error (you are trying to access index j when you only have up to i available).
In your particular case, octave informs you that this happened when j was 4, and i was 3.
PS: I will echo #HansHirse's implied warning here, that you should always pay attention to your variables, and clear them appropriately in your scripts, especially if you plan to run it many times. Never use a variable that you haven't defined (or cleared) beforehand. Otherwise, x here may not be undefined when you run your script, say, a second time. This leads to all sorts of problems, e.g., your code works but for the wrong reasons, and then fails to work again when you run it the next day and x is now undefined. In this particular example, if you had an x in your workspace which had the right number of elements in it, your code would "work" but produce the wrong result, and you wouldn't know any better.
I searched the forum and found this thread, but it does not cover my question
Two ways around -inf
From a Machine Learning class, week 3, I am getting -inf when using log(0), which later turns into an NaN. The NaN results in no answer being given in a sum formula, so no scalar for J (a cost function which is the result of matrix math).
Here is a test of my function
>> sigmoid([-100;0;100])
ans =
3.7201e-44
5.0000e-01
1.0000e+00
This is as expected. but the hypothesis requires ans = 1-sigmoid
>> 1-ans
ans =
1.00000
0.50000
0.00000
and the Log(0) gives -Inf
>> log(ans)
ans =
0.00000
-0.69315
-Inf
-Inf rows do not add to the cost function, but the -Inf carries through to NaN, and I do not get a result. I cannot find any material on -Inf, but am thinking there is a problem with my sigmoid function.
Can you provide any direction?
The typical way to avoid infinity in these cases is to add eps to the operand:
log(ans + eps)
eps is a very, very small value, and won't affect the output for values of ans unless ans is zero:
>> z = [-100;0;100];
>> g = 1 ./ (1+exp(-z));
>> log(1-g + eps)
ans =
0.0000
-0.6931
-36.0437
Adding to the answers here, I really do hope you would provide some more context to your question (in particular, what are you actually trying to do.
I will go out on a limb and guess the context, just in case this is useful. You are probably doing machine learning, and trying to define a cost function based on the negative log likelihood of a model, and then trying to differentiate it to find the point where this cost is at its minimum.
In general for a reasonable model with a useful likelihood that adheres to Cromwell's rule, you shouldn't have these problems, but, in practice it happens. And presumably in the process of trying to calculate a negative log likelihood of a zero probability you get inf, and trying to calculate a differential between two points produces inf / inf = nan.
In this case, this is an 'edge case', and generally in computer science edge cases need to be spotted as exceptional circumstances and dealt with appropriately. The reality is that you can reasonably expect that inf isn't going to be your function's minimum! Therefore, whether you remove it from the calculations, or replace it by a very large number (whether arbitrarily or via machine precision) doesn't really make a difference.
So in practice you can do either of the two things suggested by others here, or even just detect such instances and skip them from the calculation. The practical result should be the same.
-inf means negative infinity. Which is the correct answer because log of (0) is minus infinity by definition.
The easiest thing to do is to check your intermediate results and if the number is below some threshold (like 1e-12) then just set it to that threshold. The answers won't be perfect but they will still be pretty close.
Using the following as the sigmoid function:
function g = sigmoid(z)
g = 1 ./ (1 + e.^-z);
end
Then the following code runs with no issues. Choose the threshold value in the 'max' statement to be less than the expected noise in your measurements and then you're good to go
>> a = sigmoid([-100, 0, 100])
a =
3.7201e-44 5.0000e-01 1.0000e+00
>> b = 1-a
b =
1.00000 0.50000 0.00000
>> c = max(b, 1e-12)
c =
1.0000e+00 5.0000e-01 1.0000e-12
>> d = log(c)
d =
0.00000 -0.69315 -27.63102
In the following snippet (hardcoded values are just an example):
int nBulk = 30, // bulk size
nMax = 130; // max records to retrieve
nRetrieved = 0; // records retrieved so far
do
{
var response = GetRecords(nBulk);
nRetrieved += response.Count;
nBulk = nMax - nRetrieved >= nBulk ? nBulk : nMax - nRetrieved;
}
while (nRetrieved < response.Total && nRetrieved < nMax);
nBulk is assigned with a new value using a ternary expression.
Can the ternary expression be replaced with a simple arithmetic expression (i.e., no branches)?
Nothing indicates that nMax==response.Total: consequently I deduce that last GetRecords request is going to return an effectively nRetrieved <= nRequested.
If GetRecords has this capacity to read just enough available, there is no necessity to modify nBulk inside the loop from the code you are showing.
However, if I would see such code, my deduction would be that modifying nBulk for the last read might be necessary because GetRecords has a side effect - like filling a buffer - IOW, that a protection is absolutely necessary in order to avoid a buffer overflow.
In this case, the branch would be the least of the problems: having a method/function modifying states, but requiring the knowledge of the internal buffer limitation (nMax) at sender/caller site is dangerous!
This is why I would tend to move the protection wrt. nMax inside the GetRecords method/function rather than outside, and if you do so, the problem is resolved at send site: no need to modify nBulk in the loop anymore, and no need to devise about nMax.
Of course, the test will be moved inside GetRecords and thus won't disappear. But frankly, why is it going to be a problem? Couldn't you increment the bulkSize so that this test becomes absolutely neglectable?
If you cannot for whatever reason, then:
Do a first read to get the Total remaining size to read
handle the case when nRemaining >= nMax by truncating to nMax outside the loop
loop while nRemaining >= nBulk, and decrement nRemaining by effectively nRetrieved
perform a last GetRecords with nRemaining request.
There is still a test in the loop (nRemaining >= nBulk), so all this seems pretty useless.
To avoid this test, you must be sure that the effectively nRetrieved is equal to the requested nBulk when there is enough to get (nothing indicates that in your code).
In this case, you can then perform a division at step 3. nLoop=integer_floor_division(Total/nBulk), perform this number of loop (and have some compiler do the loop unrolling for you in order to reduce the number of tests on average), then perform the GetRecords for the remaining (modulo/remainder) Total%nBulk at step 4.
All you will get IMO is a code more fragile and brittle, with a risk of annihilating the marginal gains by populating the code cache... The question you should ask is: do I really need this micro-optimization and why?
The answer is Yes, but it's pretty ugly:
nBulk = (nBulk + (nMax - nRetrieved) + Math.Abs(nBulk - (nMax - nRetrieved)))/2;
If you want to avoid Abs then you could do it this way:
nBulk = (nBulk + (nMax - nRetrieved)
+ Math.Sqrt((nBulk - (nMax - nRetrieved)) * (nBulk - (nMax - nRetrieved))))/2;
which is probably worse.
Long story short, there's no good "Language Agnostic" way to do this because there's no Lanugae Agnostic way to convert from booleans to integers without branching. In those languages where you cna just say:
int i = (true);
then it's a different story.
Is there a way to avoid creating an array in this Julia expression:
max((filter(n -> string(n) == reverse(string(n)), [x*y for x = 1:N, y = 1:N])))
and make it behave similar to this Python generator expression:
max(x*y for x in range(N+1) for y in range(x, N+1) if str(x*y) == str(x*y)[::-1])
Julia version is 2.3 times slower then Python due to array allocation and N*N iterations vs. Python's N*N/2.
EDIT
After playing a bit with a few implementations in Julia, the fastest loop style version I've got is:
function f(N) # 320ms for N=1000 Julia 0.2.0 i686-w64-mingw32
nMax = NaN
for x = 1:N, y = x:N
n = x*y
s = string(n)
s == reverse(s) || continue
nMax < n && (nMax = n)
end
nMax
end
but an improved functional version isn't far behind (only 14% slower or significantly faster, if you consider 2x larger domain):
function e(N) # 366ms for N=1000 Julia 0.2.0 i686-w64-mingw32
isPalindrome(n) = string(n) == reverse(string(n))
max(filter(isPalindrome, [x*y for x = 1:N, y = 1:N]))
end
There is 2.6x unexpected performance improvement by defining isPalindrome function, compared to original version on the top of this page.
We have talked about allowing the syntax
max(f(x) for x in itr)
as a shorthand for producing each of the values f(x) in one coroutine while computing the max in another coroutine. This would basically be shorthand for something like this:
max(#task for x in itr; produce(f(x)); end)
Note, however, that this syntax that explicitly creates a task already works, although it is somewhat less pretty than the above. Your problem can be expressed like this:
max(#task for x=1:N, y=x:N
string(x*y) == reverse(string(x*y)) && produce(x*y)
end)
With the hypothetical producer syntax above, it could be reduced to something like this:
max(x*y if string(x*y) == reverse(string(x*y) for x=1:N, y=x:N)
While I'm a fan of functional style, in this case I would probably just use a for loop:
m = 0
for x = 1:N, y = x:N
n = x*y
string(n) == reverse(string(n)) || continue
m < n && (m = n)
end
Personally, I don't find this version much harder to read and it will certainly be quite fast in Julia. In general, while functional style can be convenient and pretty, if your primary focus is on performance, then explicit for loops are your friend. Nevertheless, we should make sure that John's max/filter/product version works. The for loop version also makes other optimizations easier to add, like Harlan's suggestion of reversing the loop ordering and exiting on the first palindrome you find. There are also faster ways to check if a number is a palindrome in a given base than actually creating and comparing strings.
As to the general question of "getting flexible generators and list comprehensions in Julia", the language already has
A general high-performance iteration protocol based on the start/done/next functions.
Far more powerful multidimensional array comprehensions than most languages. At this point, the only missing feature is the if guard, which is complicated by the interaction with multidimensional comprehensions and the need to potentially dynamically grow the resulting array.
Coroutines (aka tasks) which allow, among other patterns, the producer-consumer pattern.
Python has the if guard but doesn't worry about comprehension performance nearly as much – if we're going to add that feature to Julia's comprehensions, we're going to do it in a way that's both fast and interacts well with multidimensional arrays, hence the delay.
Update: The max function is now called maximum (maximum is to max as sum is to +) and the generator syntax and/or filters work on master, so for example, you can do this:
julia> #time maximum(100x - x^2 for x = 1:100 if x % 3 == 0)
0.059185 seconds (31.16 k allocations: 1.307 MB)
2499
Once 0.5 is out, I'll update this answer more thoroughly.
There are two questions being mixed together here: (1) can you filter a list comprehension mid-comprehension (for which the answer is currently no) and (2) can you use a generator that doesn't allocate an array (for which the answer is partially yes). Generators are provided by the Iterators package, but the Iterators package seems to not play well with filter at the moment. In principle, the code below should work:
max((x, y) -> x * y,
filter((x, y) -> string(x * y) == reverse(string(x * y)),
product(1:N, 1:N)))
I don't think so. There aren't currently filters in Julia array comprehensions. See discussion in this issue.
In this particular case, I'd suggest just nested for loops if you want to get faster computation.
(There might be faster approaches where you start with N and count backwards, stopping as soon as you find something that succeeds. Figuring out how to do that correctly is left as an exercise, etc...)
As mentioned, this is now possible (using Julia 0.5.0)
isPalindrome(n::String) = n == reverse(n)
fun(N::Int) = maximum(x*y for x in 1:N for y in x:N if isPalindrome(string(x*y)))
I'm sure there are better ways that others can comment on. Time (after warm-up):
julia> #time fun(1000);
0.082785 seconds (2.03 M allocations: 108.109 MB, 27.35% gc time)