Octave use of iteration (start:step:end) - octave

Let's say I have an array A = [1 2 3 4 5 6 7 8 9 10]. I want to iterate through it and do something with each number.
A(start:step:end) -> since I want to iterate with step 1 I use A(1:10).
Question here is, how can I use that iteration? In C++ you would do
for (int i = 0; i < 10; i++)
{
//DO SOMETHING
}
I've spent 4 hours searching how to use that iteration. I have not found a single explanation to such a trivial thing: passing block of code to actually do something with numbers. Don't even know how to use current index (i.e., i in C++).
I have my function in Octave f = #(variable) (...), however when I call f(A(1:10)) it is not really passing each number to the function but rather finishes iteration and then executes function.
I'd expect something like
A(1:10) (DO SOMETHING WITH EACH NUMBER)
or in my example
A(1:10) ( f(INDEX) )
but that does not seem to work either.
I know Octave has a built-in for loop but in my case it is too slow.
That was simplified explanation, here is more advanced.
I want to multiply matrix A in such way that one matrix starts iteration with 1 and the other one with 2 (e.g., A(1:end-1).*A(2:end)) and use each multiplied number in my custom function.

An analogue of C++ loop
for (int i = 0; i < 10; i++)
{
//DO SOMETHING
}
would be
for i = 1:10
//DO SOMETHING
end
since Octave indices are 1-based.
There is no array.forEach(do something) construction in Octave.
Most of the time, a speed-up is achieved by passing arrays (vectors and matrices) to a function at once, and structuring the function itself so that it can handle an array. How to do the latter depends on what the function is.

Related

F#: How to Call a function with Argument Byref Int

I have this code:
let sumfunc(n: int byref) =
let mutable s = 0
while n >= 1 do
s <- n + (n-1)
n <- n-1
printfn "%i" s
sumfunc 6
I get the error:
(8,10): error FS0001: This expression was expected to have type
'byref<int>'
but here has type
'int'
So from that I can tell what the problem is but I just dont know how to solve it. I guess I need to specify the number 6 to be a byref<int> somehow. I just dont know how. My main goal here is to make n or the function argument mutable so I can change and use its value inside the function.
Good for you for being upfront about this being a school assignment, and for doing the work yourself instead of just asking a question that boils down to "Please do my homework for me". Because you were honest about it, I'm going to give you a more detailed answer than I would have otherwise.
First, that seems to be a very strange assignment. Using a while loop and just a single local variable is leading you down the path of re-using the n parameter, which is a very bad idea. As a general rule, a function should never modify values outside of itself — and that's what you're trying to do by using a byref parameter. Once you're experienced enough to know why byref is a bad idea most of the time, you're experienced enough to know why it might — MIGHT — be necessary some of the time. But let me show you why it's a bad idea, by using the code that s952163 wrote:
let sumfunc2 (n: int byref) =
let mutable s = 0
while n >= 1 do
s <- n + (n - 1)
n <- n-1
printfn "%i" s
let t = ref 6
printfn "The value of t is %d" t.contents
sumfunc t
printfn "The value of t is %d" t.contents
This outputs:
The value of t is 7
13
11
9
7
5
3
1
The value of t is 0
Were you expecting that? Were you expecting the value of t to change just because you passed it to a function? You shouldn't. You really, REALLY shouldn't. Functions should, as far as possible, be "pure" -- a "pure" function, in programming terminology, is one that doesn't modify anything outside itself -- and therefore, if you run it twice with the same input, it should produce the same output every time.
I'll give you a way to solve this soon, but I'm going to post what I've written so far right now so that you see it.
UPDATE: Now, here's a better way to solve it. First, has your teacher covered recursion yet? If he hasn't, then here's a brief summary: functions can call themselves, and that's a very useful technique for solving all sorts of problems. If you're writing a recursive function, you need to add the rec keyword immediately after let, like so:
let rec sumExampleFromStackOverflow n =
if n <= 0 then
0
else
n + sumExampleFromStackOverflow (n-1)
let t = 7
printfn "The value of t is %d" t
printfn "The sum of 1 through t is %d" (sumExampleFromStackOverflow t)
printfn "The value of t is %d" t
Note how I didn't need to make t mutable this time. In fact, I could have just called sumExampleFromStackOverflow 7 and it would have worked.
Now, this doesn't use a while loop, so it might not be what your teacher is looking for. And I see that s952163 has just updated his answer with a different solution. But you should really get used to the idea of recursion as soon as you can, because breaking the problem down into individual steps using recursion is a really powerful technique for solving a lot of problems in F#. So even though this isn't the answer you're looking for right now, it is the answer you're going to be looking for soon.
P.S. If you use any of the help you've gotten here, tell your teacher that you've done so, and give him the URL of this question (http://stackoverflow.com/questions/39698430/f-how-to-call-a-function-with-argument-byref-int) so he can read what you asked and what other people told you. If he's a good teacher, he won't lower your grade for doing that; in fact, he might raise it for being honest and upfront about how you solved the problem. But if you got help with your homework and you don't tell your teacher, 1) that's dishonest, and 2) you'll only hurt yourself in the long run, because he'll think you understand a concept that you maybe haven't understood yet.
UPDATE 2: s952163 suggests that I show you how to use the fold and scan functions, and I thought "Why not?" Keep in mind that these are advanced techniques, so you probably won't get assignments where you need to use fold for a while. But fold is basically a way to take any list and do a calculation that turns the list into a single value, in a generic way. With fold, you specify three things: the list you want to work with, the starting value for your calculation, and a function of two parameters that will do one step of the calculation. For example, if you're trying to add up all the numbers from 1 to n, your "one step" function would be let add a b = a + b. (There's an even more advanced feature of F# that I'm skipping in this explanation, because you should learn just one thing at a time. By skipping it, it keeps the add function simple and easy to understand.)
The way you would use fold looks like this:
let sumWithFold n =
let upToN = [1..n] // This is the list [1; 2; 3; ...; n]
let add a b = a + b
List.fold add 0 upToN
Note that I wrote List.fold. If upToN was an array, then I would have written Array.fold instead. The arguments to fold, whether it's List.fold or Array.fold, are, in order:
The function to do one step of your calculation
The initial value for your calculation
The list (if using List.fold) or array (if using Array.fold) that you want to do the calculation with.
Let me step you through what List.fold does. We'll pretend you've called your function with 4 as the value of n.
First step: the list is [1;2;3;4], and an internal valueSoFar variable inside List.fold is set to the initial value, which in our case is 0.
Next: the calculation function (in our case, add) is called with valueSoFar as the first parameter, and the first item of the list as the second parameter. So we call add 0 1 and get the result 1. The internal valueSoFar variable is updated to 1, and the rest of the list is [2;3;4]. Since that is not yet empty, List.fold will continue to run.
Next: the calculation function (add) is called with valueSoFar as the first parameter, and the first item of the remainder of the list as the second parameter. So we call add 1 2 and get the result 3. The internal valueSoFar variable is updated to 3, and the rest of the list is [3;4]. Since that is not yet empty, List.fold will continue to run.
Next: the calculation function (add) is called with valueSoFar as the first parameter, and the first item of the remainder of the list as the second parameter. So we call add 3 3 and get the result 6. The internal valueSoFar variable is updated to 6, and the rest of the list is [4] (that's a list with one item, the number 4). Since that is not yet empty, List.fold will continue to run.
Next: the calculation function (add) is called with valueSoFar as the first parameter, and the first item of the remainder of the list as the second parameter. So we call add 6 4 and get the result 10. The internal valueSoFar variable is updated to 10, and the rest of the list is [] (that's an empty list). Since the remainder of the list is now empty, List.fold will stop, and return the current value of valueSoFar as its final result.
So calling List.fold add 0 [1;2;3;4] will essentially return 0+1+2+3+4, or 10.
Now we'll talk about scan. The scan function is just like the fold function, except that instead of returning just the final value, it returns a list of the values produced at all the steps (including the initial value). (Or if you called Array.scan, it returns an array of the values produced at all the steps). In other words, if you call List.scan add 0 [1;2;3;4], it goes through the same steps as List.fold add 0 [1;2;3;4], but it builds up a result list as it does each step of the calculation, and returns [0;1;3;6;10]. (The initial value is the first item of the list, then each step of the calculation).
As I said, these are advanced functions, that your teacher won't be covering just yet. But I figured I'd whet your appetite for what F# can do. By using List.fold, you don't have to write a while loop, or a for loop, or even use recursion: all that is done for you! All you have to do is write a function that does one step of a calculation, and F# will do all the rest.
This is such a bad idea:
let mutable n = 7
let sumfunc2 (n: int byref) =
let mutable s = 0
while n >= 1 do
s <- n + (n - 1)
n <- n-1
printfn "%i" s
sumfunc2 (&n)
Totally agree with munn's comments, here's another way to implode:
let sumfunc3 (n: int) =
let mutable s = n
while s >= 1 do
let n = s + (s - 1)
s <- (s-1)
printfn "%i" n
sumfunc3 7

Cuda Kernel with different array sizes

I am working on a fluid dynamic problem in cuda and discovered a problem like this
if I have an array e.g debug_array with the length 600 and an array
value_array with the length 100 and I wanna do sth like
for(int i=0;i<6;i++)
{
debug_array[6*(bx*block_size+tx)+i] = value_array[bx*block_size+tx];
}
block_size would in this example be based on the 100 element array, e.g
4 blocks block_size 25
if value_array contains e.g 10;20;30;.....
I would expect debug_array to have groups of 6 similar values like
10;10;10;10;10;10;20;20;20;20;20;20;30......
The problem is that it is not picking up all values from the values array, any idea
why this isn't working or a good workaround.
What will work is if I define float val = value_array[bx*block_size+tx]; outside the for loop and keep this inside the loop debug_array[bx*block_size+tx+i] = val;
But I would like to avoid that as my kernels have between 5 and 10 device function inside the loop and it makes it just hard to read.
thanks in advance any advice will be appriciated
Markus
There seems to be an error in computing the index:
Let's assume bx = 0 and tx = 0
The first 6 elements in debug_array will be filled with data.
Next thread: tx = 1: Elements 1 to 7 will be filled with data (overwriting existing data).
Due to the threads working in parallel it is not determined which thread will be scheduled first and therefore which values will be written into the debug_array.
You should have written:
debug_array[6*(bx*block_size+tx)+i] = value_array[bx*block_size+tx];
If changing the code to move the value_array expression out of the loop and into a temp variable makes the code work - and that is the only code change you made - then this smells like a compiler bug.
Try changing your nvcc compiler options to reduce or disable optimizations and see if the value_array expression inside the loop changes behavior. Also, make sure you're using the latest CUDA tools.
Optimizing compilers will often attempt to move expressions that aren't dependent on the loop index variable out of the loop, exactly like your manual workaround. It's called "invariant code motion" and it makes loops faster by reducing the amount of code that executes in each iteration of the loop. If manually extracting the invariant code from the loop works, but letting the compiler figure it out on its own doesn't, that casts doubt on the compiler.

What is an effective way to filter numbers according to certain number-ranges using CUDA?

I have a lot of random floating point numbers residing in global GPU memory. I also have "buckets" that specify ranges of numbers they will accept and a capacity of numbers they will accept.
ie:
numbers: -2 0 2 4
buckets(size=1): [-2, 0], [1, 5]
I want to run a filtration process that yields me
filtered_nums: -2 2
(where filtered_nums can be a new block of memory)
But every approach I take runs into a huge overhead of trying to synchronize threads across bucket counters. If I try to use a single-thread, the algorithm completes successfully, but takes frighteningly long (over 100 times slower than generating the numbers in the first place).
What I am asking for is a general high-level, efficient, as-simple-as-possible approach algorithm that you would use to filter these numbers.
edit
I will be dealing with 10 buckets and half a million numbers. Where all the numbers fall into exactly 1 of the 10 bucket ranges. Each bucket will hold 43000 elements. (There are excess elements, since the objective is to fill every bucket, and many numbers will be discarded).
2nd edit
It's important to point out that the buckets do not have to be stored individually. The objective is just to discard elements that would not fit into a bucket.
You can use thrust::remove_copy_if
struct within_limit
{
__host__ __device__
bool operator()(const int x)
{
return (x >=lo && x < hi);
}
};
thrust::remove_copy_if(input, input + N, result, within_limit());
You will have to replace lo and hi with constants for each bin..
I think you can templatize the kernel, but then again you will have to instantiate the template with actual constants. I can't see an easy way at it, but I may be missing something.
If you are willing to look at third party libraries, arrayfire may offer an easier solution.
array I = array(N, input, afDevice);
float **Res = (float **)malloc(sizeof(float *) * nbins);
for(int i = 0; i < nbins; i++) {
array res = where(I >= lo[i] && I < hi[i]);
Res[i] = res.device<float>();
}

input element 1 by 1 from other function w/o for loop

I have a function that calculates an array of numbers (randparam) that I want to input element by element into another function that does a simulation.
For example
function [randparam] = arraycode
code
randparam = results of code
% randparam is now a 1x1001 vector.
end
next I want to input randparam 1 by 1 into my simulation function
function simulation
x = constant + constant * randparam + constant
return
end
What makes this difficult for me is because of the return command in the simulation function, it only calculates one step of the equation x above, returns the result into another function, call it integrator, and then the integrator function will call simulation function again to calculate x.
so the integrator function might look like
function integrator (x)
y = simulation(x) * 5
u = y+10
yy = simulation(x) * 10 + u
end
As you can see, integrator function calls the simulation function twice. which creates two problems for me:
If I create a for loop in the simulation function where I input element by element using something like:
for i = 1:100
x = constant + constant * randparam(i) + constant
return
end
then every time my integrator function calls my simulation function again, my for loop starts at 1 all over again.
2.If I some how kept i in my base workspace so that my for loop in my simulation function would know to step up from 1, then my y and yy functions would have different x inputs because as soon as it would be called the second time for yy, then i would now be i+1 thanks to the call due to y.
Is there a way to avoid for loops in this scenario? One potential solution to problem number two is to duplicate the script but with a different name, and have my for loop use a different variable, but that seems rather inefficient.
Hope I made this clear.
thanks.
First, if you generically want to apply the same function to each element of an array and there isn't already a built in vectorized way to do it, you could use arrayfun (although often a simple for loop is faster and more readable):
%# randparam is a 1x1001 vector.
%#next I want to input randparam 1 by 1 into my simulation function
function simulation
x = constant + constant * randparam + constant
return
end
(Note: ask yourself what this function can possibly be doing, since it isn't returning a value and MATLAB doesn't pass by reference.) This is what arrayfun is for: applying a function to each element of an array (or vector, in this case). Again, you should make sure in your case that it makes sense to do this, rather than an explicit loop.
function simulation(input_val)
#% your stuff
end
sim_results = arrayfun( #simulation, randparam);
Of course, the way you've written it, the line
x = constant + constant*randparam + constant;
can (and will) be done vectorized - if you give it a vector or matrix, a vector or matrix will be the result.
Second it seems that you're not clear on the "scope" of function variables in MATLAB. If you call a function, a clean workspace is created. So x from one function isn't automatically available within another function you call. Variables also go out of scope at the end of a function, so using x within a function doesn't change/overwrite a variable x that exists outside that function. And multiple invocations of a function each have their own workspace.
What's wrong with a loop at the integrator level?
function integrator (x)
for i=1:length(x)
y = simulation(x(i)) * 5
u = y+10
yy = simulation(x(i)) * 10 + u
end
And pass your entire randparm into integrator? It's not clear from your question whether you want simulation to return the same value when given the same input, or whether you want it step twice with the same input, or whether you want a fresh input on every call. It is also not clear if simulation keeps any internal state. The way you've written the example, simulation depends only on the input value, not on any previous inputs or outputs, which would make it trivial to vectorize. If we're all missing the boat, please edit your question with more refined example code.

Loop termination conditions

These for-loops are among the first basic examples of formal correctness proofs of algorithms. They have different but equivalent termination conditions:
1 for ( int i = 0; i != N; ++i )
2 for ( int i = 0; i < N; ++i )
The difference becomes clear in the postconditions:
The first one gives the strong guarantee that i == N after the loop terminates.
The second one only gives the weak guarantee that i >= N after the loop terminates, but you will be tempted to assume that i == N.
If for any reason the increment ++i is ever changed to something like i += 2, or if i gets modified inside the loop, or if N is negative, the program can fail:
The first one may get stuck in an infinite loop. It fails early, in the loop that has the error. Debugging is easy.
The second loop will terminate, and at some later time the program may fail because of your incorrect assumption of i == N. It can fail far away from the loop that caused the bug, making it hard to trace back. Or it can silently continue doing something unexpected, which is even worse.
Which termination condition do you prefer, and why? Are there other considerations? Why do many programmers who know this, refuse to apply it?
I tend to use the second form, simply because then I can be more sure that the loop will terminate. I.e. it's harder to introduce a non-termination bug by altering i inside the loop.
Of course, it also has the slightly lazy advantage of being one less character to type ;)
I would also argue, that in a language with sensible scope rules, as i is declared inside the loop construct, it shouldn't be available outside the loop. This would mitigate any reliance on i being equal to N at the end of the loop...
We shouldn't look at the counter in isolation - if for any reason someone changed the way the counter is incremented they would change the termination conditions and the resulting logic if it's required for i==N.
I would prefer the the second condition since it's more standard and will not result in endless loop.
In C++, using the != test is preferred for generality. Iterators in C++ have various concepts, like input iterator, forward iterator, bidirectional iterator, random access iterator, each of which extends the previous one with new capabilities. For < to work, random access iterator is required, whereas != merely requires input iterator.
If you trust your code, you can do either.
If you want your code to be readable and easily understood (and thus more tolerant to change from someone who you've got to assume to be a klutz), I'd use something like;
for ( int i = 0 ; i >= 0 && i < N ; ++i)
I always use #2 as then you can be sure the loop will terminate... Relying on it being equal to N afterwards is relying on a side effect... Wouldn't you just be better using the variable N itself?
[edit] Sorry...I meant #2
I think most programmers use the 2nd one, because it helps figure out what goes on inside the loop. I can look at it, and "know" that i will start as 0, and will definitely be less than N.
The 1st variant doesn't have this quality. I can look at it, and all I know is that i will start as 0 and that it won't ever be equal to N. Not quite as helpful.
Irrespective of how you terminate the loop, it is always good to be very wary of using a loop control variable outside the loop. In your examples you (correctly) declare i inside the loop, so it is not in scope outside the loop and the question of its value is moot...
Of course, the 2nd variant also has the advantage that it's what all of the C references I have seen use :-)
In general I would prefer
for ( int i = 0; i < N; ++i )
The punishment for a buggy program in production, seems a lot less severe, you will not have a thread stuck forever in a for loop, a situation that can be very risky and very hard to diagnose.
Also, in general I like to avoid these kind of loops in favour of the more readable foreach style loops.
I prefer to use #2, only because I try not to extend the meaning of i outside of the for loop. If I were tracking a variable like that, I would create an additional test. Some may say this is redundant or inefficient, but it reminds the reader of my intent: At this point, i must equal N
#timyates - I agree one shouldn't rely on side-effects
I think you stated very well the difference between the two. I do have the following comments, though:
This is not "language-agnostic", I can see your examples are in C++ but there
are languages where you are not allowed to modify the loop variable inside the
loop and others that don't guarantee that the value of the index is usable after
the loop (and some do both).
You have declared the i
index within the for so I would not bet on the value of i after the loop.
The examples are a little bit misleading as they implictly assume that for is
a definite loop. In reality it is just a more convenient way of writing:
// version 1
{ int i = 0;
while (i != N) {
...
++i;
}
}
Note how i is undefined after the block.
If a programmer knew all of the above would not make general assumption of the value of i and would be wise enough to choose i<N as the ending conditions, to ensure that the the exit condition will be eventually met.
Using either of the above in c# would cause a compiler error if you used i outside the loop
I prefer this sometimes:
for (int i = 0; (i <= (n-1)); i++) { ... }
This version shows directly the range of values that i can have. My take on checking lower and upper bound of the range is that if you really need this, your code has too many side effects and needs to be rewritten.
The other version:
for (int i = 1; (i <= n); i++) { ... }
helps you determine how often the loop body is called. This also has valid use cases.
For general programming work I prefer
for ( int i = 0; i < N; ++i )
to
for ( int i = 0; i != N; ++i )
Because it is less error prone, especially when code gets refactored. I have seen this kind of code turned into an infinite loop by accident.
That argument made that "you will be tempted to assume that i == N", I don't believe is true. I have never made that assumption or seen another programmer make it.
From my standpoint of formal verification and automatic termination analysis, I strongly prefer #2 (<). It is quite easy to track that some variable is increased (before var = x, after var = x+n for some non-negative number n). However, it is not that easy to see that i==N eventually holds. For this, one needs to infer that i is increased by exactly 1 in each step, which (in more complicated examples) might be lost due to abstraction.
If you think about the loop which increments by two (i = i + 2), this general idea becomes more understandable. To guarantee termination one now needs to know that i%2 == N%2, whereas this is irrelevant when using < as the condition.