Why one element of SubString Vector can not be tested into conditional evaluation if (Julia)? - function

I want to create a function in which, first, it filters one element of a dataframe in Julia. Second, it tests if the element is "missing". If the answer is rue, it return the value "0.0". My issue is that the control evaluation "if" does not work and I don t know why. If the element is "String" the control evaluation works, however, the element is a 1-element Vector{SubString{String}}: after filtering; thus, the control evaluation does not work. I would like to know why and it is possible to turn the vector element into a string object.
Note: "isequal", '==', '===' do not work either.
For example:
example_ped = DataFrame(animal = collect(1:1:11),
sire = [fill(0,5); fill(4,3); fill(5,3)],
dam = [fill(0,4); fill(2,4); fill(3,3)])
CSV.write("ped_example.txt",example_ped, header=true,delim='\t')
pedi = CSV.read("ped_example.txt",delim = '\t', header=true, missingstrings=["0"], DataFrame)
pedi[!,1]=strip.(string.(pedi[!,1]))
pedi[!,2]=strip.(string.(pedi[!,2]))
pedi[!,3]=strip.(string.(pedi[!,3]))
Part of the function
function computAddRel!(ped,animal_1,animal_2)
elder,recent = animal_1 < animal_2 ? (animal_1,animal_2) : (animal_2,animal_1)
sireOfrecent = ped.sire[ped.animal.==recent]
damOfrecent = ped[ped.animal.==recent,"dam"]
if elder==recent
f_inbreed = (sireOfrecent=="missing" || damOfrecent=="missing") ? 0.0 : 0.5*computAddRel!(ped,sireOfrecent,damOfrecent)
adiv = 1.0 + f_inbreed
return adiv
end
end
if the animal_1 and animal_2 are equal to 5
julia> sireOfrecent = pedi.sire[pedi.animal.==recent]
1-element Vector{Union{Missing, Int64}}:
missing
However, the control evaluation is false
julia> sireOfrecent=="missing"
false
julia> isequal(sireOfrecent,"missing")
false
Thank in advance for your time.

You should write:
ismissing(only(sireOfrecent))
The meaning of this:
only checks if you picked exactly one row (if not - you will get an error, as then there is ambiguity; if yes - you extract out the element from an array)
ismissing is a function that you should use to check if some value is missing.
Here are some examples:
julia> x = [missing]
1-element Vector{Missing}:
missing
julia> only(x)
missing
julia> ismissing(only(x))
true
julia> only([1, 2])
ERROR: ArgumentError: Collection has multiple elements, must contain exactly 1 element
julia> ismissing(only([1]))
false

Related

How to define a default vector/matrix for a function's input

I would like to create a function in which one of the inputs is a matrix. But I also want the function to have a default input. For example, please see the following simple "test" function with an input "x":
def test(x=None):
if x==None:
y = np.array([[123], [123]])
else:
y = x
return y
In this way, let's say I want to see the function without providing input:
print(test())
Would give:
[[123]
[123]]
However, if I want "x" to be a matrix or vector (like the following script):
z = np.array([[12], [12]])
print(test(z)
I got an error saying:
**"The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()".**
Fine, I want to comply with the error warning. Then I changed the function to:
def test(x=None):
if x.all()==None:
y = np.array([[123], [123]])
else:
y = x
return y
Would return (as expected):
print(test())
[[123]
[123]]
However, with the revised script, if I want x to be none again:
print(test())
Gives a new warning:
**'NoneType' object has no attribute 'all'**
How can I solve this? I want the function to work either with x being a pre-defined matrix or not (a default).
You can use default parameter, then if no argument is provided it will be used
def test(x=np.array([[123], [123]])):
return x

OCTAVE: Checking existence of an element of a cell array

I am using Octave 4.0.0.
I define A{1, 1} = 'qwe', but when I check existence of A{1, 1}, as in
exist("A{1,1}")
or
exist("A{1,1}", "var")
it returns 0.
How can I check its existence?
To check if an array has element say 3, 5, you need to verify that the array has at least 3 rows and 5 columns:
all(size(A) >= [3, 5])
You can of course check if variable A exists at all before-hand, and also is a cell array. A complete solution might be something like
function b = is_element(name, varargin)
b = false;
if ~evalin(['exists("' name '")'], 'caller')
return;
end
if ~strcmp(evalin(['class(' name ')'], 'caller'), 'cell')
return;
end
if evalin(['ndim(' name ')'], 'caller') ~= nargin - 1
return;
end
b = all(evalin(['size(' name ')'], 'caller') >= cell2mat(varargin))
endfunction
This function accepts a variable name and the multi-dimensional index you are interested in. It returns 1 if the object exists as a cell array of sufficient dimensionality and size to contain the requested element.

Better way than using `Task/produce/consume` for lazy collections express as coroutines

It is very convenient to use Tasks
to express a lazy collection / a generator.
Eg:
function fib()
Task() do
prev_prev = 0
prev = 1
produce(prev)
while true
cur = prev_prev + prev
produce(cur)
prev_prev = prev
prev = cur
end
end
end
collect(take(fib(), 10))
Output:
10-element Array{Int64,1}:
1
1
2
3
5
8
13
21
34
However, they do not follow good iterator conventions at all.
They are as badly behaved as they can be
They do not use the returned state state
start(fib()) == nothing #It has no state
So they are instead mutating the iterator object itself.
An proper iterator uses its state, rather than ever mutating itself, so they multiple callers can iterate it at once.
Creating that state with start, and advancing it during next.
Debate-ably, that state should be immutable with next returning a new state, so that can be trivially teeed. (On the other hand, allocating new memory -- though on the stack)
Further-more, the hidden state, it not advanced during next.
The following does not work:
#show ff = fib()
#show state = start(ff)
#show next(ff, state)
Output:
ff = fib() = Task (runnable) #0x00007fa544c12230
state = start(ff) = nothing
next(ff,state) = (nothing,nothing)
Instead the hidden state is advanced during done:
The following works:
#show ff = fib()
#show state = start(ff)
#show done(ff,state)
#show next(ff, state)
Output:
ff = fib() = Task (runnable) #0x00007fa544c12230
state = start(ff) = nothing
done(ff,state) = false
next(ff,state) = (1,nothing)
Advancing state during done isn't the worst thing in the world.
After all, it is often the case that it is hard to know when you are done, without going to try and find the next state. One would hope done would always be called before next.
Still it is not great, since the following happens:
ff = fib()
state = start(ff)
done(ff,state)
done(ff,state)
done(ff,state)
done(ff,state)
done(ff,state)
done(ff,state)
#show next(ff, state)
Output:
next(ff,state) = (8,nothing)
Which is really now what you expect. It is reasonably to assume that done is safe to call multiple times.
Basically Tasks make poor iterators. In many cases they are not compatible with other code that expects an iterator. (In many they are, but it is hard to tell which from which).
This is because Tasks are not really for use as iterators, in these "generator" functions. They are intended for low-level control flow.
And are optimized as such.
So what is the better way?
Writing an iterator for fib isn't too bad:
immutable Fib end
immutable FibState
prev::Int
prevprev::Int
end
Base.start(::Fib) = FibState(0,1)
Base.done(::Fib, ::FibState) = false
function Base.next(::Fib, s::FibState)
cur = s.prev + s.prevprev
ns = FibState(cur, s.prev)
cur, ns
end
Base.iteratoreltype(::Type{Fib}) = Base.HasEltype()
Base.eltype(::Type{Fib}) = Int
Base.iteratorsize(::Type{Fib}) = Base.IsInfinite()
But is is a bit less intuitive.
For more complex functions, it is much less nice.
So my question is:
What is a better way to have something that works like as Task does, as a way to buildup a iterator from a single function, but that is well behaved?
I would not be surprised if someone has already written a package with a macro to solve this.
The current iterator interface for Tasks is fairly simple:
# in share/julia/base/task.jl
275 start(t::Task) = nothing
276 function done(t::Task, val)
277 t.result = consume(t)
278 istaskdone(t)
279 end
280 next(t::Task, val) = (t.result, nothing)
Not sure why the devs chose to put the consumption step in the done function rather than the next function. This is what is producing your weird side-effect. To me it sounds much more straightforward to implement the interface like this:
import Base.start; function Base.start(t::Task) return t end
import Base.next; function Base.next(t::Task, s::Task) return consume(s), s end
import Base.done; function Base.done(t::Task, s::Task) istaskdone(s) end
Therefore, this is what I would propose as the answer to your question.
I think this simpler implementation is a lot more meaningful, fulfils your criteria above, and even has the desired outcome of outputting a meaningful state: the Task itself! (which you're allowed to "inspect" if you really want to, as long as that doesn't involve consumption :p ).
However, there are certain caveats:
Caveat 1: The task is REQUIRED to have a return value, signifying the final element in the iteration, otherwise "unexpected" behaviour might occur.
I'm assuming the devs chose the first approach to avoid exactly this kind of "unintended" output; however I believe this should have actually been the expected behaviour! A task expected to be used as an iterator should be expected to define an appropriate iteration endpoint (by means of a clear return value) by design!
Example 1: The wrong way to do it
julia> t = Task() do; for i in 1:10; produce(i); end; end;
julia> collect(t) |> show
Any[1,2,3,4,5,6,7,8,9,10,nothing] # last item is a return value of nothing
# correponding to the "return value" of the
# for loop statement, which is 'nothing'.
# Presumably not the intended output!
Example 2: Another wrong way to do it
julia> t = Task() do; produce(1); produce(2); produce(3); produce(4); end;
julia> collect(t) |> show
Any[1,2,3,4,()] # last item is the return value of the produce statement,
# which returns any items passed to it by the last
# 'consume' call; in this case an empty tuple.
# Presumably not the intended output!
Example 3: The (in my humble opinion) right way to do it!.
julia> t = Task() do; produce(1); produce(2); produce(3); return 4; end;
julia> collect(t) |> show
[1,2,3,4] # An appropriate return value ending the Task function ensures an
# appropriate final value for the iteration, as intended.
Caveat 2: The task should not be modified / consumed further inside the iteration (a common requirement with iterators in general), except in the understanding that this intentionally causes a 'skip' in the iteration (which would be a hack at best, and presumably not advisable).
Example:
julia> t = Task() do; produce(1); produce(2); produce(3); return 4; end;
julia> for i in t; show(consume(t)); end
24
More Subtle example:
julia> t = Task() do; produce(1); produce(2); produce(3); return 4; end;
julia> for i in t # collecting i is a consumption event
for j in t # collecting j is *also* a consumption event
show(j)
end
end # at the end of this loop, i = 1, and j = 4
234
Caveat 3: With this scheme it is expected behaviour that you can 'continue where you left off'. e.g.
julia> t = Task() do; produce(1); produce(2); produce(3); return 4; end;
julia> take(t, 2) |> collect |> show
[1,2]
julia> take(t, 2) |> collect |> show
[3,4]
However, if one would prefer the iterator to always start from the pre-consumption state of a task, the start function could be modified to achieve this:
import Base.start; function Base.start(t::Task) return Task(t.code) end;
import Base.next; function Base.next(t::Task, s::Task) consume(s), s end;
import Base.done; function Base.done(t::Task, s::Task) istaskdone(s) end;
julia> for i in t
for j in t
show(j)
end
end # at the end of this loop, i = 4, and j = 4 independently
1234123412341234
Interestingly, note how this variant would affect the 'inner consumption' scenario from 'caveat 2':
julia> t = Task() do; produce(1); produce(2); produce(3); return 4; end;
julia> for i in t; show(consume(t)); end
1234
julia> for i in t; show(consume(t)); end
4444
See if you can spot why this makes sense! :)
Having said all this, there is a philosophical point about whether it even matters that the way a Task behaves with the start, next, and done commands matters at all, in that, these functions are considered "an informal interface": i.e. they are supposed to be "under the hood" functions, not intended to be called manually.
Therefore, as long as they do their job and return the expected iteration values, you shouldn't care too much about how they do it under the hood, even if technically they don't quite follow the 'spec' while doing so, since you were never supposed to be calling them manually in the first place.
How about the following (uses fib defined in OP):
type NewTask
t::Task
end
import Base: start,done,next,iteratorsize,iteratoreltype
start(t::NewTask) = istaskdone(t.t)?nothing:consume(t.t)
next(t::NewTask,state) = (state==nothing || istaskdone(t.t)) ?
(state,nothing) : (state,consume(t.t))
done(t::NewTask,state) = state==nothing
iteratorsize(::Type{NewTask}) = Base.SizeUnknown()
iteratoreltype(::Type{NewTask}) = Base.EltypeUnknown()
function fib()
Task() do
prev_prev = 0
prev = 1
produce(prev)
while true
cur = prev_prev + prev
produce(cur)
prev_prev = prev
prev = cur
end
end
end
nt = NewTask(fib())
take(nt,10)|>collect
This is a good question, and is possibly better suited to the Julia list (now on Discourse platform). In any case, using defined NewTask an improved answer to a recent StackOverflow question is possible. See: https://stackoverflow.com/a/41068765/3580870

How do I set a function to a variable in MATLAB

As a homework assignment, I'm writing a code that uses the bisection method to calculate the root of a function with one variable within a range. I created a user function that does the calculations, but one of the inputs of the function is supposed to be "fun" which is supposed to be set equal to the function.
Here is my code, before I go on:
function [ Ts ] = BisectionRoot( fun,a,b,TolMax )
%This function finds the value of Ts by finding the root of a given function within a given range to a given
%tolerance, using the Bisection Method.
Fa = fun(a);
Fb = fun(b);
if Fa * Fb > 0
disp('Error: The function has no roots in between the given bounds')
else
xNS = (a + b)/2;
toli = abs((b-a)/2);
FxNS = fun(xns);
if FxNS == 0
Ts = xNS;
break
end
if toli , TolMax
Ts = xNS;
break
end
if fun(a) * FxNS < 0
b = xNS;
else
a = xNS;
end
end
Ts
end
The input arguments are defined by our teacher, so I can't mess with them. We're supposed to set those variables in the command window before running the function. That way, we can use the program later on for other things. (Even though I think fzero() can be used to do this)
My problem is that I'm not sure how to set fun to something, and then use that in a way that I can do fun(a) or fun(b). In our book they do something they call defining f(x) as an anonymous function. They do this for an example problem:
F = # (x) 8-4.5*(x-sin(x))
But when I try doing that, I get the error, Error: Unexpected MATLAB operator.
If you guys want to try running the program to test your solutions before posting (hopefully my program works!) you can use these variables from an example in the book:
fun = 8 - 4.5*(x - sin(x))
a = 2
b = 3
TolMax = .001
The answer the get in the book for using those is 2.430664.
I'm sure the answer to this is incredibly easy and straightforward, but for some reason, I can't find a way to do it! Thank you for your help.
To get you going, it looks like your example is missing some syntax. Instead of either of these (from your question):
fun = 8 - 4.5*(x - sin(x)) % Missing function handle declaration symbol "#"
F = # (x) 8-4.5*(x-sin9(x)) %Unless you have defined it, there is no function "sin9"
Use
fun = #(x) 8 - 4.5*(x - sin(x))
Then you would call your function like this:
fun = #(x) 8 - 4.5*(x - sin(x));
a = 2;
b = 3;
TolMax = .001;
root = BisectionRoot( fun,a,b,TolMax );
To debug (which you will need to do), use the debugger.
The command dbstop if error stops execution and opens the file at the point of the problem, letting you examine the variable values and function stack.
Clicking on the "-" marks in the editor creates a break point, forcing the function to pause execution at that point, again so that you can examine the contents. Note that you can step through the code line by line using the debug buttons at the top of the editor.
dbquit quits debug mode
dbclear all clears all break points

Matlab call a function from a function

I have two functions:
function [] = func_one()
S.pb = uicontrol('style','push','unit','pix','posit',[20 20 260 30],
'string','Print Choices','callback',{#func_two,S});
and I have the second function:
function [a] = func_two(varargin)
a = 'alon';
end
I want func_one to return the variable a of func_two. How can I do that please?
I tried:
function [a] = func_one()
But I guess I have to do something with 'callback',{#func_two,S})
Thank you all!
If, as you say, you want func_one to return the value a in func_two then the easiest way to do this without using a callback is:
function [a] = func_one()
S.pb = uicontrol('style','push','unit','pix','posit',[20 20 260 30],
'string','Print Choices');
a = func_two()
The above will allow you to say run a=func_one and a will be the string 'alon'.
If you really really want func_two() to be a callback of your pushbutton, and you want a='alon' to be assigned in the workspace of func_one (the function that calls func_two) then put this in func_two
assignin('caller','a',a)
And if neither is what you want, then maybe you can indicate why you want func_one to return what func_two returns - like the exact interaction you are hoping to have with your GUI and how it differs from what you're actually experiencing.
If you are designing a GUI programmatically, I suggest you use nested functions to share data. Example:
function IncrementExample()
x = 0;
uicontrol('Style','pushbutton', 'String','(0)', ...
'Callback',#callback);
function callback(o,e)
%# you can access the variable x in here
x = x + 1;
%# update button text
set(o, 'String',sprintf('(%d)',x))
drawnow
end
end