Applying a function to multiple rows of a data frame where the row is an argument in the function in R - function

Apologies for the rather long name, but I wanted to be specific. I am rather new to R and coding so please go easy on me.
I have a function as follows:
myfun = function(x, y, g) {return(1 / (1 + exp(y*g%*%x)))}
where x is any data frame with n rows and d columns, y is a scalar and integer, and g is a vector of length d (i.e. same as x). I want to run this function for each row of x without using loops.
I have tried various function in the apply family similar to the code below:
apply(x = a, 1, myfun(y = 1, g = b)
where a is a 3x3 data frame and b is a vector 3 elements long. The above code gives an error that I am missing an argument from myfun, but I am obviously clueless on what to try.
Thanks for any help in advance!
Edit: My actual data frame is huge, sparse, and not very straight forward (I think), so I will include an example data frame and other variables:
a = data.frame(c1 = seq(1,3,1), c2 = seq(4,6,1), c3 = seq(7,9,1))
b = c(1,2,3)
c = 1
Also, I think I may have not clearly stated an important piece of information. I want to actually do a summation of myfun over all the rows and values of b, so I actually want the following:
answer = myfun(a[1,], c, b[1]) + myfun(a[2,], c, b[2]) + myfun(a[3,], c, b[3])
In other words, a[1,] should be applied to myfun with b[1] as they are grouped together. I also made an edit to the function above because I forgot to include return(). Hopefully, this makes things more clear. Apologies for the confusion!

Related

How to write function in Julia v1.1 that inputs multi-dimensional array?

I am trying to write a function in Julia that takes in a multi-dimensional array (a data cube) and rescales every entry from 0 to 1. However, whenever I run the code in atom, I get the error
LoadError: MethodError: no method matching -(::Array{Float64,2}, ::Float64)
Closest candidates are:
-(::Float64, ::Float64) at float.jl:397
-(::Complex{Bool}, ::Real) at complex.jl:298
-(::Missing, ::Number) at missing.jl:97
...
Stacktrace:
[1] rescale_zero_one(::Array{Float64,2}) at D:\Julio\Documents\Michigan_v2\CS\EECS_598_Data_Science\codex\Codex_3\svd_video.jl:40
[2] top-level scope at D:\Julio\Documents\Michigan_v2\CS\EECS_598_Data_Science\codex\Codex_3\svd_video.jl:50 [inlined]
[3] top-level scope at .\none:0
in expression starting at D:\Julio\Documents\Michigan_v2\CS\EECS_598_Data_Science\codex\Codex_3\svd_video.jl:48
I have the basics of what my function must do, but I really don't understand some of the notation and what the error is telling or how to fix it.
function rescale_zero_one(A::Array)
B = float(A)
B -= minimum(B)
B /= maximum(B)
return B
end
m,n,j = size(movie_cube)
println(j)
C = Array{Float64}(UndefInitializer(),m,n,j)
for k in 1:j
println(k)
C[:,:,j] = rescale_zero_one(movie_cube[:,:,j])
end
the variable movie_cube is a 3 dimensional data array of Float64 entries and I just want to rescale the entries from zero to one. However, the error that I mentioned keeps appearing. I would really appreciate any help with this code!
Try to use dot syntax for doing some operations in an array!
function rescale_zero_one(A::Array)
B = float.(A)
B .-= minimum(B)
B ./= maximum(B)
return B
end
This code is a bit faster and simpler (it only makes two passes over the input matrix rather than five in the previous answer):
function rescale(A::Matrix)
(a, b) = extrema(A)
return (A .- a) ./ (b - a)
end
This can be generalized to three dimensions, so that you don't need the outer loop over the dimensions in C. Warning: this solution is actually a bit slow, since extrema/maximum/minimum are slow when using the dims keyword, which is quite strange:
function rescale(A::Array{T, 3}) where {T}
mm = extrema(A, dims=(1,2))
a, b = first.(mm), last.(mm)
return (A .- a) ./ (b .- a)
end
Now you could just write C = rescale(movie_cube). You can even generalize this further:
function rescale(A::Array{T, N}; dims=ntuple(identity, N)) where {T,N}
mm = extrema(A, dims=dims)
a, b = first.(mm), last.(mm)
return (A .- a) ./ (b .- a)
end
Now you can normalize your multidimensional array along any dimensions you like. Current behaviour becomes
C = rescale(movie_cube, dims=(1,2))
Rescaling each row is
C = rescale(movie_cube, dims=(1,))
Default behaviour is to rescale the entire array:
C = rescale(movie_cube)
One more thing, this is a bit odd:
C = Array{Float64}(UndefInitializer(),m,n,j)
It's not wrong, but it is more common to use the shorter and more elegant:
C = Array{Float64}(undef, m, n, j)
You might also consider simply writing: C = similar(movie_cube) or C = similar(movie_cube, Float64).
Edit: Another general solution is to not implement the dimension handling in the rescale function, but to rather leverage mapslices. Then:
function rescale(A::Array)
(a, b) = extrema(A)
return (A .- a) ./ (b - a)
end
C = mapslices(rescale, A, dims=(1,2))
This is also not the fastest solution, for reasons I don't understand. I really think this ought to be fast, and might be sped up in a future version of Julia.

scilab - how to return matrices from a function with if-statements?

I have a scilab function that looks something like this (very simplified code just to get the concept of how it works):
function [A, S, Q]=myfunc(a)
A = a^2;
S = a+a+a;
if S > A then
Q = "Bigger";
else
Q = "Lower";
end
endfunction
And I get the expected result if I run:
--> [A,S,Q]=myfunc(2)
Q =
Bigger
S =
6.
A =
4.
But if I put matrices into the function I expect to get equivalent matrices back as an answer with a result but instead I got this:
--> [A,S,Q]=myfunc([2 4 6 8])
Q =
Lower
S =
6. 12. 18. 24.
A =
4. 16. 36. 64.
Why isn't Q returning matrices of values like S and A? And how do I achieve that it will return "Bigger. Lower. Lower. Lower." as an answer? That is, I want to perform the operation on each element of the matrix.
Because in your program you wrote Q = "Bigger" and Q = "Lower". That means that Q will only have one value. If you want to store the comparisons for every value in A and S, you have to make Scilab do that.
You can achieve such behavior by using loops. This is how you can do it by using two for loops:
function [A, S, Q]=myfunc(a)
A = a^2;
S = a+a+a;
//Get the size of input a
[nrows, ncols] = size(a)
//Traverse all rows of the input
for i = 1 : nrows
//Traverse all columns of the input
for j = 1 : ncols
//Compare each element
if S(i,j) > A(i,j) then
//Store each result
Q(i,j) = "Bigger"
else
Q(i,j) = "Lower"
end
end
end
endfunction
Beware of A = a^2. It can break your function. It has different behaviors if input a is a vector (1-by-n or n-by-1 matrix), rectangle matrix (m-by-n matrix, m ≠ n ), or square matrix (n-by-n matrix):
Vector: it works like .^, i.e. it raises each element individually (see Scilab help).
Rectangle: it won't work because it has to follow the rule of matrix multiplication.
Square: it works and follows the rule of matrix multiplication.
I will add that in Scilab, the fewer the number of loop, the better : so #luispauloml answer may rewrite to
function [A, S, Q]=myfunc(a)
A = a.^2; // used element wise power, see luispauloml advice
S = a+a+a;
Q(S > A) = "Bigger"
Q(S <= A) = "Lower"
Q = matrix(Q,size(a,1),size(a,2)) // a-like shape
endfunction

MATLAB Function Elements Range

Sorry for the noob question, but I am a beginner in MATLAB. I need to do the following task, but am stuck. "Write a function that takes three arguments x, a, b, where x is a matrix, and a and b are scalars. The function returns the number of elements in x that lie in the interval [a, b]." Here is what I have so far.
function y = count(x,a,b);
for value=a:b
length(value)
end
I need to call the function in the command prompt with the following line:
count(randn(20, 20), 0, 5)
However, I'm not getting anything close to the correct answer. Can anyone point me in the right direction? Thank you.
As Jonas suggested nnz and sum are faster options than numel(find(...)), with sum being the fastest, therefore:
function y = count(x,a,b);
y = sum(x(:)>a & x(:)<b);

Error plotting a function of 2 variables

I am trying to plot the function
f(x, y) = (x – 3).^2 – (y – 2).^2.
x is a vector from 2 to 4, and y is a vector from 1 to 3, both with increments of 0.2. However, I am getting the error:
"Subscript indices must either be real positive integers or logicals".
What do I do to fix this error?
I (think) I see what you are trying to achieve. You are writing your syntax like a mathematical function definition. Matlab is interpreting f as a 2-dimensional data type and trying to assign the value of the expression to data indexed at x,y. The values of x and y are not integers, so Matlab complains.
If you want to plot the output of the function (we'll call it z) as a function of x and y, you need to define the function quite differently . . .
f = #(x,y)(x-3).^2 - (y-2).^2;
x=2:.2:4;
y=1:.2:3;
z = f( repmat(x(:)',numel(y),1) , repmat(y(:),1,numel(x) ) );
surf(x,y,z);
xlabel('X'); ylabel('Y'); zlabel('Z');
This will give you an output like this . . .
The f = #(x,y) part of the first line states you want to define a function called f taking variables x and y. The rest of the line is the definition of that function.
If you want to plot z as a function of both x and y, then you need to supply all possible combinations in your range. This is what the line containing the repmat commands is for.
EDIT
There is a neat Matlab function meshgrid that can replace the repmat version of the script as suggested by #bas (welcome bas, please scroll to bas' answer and +1 it!) ...
f = #(x,y)(x-3).^2 - (y-2).^2;
x=2:.2:4;
y=1:.2:3;
[X,Y] = meshgrid(x,y);
surf(x,y,f(X,Y));
xlabel('x'); ylabel('y'); zlabel('z');
I typically use the MESHGRID function. Like so:
x = 2:0.2:4;
y = 1:0.2:3;
[X,Y] = meshgrid(x,y);
F = (X-3).^2-(Y-2).^2;
surf(x,y,F);
xlabel('x');ylabel('y');zlabel('f')
This is identical to the answer by #learnvst. it just does the repmat-ing for you.
Your problem is that the function you are using uses integers, and you are trying to assign a double to it. Integers cannot have decimal places. To fix this, you can make it to where it increases in increments of 1, instead of 0.2

summing functions handles in matlab

Hi
I am trying to sum two function handles, but it doesn't work.
for example:
y1=#(x)(x*x);
y2=#(x)(x*x+3*x);
y3=y1+y2
The error I receive is "??? Undefined function or method 'plus' for input arguments of type 'function_handle'."
This is just a small example, in reality I actually need to iteratively sum about 500 functions that are dependent on each other.
EDIT
The solution by Clement J. indeed works but I couldn't manage to generalize this into a loop and ran into a problem. I have the function s=#(x,y,z)((1-exp(-x*y)-z)*exp(-x*y)); And I have a vector v that contains 536 data points and another vector w that also contains 536 data points. My goal is to sum up s(v(i),y,w(i)) for i=1...536 Thus getting one function in the variable y which is the sum of 536 functions. The syntax I tried in order to do this is:
sum=#(y)(s(v(1),y,z2(1)));
for i=2:536
sum=#(y)(sum+s(v(i),y,z2(i)))
end
The solution proposed by Fyodor Soikin works.
>> y3=#(x)(y1(x) + y2(x))
y3 =
#(x) (y1 (x) + y2 (x))
If you want to do it on multiple functions you can use intermediate variables :
>> f1 = y1;
>> f2 = y2;
>> y3=#(x)(f1(x) + f2(x))
EDIT after the comment:
I'm not sure to understand the problem. Can you define your vectors v and w like that outside the function :
v = [5 4]; % your 536 data
w = [4 5];
y = 8;
s=#(y)((1-exp(-v*y)-w).*exp(-v*y))
s_sum = sum(s(y))
Note the dot in the multiplication to do it element-wise.
I think the most succinct solution is given in the comment by Mikhail. I'll flesh it out in more detail...
First, you will want to modify your anonymous function s so that it can operate on vector inputs of the same size as well as scalar inputs (as suggested by Clement J.) by using element-wise arithmetic operators as follows:
s = #(x,y,z) (1-exp(-x.*y)-z).*exp(-x.*y); %# Note the periods
Then, assuming that you have vectors v and w defined in the given workspace, you can create a new function sy that, for a given scalar value of y, will sum across s evaluated at each set of values in v and w:
sy = #(y) sum(s(v,y,w));
If you want to evaluate this function using an array of values for y, you can add a call to the function ARRAYFUN like so:
sy = #(y) arrayfun(#(yi) sum(s(v,yi,w)),y);
Note that the values for v and w that will be used in the function sy will be fixed to what they were when the function was created. In other words, changing v and w in the workspace will not change the values used by sy. Note also that I didn't name the new anonymous function sum, since there is already a built-in function with that name.