rdbwselect in R not showing output - regression

I'm using the package rdrobust in R and Stata. I planned to fully implement the analysis in R, but encountered a problem with the function rdbwselect. This function computes different bandwidths depending on the selection procedure. By default, the procedure is Mean Square Error bwselect=mserd. However, I'm interested in exploring other procedures and comparing them. I then tried ALL=true; which is the option that according to the package "if specified, rdbwselect reports all available bandwidth selection procedures"
My issue is that, in R, rdbwselect is not showing me the bandwidths, not with the default not with the 'all' option or any other specification
x<-runif(1000,-1,1)
y<-5+3*x+2*(x>=0)+rnorm(1000)
## With default mserd
rdbwselect(y,x,)
## All selection procedures
rdbwselect(y,x,all= TRUE)
Output rdwselect
The output of both lines of rdbwselect code is exactly the same (see image), and it should not. I also try replicating the script from the rdrobust article in The R Journal (Page 49) and I don't get the same output as in the paper.
Nevertheless, the function is working in Stata 16
clear all
set obs 1000
set seed 1234
gen x = runiform(-1,1)
gen y = 5+3*x+2*(x>=0)+rnormal()
rdbwselect y x
rdbwselect y x, all
Could someone provide me with some guidance on why R is not showing me the complete expected output of the function rdbwselect? I'm wondering if this is an issue related to my version of R? Could this be a bug with the R package or the specific function rdbwselect? How can I verify the computation behind rdbwselect?
I appreciate any advice or follow-up questions.

Found the solution. All I needed to do was to embed the function within the summary() function
summary(rdbwselect(y,x,))
or add a pipe and the summary function
rdbwselect(y,x,all= TRUE) %>%
summary()
I want to post it as this is nowhere mentioned in the package documentation nor the article in The R Journal.

Related

Data frame error when converting iGraph to gexf object

I am trying to convert an iGraph object to a gexf object using the rgexf package so that I can write a file usable with Gephi, which I prefer for network visualization.
My iGraph object is created by reading in two CSVs: h.edges and h.nodes. There are both edge and node attributes. Once the files are read in, I create the iGraph object, calculate centrality measures and then attach the centrality measures as node attributes. The code looks like so:
iNet = graph_from_data_frame(d=h.edges, vertices = h.nodes, directed = F)
V(iNet)$degree = degree(iNet)
V(iNet)$eig = evcent(iNet)$vector
V(iNet)$betweenness = betweenness(iNet)
This appears to be working fine since I can do all the normal iGraph functions -- plot, calculate centralities, identify communities, etc. My problem comes when I try to convert this to a gexf object. I run the following code:
library(rgexf)
iNet.gexf igraph.to.gexf(iNet)
But get the below error message:
Error in `[.data.frame`(x, r, vars, drop = drop) :
undefined columns selected
Anyone know what's happening? Although I know the example here can all be done just by uploading the two CSVs straight to Gephi and running the calculations there, the end goal is to be able to attach iGraph's more robust calculations as attributes in ways that Gephi can't.

sympy autowrap (cython): limit of # of arguments, arguments in array form?

I have the following issue:
I want to use autowrap to generate a compiled version of a sympy matrix, with cells containing sympy expressions. Depending on the specification of my problem, the number of arguments can get very large.
I ran into the following 2 issues:
The number of arguments that autowrap accepts seems to be limited to 509.
i.e., this works:
import sympy
from sympy.utilities.autowrap import autowrap
x = sympy.symbols("x:509")
exp = sum(x)
cyt = autowrap(exp, backend="cython", args=x)
and this fails to compile:
x = sympy.symbols("x:510")
exp = sum(x)
cyt = autowrap(exp, backend="cython", args=x)
The message I get seems not very telling:
[...] (Full output upon request)
Generating code
c:\users\[classified]\appdata\local\temp\tmp2zer8vfe_sympy_compile\wrapper_module_17.c(6293) : fatal error C1001: An internal error has occurred in the compiler.
(compiler file 'f:\dd\vctools\compiler\utc\src\p2\hash.c', line 884)
To work around this problem, try simplifying or changing the program near the locations listed above.
Please choose the Technical Support command on the Visual C++
Help menu, or open the Technical Support help file for more information
LINK : fatal error LNK1257: code generation failed
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\VC\\BIN\\x86_amd64\\link.exe' failed with exit status 1257
Is there any way around this? I would like to use versions of my program that need ~1000 input variables.
(I have no understanding of C/cython. Is this an autowrap limitation, a C limitation ...?)
Partly connected to the above:
Can one compile functions that accept the arguments as array.
Is there any way to generate code that accepts a numpy array as input? I specifically mean one array for all the arguments, instead of providing the arguments as list. (Similar to lambdify using a DeferredVector). ufuncify supports array input, but as I understand only for broadcasting/vectorizing the function.
I would hope that an array as argument could circumvent the first problem above, which is most pressing for me. Apart from that, I would prefer array input anyways, both because it seems faster (no need to unpack the numpy array I have as input into a list), and also more straightforward and natural.
Does anyone have any suggestions what I can do?
Also, could anyone tell me whether f2py has similar limitations? This would also be an option for me if feasible, but I don't have it set up to work currently, and would prefer to know whether it helps at all before investing the time.
Thanks!
Edit:
I played around a bit with the different candidates for telling autowrap that the input argument will be something in array form, rather than a list of numbers. I'll document my steps here for posterity, and also to increase chances to get some input:
sympy.DeferredVector
Is what I use with lambdify for the same purpose, so I thought to give it a try. However, warning:
A = sympy.DeferredVector("A")
expression = A[0]+A[1]
cyt = autowrap(expression, backend="cython", args=A)
just completely crashed my OS - last statement started executing, (no feedback), everything got really slow, then no more reactions. (Can only speculate, perhaps it has to do with the fact that A has no shape information, which does not seem to bother lambdify, but might be a problem here. Anyways, seems not the right way to go.)
All sorts of array-type objects filled with the symbols in the expression to be wrapped.
e.g.
x0 ,x1 = sympy.symbols("x:2")
expression = x0 + x1
cyt = autowrap(expression, backend="cython", args=np.array([x0,x1]))
Still wants unpacked arguments. Replacing the last row by
cyt = autowrap(expression, backend="cython", args=[np.array([x0,x1])])
Gives the message
CodeGenArgumentListError: ("Argument list didn't specify: x0, x1 ", [InputArgument(x0), InputArgument(x1)])
Which is a recurrent theme to this approach: also happens when using a sympy matrix, a tuple, and so on inside the arguments list.
sympy.IndexedBase
This is actually used in the autowrap examples; however, in a (to me) inintuitive way, using an equation as the expression to be wrapped. Also, the way it is used there seems not really feasible to me: The expression I want to cythonize is a matrix, but its cells are themselves longish expressions, which I cannot obtain via index operations.
The upside is that I got a minimal example to work:
X = sympy.IndexedBase("X",shape=(1,1))
expression = 2*X[0,0]
cyt = autowrap(expression, backend="cython", args=[X])
actually compiles, and the resulting function correctly evaluates - when passed a 2d-np.array.
So this seems the most promising avenue, even though further extensions to this approach I keep trying fail.
For example this
X = sympy.IndexedBase("X",shape=(1,))
expression = 2*X[0]
cyt = autowrap(expression, backend="cython", args=[X])
gets me
[...]\site-packages\sympy\printing\codeprinter.py", line 258, in _get_expression_indices " rhs indices in %s" % expr)
ValueError: lhs indices must match non-dummy rhs indices in 2*X[0]
even though I don't see how it should be different from the working one above.
Same error message when sticking to two dimensions, but increasing the size of X:
X = sympy.IndexedBase("X",shape=(2,2))
expression = 2*X[0,0]+X[0,1]+X[1,0]+X[1,1]
cyt = autowrap(expression, backend="cython", args=[X])
ValueError: lhs indices must match non-dummy rhs indices in 2*X[0, 0] + X[0, 1] + X[1, 0] + X[1, 1]
I tried snooping around the code for autowrap, but I feel a bit lost there...
So I'm still searching for a solution and happy for any input.
Passing the argument as an array seems to work OK
x = sympy.MatrixSymbol('x', 520, 1)
exp = 0
for i in range(x.shape[0]):
exp += x[i]
cyt = autowrap(exp, backend='cython')
arr = np.random.randn(520, 1)
cyt(arr)
Out[48]: -42.59735861021934
arr.sum()
Out[49]: -42.597358610219345

define theano function with other theano function output

I am new to theano, can anyone help me defining a theano function like this:
Basically, I have a network model looks like this:
y_hat, cost, mu, output_hiddens, cells = nn_f(x, y, in_size, out_size, hidden_size, layer_models, 'MDN', training=False)
here the input x is a tensor:
x = tensor.tensor3('features', dtype=theano.config.floatX)
I want to define two theano functions for later use:
f_x_hidden = theano.function([x], [output_hiddens])
f_hidden_mu = theano.function([output_hiddens], [mu], on_unused_input = 'warn')
the first one is fine. for the second one, the problem is both the input and the output are output of the original function. it gives me error:
theano.gof.fg.MissingInputError: An input of the graph, used to compute Elemwise{identity}(features), was not provided and not given a value.
my understanding is, both of [output_hiddens] and [mu] are related to the input [x], there should be an relation between them. I tried define another theano function from [x] to [mu] like:
f_x_mu = theano.function([x], [mu]),
then
f_hidden_mu = theano.function(f_x_hidden, f_x_mu),
but it still does not work. Does anyone can help me? Thanks.
The simple answer is NO WAY. In here
Because in Theano you first express everything symbolically and afterwards compile this expression to get functions, ...
You can't use the output of theano.function as input/output for another theano.function since they are already a compiled graph/function.
You should pass the symbolic variables, such as x in your example code for f_x_hidden, to build the model.

Can't replicate result with degree.distribution function of igraph

I found this very simple example online:
library(igraph)
g <- graph.ring(5)
plot(g)
summary(g)
degree.distribution(g)
I got the same results up to degree.distribution(g)while instead of [1] 0 0 1 I get NULL.
Since I have exactly the same problem with this example (NULL result to function degree.distribution instead of [1] 0.135 0.280 0.315 0.110 0.095 0.050 0.005 0.010) I wonder if the problem might be in the package installation.
The degree.distribution function in igraph 0.6.5 does not work well with R 3.0.0 and newer because of some changes in the return value of the hist function in R. The developers already know about it and it will be fixed in the next release. Until it is released, you have to work around the bug by modifying the source code of the degree.distribution function according to this patch.

MATLAB integration of an anonymous function

I wanted to ask how can I can write this in MATLAB.
I want to integrate fp(z) with z(0,x). I tried this :
fpz=#(z) f1x(z) ./ quadl(f1x(z),0,1);
sol=int(fpz,0,x) --> i also tried sol=quadl(fpz,0,x)
y=solve('y=sol',x)
xf=# (y) y ; -->this is the function i want
where f1x=# (x) 1 ./(x.^2+1) and fpx = #(x) f1x(x) ./ quadl(f1x,0,1);
but it doesn't work.
Hello,thanks for helping.
The problem is that i want an analytically solution and i can't get one.
I want f1x to give me " 1/x^2+1" , fpx "4/pi*(1+x^2) and fpz "4ArcTan(x)/pi", instead of giving me "f1x=# 1./(x^2+1)"..
With the code you send me ,still the same problem.
I managed to come into this :
f1x=# (x) 1 ./(x.^2+1)
fpx = #(x) f1x(x) ./ quadl(f1x,0,1)
f2z=# (z) 1 ./(z.^2+1);
fpz=#(z) fpx(z) ./ quadl(f2z,0,1)
sol=int(fpz(z),z,0,x)
y=solve(subs('y=sol'),x)
xf=# (y) y
The "sol" and "y=" gives me analytically answer but it is wrong because i assume f1x and fpx,fpz doesn't return into analytically expressions.
0 Your definition of fpz doesn't make any sense; as Marcin already said, you're trying to integrate something that isn't a function. This shouldn't be a problem for your alternative version with f2z. I think the code in the original question should have had just f1x rather than f1x(z) in the first line.
1 Your revised version with f2z has a different problem: now in fpz you aren't actually providing the function with an argument.
The following code seems to be the kind of thing you had in mind, and works fine for me (in MATLAB R2008a, as it happens, but none of this should be different in other versions):
f1x = #(x) 1 ./ (x.^2+1);
fpx = #(x) f1x(x) ./ quadl(f1x,0,1);
fpz = #(z) fpx(z) ./ quadl(fpx,0,1);
Now evaluating fpz(3), for instance, spins for about half a second (on my old slow laptop computer) and returns 0.1273.
So I think the problems you've been having with integration of anonymous functions are just a matter of not being quite careful enough to distinguish between the function itself and a particular value of the function.
You have some further questions about the "solve" and "int" functions in the Symbolic Math Toolbox. You should take the following with a pinch of salt because I don't have that toolbox and am relying on the online documentation for it.
3 You're feeding the names of functions you've defined in MATLAB to the Symbolic Math Toolbox functions. I don't think that is supposed to work; "int" and "solve" expect explicit algebraic expressions, not MATLAB functions. (And, further, your functions all use numerical integration -- quad, quadl, etc. -- and there's no possible way that the symbolic functions can do anything useful with that.)
Finally: When you're asking questions of this sort, it's helpful if rather than "it doesn't work" you say how it doesn't work. For instance, your most recent comment is much more useful ("it gives me ..." followed by the actual output you get from MATLAB).