Generate small-world model in igraph using Newman-Watts algorithm - igraph

I want to generate small-world networks using igraph, but not using "rewiring" as implemented in the watts.strogatz.game(). In the Newman variation all local links are fixed but a fixed number of random links are lifted and dropped randomly on the network at a fixed rate (basically adding "long-range" connections). I thought I could simply generate a lattice (e.g. g <- graph.lattice(length=20, dim=1, circular=TRUE)) and then put a classical random graph on top of that. However, I do not know how to do this using a graph as input argument. Or maybe it is possible to add random edges at a specified probability?
Any help highly appreciated.
Thanks a lot!

Use graph.lattice to generate a lattice, then erdos.renyi.game with the same number of vertices and a fixed probability to generate a random graph. Then you can combine the two graphs using the %u% (union) operator. There is a small chance for multi-edges if the same edge happens to be the part of the lattice and the random graph as well, so you should also call simplify() on the union if you don't want that.

This seems to do the trick, in case anyone is interested. Just have to create a function to do this "rewiring" over and over again. Many thanks again, Tamas!
library(igraph)
g <- graph.lattice(length=100, dim=1, circular=TRUE)
g2 <- erdos.renyi.game(100, 1/100)
g3 <- g %u% g2
g3 <- simplify(g3)
plot.igraph(g3, vertex.size = 1,vertex.label = NA, layout=layout_in_circle)

Related

CycleGAN for unpaired image to image translation

Referring to the original paper on CycleGAN i am confused about this line
The optimal G thereby translates the domain X to a domain Yˆ
distributed identically to Y . However, such a translation does not
guarantee that an individual input x and output y are paired up in a
meaningful way – there are infinitely many mappings G that will induce
the same distribution over yˆ.
I understand there are two sets of images and there is no pairing between them so when generator will taken one image lets say x from set X as input and try to translate it to an image similar to the images in Y set then my question is that there are many images present in the set Y so which y will our x be translated into? There are so many options available in set Y. Is that what is pointed out in these lines of the paper that i have written above? And is this the reason we take cyclic loss to overcome this problem and to create some type of pairing between any two random images by converting x to y and then converting y back to x?
The image x won't be translated to a concrete image y but rather to a "style" of the domain Y. The input is fed to the generator, which tries to produce a sample from the desired distribution (the other domain), the generated image then goes to the discriminator, which tries to predict if the sample is from the actual distribution or produced by the generator. This is just the normal GAN workflow.
If I understand it correctly, in the lines you quoted, authors explain the problems that arise with adversarial loss. They say it again here:
Adversarial training can, in theory, learn mappings G and F that produce outputs identically distributed as target domains Y and X respectively. However, with large enough capacity, a network can map the same set of input images to any random permutation of images in the target domain, where any of the learned mappings can induce an output distribution that matches the target distribution. Thus, an adversarial loss alone cannot guarantee that the learned function can map an individual input x_i to a desired output y_i.
This is one of the reasons for introducing the concept of cycle-consistency to produce meaningful mappings, reduce the space of possible mapping functions (can be viewed as a form of regularization). The idea is not to create a pairing between 2 random images which already are in the dataset (the dataset stays unpaired), but to make sure, that if you map a real image from the domain X to the domain Y and then back again, you get the original image back.
Cycle consistency encourages generators to avoid unnecessary changes and thus to generate images that share structural similarity with inputs, it also prevents generators from excessive hallucinations and mode collapse.
I hope that answers your questions.

Simulation from a nested glmer model

I'm having a problem generating simulations from a 3 level glmer model when conditioning on the random effects (I'm actually using predict via bootMer but the problem is the same).
This works:
library(lme4)
fit1 = glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
data = cbpp, family = binomial)
simulate(fit1, re.form=NULL)
This fails:
cbpp$bigherd = rep(1:7, 8)
fit2 = glmer(cbind(incidence, size - incidence) ~ period + (1 | bigherd / herd),
data = cbpp, family = binomial)
simulate(fit2, re.form=NULL)
Error: No random effects terms specified in formula
Many thanks for any ideas.
Update
Ben, many thanks for your help below, really appreciate it. I wonder if I can impose on you again.
What I want to do is simulate predictions on the response scale and I'm not sure if I can use your work around? Or if there is an alternative to what I'm doing. Thank you!
This works as expected, but is not conditional on random effects:
FUN = function(.){
predict(., type="response")
}
bootMer(fit2, FUN, nsim=3)$t
This doesn't work, as would be expected given above problem:
bootMer(fit2, FUN, nsim=3, use.u=TRUE)$t
As far as I can see, I can't pass re.form to bootMer.
Does the alternative below result in simulated predictions conditional on random effects without passing use.u to bootMer?
FUN = function(.){
predict(., type="response", re.form=~(1|herd:bigherd) + (1|bigherd))
}
bootMer(fit2, FUN, nsim=10)$t
I'm not sure what's going on yet, but here are two workarounds that do work:
simulate(fit2, re.form=lme4:::reOnly(formula(fit2)))
simulate(fit2, re.form=~(1|herd:bigherd) + (1|bigherd))
There must be something going wrong with the expansion of the "slash" term, because this doesn't work:
simulate(fit2, re.form=~(1|bigherd/herd))
I've posted this as an lme4 issue
These workarounds don't work for bootMer (which only takes the use.u argument, not re.form) in the current CRAN release (1.1-9).
It is fixed in the development version on Github (1.1-10): devtools::install_github("lme4/lme4") will install it, if you have compilation tools installed.
In the meantime you could just go ahead and implement your own parametric bootstrap (for parametric bootstrapping, bootMer is actually a very thin wrapper around simulate()/[refit()orupdate()]/FUN`). Much of the complication has to do with parallel computation (you'd have to add some of it back in if you want parallel computation in your own PB implementation).
This is the outline of a hand-rolled parametric bootstrap:
nboot <- 10
nresp <- length(FUN(orig_fit))
res <- matrix(NA,nboot,nresp)
for (i in 1:nboot) {
res[i,] <- FUN(update(orig_fit,data=simulate(orig_fit,...)))
## or use refit() for LMMs
## ... are options to simulate()
}
t(apply(res,2,quantile,c(0.025,0.975)))

Is it possible to plot complex variable in wxMaxima or Octave?

For example , if I want to plot Sin(z) where z is a complex variable , how I will achieve it in either Octave or Maxima?
I don't know about Octave, but here is a message about that, with some code you can try in Maxima: https://www.ma.utexas.edu/pipermail/maxima/2007/006644.html
There may be more specific information for wxMaxima -- you can try their user forum: https://sourceforge.net/p/wxmaxima/discussion/435775/
(referring Octave 4.0.0)
How do you want to try to represent the output of the function? Plotting either the real or imaginary parts of the output can be done fairly simply using a 3-dimensional graph, where the x and y axes are the real and imaginary components of z, and the vertical axis is either the real or imaginary values of sin(z). Producing those are fairly simple in Octave. Here's a link to a script you can save and run to show an example.
Simply change the g = exp(f) line to g = sin(f).
Octave-help mailing list example
Note that the imaginary part plot is commented out. Just switch the # between the different plot commands if you want to see that part.
Now, are you instead looking for options to map the Z plane (z=x+iy) to the W plane (w=u+iv) and represent closed contours mapped by w=sin(z)? in that case you'll need to do parametric plotting as described on this FIT site. There is a link to his Matlab program at the bottom of the explanation that provides one method of using color coding to match z->w plane contour mapping.
Those m-files are written for Matlab, so a few things do not work, but the basic plotting is compatible with Octave 4.0.0. (the top level ss13.m file will fail on calls to flops and imwrite)
But, if you put your desired function in myfun13.m for f, df and d2f, (sin(z), cos(z), -sin(z) respectively), then run cvplot13, you'll get color maps showing the correspondence between z and w planes.
wxMaxima has a plot3d that can do it. Since the expression to plot is in terms of x and y, I plotted the function's magnitude with abs(f(x+%i*y)):
plot3d(abs((x+%i*y-3)*(x+%i*y-5)*(x+%i*y-6)), [x,2,7], [y,-1,1], [grid,100,100], [z,0,5])$

Extract 3D coordinates from R PCA

I am trying to find a way make 3D PCA visualization from R more portable;
I have run a PCA on 2D matrix using prcomp().
How do I export the 3D coordinates of data points, along with labels and colors (RGB) associated with each?
Whats the practical difference with princomp() and prcomp()?
Any ideas on how to best view the 3D PCA plot using HTML5 and canvas?
Thanks!
Here is an example to work from:
pc <- prcomp(~ . - Species, data = iris, scale = TRUE)
The axis scores are extracted from component x; as such you can just write out (you don't say how you want the exported) as CSV using:
write.csv(pc$x[, 1:3], "my_pc_scores.csv")
If you want to assign information to these scores (the colours and labels, which are not something associated with the PCA but something you assign yourself), then add them to the matrix of scores and then export. In the example above there are three species with 50 observations each. If we want that information exported alongside the scores then something like this will work
scrs <- data.frame(pc$x[, 1:3], Species = iris$Species,
Colour = rep(c("red","green","black"), each = 50))
write.csv(scrs, "my_pc_scores2.csv")
scrs looks like this:
> head(scrs)
PC1 PC2 PC3 Species Colour
1 -2.257141 -0.4784238 0.12727962 setosa red
2 -2.074013 0.6718827 0.23382552 setosa red
3 -2.356335 0.3407664 -0.04405390 setosa red
4 -2.291707 0.5953999 -0.09098530 setosa red
5 -2.381863 -0.6446757 -0.01568565 setosa red
6 -2.068701 -1.4842053 -0.02687825 setosa red
Update missed the point about RGB. See ?rgb for ways of specifying this in R, but if all you want are the RGB strings then change the above to use something like
Colour = rep(c("#FF0000","#00FF00","#000000"), each = 50)
instead, where you specify the RGB strings you want.
The essential difference between princomp() and prcomp() is the algorithm used to calculate the PCA. princomp() uses a Eigen decomposition of the covariance or correlation matrix whilst prcomp() uses the singular value decomposition (SVD) of the raw data matrix. princomp() only handles data sets where there are at least as many samples (rows) and variables (columns) in your data. prcomp() can handle that type of data and data sets where there are more columns than rows. In addition, and perhaps of greater importance depending on what uses you had in mind, the SVD is preferred over the eigen decomposition for it's better numerical accuracy.
I have tagged the Q with html5 and canvas in the hope specialists in those can help. If you don't get any responses, delete point 3 from your Q and start a new one specifically on the topic of displaying the PCs using canvas, referencing this one for detail.
You can find out about any R object by doing str(object_name). In this case:
m <- matrix(rnorm(50), nrow = 10)
res <- prcomp(m)
str(m)
If you look at the help page for prcomp by doing ?prcomp, you can discover that the scores are in res$x and the loadings are in res$rotation. These are labeled by PC already. There are no colors, unless you decide to assign some colors in the course of a plot. See the respective help pages to compare princomp with prcomp for a comparison between the two functions. Basically, the difference between them has to do with the method used behind the scenes. I can't help you with your last question.
You state that your perform PCA on a 2D matrix. If this is your data matrix there is no way to get 3D PCA's. Ofcourse it might be that your 2D matrix is a covariance matrix of the data, in that case you need to use princomp (not prcomp!) and explictely pass the covariance matrix m like this:
princomp(covmat = m)
Passing the covariance matrix like:
princomp(m)
does not yield the correct result.

"Reverse" statistics: generating data based on mean and standard deviation

Having a dataset and calculating statistics from it is easy. How about the other way around?
Let's say I know some variable has an average X, standard deviation Y and assume it has normal (Gaussian) distribution. What would be the best way to generate a "random" dataset (of arbitrary size) which will fit the distribution?
EDIT: This kind of develops from this question; I could make something based on that method, but I am wondering if there's a more efficient way to do it.
You can generate standard normal random variables with the Box-Mueller method. Then to transform that to have mean mu and standard deviation sigma, multiply your samples by sigma and add mu. I.e. for each z from the standard normal, return mu + sigma*z.
This is really easy to do in Excel with the norminv() function. Example:
=norminv(rand(), 100, 15)
would generate a value from a normal distribution with mean of 100 and stdev of 15 (human IQs). Drag this formula down a column and you have as many values as you want.
I found a page where this problem is solved in several programming languages:
http://rosettacode.org/wiki/Random_numbers
There are several methods to generate Gaussian random variables. The standard method is Box-Meuller which was mentioned earlier. A slightly faster version is here:
http://en.wikipedia.org/wiki/Ziggurat_algorithm
Here's the wikipedia reference on generating Gaussian variables
http://en.wikipedia.org/wiki/Normal_distribution#Generating_values_from_normal_distribution
I'll give an example using R and the 2nd algorithm in the list here.
X<-4; Y<-2 # mean and std
z <- sapply(rep(0,100000), function(x) (sum(runif(12)) - 6) * Y + X)
plot(density(z))
> mean(z)
[1] 4.002347
> sd(z)
[1] 2.005114
> library(fUtilities)
> skewness(z,method ="moment")
[1] -0.003924771
attr(,"method")
[1] "moment"
> kurtosis(z,method ="moment")
[1] 2.882696
attr(,"method")
[1] "moment"
You could make it a kind of Monte Carlo simulation. Start with a wide random "acceptable range" and generate a few truly random values. Check your statistics and see if the average and variance are off. Adjust the "acceptable range" for the random values and add a few more values. Repeat until you have hit both your requirements and your population sample size.
Just off the top of my head, let me know what you think. :-)
The MATLAB function normrnd from the Statistics Toolbox can generate normally distributed random numbers with a given mu and sigma.
It is easy to generate dataset with normal distribution (see http://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform ).
Remember that generated sample will not have exact N(0,1) distribution! You need to standarize it - substract mean and then divide by std deviation. Then You are free to transform this sample to Normal distribution with given parameters: multiply by std deviation and then add mean.
Interestingly numpy has a prebuilt function for that:
import numpy as np
def generate_dataset(mean, std, samples):
dataset = np.random.normal(mean, std, samples)
return dataset