Full or complete graph from supernodes - igraph

I am trying to do a complete graph from two different groups of nodes. The first one (A) has 8 vertices, and the second one (B) has 147 vertices. My first guess was doing this:
g <- make_empty_graph(directed= FALSE)
g <- g + vertex(c(A, B))
g <- g + graph.full(A, B)
plot(g)
But, unfortunately, this procedure just generated an non-connected graph like this:
Someone could tell me the right path to produce a full connected graph from two or more different groups of nodes in Csárdi's Igraph? I appreciate your help!
Best,
A.

I am not quite sure what you are asking, but it seems to be one of two things.
You might want the fully connected graph for all nodes in A or B. You can get that with
g = graph.full(c(A,B))
You might want the full graph for a joined to the full graph for B. That is also simple.
gA = graph.full(A)
gB = graph.full(B)
g = gA + gB
It is a bit messy to get a nice plot of this though. This works pretty well.
LOA = layout_with_fr(gA)
LOB = layout_with_fr(gB)
LOA[,1] = (LOA[,1] - min(LOA[,1]))/4 + max(LOB[,1]) +0.05
LOA[,2] = (LOA[,2] - min(LOA[,2]))/4 + max(LOB[,2]) +0.05
plot(g, layout=rbind(LOA,LOB), vertex.size=9, margin=-0.2)

Related

Solving a steady flow in a subdomain of a mesh using FiPy

I'm fairly new to FiPy and I'm currently facing an issue of which I am sure can be solved easily: I want to solve a 3D steady flow of the form:
eq = ( DiffusionTerm(var=u) == -(1/mu) * dP + g_acc * (rho/mu) * ymax )
Where velocity u is in the y-direction and where du/dy = 0.
I would like to let the PDE solve within a subdomain of the entire 3D mesh, meaning that:
0 <= X < Boundary_1, u=0.
Boundary_1 <= X < Boundary_2, u = PDE solution
Boundary_2 <= X < Lx, u = 0.
Currently I have tried the following:
mesh = Grid3D(dx=dx, dy=dy, dz=dz, Lx=(Lx-0), Ly =(ymax-ymin), Lz=(zmax-zmin))
u = CellVariable(name = "velocity", mesh = mesh)
X, Y, Z = mesh.cellCenters
LeftWall = (X <= xBoundary_left)
RightWall = (X > xBoundary_right)
FrontWall = (mesh.facesFront)
BackWall = (mesh.facesBack)
u.constrain(0., where=LeftWall)
u.constrain(0., where=RightWall)
u.constrain(0., where=FrontWall)
u.constrain(0., where=BackWall)
Which leads to a solution of image 1 (see attached image). The boundary conditions for the X variable are not taken into account as I would like, as you can see in the example in image 2, where only the PDE domain is shown.
What I am looking for is a way to define the boundary conditions at the faces of the subdomain in such a way that it does not 'crop' the solution, but rather only solves the PDE for that subdomain. If it is possible to 'stitch' meshes together that include a Cellvariable u that has value 0 for two meshes and the value of the solved PDE for one mesh, that would be great as well!
I have tried working with inner boundary conditions in the form of an Implicit source, but that ended up in different errors.
Any help would be much appreciated!
FiPy constraints do not work on internal faces.
In our own work, rather than only solving an equation in a subdomain, we modify the coefficients to cause different behaviors to dominate in different subdomains. Conservation of momentum and conservation of mass don't suddenly stop being true; rather different conditions lead to, e.g., different Reynolds numbers.
It is possible to solve different equations on different meshes and communicate between them. See, e.g.,
https://www.mail-archive.com/search?q=how+to+set+up+data+transfer+between+two+adjacent+nonuniform+meshs&l=fipy%40nist.gov
How to extract a plane from a 3D variable in FiPy (3D to 2D)
https://www.mail-archive.com/search?l=fipy%40nist.gov&q=How+to+combine+scipy.interpolate.interp2d+with+fipy+variables&x=0&y=0
https://www.mail-archive.com/search?l=fipy%40nist.gov&q=Spline+interpolation+and+fipy+variable&x=18&y=9

Store 2 previous array to implement Leapfrog numerical Scheme

In the context of advection numerical solving, I try to implement the following recurrence formula in a time loop:
As you can see, I need the second previous time value for (j-1) and previous (j) value to compute the (j+1) time value.
I don't know how to implement this recurrence formula. Here below my attempt in Python where u represents the array of values T for each iteration:
l = 1
# Time loop
for i in range(1,nt+1):
# Leapfrog scheme
# Store (i-1) value for scheme formula
if (l < 2):
atemp = copy(u)
l = l+1
elif (l == 2):
btemp = copy(atemp)
l = 1
u[1:nx-1] = btemp[1:nx-1] - cfl*(u[2:nx] - u[0:nx-2])
t=t+dt
Coefficient cfl is equal to s.
But the results of simulation don't give fully good results. I think my way to do is not correct.
How can I implement this recurrence? i.e mostly how to store the (j-1) value in time to inject it into formula for computing (j+1) ?
Update
In the formula:
the time index j has to start from j=1since we have the term T_(i,j-1).
So for the first iteration, we have :
T_i,2 = T_i,0 - s (T_(i+1),1 - T_(i-1),1)
Then, if In only use time loop (and not spatial loop such that way, I can't compute dudx[i]=T[i+1]-T[i-1]), how can I compute (T_(i+1),1 - T_(i-1),1), I mean, without precalculating dudx[i] = T_(i+1),1 - T_(i-1),1 ?
That was the trick I try to implement in my original question. The main problem is that I am imposed to use only time loop.
The code would be simpler if I could use 2D array with T[i][j] element, ifor spatial and jfor time but I am not allowed to use 2D array in my examination.
There are few problems I see in your code. First is notation. From the numerical scheme you posted it looks like you are discretizing time with j and space with i using central differences in both. But in your code it looks like the time loop is written in terms of i and this is confusing. I will use j for space and n for time here.
Second, this line
u[1:nx-1] = btemp[1:nx-1] - cfl*(u[2:nx] - u[0:nx-2])
is not correct since for the spatial derivatve du/dx you need to apply the central difference scheme at every spatial point of u. Hence, u[2:nx] - u[0:nx-2] is doing nothing like this, it is just subtracting what seems to be the solution including boundary points on the left from the solution including boundary points on the right. You need to properly calculate this spatial derivative.
Finally, the Leapfrog method which indeed takes into account the n-1 solution is usually implemented by keeping a copy of the previous time step in another variable such as u_prev. So if you use the Leapfrog time scheme plus central difference spatial scheme, in the end you should have something like
u_prev = u_init
u = u_prev
for n in time...:
u_new = u_prev - cfl*(dudx)
u_prev = u
u = u_new
Note that u on the LHS is to compute time n+1, u_prev is at time n-1 and dudx uses u at the current time n. Also, you can compute dudx with
for j in space...:
dudx[j] = u[j+1]-u[j-1]

Extract 3D coordinates from R PCA

I am trying to find a way make 3D PCA visualization from R more portable;
I have run a PCA on 2D matrix using prcomp().
How do I export the 3D coordinates of data points, along with labels and colors (RGB) associated with each?
Whats the practical difference with princomp() and prcomp()?
Any ideas on how to best view the 3D PCA plot using HTML5 and canvas?
Thanks!
Here is an example to work from:
pc <- prcomp(~ . - Species, data = iris, scale = TRUE)
The axis scores are extracted from component x; as such you can just write out (you don't say how you want the exported) as CSV using:
write.csv(pc$x[, 1:3], "my_pc_scores.csv")
If you want to assign information to these scores (the colours and labels, which are not something associated with the PCA but something you assign yourself), then add them to the matrix of scores and then export. In the example above there are three species with 50 observations each. If we want that information exported alongside the scores then something like this will work
scrs <- data.frame(pc$x[, 1:3], Species = iris$Species,
Colour = rep(c("red","green","black"), each = 50))
write.csv(scrs, "my_pc_scores2.csv")
scrs looks like this:
> head(scrs)
PC1 PC2 PC3 Species Colour
1 -2.257141 -0.4784238 0.12727962 setosa red
2 -2.074013 0.6718827 0.23382552 setosa red
3 -2.356335 0.3407664 -0.04405390 setosa red
4 -2.291707 0.5953999 -0.09098530 setosa red
5 -2.381863 -0.6446757 -0.01568565 setosa red
6 -2.068701 -1.4842053 -0.02687825 setosa red
Update missed the point about RGB. See ?rgb for ways of specifying this in R, but if all you want are the RGB strings then change the above to use something like
Colour = rep(c("#FF0000","#00FF00","#000000"), each = 50)
instead, where you specify the RGB strings you want.
The essential difference between princomp() and prcomp() is the algorithm used to calculate the PCA. princomp() uses a Eigen decomposition of the covariance or correlation matrix whilst prcomp() uses the singular value decomposition (SVD) of the raw data matrix. princomp() only handles data sets where there are at least as many samples (rows) and variables (columns) in your data. prcomp() can handle that type of data and data sets where there are more columns than rows. In addition, and perhaps of greater importance depending on what uses you had in mind, the SVD is preferred over the eigen decomposition for it's better numerical accuracy.
I have tagged the Q with html5 and canvas in the hope specialists in those can help. If you don't get any responses, delete point 3 from your Q and start a new one specifically on the topic of displaying the PCs using canvas, referencing this one for detail.
You can find out about any R object by doing str(object_name). In this case:
m <- matrix(rnorm(50), nrow = 10)
res <- prcomp(m)
str(m)
If you look at the help page for prcomp by doing ?prcomp, you can discover that the scores are in res$x and the loadings are in res$rotation. These are labeled by PC already. There are no colors, unless you decide to assign some colors in the course of a plot. See the respective help pages to compare princomp with prcomp for a comparison between the two functions. Basically, the difference between them has to do with the method used behind the scenes. I can't help you with your last question.
You state that your perform PCA on a 2D matrix. If this is your data matrix there is no way to get 3D PCA's. Ofcourse it might be that your 2D matrix is a covariance matrix of the data, in that case you need to use princomp (not prcomp!) and explictely pass the covariance matrix m like this:
princomp(covmat = m)
Passing the covariance matrix like:
princomp(m)
does not yield the correct result.

Normal vector from least squares-derived plane

I have a set of points and I can derive a least squares solution in the form:
z = Ax + By + C
The coefficients I compute are correct, but how would I get the vector normal to the plane in an equation of this form? Simply using A, B and C coefficients from this equation don't seem correct as a normal vector using my test dataset.
Following on from dmckee's answer:
a x b = (a2b3 − a3b2), (a3b1 − a1b3), (a1b2 − a2b1)
In your case a1=1, a2=0 a3=A b1=0 b2=1 b3=B
so = (-A), (-B), (1)
Form the two vectors
v1 = <1 0 A>
v2 = <0 1 B>
both of which lie in the plane and take the cross-product:
N = v1 x v2 = <-A, -B, +1> (or v2 x v1 = <A, B, -1> )
It works because the cross-product of two vectors is always perpendicular to both of the inputs. So using two (non-colinear) vectors in the plane gives you a normal.
NB: You probably want a normalized normal, of course, but I'll leave that as an exercise.
A little extra color on the dmckee answer. I'd comment directly, but I do not have enough SO rep yet. ;-(
The plane z = Ax + By + C only contains the points (1, 0, A) and (0, 1, B) when C=0. So, we would be talking about the plane z = Ax + By. Which is fine, of course, since this second plane is parallel to the original one, the unique vertical translation that contains the origin. The orthogonal vector we wish to compute is invariant under translations like this, so no harm done.
Granted, dmckee's phrasing is that his specified "vectors" lie in the plane, not the points, so he's arguably covered. But it strikes me as helpful to explicitly acknowledge the implied translations.
Boy, it's been a while for me on this stuff, too.
Pedantically yours... ;-)

How to develop a Plagiarism detector?

I am planning to make a Plagiarism Detector as my Computer Science Engineering final year project,for which I would like to take your suggestions on how to go about it.
I would appreciate if you could suggest which all fields in CS I need to focus on and also the language which would be the most appropriate to implement in.
The language is nearly irrelevant. Another questions exists that discusses this a bit more. Basically, the method suggested there is to use Google. Extract parts of the target-text, and search for them on Google.
I am making a plagiarism checker using Python as a hobby project.
The following steps are to be followed:
Tokenize the document.
Remove all the stop words using NLTK library.
Use GenSim library and find the most relevant words, line by line. This can be done by creating the LDA or LSA of the document.
Use Google Search API to search for those words.
Note:
you might have chosen to use the Google API and search the whole document at once. This will work when you are working with smaller amount of data. However when building plagiarism checker for sites and webscraped data, we will need to apply NLTK algorithms.
The Google search API will result in the top articles which have the same words which were resulted in the LDA or LSA from GenSim library functions of Python.
Hope it helped.
Here is a simple code to match the similarity percentage between two file
import numpy as np
def levenshtein(seq1, seq2):
size_x = len(seq1) + 1
size_y = len(seq2) + 1
matrix = np.zeros ((size_x, size_y))
for x in range(size_x):
matrix [x, 0] = x
for y in range(size_y):
matrix [0, y] = y
for x in range(1, size_x):
for y in range(1, size_y):
if seq1[x-1] == seq2[y-1]:
matrix [x,y] = min(
matrix[x-1, y] + 1,
matrix[x-1, y-1],
matrix[x, y-1] + 1
)
else:
matrix [x,y] = min(
matrix[x-1,y] + 1,
matrix[x-1,y-1] + 1,
matrix[x,y-1] + 1
)
#print (matrix)
return (matrix[size_x - 1, size_y - 1])
with open('original.txt', 'r') as file:
data = file.read().replace('\n', '')
str1=data.replace(' ', '')
with open('target.txt', 'r') as file:
data = file.read().replace('\n', '')
str2=data.replace(' ', '')
if(len(str1)>len(str2)):
length=len(str1)
else:
length=len(str2)
print(100-round((levenshtein(str1,str2)/length)*100,2),'% Similarity')
Create two files "original.txt" and "target.txt" in same directory with content.
you better try python,cause its easy to develop a program using this..i'm also doing a project on plagiarism detector..i suggest u to tokenize the string first..actually it is complicated but this is the way if u r trying to develop for source code,else if u r developing plagiarism detector for text file use cosine similarity method,LCS method or simply considering position..