Combining/Removing Edges - igraph

I am creating a circular lattice graph and considering its corresponding full graph. I call the edges in the full graph that are NOT in the lattice graph "non lattice" edges. Now, I want to select a number of edges from the lattice to remove and then I want to add a number of randomly selected edges from the non lattice edges to create a NEW graph. Here is a small example, you will see where it breaks down (last line). Basically, I am having trouble with the edge lists being sequences. Also, if I can do this without have to name all of the nodes that would be better- eventually, I will have large graphs! Here is my current code:
n <- 5 #number of nodes
k <- 1 #number of neighbors for lattice connections
g <- make_lattice(length = n, dim = 1, nei = k, circular = TRUE) #lattice
V(g)$name <- letters[1:n] #name nodes
lat_e <- E(g) #lattice edges
g1 <- g #make a copy
g1[V(g1), V(g1)] <- TRUE #add all possible edges
g1 <- simplify(g1) #remove loops
newe <- E(g1)
non_lat_e <- difference(newe, olde) #non lattice edges
n_switch <- 2 #want to change 2 lattice edges to
non_lattice edges
e_rem <- sample(1:length(lat_e),n_switch)
e_add <- sample(1:length(non_lat_e), n_switch)
g <- delete_edges(g, lat_e[e_rem]) #delete lattice edges
g <- add.edges(g, non_lat_e[e_add]) #add non lattice edges. ERROR

You can't refer to nodes from one graph when adding edges to a new graph. You'll need to provide a pairwise vector by name in order to add new edges. Something like this
non_lattice_edges <- ends(g1, non_lat_e[e_add]) %>%
t() %>%
as.vector()
g <- add.edges(g, non_lattice_edges)
If you never delete a vector, then you don't need add vertex names because the vertex indices will stay consistent. Here is your code, but without the named vertices:
n <- 5 #number of nodes
k <- 1 #number of neighbors for lattice connections
g <- make_lattice(length = n, dim = 1, nei = k, circular = TRUE) #lattice
lat_e <- E(g) #lattice edges
g1 <- g #make a copy
g1[V(g1), V(g1)] <- TRUE #add all possible edges
g1 <- simplify(g1) #remove loops
newe <- E(g1)
non_lat_e <- difference(newe, lat_e) #non lattice edges
n_switch <- 2
e_rem <- sample(1:length(lat_e),n_switch)
e_add <- sample(1:length(non_lat_e), n_switch)
g <- delete_edges(g, lat_e[e_rem]) #delete lattice edges
non_lattice_edges <- ends(g1, non_lat_e[e_add]) %>%
t() %>%
as.vector()
g <- add.edges(g, non_lattice_edges)

Related

In R ggraph, how to fix circular plot structure

Below code can't get the wished plot (the data structure can' show in current plot), How to fixed it and change the plot to wished plot ? Thanks!
library(ggraph)
library(igraph)
library(tidyverse)
md <- data.frame(category = c('FDM','FDM','FDM'),
item =c('A1','A1','C1'),
subitem =c('A11','A12','C1'),
amount = c(1,2,3))
vertices <- md %>% gather(key='type',value = 'item',- amount) %>% select(- type) %>%
group_by(item) %>% summarise(amount= sum(amount))
pt <- graph_from_data_frame(md,vertices = vertices)
ggraph(pt,layout = 'circlepack', weight =amount)+
geom_node_circle(aes(fill=depth))+
geom_node_label(aes(label = paste0(name,'\n',amount )))+theme_void()

How can I adjust my pcor model for confounders and do it for many models at one time?

I have a dataset with many columns. First column is the outcome (Test)(Dependent variable, y). Columns 2-32 are confounders. Finally, columns 33-54 are miRNAs (expression)(Independent variable, x).
I want to do a partial correlation (to obtain p-value and estimate) between each one of the independent variables with the dependent variable, adjusting by confounders. Since my variables don't follow a normal distribution, I want to use Spearman method.
I don't want to put all of them in the same model, I want different models, one by one. That is:
Model 1: Test vs miRNA1 by confounders
Model 2: Test vs miRNA2 by confounders
[...]
Model 21: Test vs miRNA21 by confounders
I tried with an auxiliary function. But it doesn't work. Any help? Thanks :)
The script is here:
#data
n <- 10000
nc <- 30
nm <- 20
y <- rnorm(n = n)
X <- matrix(rnorm(n = n*(nc+nm)), ncol = nc + nm)
df <- data.frame(y = y, X)
#variable names
confounders <- colnames(df)[2:31]
mirnas <- colnames(df)[32:51]
#auxiliar regression function
pcor_fun <- function(data, y_col, X_cols) {
formula <- as.formula(paste(y_col, X_cols))
pcor <- pcor.test(formula = formula, data = data, method = "spearman")
pcor_summary <- summary(pcor)$coef
return(pcor_summary)
}
#simple linear regressions
lm_list1 <- lapply(X = mirnas, FUN = pcor_fun, data = df, y_col = "y")
lm_list1[[1]]
#adjusting by confounders
lm_list2 <- lapply(X = mirnas, FUN = function(x) pcor_fun(data = df, y_col = "y", X_cols = c(confounders, x)))
lm_list2[[1]]

R - Issue with the DOM of the danish parliament (webscraping)

I've been working on a webscraping project for the political science department at my university.
The Danish parliament is very transparent about their democratic process and they are uploading all the legislative documents on their website. I've been crawling over all pages starting 2008. Right now I'm parsing the information into a dataframe and I'm having an issue that I was not able to resolve so far.
If we look at the DOM we can see that they named most of the objects div.tingdok-normal. The number of objects varies between 16-19. To parse the information correctly for my dataframe I tried to grep out the necessary parts according to patterns. However, the issue is that sometimes my pattern match more than once and I don't know how to tell R that I only want the first match.
for the sake of an example I include some code:
final.url <- "https://www.ft.dk/samling/20161/lovforslag/l154/index.htm"
to.save <- getURL(final.url)
p <- read_html(to.save)
normal <- p %>% html_nodes("div.tingdok-normal > span") %>% html_text(trim =TRUE)
tomatch <- c("Forkastet regeringsforslag", "Forkastet privat forslag", "Vedtaget regeringsforslag", "Vedtaget privat forslag")
type <- unique (grep(paste(tomatch, collapse="|"), results, value = TRUE))
Maybe you can help me with that
My understanding is that you want to extract the text of the webpage, because the "tingdok-normal" are related to the text. I was able to get the text of the webpage with the following code. Also, the following code identifies the position of the first "regex hit" of the different patterns to match.
library(pagedown)
library(pdftools)
library(stringr)
pagedown::chrome_print("https://www.ft.dk/samling/20161/lovforslag/l154/index.htm",
"C:/.../danish.pdf")
text <- pdftools::pdf_text("C:/.../danish.pdf")
tomatch <- c("(A|a)ftalen", "(O|o)pholdskravet")
nb_Tomatch <- length(tomatch)
list_Position <- list()
list_Text <- list()
for(i in 1 : nb_Tomatch)
{
# Locates the first hit of the regex
# To locate all regex hit, use stringr::str_locate_all
list_Position[[i]] <- stringr::str_locate(text , pattern = tomatch[i])
list_Text[[i]] <- stringr::str_sub(string = text,
start = list_Position[[i]][1, 1],
end = list_Position[[i]][1, 2])
}
Here is another approach :
library(RDCOMClient)
library(stringr)
library(rvest)
url <- "https://www.ft.dk/samling/20161/lovforslag/l154/index.htm"
IEApp <- COMCreate("InternetExplorer.Application")
IEApp[['Visible']] <- TRUE
IEApp$Navigate(url)
Sys.sleep(5)
doc <- IEApp$Document()
html_Content <- doc$documentElement()$innerText()
tomatch <- c("(A|a)ftalen", "(O|o)pholdskravet")
nb_Tomatch <- length(tomatch)
list_Position <- list()
list_Text <- list()
for(i in 1 : nb_Tomatch)
{
# Locates the first hit of the regex
# To locate all regex hit, use stringr::str_locate_all
list_Position[[i]] <- stringr::str_locate(text , pattern = tomatch[i])
list_Text[[i]] <- stringr::str_sub(string = text,
start = list_Position[[i]][1, 1],
end = list_Position[[i]][1, 2])
}

How to extract an adjacency matrix of a giant component of a graph using R?

I would like to extract an adjacency matrix of a giant component of a graph using R.
For example, I can create Erdos-Renyi g(n,p)
n = 100
p = 1.5/n
g = erdos.renyi.game(n, p)
coords = layout.fruchterman.reingold(g)
plot(g, layout=coords, vertex.size = 3, vertex.label=NA)
# Get the components of an undirected graph
cl = clusters(g)
# How many components?
cl$no
# How big are these (the first row is size, the second is the number of components of that size)?
table(cl$csize)
cl$membership
# Get the giant component
nodes = which(cl$membership == which.max(cl$csize))
# Color in red the nodes in the giant component and in sky blue the rest
V(g)$color = "SkyBlue2"
V(g)[nodes]$color = "red"
plot(g, layout=coords, vertex.size = 3, vertex.label=NA)
here, I only want to extract the adjacency matrix of those red nodes.
enter image description here
It's easy to get the giant component as a new graph like below and then get the adjacency matrix.
g <- erdos.renyi.game(100, .015, directed = TRUE)
# if you have directed graph, decide if you want
# strongly or weakly connected components
co <- components(g, mode = 'STRONG')
gi <- induced.subgraph(g, which(co$membership == which.max(co$csize)))
# if you want here you can decide if you want values only
# in the upper or lower triangle or both
ad <- get.adjacency(gi)
But you might want to keep the vertex IDs of the original graph. In this case just subset the adjacency matrix:
g <- erdos.renyi.game(100, .015)
co <- components(g)
gi_vids <- which(co$membership == which.max(co$csize))
gi_ad <- get.adjacency(g)[gi_vids, gi_vids]
# you can even add the names of the nodes
# as row and column names.
# generating dummy node names:
V(g)$name <- sapply(
seq(vcount(g)),
function(i){
paste(letters[ceiling(runif(5) * 26)], collapse = '')
}
)
rownames(gi_ad) <- V(g)$name[gi_vids]
colnames(gi_ad) <- V(g)$name[gi_vids]

kableExtra::text_spec - Rotate Text - Unwanted Commas

Using RSTudio > Blogdown > Hugo to create a blog
Inserting this R in a post. When the HTML is rendered there are commas between the rotated letters. Why is that?
library("knitr")
library("kableExtra")
library("dplyr")
library("formattable")
library("stringr")
library("tidyverse")
p1 <- c("R Markdown is pretty neat. You can do things like this. I wonder why more people don't")
p1 <- c("Hello World!")
p2 <- c("do this. It's so much easier to read. NOTE: Those people live here.")
p_text <- unlist(strsplit(p1, "")) # strsplit returns a list. Make it a vector.
num_char <- length(p_text)
p_angle <- seq(30, 360, 30)
num_ang <- length(p_angle)
p_angle_long <- rep(p_angle, ceiling(num_char / num_ang)) # Repeat the angles for the length of the string
p_angle_long <- p_angle_long[1:num_char]
rtext <- text_spec(p_text, "html", bold = TRUE, angle = p_angle_long)
The output of text_spec is a vector with each letter (+ accompanying HTML tags) as a separate element. You can combine into a single string with paste0:
# Example RMarkdown chunk that produces rotated text:
```{r txt, results='asis'}
library("knitr")
library("kableExtra")
library("tidyverse")
p1 <- c("Hello World!")
p2 <- c("do this. It's so much easier to read. NOTE: Those people live here.")
p_text <- unlist(strsplit(p1, "")) # strsplit returns a list. Make it a vector.
num_char <- length(p_text)
p_angle <- seq(30, 360, 30)
num_ang <- length(p_angle)
p_angle_long <- rep(p_angle, ceiling(num_char / num_ang))
# Repeat the angles for the length of the string
p_angle_long <- p_angle_long[1:num_char]
rtext <- text_spec(p_text, "html", bold = TRUE, angle = p_angle_long)
cat(paste0(rtext, collapse=""))
```