Will Shortest_paths considerate all paths to a same name vertex, however with different index ID? - igraph

I have a vertex named "A" and other one with the same name "A", however with different index_ids, if there is a vertex called "X" and i do:
graph.get_all_shortest_paths("X", "A" , mode = 'out')
Will it return the shortest path considering X to A¹ and X to A², or only the first one that igraph "see", since igraph already did the last option, when connecting edges.
Thanks.

It appears that igraph defaults to the lowest id if there are vertices with the same name. see below:
library(igraph)
set.seed(8675309)
el <- data.frame(to=sample(LETTERS[1:5], size = 10, replace = T),
from = sample(LETTERS[1:5], size = 10,
replace = T))
g <- graph_from_edgelist(as.matrix(el))
g1 <- add_vertices(g, 2, attr = list(name = c("A","X" )))
g1 <- add_edges(g1, c(7,1, 7,4, 4,6, 2,1))
V(g1)[[]]
#+ 7/7 vertices, named, from f23a782:
# name
#1 A
#2 E
#3 C
#4 D
#5 B
#6 A
#7 X
plot(g1, vertex.label = make.unique(V(g1)$name))
First we check for shortest paths with the first A vertex is closest.
get.all.shortest.paths(g1, from = "X", to = "A")
#$res[[1]]
#+ 2/7 vertices, named, from f23a782:
#[1] X A
get.all.shortest.paths(g1, from = "X", to = V(g1)[name=="A"])
#$res[[1]]
#+ 2/7 vertices, named, from f23a782:
#[1] X A
#
#$res[[2]]
#+ 3/7 vertices, named, from f23a782:
#[1] X D A
Then we check the shortest path when the second A (A.1) vertex is closest and notice that the path to the first A is reported even though it is longer.
get.all.shortest.paths(g1, from = "D", "A")
#$res[[1]]
#+ 3/7 vertices, named, from f23a782:
#[1] D E A
get.all.shortest.paths(g1, from = "D", V(g1)[name=="A"])
#$res[[1]]
#+ 3/7 vertices, named, from f23a782:
#[1] D E A
#
#$res[[2]]
#+ 2/7 vertices, named, from f23a782:
#[1] D A

Related

Plot surface in 3D where f(x,y,z) vanishes

I have a function
f(x,y,z) = 1 + 2xyz - xx - yy - z*z. My range of interest is [-1,1] in any variable.
Obviously, I could not construct a human visible 4D Plot, but I could plot the points in 3D where it vanishes, or fill with red the range where it is positive.
Having forgotten almost everything about matlab and octave, I searched examples and tried:
[x y z] = meshgrid(-1:0.1:1, -1:0.1:1, -1:0.1:1);
coords = [x(:) y(:) z(:)];
V = 1.0 + 2*x.*y.*z -y.*y - z.*z - x.*x;
for p = 1: 1:100
if V(p) >=0
c = 'red';
scatter3(x(:,p),y(:,p), z(:,p), 'c');
end
end
It produces a plot, but it must be bogus, because there are no red dots and 1,1,1 is a solution, which doesn’t show up in the plot. Also the z axis is messed up, showing only negative values.
Please help.
The Plot:
Try this
[x y z] = meshgrid( -1 : 0.1 : 1, -1 : 0.1 : 1, -1 : 0.1 : 1 );
V = 1.0 + 2 * x .* y .* z - y .* y - z .* z - x .* x;
scatter3( x(:), y(:), z(:), 100, V(:), 'filled' );

Parsing incomplete lists into data frames with two different problems

If you request web data through R, you often work with json or xml where the fields are not named if there is no value for them. Sometimes, there isn't even any data and it comes out as an empty list for a certain index. So, I see this as two different problems. I'm proposing the solution I use to solve this as well but I know there are some better ones out there. I have for starters, a very messy and fake list that I created that is missing field names (on purpose from the xml, json spec) AND missing whole indexes (also on purpose).
(messy_list <- list(list(x = 2, y = 3),
list(),
list(y = 4),
list(x = 5)))
Now, here is how I break it down to what I would say is "solved".
library(plyr)
messy_list_no_empties <- lapply(messy_list, function(x) if(length(x) == 0) {list(NA, NA)} else x)
ldply(messy_list_no_empties, data.frame)[,1:2]
The end result is what I am looking for but I would like to find a more elegant way to deal with this problem.
With purrr::map_df,
library(purrr)
messy_list <- list(list(x = 2, y = 3),
list(),
list(y = 4),
list(x = 5))
messy_list %>% map_df(~list(x = .x$x %||% NA,
y = .x$y %||% NA))
#> # A tibble: 4 × 2
#> x y
#> <dbl> <dbl>
#> 1 2 3
#> 2 NA NA
#> 3 NA 4
#> 4 5 NA
map_df iterates over the list like lapply and coerces the results to a data.frame. The function (in purrr's formula form) assembles a list with an x and a y element, looking for existing values if they're there. If they're not, the subsetting will return NULL, which %||% will replace with the value after it, NA.
In mostly-equivalent base R,
as.data.frame(do.call(rbind,
lapply(messy_list, function(.x){
list(x = ifelse(is.null(.x$x), NA, .x$x),
y = ifelse(is.null(.x$y), NA, .x$y))
})))
#> x y
#> 1 2 3
#> 2 NA NA
#> 3 NA 4
#> 4 5 NA
Note the base approach won't handle different types well. To do so, coerce everything to character (rbind probably will anyway, so just add stringsAsFactors = FALSE to as.data.frame) and lapply type.convert.
Your method is already pretty compact, but if you're looking for other methods, one way might be to use rbindlist from data.table:
library(data.table)
new_list <- lapply(messy_list, function(x) if(identical(x,list())){list(x = NA)} else {x})
rbindlist(new_list, fill = T, use.names = T)
# x y
#1: 2 3
#2: NA NA
#3: NA 4
#4: 5 NA
Note we need the lapply so it doesn't drop the rows that are empty

How to write a JSON object from R dataframe with grouping

In general I feel there is a need to make JSON objects by folding multiple columns. There is no direct way to do this afaik. Please point it out if there is ..
I have data of this from
A B C
1 a x
1 a y
1 c z
2 d p
2 f q
2 f r
How do I write a json which looks like
{'query':'1', 'type':[{'name':'a', 'values':[{'value':'x'}, {'value':'y'}]}, {'name':'c', 'values':[{'value':'z'}]}]}
and similarly for 'query':'2'
I am looking to spit them in the mongo import/export individual json lines format.
Any pointers are also appreciated..
You've got a little "non-standard" thing going with two keys of "value" (I don't know if this is legal json), as you can see here:
(js <- jsonlite::fromJSON('{"query":"1", "type":[{"name":"a", "values":[{"value":"x"}, {"value":"y"}]}, {"name":"c", "values":[{"value":"z"}]}]}'))
## $query
## [1] "1"
##
## $type
## name values
## 1 a x, y
## 2 c z
... with a data.frame cell containing a list of data.frames:
js$type$values[[1]]
## value
## 1 x
## 2 y
class(js$type$values[[1]])
## [1] "data.frame"
If you can accept your "type" variable containing a vector instead of a named-list, then perhaps the following code will suffice:
jsonlite::toJSON(lapply(unique(dat[, 'A']), function(a1) {
list(query = a1,
type = lapply(unique(dat[dat$A == a1, 'B']), function(b2) {
list(name = b2,
values = dat[(dat$A == a1) & (dat$B == b2), 'C'])
}))
}))
## [{"query":[1],"type":[{"name":["a"],"values":["x","y"]},{"name":["c"],"values":["z"]}]},{"query":[2],"type":[{"name":["d"],"values":["p"]},{"name":["f"],"values":["q","r"]}]}]

Haskell function about even and odd numbers

I'm new to Haskell, started learning a couple of days ago and I have a question on a function I'm trying to make.
I want to make a function that verifies if x is a factor of n (ex: 375 has these factors: 1, 3, 5, 15, 25, 75, 125 and 375), then removes the 1 and then the number itself and finally verifies if the number of odd numbers in that list is equal to the number of even numbers!
I thought of making a functions like so to calculate the first part:
factor n = [x | x <- [1..n], n `mod`x == 0]
But if I put this on the prompt it will say Not in scope 'n'. The idea was to input a number like 375 so it would calculate the list. What I'm I doing wrong? I've seen functions being put in the prompt like this, in books.
Then to take the elements I spoke of I was thinking of doing tail and then init to the list. You think it's a good idea?
And finally I thought of making an if statement to verify the last part. For example, in Java, we'd make something like:
(x % 2 == 0)? even++ : odd++; // (I'm a beginner to Java as well)
and then if even = odd then it would say that all conditions were verified (we had a quantity of even numbers equal to the odd numbers)
But in Haskell, as variables are immutable, how would I do the something++ thing?
Thanks for any help you can give :)
This small function does everything that you are trying to achieve:
f n = length evenFactors == length oddFactors
where evenFactors = [x | x <- [2, 4..(n-1)], n `mod` x == 0]
oddFactors = [x | x <- [3, 5..(n-1)], n `mod` x == 0]
If the "command line" is ghci, then you need to
let factor n = [x | x <- [2..(n-1)], n `mod` x == 0]
In this particular case you don't need to range [1..n] only to drop 1 and n - range from 2 to (n-1) instead.
The you can simply use partition to split the list of divisors using a boolean predicate:
import Data.List
partition odd $ factor 10
In order to learn how to write a function like partition, study recursion.
For example:
partition p = foldr f ([],[]) where
f x ~(ys,ns) | p x = (x:ys,ns)
f x ~(ys,ns) = (ys, x:ns)
(Here we need to pattern-match the tuples lazily using "~", to ensure the pattern is not evaluated before the tuple on the right is constructed).
Simple counting can be achieved even simpler:
let y = factor 375
(length $ filter odd y) == (length y - (length $ filter odd y))
Create a file source.hs, then from ghci command line call :l source to load the functions defined in source.hs.
To solve your problem this may be a solution following your steps:
-- computers the factors of n, gets the tail (strips 1)
-- the filter functions removes n from the list
factor n = filter (/= n) (tail [x | x <- [1..n], n `mod` x == 0])
-- checks if the number of odd and even factors is equal
oe n = let factors = factor n in
length (filter odd factors) == length (filter even factors)
Calling oe 10 returns True, oe 15 returns False
(x % 2 == 0)? even++ : odd++;
We have at Data.List a partition :: (a -> Bool) -> [a] -> ([a], [a]) function
So we can divide odds like
> let (odds,evens) = partition odd [1..]
> take 10 odds
[1,3,5,7,9,11,13,15,17,19]
> take 10 evens
[2,4,6,8,10,12,14,16,18,20]
Here is a minimal fix for your factor attempt using comprehensions:
factor nn = [x | n <- [1..nn], x <- [1..n], n `mod`x == 0]

Subsetting a data frame in a function using another data frame as parameter

I would like to submit a data frame to a function and use it to subset another data frame.
This is the basic data frame:
foo <- data.frame(var1= c(1, 1, 1, 2, 2, 3), var2=c('A', 'A', 'B', 'B', 'C', 'C'))
I use the following function to find out the frequencies of var2 for specified values of var1.
foobar <- function(x, y, z){
a <- subset(x, (x$var1 == y))
b <- subset(a, (a$var2 == z))
n=nrow(b)
return(n)
}
Examples:
foobar(foo, 1, "A") # returns 2
foobar(foo, 1, "B") # returns 1
foobar(foo, 3, "C") # returns 1
This works. But now I want to submit a data frame of values to foobar. Instead of the above examples, I would like to submit df to foobar and get the same results as above (2, 1, 1)
df <- data.frame(var1=c(1, 1, 3), var2=c("A", "B", "C"))
When I change foobar to accept two arguments like foobar(foo, df) and use y[, c(var1)] and y[, c(var2)] instead of the two parameters x and y it still doesn't work. Which way is there to do this?
edit1: last paragraph clarified
edit2: var1 type corrected
Try this:
library(plyr)
match_df <- function(x, match) {
vars <- names(match)
# Create unique id for each row
x_id <- id(match[vars])
match_id <- id(x[vars])
# Match identifiers and return subsetted data frame
x[match(x_id, match_id, nomatch = 0), ]
}
match_df(foo, df)
# var1 var2
# 1 1 A
# 3 1 B
# 5 2 C
Your function foobar is expecting three arguments, and you only supplied two arguments to it with foobar(foo, df). You can use apply to get what you want:
apply(df, 1, function(x) foobar(foo, x[1], x[2]))
And in use:
> apply(df, 1, function(x) foobar(foo, x[1], x[2]))
[1] 2 1 1
To respond to your edit:
I'm not entirely sure what y[, c(var1)] means, but here's an attempt at trying to figure out what you are trying to do.
What I think you were trying to do was: foobar(foo, y = df[, "var1"], z = df[, "var2"]).
First, note that the use of c() is not needed here and you can reference the columns you want by placing the name of the column in quotes OR reference the column by number (as I did above). Secondly, df[, "var1"] returns all of the rows for the column names var1 which has a length of three:
> length(df[, "var1"])
[1] 3
The function you defined is not set up to deal with vectors of length greater than 1. That is why we need to iterate through each row of your dataframe to grab a single value, process it, and then go to the next row in the data.frame. That is what the apply function does. It is equivalent to saying something along the lines of for (i in 1: length(nrow(df)) but is a more idiomatic way of handling such issues.
Finally, is there a reason you generated var1 as a factor? It probably makes more sense to treate these as numeric in my opinion. Compare:
> str(df)
'data.frame': 3 obs. of 2 variables:
$ var1: Factor w/ 2 levels "1","3": 1 1 2
$ var2: Factor w/ 3 levels "A","B","C": 1 2 3
Versus
> df2 <- data.frame(var1=c(1,1,3), var2=c("A", "B", "C"))
> str(df2)
'data.frame': 3 obs. of 2 variables:
$ var1: num 1 1 3
$ var2: Factor w/ 3 levels "A","B","C": 1 2 3
In summary - apply is the function you are after here. You may want to spend some time thinking about whether your data should be numeric or a factor, but apply is still what you want.
foobar2 <- function(x, df) {
.dofun <- function(y, z){
a <- subset(x, x$var1==y)
b <- subset(a, a$var2==z)
n <- nrow(b)
return (n)
}
ans <- mapply(.dofun, as.character(df$var1), as.character(df$var2))
names(ans) <- NULL
return(ans)
}