permuting a path while keeping end vertices constant? - igraph

I have animal movement paths from GPS collars (the animal's location was recorded every 2h). To study how the actual path compares to random paths I need to generate alternate paths by randomly distributing the original route segments between the actual beginning and end locations (first and last vertices). I thought a good way to go would be to use the permute.vertices function in igraph. However, I cannot figure out how to keep the first and last vertices constant.
Here is a sample data set:
I'm starting out with a matrix of from-coordinates and to-coordinates that define the steps:
library(igraph)
path <- matrix (c(-111.52, -111.49, -111.48, -111.47, -111.46,
35.34, 35.35, 35.33, 35.32, 35.31,
-111.49, -111.48, -111.47, -111.46, -111.5,
35.35, 35.33, 35.32, 35.31, 35.4),
nrow=5, ncol=4)
path<-as.data.frame(path)
names(path)<-c("From.x","From.y","To.x","To.y")
From <- 0:(nrow(path)-1)
To <- 1:nrow(path)
path <- cbind(From, To, path)
Turning the data.frame into a graph:
path <- graph.data.frame(path,directed=FALSE)
V(path)
Randomly permuting the vertices:
path2 <- permute.vertices(path, permutation=sample(vcount(path)))
V(path2)
How could I write the code to keep the first and last vertices always "0" and "5"? (or depending on the path, of course, a different number than "5")
I also then need to extract the coordinates from the permuted path and get them into a matrix. I tried it with the tkplot.getcoords command, but am not sure how to transform them back (I suppose tkplot transforms them somehow).
tkplot(path2)
kplot.getcoords(1, norm = TRUE)
I'm using RStudio on Windows 8.

Then just permute the rest of the vertices, and keep 0 and 5:
perm <- c(1, sample(2:(vcount(path)-1)), 5)
perm
# [1] 1 4 5 3 2 5
path2 <- permute.vertices(path, permutation=perm)
V(path2)
# Vertex sequence:
# [1] "0" "4" "3" "1" "5" "0"
For your other question, please explain better what you want, because I am not sure what kind of matrix you want to create.

Related

Networkx pyvis: change color of nodes

I have a dataframe that has source: person 1, target: person 2 and in_rewards_program : binary.
I created a network using the pyvis package"
got_net = Network(notebook=True, height="750px", width="100%")
# got_net = Network(notebook=True, height="750px", width="100%", bgcolor="#222222", font_color="white")
# set the physics layout of the network
got_net.barnes_hut()
got_data = df
sources = got_data['source']
targets = got_data['target']
# create graph using pviz network
edge_data = zip(sources, targets)
for e in edge_data:
src = e[0]
dst = e[1]
#add nodes and edges to the graph
got_net.add_node(src, src, title=src)
got_net.add_node(dst, dst, title=dst)
got_net.add_edge(src, dst)
neighbor_map = got_net.get_adj_list()
# add neighbor data to node hover data
for node in got_net.nodes:
node["title"] += " Neighbors:<br>" + "<br>".join(neighbor_map[node["id"]])
node["value"] = len(neighbor_map[node["id"]]) # this value attrribute for the node affects node size
got_net.show("test.html")
I want to add the functionality where the nodes are different colors based on the value in in_rewards_program. If the source node has 0 then make the node red and if the source node had 1 then make it blue. I am not sure how to do this.
There is not much information to know more about your data but based on your code I can assume that you can zip "source" and "target" columns with "in_rewards_program" column and make a conditional statement before adding the nodes so that it will change the node color based on the reward value. According to pyvis documentation, you can pass a color parameter with add_node method:
got_net = Network(notebook=True, height="750px", width="100%")
# set the physics layout of the network
got_net.barnes_hut()
sources = df['source']
targets = df['target']
rewards = df['in_rewards_program']
# create graph using pviz network
edge_data = zip(sources, targets, rewards)
for src, dst, reward in edge_data:
#add nodes and edges to the graph
if reward == 0:
got_net.add_node(src, src, title=src, color='red')
if reward == 1:
got_net.add_node(dst, dst, title=dst, color='blue')
got_net.add_edge(src, dst)

How to automatically crop an .OBJ 3D model to a bounding box?

In the now obsoleted Autodesk ReCap API it was possible to specify a "bounding box" around the scene to be generated from images.
In the resulting models, any vertices outside the bounding box were discarded, and any volumes that extended beyond the bounding box were truncated to have faces at the box boundaries.
I am now using Autodesk's Forge Reality Capture API which replaced ReCap. Apparently, This new API does not allow the user to specify a bounding box.
So I am now searching for a program that takes an .OBJ file and a specified bounding box as input, and outputs a file of just the vertices and faces within this bounding box.
Given that there is no way to specify the bounding box in Reality Capture API, I created this python program. It is crude, in that it only discards faces that have vertices that are outside the bounding box. And it actually does discards nondestructively, only by commenting them out in the output OBJ file. This allows you to uncomment them and then use a different bounding box.
This may not be what you need if you truly want to remove all relevant v, vn, vt, vp and f lines that are outside the bounding box, because the OBJ file size remains mostly unchanged. But for my particular needs, keeping all the records and just using comments was preferable.
# obj3Dcrop.py
# (c) Scott L. McGregor, Dec 2019
# License: free for all non commercial uses. Contact author for any other uses.
# Changes and Enhancements must be shared with author, and be subject to same use terms
# TL;DR: This program uses a bounding box, and "crops" faces and vertices from a
# Wavefront .OBJ format file, created by Autodesk Forge Reality Capture API
# if one of the vertices in a face is not within the bounds of the box.
#
# METHOD
# 1) All lines other than "v" vertex definitions and "f" faces definitions
# are copied UNCHANGED from the input .OBJ file to an output .OBJ file.
# 2) All "v" vertex definition lines have their (x, y, z) positions tested to see if:
# minX < x < maxX and minY < y < maxY and minZ < z < maxZ ?
# If TRUE, we want to keep this vertex in the new OBJ, so we
# store its IMPLICIT ORDINAL position in the file in a dictionary called v_keepers.
# If FALSE, we will use its absence from the v_keepers file as a way to identify
# faces that contain it and drop them. All "v" lines are also copied unchanged to the
# output file.
# 3) All "f" lines (face definitions) are inspected to verify that all 3 vertices in the face
# are in the v_keepers list. If they are, the f line is output unchanged.
# 4) Any "f" line that refers to a vertex that was cropped, is prefixed by "# CROPPED: "
# in the output file. Lines beginning # are treated as comments, and ignored in future
# processing.
# KNOWN LIMITATIONS: This program generates models in which the outside of bound faces
# have been removed. The vertices that were found outside the bounding box, are still in the
# OBJ file, but they are now disconnected and therefore ignored in later processing.
# The "f" lines for faces with vertices outside the bounding box are also still in the
# output file, but now commented out, so they don't process. Because this is non-destructive.
# we can easily change our bounding box later, uncomment cropped lines and reprocess.
#
# This might be an incomplete solution for some potential users. For such users
# a more complete program would delete unneeded v, vn, vt and vp lines when the v vertex
# that they refer to is dropped. But note that this requires renumbering all references to these
# vertice definitions in the "f" face definition lines. Such a more complete solution would also
# DISCARD all 'f' lines with any vertices that are out of bounds, instead of making them copies.
# Such a rewritten .OBJ file would be var more compact, but changing the bounding box would require
# saving the pre-cropped original.
# QUIRK: The OBJ file format defines v, vn, vt, vp and f elements by their
# IMPLICIT ordinal occurrence in the file, with each element type maintaining
# its OWN separate sequence. It then references those definitions EXPLICITLY in
# f face definitions. So deleting (or commenting out) element references requires
# appropriate rewriting of all the"f"" lines tracking all the new implicit positions.
# Such rewriting is not particularly hard to do, but it is one more place to make
# a mistake, and could make the algorithm more complicated to understand.
# This program doesn't bother, because all further processing of the output
# OBJ file ignores unreferenced v, vn, vt and vp elements.
#
# Saving all lines rather than deleting them to save space is a tradeoff involving considerations of
# Undo capability, compute cycles, compute space (unreferenced lines) and maintenance complexity choice.
# It is left to the motivated programmer to add this complexity if needed.
import sys
#bounding_box = sys.argv[1] # should be in the only string passsed (maxX, maxY, maxZ, minX, minY, minZ)
bounding_box = [10, 10, 10, -10, -10, 1]
maxX = bounding_box[0]
maxY = bounding_box[1]
maxZ = bounding_box[2]
minX = bounding_box[3]
minY = bounding_box[4]
minZ = bounding_box[5]
v_keepers = dict() # keeps track of which vertices are within the bounding box
kept_vertices = 0
discarded_vertices = 0
kept_faces = 0
discarded_faces = 0
discarded_lines = 0
kept_lines = 0
obj_file = open('sample.obj','r')
new_obj_file = open('cropped.obj','w')
# the number of the next "v" vertex lines to process.
original_v_number = 1 # the number of the next "v" vertex lines to process.
new_v_number = 1 # the new ordinal position of this vertex if out of bounds vertices were discarded.
for line in obj_file:
line_elements = line.split()
# Python doesn't have a SWITCH statement, but we only have three cases, so we'll just use cascading if stmts
if line_elements[0] != "f": # if it isn't an "f" type line (face definition)
if line_elements[0] != "v": # and it isn't an "v" type line either (vertex definition)
# ************************ PROCESS ALL NON V AND NON F LINE TYPES ******************
# then we just copy it unchanged from the input OBJ to the output OBJ
new_obj_file.write(line)
kept_lines = kept_lines + 1
else: # then line_elements[0] == "v":
# ************************ PROCESS VERTICES ****************************************
# a "v" line looks like this:
# f x y z ...
x = float(line_elements[1])
y = float(line_elements[2])
z = float(line_elements[3])
if minX < x < maxX and minY < y < maxY and minZ < z < maxZ:
# if vertex is within the bounding box, we include it in the new OBJ file
new_obj_file.write(line)
v_keepers[str(original_v_number)] = str(new_v_number)
new_v_number = new_v_number + 1
kept_vertices = kept_vertices +1
kept_lines = kept_lines + 1
else: # if vertex is NOT in the bounding box
new_obj_file.write(line)
discarded_vertices = discarded_vertices +1
discarded_lines = discarded_lines + 1
original_v_number = original_v_number + 1
else: # line_elements[0] == "f":
# ************************ PROCESS FACES ****************************************
# a "f" line looks like this:
# f v1/vt1/vn1 v2/vt2/vn2 v3/vt3/vn3 ...
# We need to delete any face lines where ANY of the 3 vertices v1, v2 or v3 are NOT in v_keepers.
v = ["", "", ""]
# Note that v1, v2 and v3 are the first "/" separated elements within each line element.
for i in range(0,3):
v[i] = line_elements[i+1].split('/')[0]
# now we can check if EACH of these 3 vertices are in v_keepers.
# for each f line, we need to determine if all 3 vertices are in the v_keepers list
if v[0] in v_keepers and v[1] in v_keepers and v[2] in v_keepers:
new_obj_file.write(line)
kept_lines = kept_lines + 1
kept_faces = kept_faces +1
else: # at least one of the vertices in this face has been deleted, so we need to delete the face too.
discarded_lines = discarded_lines + 1
discarded_faces = discarded_faces +1
new_obj_file.write("# CROPPED "+line)
# end of line processing loop
obj_file.close()
new_obj_file.close()
print ("kept vertices: ", kept_vertices ,"discarded vertices: ", discarded_vertices)
print ("kept faces: ", kept_faces, "discarded faces: ", discarded_faces)
print ("kept lines: ", kept_lines, "discarded lines: ", discarded_lines)
Unfortunately, (at least for now) there is no way to specify the bounding box in Reality Capture API.

Data left out when reading json online to R

I try to read online json data to R through the below codes in R:
library('jsonlite')
address<-'https://data.cityofchicago.org/resource/qnmj-8ku6.json'
sample<-fromJSON(address)
The codes did run and have results in right format of a table. But only produced 1000 observations while the original city portal database has more than 200,000 observations. I am not sure what to be fixed to download the whole dataset. Please help.
You're using the wrong link to get the data. You can see the correct link by going to 'Export'
library(jsonlite)
address <- "https://data.cityofchicago.org/api/views/qnmj-8ku6/rows.json?accessType=DOWNLOAD"
sample <- fromJSON(address)
length(sample)
# [1]
length(sample[[2]])
# [1] 274228
Although, you may want to get it as a .csv to make it easier to work with straight away?
address <- "https://data.cityofchicago.org/api/views/qnmj-8ku6/rows.csv?accessType=DOWNLOAD"
sample_csv <- read.csv(address)
nrow(sample_csv)
# [1] 274228
str(sample_csv)
# 'data.frame': 274228 obs. of 22 variables:
# $ ID : int 10512552 10517063 10517120 10518590 10518648
# $ Case.Number : Factor w/ 274219 levels "HA107183","HA156050",..
# $ Date : Factor w/ 112977 levels "01/01/2014 01:00:00 AM",..
# $ Block : Factor w/ 27499 levels "0000X E 100TH PL",..
# $ IUCR : Factor w/ 331 levels "0110","0141",..
# $ Primary.Type : Factor w/ 33 levels "ARSON","ASSAULT",..
# $ Description : Factor w/ 310 levels "$500 AND UNDER",..
# ... etc

Iteratively read a fixed number of lines into R

I have a josn file I'm working with that contains multiple json objects in a single file. R is unable to read the file as a whole. But since each object occurs at regular intervals, I would like to iteratively read a fixed number of lines into R.
There are a number of SO questions on reading single lines into R but I have been unable to extend these solutions to a fixed number of lines. For my problem I need to read 16 lines into R at a time (eg 1-16, 17-32 etc)
I have tried using a loop but can't seem to get the syntax right:
## File
file <- "results.json"
## Create connection
con <- file(description=file, open="r")
## Loop over a file connection
for(i in 1:1000) {
tmp <- scan(file=con, nlines=16, quiet=TRUE)
data[i] <- fromJSON(tmp)
}
The file contains over 1000 objects of this form:
{
"object": [
[
"a",
0
],
[
"b",
2
],
[
"c",
2
]
]
}
With #tomtom inspiration I was able to find a solution.
## File
file <- "results.json"
## Loop over a file
for(i in 1:1000) {
tmp <- paste(scan(file=file, what="character", sep="\n", nlines=16, skip=(i-1)*16, quiet=TRUE),collapse=" ")
assign(x = paste("data", i, sep = "_"), value = fromJSON(tmp))
}
I couldn't create a connection as each time I tried the connection would close before the file had been completely read. So I got rid of that step.
I had to include the what="character" variable as scan() seems to expect a number by default.
I included sep="\n", paste() and collapse=" " to create a single string rather than the vector of characters that scan() creates by default.
Finally I just changed the final assignment operator to have a bit more control over the names of the output.
This might help:
EDITED to make it use a list and Reduce into one file
## Loop over a file connection
data <- NULL
for(i in 1:1000) {
tmp <- scan(file=con, nlines=16, skip=(i-1)*16, quiet=TRUE)
data[[i]] <- fromJSON(tmp)
}
df <- Reduce(function(x, y) {paste(x, y, collapse = " ")})
You would have to make sure that you don't reach further than the end of the file though ;-)

Rpy2 - Select Results and Output to CSV File

I'm currently doing Cox Proportional Hazards Modeling using Rpy2 - I imagine my question will cover other functions and the results from calling them as well though.
After I run the function, I have a variable which contains the results from the function, in the form of a vector. I have tried explicitly converting this to a DataFrame (resultsDataFrame = DataFrame(resultVector)). There are no errors returned when doing this. However, when I do resultsDataFrame.to_csvfile(filename) I get the following error:
Traceback (most recent call last):
File "<pyshell#171>", line 1, in <module>
modelFrame.to_csvfile('/Users/fortylashes/Documents/Matthews_Research/Cox_PH/ResultOutput_Exp1.csv')
File "/Library/Python/2.7/site-packages/rpy2/robjects/vectors.py", line 1031, in to_csvfile
'col.names': col_names, 'qmethod': qmethod, 'append': append})
RRuntimeError: Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) :
cannot coerce class ""coxph"" to a data.frame
Furthermore, when I simply do:
for result in resultVector:
print (result)
I get an extremely long list of results- including information on each entry in the dataset used in the model, for each variable (so 9,000 records x 9 variables = 81,000 unneeded results). The results I really need are at the bottom of this vector and look like this:
coef exp(coef) se(coef) z p
age_age6574 -0.057775 0.944 0.05469 -1.056 2.9e-01
age_age75plus -0.020795 0.979 0.04891 -0.425 6.7e-01
sex_female -0.005304 0.995 0.03961 -0.134 8.9e-01
stage_late -0.261609 0.770 0.04527 -5.779 7.5e-09
access -0.000494 1.000 0.00069 -0.715 4.7e-01
Likelihood ratio test=36.6 on 5 df, p=7.31e-07 n= 9752, number of events= 2601
*NOTE: There were several more variables for which data was reported in the initial results (the 9,000 x 9 that I was talking about) but weren't actually used in the model.
I was wondering if there was a way to explicitly get this data, put it in one long ordered row, and then output it to a csv file?
::::UPDATE::::
When I call theModel.names I get a list of the various measures which can be called by numerical index:
[1] "coefficients" "var" "loglik"
[4] "score" "iter" "linear.predictors"
[7] "residuals" "means" "concordance"
[10] "method" "n" "nevent"
[13] "terms" "assign" "wald.test"
[16] "y" "formula" "call"
From this I can get the coefficients, which can then be exponentiated. I have not found, however, the p-value, the z score or the likelihood test ratio, which I will need.