Not saving html interactive file with R - html

I am trying to design a circos plot using BioCircos R package. BioCircos allows to save the plots as .html interactive files. However, when I run the package using RScript the saved .html file is empty. To save the .html file I used saveWidget option from htmlwidgets package. Is it something wrong with saveWidget option? The code I used follows:
#!/usr/bin/Rscript
######R script for BioCircos test
library(htmlwidgets)
library(BioCircos)
genomes <- list("chra1" = 217471166, "chra2" = 181034961, "chra3" = 153873357, "chra4" = 153961319, "chra5" = 164033575,
"chra6" = 154486312, "chra7" = 133565930, "chra8" = 147241510, "chra9" = 91218944, "chra10" = 52432566, "chrb1" = 843366180, "chrb2" = 842558404, "chrb3" = 707956555, "chrb4" = 635713434, "chrb5" = 567300182,
"chrb6" = 439630435, "chrb7" = 236595445, "chrb8" = 231667822, "chrb9" = 230778867, "chrb10" = 151572763, "chrb11" = 103205957) # custom genome
links_chromosomes_01 <- c("chra1", "chra2", "chra3", "chra4", "chra4", "chra5", "chra6", "chra7", "chra7", "chra8", "chra8", "chra9", "chra10") # Chromosomes on which the links should start
links_chromosomes_02 <- c("chrb2", "chrb3", "chrb1", "chrb9", "chrb10", "chrb4", "chrb5", "chrb6", "chrb1", "chrb8", "chrb3", "chrb7", "chrb6") # Chromosomes on which the links should end
links_pos_01 <- c(115060347, 102611974, 14761160, 128700431, 128681496, 42116205, 58890582, 40356090,
146935315, 136481944, 157464876, 39323393, 84752508, 136164354,
99573657, 102580613,
111139346, 120764772, 90748238, 122164776,
44933176, 18823342,
48771409, 128288229, 150613881, 18509106, 123913217, 51237349,
34237851, 53357604, 78270031,
25306417, 25320614,
94266153,
41447919, 28810876, 2802465,
45583472,
81968637, 27858237, 17263637,
30569409) ### links chra chromosomes
links_pos_02 <- c(410543481, 463189512, 825903588, 353914638, 354135472, 717707494, 643107332, 724899652,
583713545, 558756961, 642015290, 154999098, 340216235, 557731577,
643350872, 655077847,
85356666, 157889318, 226411560, 161566470,
109857786, 25338955,
473876792, 124495704, 46258030, 572314729, 141584107, 426419779,
531245660, 220131772, 353941099,
62422773, 62387030,
116923325,
76544045, 33452274, 7942164,
642047816,
215981114, 39278129, 23302654,
418922633) ### links chrb chromosomes
links_labels <- c("aldh1a3", "amh", "cyp26b1", "dmrt1", "dmrt3", "fgf20", "hhip", "srd5a3",
"amhr2", "dhh", "fgf9", "nr0b1", "rspo1", "wnt1",
"aldh1a2", "cyp19a1",
"lhx9", "pdgfb", "ptch2", "sox10",
"cbln1", "wt1",
"esr1", "foxl2", "gata4", "lrpprc", "serpine2", "srd5a2",
"asns", "ctnnb1", "srd5a1",
"cyp26a1", "cyp26c1",
"wnt4",
"ar", "nr5a1", "ptgds",
"fgf16",
"cxcr4", "pdgfa", "sox8",
"sox9")
tracklist <- BioCircosLinkTrack('myLinkTrack', links_chromosomes_01, links_pos_01,
links_pos_01, links_chromosomes_02, links_pos_02, links_pos_02,
maxRadius = 0.55, labels = links_labels)
#plotting results
plot_chra_chrb <- BioCircos(tracklist, genome = chra_chrb_genomes, genomeFillColor = "RdBu", chrPad = 0.02, displayGenomeBorder = FALSE, genomeLabelTextSize = "10pt", genomeTicksScale = 4e+3,
elementId = "chra_chrb_comp_plot_test.html")
saveWidget(plot_chra_chrb, "chra_chrb_comp_plot_test.html", selfcontained = F, libdir = "lib")
The command line to run this script:
Rscript /path_to/Circle_plot_test.r
I tried to use this script in RStudio (without saveWidget() command), however it took too long to run in my personnel computer and the results was not displayed. However, this could be due to memory usage setup because when I took off some data, the script easily generates the plot in RStudio and I am able to save it. Is there other way to save the .hmtl interactive files in R or am I doing something wrong using htmlwidgets package in my script?
Thanks all in advance for any help and comments.

When you said it took too long to run, that was a sign that something was wrong! You weren't getting anything when you used saveWidget, because there is nothing returned from BioCiros.
I found two things that are a problem. The first one will result in a blank output—you can't use a '.' in the element ID. This ID will be used in the HTML coding.
You were getting huge delays due to the scale you set for genomeTickScale. That scaling value is for a tick mark attribute. I'm not sure why you set it to .004. However, when I comment out that line, it renders immediately. I have no issues with saving the widget, either.
--One other thing, you had chra_chrb_genomes as the object name assigned to the parameter genome in the function BioCircos. I assumed it was the object genome from your question since it was the only unused object.
The only things I changed were in the BioCircos function:
(plot_chra_chrb <- BioCircos(tracklist, genome = genomes, #chra_chrb_genomes,
genomeFillColor = "RdBu",
chrPad = 0.02,
displayGenomeBorder = FALSE,
genomeLabelTextSize = "10pt",
# genomeTicksScale = 4e+3, # problematic
elementId = "chra_chrb_comp_plot_test" # no periods
))

Related

how to import variables from a json file to attributes in BUILD.bazel?

I would like to import variables defined in a json file(my_info.json) as attibutes for bazel rules.
I tried this (https://docs.bazel.build/versions/5.3.1/skylark/tutorial-sharing-variables.html) and works but do not want to use a .bzl file and import variables directly to attributes to BUILD.bazel.
I want to use those variables imported from my_info.json as attributes for other BUILD.bazel files.
projects/python_web/BUILD.bazel
load("//projects/tools/parser:config.bzl", "MY_REPO","MY_IMAGE")
container_push(
name = "publish",
format = "Docker",
registry = "registry.hub.docker.com",
repository = MY_REPO,
image = MY_IMAGE,
tag = "1",
)
Asking the similar in Bazel slack I was informed the is not possible to import variables directly to Bazel and it is needed to parse the json variables and write them into a .bzl file.
I tried also this code but nothing is written in config.bzl file.
my_info.json
{
"MYREPO" : "registry.hub.docker.com",
"MYIMAGE" : "michael/monorepo-python-web"
}
WORKSPACE.bazel
load("//projects/tools/parser:jsonparser.bzl", "load_my_json")
load_my_json(
name = "myjson"
)
projects/tools/parser/jsonparser.bzl
def _load_my_json_impl(repository_ctx):
json_data = json.decode(repository_ctx.read(repository_ctx.path(Label(":my_info.json"))))
config_lines = ["%s = %s" % (key, repr(val)) for key, val in json_data.items()]
repository_ctx.file("config.bzl", "\n".join(config_lines))
load_my_json = repository_rule(
implementation = _load_my_json_impl,
attrs = {},
)
projects/tools/parser/BUILD.bazel
load("#aspect_bazel_lib//lib:yq.bzl", "yq")
load(":config.bzl", "MYREPO", "MY_IMAGE")
yq(
name = "convert",
srcs = ["my_info2.json"],
args = ["-P"],
outs = ["bar.yaml"],
)
Executing:
% bazel build projects/tools/parser:convert
ERROR: Traceback (most recent call last):
File "/Users/michael.taquia/Documents/Personal/Projects/bazel/bazel-projects/multi-language-bazel-monorepo/projects/tools/parser/BUILD.bazel", line 2, column 22, in <toplevel>
load(":config.bzl", "MYREPO", "MY_IMAGE")
Error: file ':config.bzl' does not contain symbol 'MYREPO'
When making troubleshooting I see the execution calls the jsonparser.bzl but never enters to _load_my_json_impl function (based in print statements) and does not write anything to config.bzl.
Notes: Tested on macOS 12.6 (21G115 ) Darwin Kernel Version 21.6.0
There is a better way to do that? A code snippet will be very useful.

Running timeseries graphing function in Rmd producing cluttered x-axis labels (not present in test code)

I have a folder of xx .csv timeseries that I want to graph and knit into a clean HTML document. I have a ggplot code that produces the plot that I want using a single timeseries.csv. However, when I try to put the bones of that ggplot code in a function inside of a for loop to run each of the timeseries.csv files through the function I get a some plots with pretty different formatting.
Plot generated with my test ggplot code:
Plot generated with function and for loop:
Changes I'm trying to make to the ugly Rmd plot:
Nicely space the x-axis tick marks to whole mins (i.e. "11:14:00", "11:15:00")
Connect the data points (solved with subbing geom_line() with geom_path())
Example Rmd Code Below. Please Note that the graphs produced still have nice formatting, I'm not sure how to reproduce this problem sort of posting a 500 row dataframe. I also don't know how to post my rmd code without SO using the formatting commands in this post, so I threw in at 3 of " around my header formatting and at the end of the code to disable it.
Edits and Updates
I am getting a persistent error geom_path: Each group consists of only one observation. Do you need to adjust the group
aesthetic?.
As suggested by the commenters I tried removing plot() and using the the createChlDiffPlot() directly and replacing plot() with print(). Both produce the same ugly plots as before.
Replaced geom_line() with geom_path(). The points are now connected! x-axis cluttering is still there.
Time variable is reading as hms num
Many thanks for any help on this!
```
---
title: "Chl Filtration"
output:
flexdashboard::flex_dashboard:
theme: yeti
orientation: rows
editor_options:
chunk_output_type: console
---
```{r setup}
library(flexdashboard)
library(dplyr)
library(ggplot2)
library(hms)
library(ggthemes)
library(readr)
library(data.table)
#### Example Data
df1 <- data.frame(Time = as_hms(c("11:22:33","11:22:34","11:22:35","11:22:38","11:23:00","11:23:01","11:23:02")),
Chl_ug_L_Up = c(0.2,0.1,0.25,-0.2,-0.3,-0.15,0.1),
Chl_ug_L_Down = c(0.5,0.4,0.3,0.2,0.1,0,-0.1))
df2 <- data.frame(Time = as_hms(c("08:02:33","08:02:34","08:02:35","08:02:40","08:02:42","08:02:43","08:02:49")),
Chl_ug_L_Up = c(-0.2,-0.1,-0.25,0.2,0.3,0.15,-0.1),
Chl_ug_L_Down = c(-0.1,0,0.1,0.2,0.3,0.4,0.1))
data_directory = "./" # data folder in R project folder in the real deal
output_directory = "./" # output graph directory in R project folder
write_csv(df1, file.path(data_directory, "SO_example_df1.csv"))
write_csv(df2, file.path(data_directory, "SO_example_df2.csv"))
#### Function to create graphs
createChlDiffPlot = function(aTimeSeriesFile, aFileName, aGraphOutputDirectory, aType)
{
aFile_Mod = aTimeSeriesFile %<>%
select(Time, Chl_ug_L_Up, Chl_ug_L_Down) %>%
mutate(Chl_diff = Chl_ug_L_Up - Chl_ug_L_Down)
one_plot = ggplot(data = aFile_Mod, aes(x = Time, y = Chl_diff)) + # tried adding 'group = 1' in aes to connect points
geom_path(size = 1, color = "green") +
geom_point(color = "green") +
theme_gdocs() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.title = element_blank()) +
labs(x = "", y = "Chl Difference", title = paste0(aFileName, " - ", "Filtration"))
one_graph_name = paste0(gsub(".csv", "", aFileName), "_", aType, ".pdf")
ggsave(one_graph_name, one_plot, dpi = 600, width = 7, height = 5, units = "in", device = "pdf", aGraphOutputDirectory)
return(one_plot)
}
"``` ### remove the quotes when running example
Plots - After Velocity Adjustment
=====================================" ### remove quotes when running example
```{r, fig.width=13.5, fig.height=5}
all_files_Filtration = list.files(data_directory, pattern = ".csv")
# Loop to plot function
for(file in 1 : length(all_files_Filtration))
{
file_name = all_files_Filtration[file]
one_file = fread(file.path(data_directory, file_name))
# plot the time series agains
plot(createChlDiffPlot(one_file, file_name, output_directory, "Velocity_Paired"))
}
"``` #remove quotes when running example
```
I finally figured it out.
1) Replacing geom_line() with geom_path() connected the data points when rendered in Rmd.
2) df1$Time was formatted as a difftime object. When I looked at the dataframe in the global environment, Time :hmsnum 11:11:09 .... This made me think my format was ok, but when I ran class(df1$Time) I got [1] "hms" "difftime". With a quick google I found out difftime objects are not quite the same as hms, and my original time was generated by subtracting times. I added a conversion into my mutate function:
select(Time, Chl_ug_L_Up, Chl_ug_L_Down) %>%
mutate(Chl_diff = Chl_ug_L_Up - Chl_ug_L_Down,
Time = as_hms(Time)) # convert difftime objecct to hms
ggplot I think has some auto-formatting for hms variables, which is why difftime variable was producing ugly crowded x- axes.

source layer updating along with output layer

Source layer is layer, output layer is output. The script is updating the source layer with the new fields and their tally, along with the output layer. I've tried deleting fields from layer at the end; setting fc as a different output, copying fc to ouput at the end and then deleting the fields from fc/layer after that; and copying the source layer right of the bat (conceptually this makes the most sense to me...maybe I did it wrong)...no dice.
Any ideas that would preserve the source layer as is but get this script to run and tally on the output? Thanks for any input!!
#workspace
arcpy.env.workspace = wspace = arcpy.GetParameterAsText(0)
#buildings
layer = arcpy.GetParameterAsText(1)
#trees
trees = arcpy.GetParameterAsText(2)
#buffer building to search
buffer = arcpy.GetParameterAsText(3)
#tree field interested in - tree condition, tree location, or tree pit
tf = arcpy.GetParameterAsText(4)
#output file
output = arcpy.GetParameterAsText(5)
#make feature layers to reference
treelayer = arcpy.MakeFeatureLayer_management(trees, trees + ".shp")
fc = arcpy.MakeFeatureLayer_management(layer, output)
pit = ["Sidewalk", "Continuous", "Lawn"]
if tf == "Tree Pit":
for a in pit:
arcpy.AddField_management(fc, a, "SHORT")
with arcpy.da.SearchCursor(fc, ["OBJECTID"]) as fcrows:
for a in fcrows:
arcpy.SelectLayerByAttribute_management(fc, "NEW_SELECTION", "OBJECTID={}".format(a[0]))
arcpy.SelectLayerByLocation_management(treelayer, "WITHIN_A_DISTANCE", fc, buffer, "NEW_SELECTION")
tlrows = arcpy.da.SearchCursor(treelayer, "SITE")
list1 = []
for tlrow in tlrows:
list1.append(int(tlrow[0]))
fcrows1 = arcpy.da.UpdateCursor(fc, pit)
for fcrow1 in fcrows1:
if list1.count(1) > 0:
fcrow1[0] = list1.count(1)
else:
fcrow1[0] = 0
if list1.count(2) > 0:
fcrow1[1] = list1.count(2)
else:
fcrow1[1] = 0
if list1.count(3) > 0:
fcrow1[2] = list1.count(3)
else:
fcrow1[2] = 0
fcrows1.updateRow(fcrow1)
You don't want a variable equal to the function -- just make the feature layer.
arcpy.MakeFeatureLayer_management(layer, output)
Then, subsequent steps should affect only the output layer and ignore the source layer, e.g.:
for a in pit:
arcpy.AddField_management(output, a, "SHORT")
with arcpy.da.SearchCursor(output, ["OBJECTID"]) as fcrows:

Iteratively read a fixed number of lines into R

I have a josn file I'm working with that contains multiple json objects in a single file. R is unable to read the file as a whole. But since each object occurs at regular intervals, I would like to iteratively read a fixed number of lines into R.
There are a number of SO questions on reading single lines into R but I have been unable to extend these solutions to a fixed number of lines. For my problem I need to read 16 lines into R at a time (eg 1-16, 17-32 etc)
I have tried using a loop but can't seem to get the syntax right:
## File
file <- "results.json"
## Create connection
con <- file(description=file, open="r")
## Loop over a file connection
for(i in 1:1000) {
tmp <- scan(file=con, nlines=16, quiet=TRUE)
data[i] <- fromJSON(tmp)
}
The file contains over 1000 objects of this form:
{
"object": [
[
"a",
0
],
[
"b",
2
],
[
"c",
2
]
]
}
With #tomtom inspiration I was able to find a solution.
## File
file <- "results.json"
## Loop over a file
for(i in 1:1000) {
tmp <- paste(scan(file=file, what="character", sep="\n", nlines=16, skip=(i-1)*16, quiet=TRUE),collapse=" ")
assign(x = paste("data", i, sep = "_"), value = fromJSON(tmp))
}
I couldn't create a connection as each time I tried the connection would close before the file had been completely read. So I got rid of that step.
I had to include the what="character" variable as scan() seems to expect a number by default.
I included sep="\n", paste() and collapse=" " to create a single string rather than the vector of characters that scan() creates by default.
Finally I just changed the final assignment operator to have a bit more control over the names of the output.
This might help:
EDITED to make it use a list and Reduce into one file
## Loop over a file connection
data <- NULL
for(i in 1:1000) {
tmp <- scan(file=con, nlines=16, skip=(i-1)*16, quiet=TRUE)
data[[i]] <- fromJSON(tmp)
}
df <- Reduce(function(x, y) {paste(x, y, collapse = " ")})
You would have to make sure that you don't reach further than the end of the file though ;-)

Using the LDAvis package in R to create a gist file of the result

I'm using LDAvis for topic modeling and trying to use the as.gist option to create a gist. When serVis executes there is a timeout in curl::curl_fetch_memory after about 10 seconds. If I immediately execute serVis again I get a different error 'Problems parsing JSON' and from then on whenever serVis is run that same error recurs.
If I start all over with a fresh workspace the same behavior occurs. The first time serVis is run, curl::curl_fetch_memory times out after about 10 seconds. Subsequent executions return 'Problems parsing JSON'.
If I don't use the as.gist option it works fine, but of course doesn't create a gist.
Very rarely, it works and a gist is created. If I change parameters to reduce the size of the JSON object it usually works, which makes me think it may be related to size.
I have explored the various RCurlOptions timeout settings. Currently, they are set as
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem",
package = "RCurl"),
connecttimeout = 300, timeout = 3000,
followlocation = TRUE, dns.cache.timeout = 300))
Below is a console listing with debug set on curl::curl_fetch_memory.
> json <- createJSON(phi = cases$phi,
+ theta = cases$theta,
+ doc.len .... [TRUNCATED]
> serVis(json, open.browser = TRUE, as.gist = TRUE, description = 'APM Community')
debugging in: curl::curl_fetch_memory(url, handle = handle)
debug: {
output <- .Call(R_curl_fetch_memory, url, handle)
res <- handle_response_data(handle)
res$content <- output
res
}
Browse[2]> output <- .Call(R_curl_fetch_memory, url, handle)
Error: Timeout was reached
Browse[2]> output <- .Call(R_curl_fetch_memory, url, handle)
Browse[2]> rawToChar(output)
[1] "{\"message\":\"Problems parsing JSON\",\"documentation_url\":\"https://developer.github.com/v3\"}"
Browse[2]>
.
.
exiting from: curl::curl_fetch_memory(url, handle = handle)
Error: Problems parsing JSON
Any hints on how to debug this problem?