Graphviz - Squished on page and rank not respected - configuration

[enter image description here][1]I'm just learning GraphViz and I'm struggling with some Basic issues. I have two subgraphs that should be able to show basic nodes and links between, see code:
digraph G {
graph [rankdir="TB", splines=ortho, nodesep=2, ranksep=2.5, fontsize=18, fontname="Verdana"];
node [shape=record, height=3, width=3];
rankdir = TB;
subgraph cluster_0 {
node [style=filled, color=lightgrey, fontsize=18, fontname="Verdana"];
label = "DMZ 1";
PWI -> PWE -> PWI
}
subgraph cluster_1 {
node [style=filled, color=lightgrey, fontsize=18, fontname="Verdana"];
label = "Trusted Zone EBC" ;
PR1 -> PWI [headlabel="test text", labeldistance=2.5]
PWI -> PR1
PR1 -> PWE
PWE -> PR1
PWI -> PWE -> PWI
PFI -> PWE
PFI -> PWI
PWE -> PFI
PWI -> PFI
PBW -> PWI
PBW -> PWE
PWI -> PBW
PWE -> PBW
PSM -> PWE
PSM -> PWI
PWE -> PSM
PWI -> PSM
{rank = same; PR1; PFI; PSM; PBW}
{rank = same; PWI}
{rank = same; PWE}
}
}
Which leads to the below.. Am I hitting the edges of [enter image description here][2]a "canvas" if so, can I increase it? And why are the ranks not being respected where the top rank is being shown on the same as the lowest rank, as well as the middle rank being above them?
Thanks!
https://i.stack.imgur.com/meU6p.png

No, you're not hitting the edges of the "canvas", because the default is to expand it to fit your graph.
The problem with ranks in your graph is that you have edges going in both directions between the nodes. Since Graphviz considers the left side of an edge to have higher rank than the right side, all those edges confuses Graphviz.
If you instead always create edges from a higher ranked (according to what you want) node to a lower ranked node and use the dir attribute to specify that the edge should point in the backwards direction like so: n1 -> n2 [dir=back], Graphviz will get a consistent perception of the node ranks and respect your intended ranks.
Perhaps this is more in line with what you want:
<!DOCTYPE html>
<meta charset="utf-8">
<body>
<script src="https://d3js.org/d3.v5.js"></script>
<script src="https://unpkg.com/viz.js#1.8.2/viz.js" type="javascript/worker"></script>
<script src="https://unpkg.com/d3-graphviz#2.4.2/build/d3-graphviz.js"></script>
<div id="graph" style="text-align: center;"></div>
<script>
var dotSrc = `
digraph G {
graph [rankdir="TB", splines=ortho, nodesep=2, ranksep=2.5, fontsize=18, fontname="Verdana"];
node [shape=record, height=3, width=3];
rankdir = TB;
subgraph cluster_0 {
node [style=filled, color=lightgrey, fontsize=18, fontname="Verdana"];
label = "DMZ 1";
PWE -> PWI
PWE -> PWI [dir=back]
}
subgraph cluster_1 {
node [style=filled, color=lightgrey, fontsize=18, fontname="Verdana"];
label = "Trusted Zone EBC" ;
PWI -> PR1 [headlabel="test text", labeldistance=2.5 dir=back]
PWI -> PR1
PWE -> PR1 [dir=back]
PWE -> PR1
PWE -> PWI [dir=back]
PWE -> PWI
PWE -> PFI [dir=back]
PFI -> PWI
PWE -> PFI
PFI -> PWI [dir=back]
PWI -> PBW [dir=back]
PWE -> PBW [dir=back]
PWI -> PBW
PWE -> PBW
PWE -> PSM [dir=back]
PWI -> PSM [dir=back]
PWE -> PSM
PWI -> PSM
{rank = same; PR1; PFI; PSM; PBW}
{rank = same; PWI}
{rank = same; PWE}
}
}
`;
d3.select("#graph").graphviz()
.renderDot(dotSrc);
</script>

Related

web scraping in r with SelectorGadget

I was running this simple code below to scrape the employee number from this Fortune 500 page. I used the Chrome's extention: SelectorGadget to identify that the number I want matches with ".info__row--7f9lE:nth-child(13) .info__value--2AHH7"
library(rvest)
library(dplyr)
#download google chrome extention: SelectorGadget
link = "https://fortune.com/company/walmart/"
page = read_html(link)
Employees = page %>% html_nodes(".info__row--7f9lE:nth-child(13) .info__value--2AHH7") %>% html_text()
Employees
However, it returned "character(0)". Does anyone know what is the cause? I feel it must be a simple mistake somewhere. Thanks in advance!
Update
Here is the code I modified based on Jon's comments.
a <- c("https://fortune.com/company/walmart/", "https://fortune.com/company/amazon-com/"
,"https://fortune.com/company/apple/"
,"https://fortune.com/company/cvs-health/"
,"https://fortune.com/company/unitedhealth-group/"
, "https://fortune.com/company/berkshire-hathaway/"
, "https://fortune.com/company/mckesson/"
,"https://fortune.com/company/amerisourcebergen/"
, "https://fortune.com/company/alphabet/"
, "https://fortune.com/company/exxon-mobil/"
,"https://fortune.com/company/att/"
,"https://fortune.com/company/costco/"
,"https://fortune.com/company/cigna/"
, "https://fortune.com/company/cardinal-health/"
,"https://fortune.com/company/microsoft/"
,"https://fortune.com/company/walgreens-boots-alliance/"
,"https://fortune.com/company/kroger/"
, "https://fortune.com/company/home-depot/"
,"https://fortune.com/company/jpmorgan-chase/"
,"https://fortune.com/company/verizon/"
,"https://fortune.com/company/ford-motor/"
, "https://fortune.com/company/general-motors/"
,"https://fortune.com/company/anthem/"
, "https://fortune.com/company/centene/"
,"https://fortune.com/company/fannie-mae/"
, "https://fortune.com/company/comcast/"
, "https://fortune.com/company/chevron/"
,"https://fortune.com/company/dell-technologies/"
,"https://fortune.com/company/bank-of-america-corp/"
,"https://fortune.com/company/target/")
find_by_name <- function(list_data, name, elem = NULL) {
idx <- which(sapply(list_data, \(x) x$name) == name, arr.ind = TRUE)
stopifnot(length(idx) > 0)
if (length(idx) > 1) { idx <- idx[1] }
dat <- list_data[[idx]]
if (is.null(elem)) dat else dat[[elem]]
}
numEmp <- numeric()
for (i in 1:length(a)){
json_data <- read_html(a[i]) |>
html_element("script#preload") |>
html_text() |>
sub("\\s*window\\.__PRELOADED_STATE__ = ", "", x = _, perl = TRUE) |>
sub(";\\s*$", "", x = _, perl = TRUE) |>
fromJSON(simplifyVector = FALSE)
temp<-gsub(".*https://fortune.com", "", a[i])
page_data <- json_data$components$page[[temp]]
info_data <- page_data |>
find_by_name("body", "children") |>
find_by_name("company-about-wrapper", "children") |>
find_by_name("company-information", "config")
numEmp[i] <- info_data$employees # Results will be fed into this numEmp variable.
}
numEmp
An error says
Error in find_by_name(page_data, "body", "children") :
length(idx) > 0 is not TRUE
Should I somehow change the code stopifnot(length(idx) > 0)?
When I do document.querySelectorAll(".info__row--7f9lE:nth-child(13) .info__value--2AHH7") I see you want to scrape the # of employees. Maurits is right, looks like the data is downloaded as (inline) JSON and then rendered later. You can use Selenium to save the rendered page then apply your CSS selector there. Or you can extract the inline JSON and scrape it from there.
After some manual work, you can do the 2nd option like below in R 4.2.x
library(rvest)
library(jsonlite)
# R 4.1.x
sub2 <- function(x, pattern, replacement) sub(pattern, replacement, x = x, perl = TRUE)
url <- "https://fortune.com/company/walmart/"
json_data <- read_html(url) |>
html_element("script#preload") |>
html_text() |>
## sub("\\s*window\\.__PRELOADED_STATE__ = ", "", x = _, perl = TRUE) |> # R 4.2.x
sub2("\\s*window\\.__PRELOADED_STATE__ = ", "") |> # R 4.1.x
## sub(";\\s*$", "", x = _, perl = TRUE) |> # R 4.2.x
sub2(";\\s*$", "") |> # R 4.1.x
fromJSON(simplifyVector = FALSE)
page_data <- json_data$components$page[["/company/walmart/"]]
find_by_name <- function(list_data, name, elem = NULL) {
idx <- which(sapply(list_data, \(x) x$name) == name, arr.ind = TRUE)
stopifnot(length(idx) > 0)
if (length(idx) > 1) { idx <- idx[1] }
dat <- list_data[[idx]]
if (is.null(elem)) dat else dat[[elem]]
}
info_data <- page_data |>
find_by_name("body", "children") |>
find_by_name("company-about-wrapper", "children") |>
find_by_name("company-information", "config")
info_data$employees
#> [1] "2300000"
# Extra code to scrape company-data-table segments
library(purrr)
data_tables <- page_data |>
find_by_name("body", "children") |>
find_by_name("company-about-wrapper", "children") |>
find_by_name("company-table-wrapper", "children")
rows <- data_tables |>
lapply(\(x) c(x$config$data, x$config$change)) |>
purrr::flatten() |>
discard(~ is.null(.$key))
df <- data.frame(
key = rows |> map_chr(~ .$key),
title = rows |> map_chr(~ .$fieldMeta$title),
type = rows |> map_chr(~ .$fieldMeta$type),
value = rows |> map_chr(~ .$value)
)

R Shiny app loads, but radio buttons do not select values properly

This is my first time using stack overflow so apologies if I do this wrong.
I'm fairly new to coding in R and I'm trying to make a simple Shiny app using a TidyTuesday dataset. I wanted to make a map with points showing the different types of water systems ("water_tech") and radio buttons to choose which type of water system is plotted on the map. I got the app to load without an error message, however no matter which button is selected, all of the different types of water systems are plotted on the map, not just the one I selected (essentially, the buttons don't work). If anyone has any ideas about what could be causing this to happen I would greatly appreciate it!
Reproducible code:
### Load Libraries
library(shiny)
#> Warning: package 'shiny' was built under R version 4.0.4
library(shinythemes)
#> Warning: package 'shinythemes' was built under R version 4.0.4
library(tidyverse)
#> Warning: package 'ggplot2' was built under R version 4.0.5
#> Warning: package 'tibble' was built under R version 4.0.5
#> Warning: package 'tidyr' was built under R version 4.0.5
#> Warning: package 'dplyr' was built under R version 4.0.5
library(here)
#> here() starts at C:/Users/eruks/AppData/Local/Temp/Rtmp2jxqLH/reprex-2a306cec2120-white-boto
library(rnaturalearth)
#> Warning: package 'rnaturalearth' was built under R version 4.0.5
library(rnaturalearthdata)
#> Warning: package 'rnaturalearthdata' was built under R version 4.0.5
library(sf)
#> Warning: package 'sf' was built under R version 4.0.5
#> Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1
### Load Data
water <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-05-04/water.csv')
#>
#> -- Column specification --------------------------------------------------------
#> cols(
#> row_id = col_double(),
#> lat_deg = col_double(),
#> lon_deg = col_double(),
#> report_date = col_character(),
#> status_id = col_character(),
#> water_source = col_character(),
#> water_tech = col_character(),
#> facility_type = col_character(),
#> country_name = col_character(),
#> install_year = col_double(),
#> installer = col_character(),
#> pay = col_character(),
#> status = col_character()
#> )
### User Interface
ui <- fluidPage(theme = shinytheme("spacelab"),
# Application title
titlePanel("Water Access Points in Africa"),
# Sidebar with radio buttons for choosing which type of water system
sidebarLayout(
sidebarPanel(
radioButtons(inputId = "water_tech",
label = "Water system:",
choices = c("Hand Pump", "Hydram", "Kiosk", "Mechanized Pump", "Rope and Bucket", "Tapstand"),
selected = "Hand Pump")
),
mainPanel(
plotOutput("water_plot")
)
)
)
server <- function(input, output) {
water_clean <- water %>%
drop_na(water_tech) %>%
mutate(water_tech = ifelse(str_detect(water_tech, "Hand Pump"), "Hand Pump", water_tech),
water_tech = ifelse(str_detect(water_tech, "Mechanized Pump"), "Mechanized Pump", water_tech),
water_tech = as.factor(water_tech)) %>%
select(2, 3, 7, 9) %>%
filter(lon_deg > -25 & lon_deg < 52 & lat_deg > -40 & lat_deg < 35)
africa <- ne_countries(scale = "medium", returnclass = "sf", continent = "Africa")
rwater <- reactive({
water_clean %>%
filter(water_tech == input$water_tech)
})
output$water_plot <- renderPlot({
rwater() %>%
ggplot() +
geom_sf(data = africa,
fill = "#ffffff") +
geom_point(data = water_clean,
aes(x = lon_deg,
y = lat_deg,
color = water_tech)) +
theme_bw() +
theme(panel.grid = element_blank(),
axis.text = element_blank(),
axis.title = element_blank(),
axis.ticks = element_blank(),
panel.border = element_blank()) +
labs(x = "",
y = "")
})
}
# Run the application
shinyApp(ui = ui, server = server)
#> PhantomJS not found. You can install it with webshot::install_phantomjs(). If it is installed, please make sure the phantomjs executable can be found via the PATH variable.
```
<div style="width: 100% ; height: 400px ; text-align: center; box-sizing: border-box; -moz-box-sizing: border-box; -webkit-box-sizing: border-box;" class="muted well">Shiny applications not supported in static R Markdown documents</div>
<sup>Created on 2021-05-05 by the [reprex package](https://reprex.tidyverse.org) (v2.0.0)</sup>```
Thank you :)
rwater() has no effect in this code:
rwater() %>%
ggplot() +
geom_sf(data = africa,
fill = "#ffffff") +
geom_point(data = water_clean,
aes(x = lon_deg,
y = lat_deg,
color = water_tech))
because you enter the water_clean data in geom_point.
I think you want:
ggplot() +
geom_sf(data = africa,
fill = "#ffffff") +
geom_point(data = rwater(),
aes(x = lon_deg,
y = lat_deg,
color = water_tech))

Convert html output to image

I'm using R formattable package to render some data frames but the output is html ( it opens the browser after I run the script ).
The thing is I'm trying to render those tables under PowerBI which accepts R scripts but need the output to be an image (like a ggplot) not html. But I don't know how I can do it.
I've looked into R2HTML and htmlwidgets packages but I still didn't find a solution with those. ( I may have made some mistakes ).
Here is the dummy code I'm working with:
library(formattable)
DF <- data.frame(Ticker=c("", "", "", "IBM", "AAPL", "MSFT"),
Name=c("Dow Jones", "S&P 500", "Technology",
"IBM", "Apple", "Microsoft"),
Value=accounting(c(15988.08, 1880.33, NA,
130.00, 97.05, 50.99)),
Change=percent(c(-0.0239, -0.0216, 0.021,
-0.0219, -0.0248, -0.0399)))
DF
## Ticker Name Value Change
## 1 Dow Jones 15,988.08 -2.39%
## 2 S&P 500 1,880.33 -2.16%
## 3 Technology NA 2.10%
## 4 IBM IBM 130.00 -2.19%
## 5 AAPL Apple 97.05 -2.48%
## 6 MSFT Microsoft 50.99 -3.99%
formattable(DF, list(
Name=formatter(
"span",
style = x ~ ifelse(x == "Technology",
style(font.weight = "bold"), NA)),
Value = color_tile("white", "orange")
Change = formatter(
"span",
style = x ~ style(color = ifelse(x < 0 , "red", "green")),
x ~ icontext(ifelse(x < 0, "arrow-down", "arrow-up"), x))))
formattable(DF, list(
Name = formatter(
"span", style = x ~ ifelse(x == "Technology", style(font.weight = "bold"), NA)
),
Value = color_tile("white", "orange"),
Change = formatter(
"span", style = x ~ style(color = ifelse(x < 0 , "red", "green")),
x ~ icontext(ifelse(x < 0, "arrow-down", "arrow-up"), x)))
) -> w
htmlwidgets::saveWidget(as.htmlwidget(w), "/some/dir/table.html", selfcontained = TRUE)
webshot::webshot(url = "/some/dir/table.html", file = "/some/dir/table.png",
vwidth = 1000, vheight = 275)
The width/height is not necessarily going to come out precisely as what's specified and you'll need to do some manual guessing for it (or load up magick and see if you can auto-clip using it).
This relies on phantomjs and you may not be able to get your IT and/or security groups to enable the use of it.

Accessing the internet with knitr

When I run some code to draw a map using ggmap, in Rstudio it runs fine. When I run it using knitr it fails with the following error message:-
Error in download.file(url, destfile = destfile, quiet = !messaging, mode = "wb") :
cannot open URL 'http://maps.googleapis.com/maps/api/staticmap?center=-40.851253,172.799669&zoom=19&size=%20640x640&scale=%202&maptype=hybrid&sensor=false'
Calls: ... eval -> eval -> get_map -> get_googlemap -> download.file
In addition: Warning message:
In download.file(url, destfile = destfile, quiet = !messaging, mode = "wb") :
unable to connect to 'maps.googleapis.com' on port 80.
Execution halted
I am sure this is due to the way our network is set up, probably around permissions, but is anyone able to give me any clues as to how knitr would try to access the internet to download a map so I may be able to find a way though our firewall.
Code added but it works fine except through our network.
---
title: "Drawing a map"
author: "Alasdair Noble"
output: word_document
---
To draw a map
```{r echo=TRUE, warning=FALSE , results='markup', comment="",message=FALSE }
library(ggplot2)
library(grid)
library(GGally)
library(plyr)
library(RColorBrewer)
library(ggmap)
library(ggthemes)
```
```{r echo=TRUE, warning=FALSE , results='markup', comment="", message=FALSE }
Btrcup <- get_map(location=c(lon=171.799669, lat=-42.851253),zoom=19, maptype="hybrid")
Btrcupmap <- ggmap(Btrcup)
Btrcupmap
```

Shiny isolate - How not to send the hidden conditional input data to the server?

How can I ask Shiny not to send the hidden conditional input data to the server from the Shiny UI?
I have this problem which I don't know how to solve it.
When I select 'One site' and the dropdown select options for the site 2 will be hidden and I don't want any of the site 2 data to be sent to the server.
But Shiny does send the hidden input data to the server when I hit the GO button. How can I not to send it?
Below are my code,
ui.R,
# Site 1 options.
site1 <- selectInput(
inputId = "site1",
label = "Select a first site:",
choices = c('1a', '1b')
)
# Site 2 options.
site2 <- selectInput(
inputId = "site2",
label = "Select a second site:",
choices = c('2a', '2b')
)
shinyUI(
pageWithSidebar(
headerPanel("Shiny App"),
sidebarPanel(
selectInput(
"distribution",
"Please select a type:",
choices = c("Both sites", "One site")
),
# Site select input.
site1,
# Condition when the plot is a line plot.
conditionalPanel(
condition = "input.distribution == 'Both sites'",
site2
),
actionButton("goButton", "Go!")
),
mainPanel(
plotOutput("myPlot")
)
)
)
server.R,
shinyServer(
function(input, output, session) {
output$myPlot = renderPlot({
# Take a dependency on input$goButton
input$goButton
site1 <- isolate(input$site1)
site2 <- isolate(input$site2)
plot(1, 1, col = "white")
text(1, 1, paste(site1, " ", site2))
})
}
)
Here are the visual:
two sites (correct result),
one site (incorrect result),
the expected result,
Any ideas? Is it a bug from Shiny?