I want to create a function that takes an externally defined variable and uses it in a group by using dplyr. Here is what I have so far:
data(mtcars)
my_grp_col <- 'gear'
calculate_mean <- function(data, grouping_column, target){
data %>%
group_by(cyl, am, {{my_grp_col}}, target) %>%
summarize(mean(target, na.rm = T))
}
calculate_mean(data = mtcars, grouping_column = my_grp_col, target = mpg)
Essentially, I want to group by cyl, am, gear (which I have defined externally) and then calculate the mean of target (mpg).
The following would work (note that you need also {{...}} around target in this case):
data(mtcars)
my_grp_col <- 'gear'
calculate_mean <- function(data, grouping_column, target){
data %>%
group_by(cyl, am, !!sym(grouping_column), {{target}}) %>%
summarize(mean(target, na.rm = T))
}
calculate_mean(data = mtcars, grouping_column = my_grp_col, target = mpg)
However, it would look much nicer if you also directly give grouping_column without defining it as string before:
calculate_mean <- function(data, grouping_column, target){
data %>%
group_by(cyl, am, {{grouping_column}}, {{target}}) %>%
summarize(mean(target, na.rm = T))
}
calculate_mean(data = mtcars, grouping_column = gear, target = mpg)
Related
I am trying to webscrape a site to get addresses for a set of names (part A) along with the longitude and latitudes (part B). I don't know how to do this all together, so I did this in two parts:
# part A
library(tidyverse)
library(rvest)
library(httr)
library(XML)
# Define function to scrape 1 page
get_info <- function(page_n) {
cat("Scraping page ", page_n, "\n")
page <- paste0("https://www.mywebsite/",
page_n, "?extension") %>% read_html
tibble(title = page %>%
html_elements(".title a") %>%
html_text2(),
adress = page %>%
html_elements(".marker") %>%
html_text2(),
page = page_n)
}
# Apply function to pages 1:10
df_1 <- map_dfr(1:10, get_info)
# Check dimensions
dim(df_1)
[1] 90
Here is part B:
# Recognize pattern in websites
part1 = "https://www.mywebsite/"
part2 = c(0:55)
part3 = "extension"
temp = data.frame(part1, part2, part3)
# Create list of websites
temp$all_websites = paste0(temp$part1, temp$part2, temp$part3)
# Scrape
df_2 <- list()
for (i in 1:10)
{tryCatch({
url_i <-temp$all_websites[i]
page_i <-read_html(url_i)
b_i = page_i %>% html_nodes("head")
listanswer_i <- b_i %>% html_text() %>% strsplit("\\n")
df_2[[i]] <- listanswer_i
print(listanswer_i)
}, error = function(e){})
}
# Extract long/lat from results
lat_long = grep("LatLng", unlist(df_2[]), value = TRUE)
df_2 = data.frame(str_match(lat_long, "LatLng(\\s*(.*?)\\s*);"))
df_2 = df_2 %>% filter(X1 != "LatLngBounds();")
> dim(df_2)
[1] 86 3
We can see that df_1 and df_2 have a different number of rows - but also, there is no common merge key between df_1 and df_2. How can I re-write my code in such a way that I can create a merge key between df_1 and df_2 such that I can merge the common records between these files together?
I am not sure multiple requests to the same URIs are needed. There are some lat long values not listed either on the results pages or on the result specific linked webpage e.g.Toronto Beaches Dentist from current page 2 results has no lat long shown on either page 2 or the website specific page. In these cases, you may choose to fill the blanks using another service which returns lat long based on an address.
You can re-write your function and alter your regex patterns to produce 2 dataframes which can be joined and the resultant dataframe returned. With the appropriate regex changes, as given below, you can use the address column to join the 2 dataframes. I dislike a key which is an address but it does appear to be internally consistent across the result page. I have used a left join to return all rows from the dentist listings i.e. the practice business names.
library(tidyverse)
library(rvest)
urls <- sprintf("https://www.dentistsearch.ca/search-doctor/%i?category=0&services=0&province=55&city=&k=", 1:10)
pages <- lapply(urls, read_html)
get_dentist_info <- function(page) {
page_text <- page %>% html_text()
address_keys <- page_text %>%
str_match_all('marker_\\d+\\.set\\("content", "(.*?)"\\);') %>%
.[[1]] %>%
.[, 2]
lat_long <- page_text %>%
str_match_all("LatLng\\((.*)\\);(?![\\s\\S]+myOptions)") %>%
.[[1]] %>%
.[, 2]
lat_lon <- tibble(address = address_keys, lat_long = lat_long) %>%
separate(lat_long, into = c("lat", "long"), sep = ", ") %>%
mutate(lat = as.numeric(lat), long = as.numeric(long))
practice_info <- tibble(
title = page %>% html_elements(".title > a") %>% html_text(trim = T),
address = page %>% html_elements(".marker") %>% html_text()
)
dentist_info <- left_join(practice_info, lat_lon, by = "address")
return(dentist_info)
}
all_dentist_info <- map_dfr(pages, get_dentist_info)
I would like to parse addresses of all stores on the following website:
https://www.carrefour.fr/magasin/region/ looping through the regions. So starting for example with the region "auvergne-rhone-alpes-84", hence full url = https://www.carrefour.fr/magasin/region/auvergne-rhone-alpes-84. Note that I can add more regions afterwards, I just want to make it work with one for now.
carrefour <- "https://www.carrefour.fr/magasin/region/"
addresses_vector = c()
for (current_region in c("auvergne-rhone-alpes-84")) {
current_region_url = paste(carrefour, current_region, "/", sep="")
x <- GET(url=current_region_url)
html_doc <- read_html(x) %>%
html_nodes("[class = 'ds-body-text ds-store-card__details--content ds-body-text--size-m ds-body-text--color-standard-2']")
addresses_vector <- c(addresses_vector, html_doc %>%
rvest::html_nodes('body')%>%
xml2::xml_find_all(".//div[contains(#class, 'ds-body-text ds-store-card__details--content ds-body-text--size-m ds-body-text--color-standard-2')]") %>%
rvest::html_text())
}
I also tried with x%>% read_html() %>% rvest::html_nodes(xpath="/html/body/main/div[1]/div/div[2]/div[2]/ol/li[1]/div/div[1]/div[2]/div[2]")%>% rvest::html_text() (copying the whole xpath by hand) or x%>%read_html() %>%html_nodes("div.ds-body-text.ds-store-card__details--content.ds-body-text--size-m.ds-body-text--color-standard-2") %>%html_text() and several other ways but I always get a character(0) element returned.
Any help is appreciated!
You could write a couple of custom functions to help then use purrr to map the store data function to inputs from the output of the first helper function.
First, extract the region urls and extract the region names and region ids. Store these in a tibble. This is the first helper function get_regions.
Then use another function, get_store_info, to extract from these region urls the store info, which is stored in a div tag, from which it is dynamically extracted when JavaScript runs in the browser, but not when using rvest.
Apply the function that extracts the store info over the list of region urls and region ids.
If you use map2_dfr to pass both region id and region link to the function which extracts store data, you then have the region id to link back on to join the result of the map2_dfr to that of region tibble generated earlier.
Then do some column cleaning e.g., drop ones you don't want.
library(rvest)
library(purrr)
library(dplyr)
library(readr)
library(jsonlite)
get_regions <- function() {
url <- "https://www.carrefour.fr/magasin"
page <- read_html(url)
regions <- page %>% html_nodes(".store-locator-footer-list__item > a")
t <- tibble(
region = regions %>% html_text(trim = T),
link = regions %>% html_attr("href") %>% url_absolute(url),
region_id = NA_integer_
) %>% mutate(region_id = str_match(link, "-(\\d+)$")[, 2] %>%
as.integer())
return(t)
}
get_store_info <- function(region_url, r_id) {
region_page <- read_html(region_url)
store_data <- region_page %>%
html_node("#store-locator") %>%
html_attr(":context-stores") %>%
parse_json(simplifyVector = T) %>%
as_tibble()
store_data$region_id <- r_id
return(store_data)
}
region_df <- get_regions()
store_df <- map2_dfr(region_df$link, region_df$region_id, get_store_info)
final_df <- inner_join(region_df, store_df, by = 'region_id') # now clean columns within this.
I'd like to scrape the content of a page once the province (and the commune) are selected.
The following code correctly outputs the provinces and their values.
library(rvest)
page <- read_html(x = "https://www.solferinoesanmartino.it/progetto-torelli/progetto-torelli-risultati/")
text <- page %>% html_nodes(xpath='//select[#name="provincia"]/option')%>% html_text()
values <- page %>% html_nodes(xpath='//select[#name="provincia"]/option')%>% html_attr("value")
Res <- data.frame(text = text, values = values, stringsAsFactors = FALSE)
Res
Now, I'd like to access the page for each value, e.g. this might be helpful for getting access to value=19.
text <- page %>% html_nodes(xpath="//*/option[#value = '19']")%>% html_text()
text
The source code is the following
<div class="row results_form_search">
<form role="search" method="POST" class="search-form" action="/progetto-torelli/progetto-torelli-risultati/" id="search_location">
<input type="hidden" name="comune_from" value="" />
<div class="form-row">
<input type="text" name="cognome" placeholder="Cognome" autocomplete="off" value="">
<select name="provincia">
<option value="0" selected>Seleziona Provincia</option>
<option value="74"
>-
</option>
<option value="75"
>AGRIGENTO
</option>
<option value="19"
>ALESSANDRIA
This is where the content that I want to scrape might be.
<div class="row">
<ul class="listing_search">
</ul>
</div>
Thank you so much for your advice!
RSelenium may end up being the way to go. However, if you can insert some judicious waits, or chunk your requests, so server isn't swamped with requests, you can use rvest and make the same requests the page does.
You first need to generate all the combinations of province and comune (filtering out unwanted values); this can be done by making xmlhttp requests, using the value attribute for the options within the select for province, to gather back the comune dropdown options and their associated values.
You then make further requests, for each combination pair, to get the page content, which you would get when making selections from each of those dropdowns manually and pushing CERCA.
Pauses are needed as there are 10,389 valid combinations, by my reckoning, and, if you attempt to make all those requests one after the other, following the initial requests as well, the server will cut-off the connection.
Another option would be to chunk up combined into smaller dataframes and make requests for those at timed intervals and then combine the results.
library(rvest)
library(dplyr)
library(purrr)
get_provincias <- function(link) {
nodes <- read_html(link) %>%
html_nodes('[name="provincia"] > option:not([selected]):not(:contains("-")):not(:contains("\u0085"))')
df <- data.frame(
Provincia = nodes %>% html_text(trim = T),
id0 = nodes %>% html_attr("value")
)
return(df)
}
get_comunes <- function(id) {
link <- sprintf(
"https://www.solferinoesanmartino.it/db-torelli/_get_comuni.php?id0=%s&id1=0&_=%i",
id,
as.numeric(as.POSIXct(Sys.Date(), format = "%Y-%m-%d"))
)
# print(link)
nodes <- read_html(link) %>% html_nodes('option:not([value="0"])')
df <- data.frame(
id0 = id, # id1
Comune = nodes %>% html_text(trim = T),
id3 = nodes %>% html_attr("value")
)
return(df)
}
get_page <- function(prov_id, com_id) {
link <- sprintf(
"https://www.solferinoesanmartino.it/db-torelli/_get_soldati.php?id0=1&id1=&id2=%s&id3=%s&_=%i",
prov_id,
com_id,
as.numeric(as.POSIXct(Sys.Date(), format = "%Y-%m-%d"))
)
page <- read_html(link)
# print(page %>% html_node(".listing_name") %>% html_text(trim = T))
# print(tibble(id3 = com_id, page = page))
return(tibble(id3 = com_id, page = page))
}
provincias <- get_provincias("https://www.solferinoesanmartino.it/progetto-torelli/progetto-torelli-risultati")
comunes <- map_df(provincias$id0, get_comunes) %>% filter(Comune != "-")
combined <- dplyr::right_join(provincias, comunes, by = "id0")
# length(combined$Comune) -> 10389
results <- map2_dfr(combined$id0, combined$id3, .f = get_page)
final <- dplyr::inner_join(combined, results, by = "id3")
Below is a longer version, with the additional info you requested, where I played around with adding pauses. I still found that I could run up to, and including
combined <- dplyr::right_join(provincias, comunes, by = "id0")
in one go. But after that I needed to chunk requests into about 2000 requests batches with 20-30 minutes in between. You can try tweaking the timings below. I ended up using the commented out section to run each batch and then left a pause of 30 mins in between.
Some things to consider:
It seems that you can have comunes values like ... which still return listings. With that in mind you may wish to remove the :not parts of this:
html_nodes('[name="provincia"] > option:not([selected]):not(:contains("-")):not(:contains("\u0085"))')
as I assumed that was filtering out invalid results.
Next, you might consider writing a helper function with httr and retry,
to make the requests with backoff/retry, rather than use pauses.
Such a function might look like this:
httr::RETRY(
"GET",
<request url>,
times = 3,
pause_min = 20*60,
pause_base = 20*60)
Anyway, those are some ideas. Even without the server cutting the connection, via uses of waits, I still found it started to throttle requests, meaning some requests took quite a long time to complete. Optimizing this could potentially take a lot of time and effort. I spent a good few days playing around looking at chunk size and waits.
library(rvest)
library(dplyr)
library(purrr)
get_provincias <- function(link) {
nodes <- read_html(link) %>%
html_nodes('[name="provincia"] > option:not([selected]):not(:contains("-")):not(:contains("\u0085"))')
df <- data.frame(
Provincia = nodes %>% html_text(trim = T),
id0 = nodes %>% html_attr("value")
)
return(df)
}
get_comunes <- function(id) {
link <- sprintf(
"https://www.solferinoesanmartino.it/db-torelli/_get_comuni.php?id0=%s&id1=0&_=%i",
id,
as.numeric(as.POSIXct(Sys.Date(), format = "%Y-%m-%d"))
)
# print(link)
nodes <- read_html(link) %>% html_nodes('option:not([value="0"])')
df <- data.frame(
id0 = id, # id1
Comune = nodes %>% html_text(trim = T),
id3 = nodes %>% html_attr("value")
)
return(df)
}
get_data <- function(prov_id, com_id) {
link <- sprintf(
"https://www.solferinoesanmartino.it/db-torelli/_get_soldati.php?id0=1&id1=&id2=%s&id3=%s&_=%i",
prov_id,
com_id,
as.numeric(as.POSIXct(Sys.Date(), format = "%Y-%m-%d"))
)
# print(link)
page <- read_html(link)
df <- data.frame(
cognome = page %>% html_nodes(".listing_name") %>% html_text(trim = T),
livello = page %>% html_nodes(".listing_level") %>% html_text(trim = T),
id3 = com_id,# for later join back on comune
id0 = prov_id
)
Sys.sleep(.25) # pause for . sec
return(df)
}
get_chunks <- function(df, chunk_size) { # adapted from #BenBolker https://stackoverflow.com/a/7060331
n <- nrow(df)
r <- rep(1:ceiling(n / chunk_size), each = chunk_size)[1:n]
d <- split(df, r)
return(d)
}
write_rows <- function(df, filename) {
flag <- file.exists(filename)
df2 <- purrr::map2_dfr(df$id0, df$id3, .f = get_data)
write.table(df2,
file = filename, sep = ",",
append = flag,
quote = F, col.names = !flag,
row.names = F
)
Sys.sleep(60*10)
}
provincias <- get_provincias("https://www.solferinoesanmartino.it/progetto-torelli/progetto-torelli-risultati")
Sys.sleep(60*5)
comunes <- map_df(provincias$id0, get_comunes) %>% filter(Comune != "-")
Sys.sleep(60*10)
combined <- dplyr::right_join(provincias, comunes, by = "id0")
Sys.sleep(60*10)
chunked <- get_chunks(combined, 2000) # https://stackoverflow.com/questions/7060272/split-up-a-dataframe-by-number-of-rows
filename <- "prov_com_cog_liv.csv"
map(chunked, ~ write_rows(.x, filename))
## #### test case #####################
# df <- chunked[[6]]
#
# flag <- file.exists(filename)
#
# df2 <- map2_dfr(df$id0, df$id3, .f = get_data)
#
# write.table(df2,
# file = filename, sep = ",",
# append = flag,
# quote = F, col.names = !flag,
# row.names = F
# )
####################################
results <- read.csv(filename)
final <- dplyr::right_join(combined, results, by = "id3")
I am using the R programming language. I am trying to take different types of graphs (bar graphs, pie charts) and put them on the same page. I generated some fake data and made several graphs - then I put them together (see : Combining Different Types of Graphs Together (R))
library(dplyr)
library(ggplot2)
library(cowplot)
library(gridExtra)
library(plotly)
date= seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")
var <- rnorm(731,10,10)
group <- sample( LETTERS[1:4], 731, replace=TRUE, prob=c(0.25, 0.22, 0.25, 0.25) )
data = data.frame(date, var, group)
data$year = as.numeric(format(data$date,'%Y'))
data$year = as.factor(data$year)
###Pie
Pie_2014 <- data %>%
filter((data$year == "2014")) %>%
group_by(group) %>%
summarise(n = n())
Pie_2014_graph = ggplot(Pie_2014, aes(x="", y=n, fill=group)) +
geom_bar(stat="identity", width=1) +
coord_polar("y", start=0) +ggtitle( "Pie Chart 2014")
Pie_2015 <- data %>%
filter((data$year == "2015")) %>%
group_by(group) %>%
summarise(n = n())
Pie_2015_graph = ggplot(Pie_2015, aes(x="", y=n, fill=group)) +
geom_bar(stat="identity", width=1) +
coord_polar("y", start=0) +ggtitle( "Pie Chart 2015")
Pie_total = data %>%
group_by(group) %>%
summarise(n = n())
Pie_total_graph = ggplot(Pie_total, aes(x="", y=n, fill=group)) +
geom_bar(stat="identity", width=1) +
coord_polar("y", start=0) +ggtitle( "Pie Chart Average")
###bars
Bar_years = data %>%
group_by(year, group) %>%
summarise(mean = mean(var))
Bar_years_plot = ggplot(Bar_years, aes(fill=group, y=mean, x=year)) +
geom_bar(position="dodge", stat="identity") + ggtitle("Bar Plot All Years")
Bar_total = data %>%
group_by(group) %>%
summarise(mean = n())
Bar_total_plot = ggplot(Bar_total, aes(x=group, y=mean, fill=group)) +
geom_bar(stat="identity")+theme_minimal() + ggtitle("Bar Plot Average")
#assembling the graphs can be done two different ways
#first way
g1 <- grid.arrange(Pie_2014_graph, Pie_2015_graph , Pie_total_graph, nrow = 1)
g2 <- grid.arrange(Bar_total_plot, Bar_years_plot, nrow = 1)
g = grid.arrange(g1, g2, ncol = 1)
#second way
# arrange subplots in rows
top_row <- plot_grid(Pie_2014_graph, Pie_2015_graph, Pie_total_graph)
middle_row <- plot_grid(Bar_years_plot, Bar_total_plot)
# arrange our new rows into combined plot
p <- plot_grid(top_row, middle_row, nrow = 2)
p
From here, I am trying to use the plotly::ggplotly() command to make the above output "interactive" (move the mouse over the graphs and see labels). I know that this works for individual plots:
ggplotly(Bar_years_plot)
However, this command does not seem to work with the "cowplot" and the "gridExtra" outputs:
#gridExtra version:
ggplotly(g)
Error in UseMethod("ggplotly", p) :
no applicable method for 'ggplotly' applied to an object of class "c('gtable', 'gTree', 'grob', 'gDesc')"
#cowplot version: (produces empty plot)
ggplotly(p)
Warning messages:
1: In geom2trace.default(dots[[1L]][[1L]], dots[[2L]][[1L]], dots[[3L]][[1L]]) :
geom_GeomDrawGrob() has yet to be implemented in plotly.
If you'd like to see this geom implemented,
Please open an issue with your example code at
https://github.com/ropensci/plotly/issues
2: In geom2trace.default(dots[[1L]][[1L]], dots[[2L]][[1L]], dots[[3L]][[1L]]) :
geom_GeomDrawGrob() has yet to be implemented in plotly.
If you'd like to see this geom implemented,
Please open an issue with your example code at
https://github.com/ropensci/plotly/issues
Does anyone know if there is a quick way to use the ggplotly() function for objects created with "gridExtra" or "cowplot"?
I know that with a bit of work, it might be possible using "htmltools":
library(htmltools)
doc <- htmltools::tagList(
div(Pie_2014_graph, style = "float:left;width:50%;"),
div(Pie_2015_graph,style = "float:left;width:50%;"),
div(Pie_total_graph, style = "float:left;width:50%;"),
div(Bar_years_plot, style = "float:left;width:50%;"),
div(Bar_total_plot, style = "float:left;width:50%;"))
save_html(html = doc, file = "out.html")
But I am not sure how to do this.
Can someone please show me how to make the collections of graphs interactive either using ggplotly() or with htmltools()?
Thanks.
You should apply ggplotly() to the individual graphs, not the collection graphs.
For example:
Pie_2014_graph = ggplotly(ggplot(Pie_2014, aes(x="", y=n, fill=group)) +
geom_bar(stat="identity", width=1) +
coord_polar("y", start=0) +ggtitle( "Pie Chart 2014") )
I defined some table functions in R using the expss package to automate tabulation.
One of my tables wants to show cases or percentages on categories followed by the mean. The mean can be based on the same category variable or it can be defined to be another variable.
Overall the code works perfect. For some variables though I keep struggling with the error code "Error in do.call(data.frame, c(x, alis)) : variable names are limited to 10000 bytes"
the code for this table
Table2 = function (Q, banner=banner, caption , Q.mean, ddata=d, questlab=dquest, mis.val=999) {
x_totaln<-eval(substitute(x),ddata)
x_totaln[is.na(eval(substitute(Q),ddata))]<-NA
if(missing(Q.mean))
{Q_mean<-eval(substitute(Q),ddata)}
else
{Q_mean<-eval(substitute(Q.mean),ddata)}
Q_mean[Q_mean==mis.val]<-NA
if(missing(caption))
{caption<-eval(substitute(var_lab(Q_mean)),questlab)}
eval.parent(substitute(
{
banner %>%
tab_cells (x_totaln) %>%
tab_stat_cases(total_row_position = c("none"),label = 'N') %>%
tab_cells (Q) %>%
tab_stat_cases(total_row_position = c("none"),label = 'N') %>%
tab_stat_cpct(total_row_position = c("none"), label = '%') %>%
tab_cells (Q_mean) %>%
tab_stat_mean(label = 'Mean') %>%
tab_pivot (stat_position = "inside_rows") %>%
drop_c () %>%
custom_format2() %>%
set_caption(caption)
}
))
}
This code is overall working perfect.
Table2(Q8_cat)
For some variables though it generates the error code
Table2(age_cat,Q.mean=age,caption="Your age at the start of the programme?")
Error in do.call(data.frame, c(x, alis)) :
variable names are limited to 10000 bytes
19.
do.call(data.frame, c(x, alis))
while including the variables in the code works again perfect
Table2test = function () {
x_totaln<-eval(substitute(x),d)
x_totaln[is.na(eval(substitute(age_cat),d))]<-NA
Q_mean<-eval(substitute(age),d)
Q_mean[Q_mean==999]<-NA
banner %>%
tab_cells (x_totaln) %>%
tab_stat_cases(total_row_position = c("none"),label = 'N') %>%
tab_cells (age_cat) %>%
tab_stat_cases(total_row_position = c("none"),label = 'N') %>%
tab_stat_cpct(total_row_position = c("none"), label = '%') %>%
tab_cells (Q_mean) %>%
tab_stat_mean(label = 'Mean') %>%
tab_pivot (stat_position = "inside_rows") %>%
drop_c () %>%
custom_format2() %>%
set_caption("Your age at the start of the programme?")
}
Any advice? Or anyone any idea why the error occurs?
Thanks
When you substitute variables in some cases they are represented as structure. In this case there is no variable name in the expression but only value: tab_cells(structure(c(22, 23, 22, 23, ... many numbers))). And we try to use this long representation as name in the resulted table. But R has limitation on the length of the names. And here the function fails. Solution is quite simple - we will always set variable labels which we will use as names. So the following code run without any errors:
Table2 = function (Q, banner=banner, caption , Q.mean, ddata=d, questlab=dquest, mis.val=999) {
x_totaln<-eval(substitute(x),ddata)
x_totaln[is.na(eval(substitute(Q),ddata))]<-NA
var_lab(x_totaln) = "Total" # add label for total
if(missing(Q.mean))
{Q_mean<-eval(substitute(Q),ddata)}
else
{Q_mean<-eval(substitute(Q.mean),ddata)}
Q_mean[Q_mean==mis.val]<-NA
if(missing(caption))
{caption<-eval(substitute(var_lab(Q_mean)),questlab)}
eval.parent(substitute(
{
banner %>%
tab_cells (x_totaln) %>%
tab_stat_cases(total_row_position = c("none"),label = 'N') %>%
tab_cells (Q) %>%
tab_stat_cases(total_row_position = c("none"),label = 'N') %>%
tab_stat_cpct(total_row_position = c("none"), label = '%') %>%
tab_cells ("|" = Q_mean) %>% # "|" suppress label for mean
tab_stat_mean(label = 'Mean') %>%
tab_pivot (stat_position = "inside_rows") %>%
drop_c () %>%
custom_format2() %>%
set_caption(caption)
}
))
}