How to hyperlink the text files into HTML file using R? - html

I have an HTML output file and there is a column named "Description" in this file. I want to link locally saved text files to some of the entries of this column when a value is Report data does not match.
Snapshot the HTML file is following:
So, there are dedicated texts files to row no: 12, 16, 17, 18, 19, 20, which I want to link them to the Description column.
Line of Codes to generating HTML file is:
library(xtable)
extract1 <- result[,list(TestCaseID, breadcrumb, Discription),]
print(xtable(extract1), type = "html", file = "extracted.html")
How to do the linking of text files. Please let me know if any modification required in the question. Thanks in Advance!!!

I recommend that you perform a pre-processing according to your requirements. Because the names of the text files may change later, they should be provided as a separate column.
If the text file link is not required think of a conditional processing for NA's later.
The sample below is based on a main list. Text files reside in a subfolder.
The trick is the use of HTML tag href and use of the sanitize.text.function as shown below for your test cases.
You'll need to create some dummy text files like gauge-D00.txt, gauge-D01.txt, etc. in a subfolder to try the example.
# --------------------------------------------------------
# gauge main ID list
#---------------------------------------------------------
# ID,location,description,textfile
# D00,nature reserve,Otternhagener Moor,../gauge-D00.txt
# D01,nature reserve,Helstorfer Moor,../gauge-D01.txt
# FER,benchmark,Negenborner Weg,../gauge-FER.txt
#----------------------------------------------------------
# text files reside in /data-develop-text-file-link/
# ---------------------------------------------------------
library (xtable)
gaugelist <- structure(list(
ID = structure(1:3, .Label = c("D00", "D01", "FER"), class = "factor"),
location = structure(c(2L, 2L, 1L), .Label = c("benchmark", "nature reserve"), class = "factor"),
description = structure(c(3L, 1L, 2L), .Label = c("Helstorf", "Negenborn", "Otternhagen"), class = "factor"),
textfile = structure(c(2L, 3L, 1L), .Label = c("../gauge-FER.txt", "../gauge-D00.txt", "../gauge-D01.txt"), class = "factor")),
class = "data.frame", row.names = c(NA, -3L))
head(gaugelist)
# set HTML tag for linking to local file --------------------------------------------
gaugelist$description <- paste("", gaugelist$description, "")
head(gaugelist)
# remove textfile column from data.frame --------------
gaugelist$textfile <- NULL
head(gaugelist)
# print HTML table and sanitize by using your own function (add subfolder) ---------------------------------------
print(xtable(gaugelist), type = "html",
sanitize.text.function = function(str) gsub("..", "./data-develop-text-file-link", str, fixed = TRUE),
file = "gauge-list.html")
Edit:
It is slightly better to reference the current directory ./data-develop-text-file-link with ./. I edited this for gsub handling but this makes no difference.
The structure of HTML and text files described in my answer above and only exemplarily hinted at as a thought is based on a website structure. The HTML table is located at any root node and the text files are located in a directory below it. So there is later the possibility to upload the file to a server or to leave it locally on the PC.
That's why I worked with relative links, which work for me in all browsers.
Please note that absolute paths to text files seem to be a problem with Microsoft Edge and Internet Explorer. TEST: Copy the link with the right mouse button and paste it into Edge's address text box and the text file will open. I couldn't find any problems with Firefox and Chrome when testing with e.g. C:\Users\%USERNAME%\Documents or D:_working\ e.g.:
# print HTML table and sanitize by using your own function (add subfolder) ---------------------------------------
print(xtable(gaugelist), type = "html",
sanitize.text.function = function(str) gsub("..", "file:///C:/Users/webma/Documents/data-develop-text-file-link", str, fixed = TRUE),
file = "gauge-list.html")

Related

python-docx(verion 0.8.11) Inline Picture

Am using windows 10 and python-docx(verion 0.8.11). How do I add_picture and implement Textwrap format = "square"?
2) Can we insert floating Picture?
Tried this...
from docx import Document
document = Document()
# Add a picture to the document
picture = document.add_picture('picture1.png')
# Set the text wrap type to 'square'
picture.text_wrap = True
# Save the document
document.save('document06Jan.docx')
No Errors but, picture wrapFormat is still InLine with Text

R - combine image and table then export as PDF

I have four goals:
Connect to a Postgresql database and pull some data
Gloss up a table with some colour and formatting
Include an image (company logo) above it
Export as PDF
1 and 2 are easy enough and 4 seems possible even if not convenient, but I don't think R was designed to add and position images. I've attached some sample code of how I envision creating the table, and then a mockup of what I think the final version might look like. Can anyone advise on the best way to accomplish this?
Sample data:
data(mtcars)
df <- head(mtcars)
HTML approach: flexible and portable to other apps
library(tableHTML)
html_table <- df %>%
tableHTML(rownames = FALSE, border = 0) %>%
add_css_row(css = list(c('font-family', 'text-align'), c('sans-serif', 'center'))) %>%
add_css_header(css = list(c('background-color', 'color'), c('#173ACC', 'white')), headers = 1:ncol(df))
Grob approach: Creating a ggplot-like image. I've seen recommendations to use grid.arrange to place an image on top and export as a PDF
library(ggpubr)
tbody.style = tbody_style(color = "black",
fill = "white", hjust=1, x=0.9)
grob_table <- ggtexttable(df, rows = NULL,
theme = ttheme(
colnames.style = colnames_style(color = "white", fill = "#173ACC"),
tbody.style = tbody.style
)
)
grid.arrange(table_image)
You are almost there. You just need to import your image (could be png, jpeg or svg) then pass it to grid::rasterGrob. Use the options in rasterGrob to adjust size etc. Then pass your grob table to gridExtra::grid.arrange
logo_imported <- png::readPNG(system.file("img", "Rlogo.png", package="png"), TRUE)
lg <- grid::rasterGrob(logo_imported)
gridExtra::grid.arrange(lg, grob_table)
You can then either render this to pdf by adding it to an rmarkdown report (probably best), or you can save directly to pdf via
gridExtra::grid.arrange(lg, grob_table)
pdf(file = "My Plot.pdf",
width = 4, # The width of the plot in inches
height = 4)

Loop through multiple links from an Excel file, open and download the corresponding webpages

I downloaded from MediaCloud an Excel file with 1719 links to different newspaper articles. I am trying to use R to loop through each link, open it and download all the corresponding online articles in a single searchable file (HTML, CSV, TXT, PDF - doesn't matter) that I can read and analyze later.
I went through all similar questions on Stack Overflow and a number of tutorials for downloading files and managed to assemble this code (I am very new to R):
express <-read.csv("C://Users//julir//Documents//Data//express.csv")
library(curl)
for (express$url in 2:1720)
destfile <- paste0("C://Users//julir//Documents//Data//results.csv")
download.file(express$url, destfile, method = "auto", quiet = TRUE, cacheOK=TRUE)
Whenever I try to run it though I get the following error:
Error in download.file(express$url, destfile = express$url, method = "auto", : 'url' must be a length-one character vector
I tried also this alternative method suggested online:
library(httr)
url <- express$url
response <- GET(express$url)
html_document <- content(response, type = "text", encoding = "UTF-8")
But I get the same mistake:
Error in parse_url(url) : length(url) == 1 is not TRUE
So I guess there is a problem with how the URLs are stored - but I can't understand how to fix it.
I am also not certain about the downloading process - I would ideally want all text on the HTML page - it seems unpractical to use selectors and rvest in this case - but I might be very wrong.
You need to look through the url's and read/parse each individually. You are essentially passing an array of urls into one request, which is why you see that error.
I don't know your content/urls, but here's an example of how you would approach this:
library(xml2)
library(jsonlite)
library(dplyr)
df <- data.frame(page_n = 1:5, urls = sprintf('https://www.politifact.com/factchecks/list/?page=%s', 1:5))
result_info <- lapply(df$urls, function(i){
raw <- read_html(i)
a_tags <- raw %>% xml_find_all(".//a[contains(#href,'factchecks/2021')]")
urls <- xml2::url_absolute(xml_attr(a_tags, "href"),xml_url(raw))
titles <- xml_text(a_tags) %>% stri_trim_both()
data.frame(title = titles, links = urls)
}) %>% rbind_pages()
result_info %>% head()
title
links
Says of UW-Madison, "It cost the university $50k (your tax dollars) to remove" a rock considered by some a symbol of racism.
https://www.politifact.com/factchecks/2021/aug/14/rachel-campos-duffy/no-taxpayer-funds-were-not-used-remove-rock-deemed/
“Rand Paul’s medical license was just revoked!”
https://www.politifact.com/factchecks/2021/aug/13/facebook-posts/no-rand-pauls-medical-license-wasnt-revoked/
Every time outgoing New York Gov. Andrew Cuomo “says the firearm industry ‘is immune from lawsuits,’ it's false.”
https://www.politifact.com/factchecks/2021/aug/13/elise-stefanik/refereeing-andrew-cuomo-elise-stefanik-firearm-ind/
The United States' southern border is "basically open" and is "a super spreader event.”
https://www.politifact.com/factchecks/2021/aug/13/gary-sides/north-carolina-school-leader-repeats-false-claims-/
There is a “0.05% chance of dying from COVID.”
https://www.politifact.com/factchecks/2021/aug/13/tiktok-posts/experts-break-down-numbers-catching-or-dying-covid/
The Biden administration is “not even testing these people” being released by Border Patrol into the U.S.
https://www.politifact.com/factchecks/2021/aug/13/ken-paxton/biden-administration-not-even-testing-migrants-rel/

Edit map with "R for leaflet"

I have a script which allows me to generate a map with with "R for leaflet" :
library(htmlwidgets)
library(raster)
library(leaflet)
# PATHS TO INPUT / OUTPUT FILES
projectPath = "path"
#imgPath = paste(projectPath,"data/cea.tif", sep = "")
#imgPath = paste(projectPath,"data/o41078a1.tif", sep = "") # bigger than standard max size (15431804 bytes is greater than maximum 4194304 bytes)
imgPath = paste(projectPath,"/test.tif", sep = "")
outPath = paste(projectPath, "/leaflethtmlgen.html", sep="")
# load raster image file
r <- raster(imgPath)
# reproject the image, if necessary
#crs(r) <- sp::CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
# color palette, which is interpolated ?
pal <- colorNumeric(c("#FF0000", "#666666", "#FFFFFF"), values(r),
na.color = "transparent")
# create the leaflet widget
m <- leaflet() %>%
addTiles() %>%
addRasterImage(r, colors=pal, opacity = 0.9, maxBytes = 123123123) %>%
addLegend(pal = pal, values = values(r), title = "Test")
# save the generated widget to html
# contains the leaflet widget AND the image.
saveWidget(m, file = outPath, selfcontained = FALSE, libdir = 'leafletwidget_libs')
My problem is that this is generating a html file and I need this map to be dyanamic. For example, when a user click on some html button which is not integrate on the map, I want to add a rectangle on the map. Any solutions would be welcome...
Leaflet itself does not provide the interactive functionality you are looking for. One solution is to use shiny, which is a web application framework for R. From simple R code, it generates a web page, and runs R on the server-side to respond to user interaction. It is well documented, has a gallery of examples, and a tutorial to get new users started.
It works well with leaflet. One of the examples on the shiny web site uses it, and also includes a link to the source code.
Update
Actually, if simple showing/hiding of elements is enough, leaflet alone will suffice with the use of groups. From the question it's not very clear how dynamic you need it to be.

Generate an HTML report based on user interactions in shiny

I have a shiny application that allows my user to explore a dataset. The idea is that the user explores the dataset, and any interesting things the user finds he will share with his client via email. I don't know in advance how many things the user will find interesting. So, next to each table or chart I have an "add this item to the report" button, which isolates the current view and adds it to a reactiveValues list.
Now, what I want to do is the following:
Loop through all the items in the reactiveValues list,
Generate some explanatory text describing the item (This text should preferably be formatted HTML/markdown, rather than code comments)
Display the item
Capture the output of this loop as HTML
Display this HTML in Shiny as a preview
write this HTML to a file
knitr seems to do exactly the reverse of what I want - where knitr allows me to add interactive shiny components in an otherwise static document, I want to generate HTML in shiny (maybe using knitr, I don't know) based on static values the user has created.
I've constructed a minimum not-working example below to try to indicate what I would like to do. It doesn't work, it's just for demonstration purposes.
ui = shinyUI(fluidPage(
title = "Report generator",
sidebarLayout(
sidebarPanel(textInput("numberinput","Add a number", value = 5),
actionButton("addthischart", "Add the current chart to the report")),
mainPanel(plotOutput("numberplot"),
htmlOutput("report"))
)
))
server = shinyServer(function(input, output, session){
#ensure I can plot
library(ggplot2)
#make a holder for my stored data
values = reactiveValues()
values$Report = list()
#generate the plot
myplot = reactive({
df = data.frame(x = 1:input$numberinput, y = (1:input$numberinput)^2)
p = ggplot(df, aes(x = x, y = y)) + geom_line()
return(p)
})
#display the plot
output$numberplot = renderPlot(myplot())
# when the user clicks a button, add the current plot to the report
observeEvent(input$addthischart,{
chart = isolate(myplot)
isolate(values$Report <- c(values$Report,list(chart)))
})
#make the report
myreport = eventReactive(input$addthischart,{
reporthtml = character()
if(length(values$Report)>0){
for(i in 1:length(values$Report)){
explanatorytext = tags$h3(paste(" Now please direct your attention to plot number",i,"\n"))
chart = values$Report[[i]]()
theplot = HTML(chart) # this does not work - this is the crux of my question - what should i do here?
reporthtml = c(reporthtml, explanatorytext, theplot)
# ideally, at this point, the output would be an HTML file that includes some header text, as well as a plot
# I made this example to show what I hoped would work. Clearly, it does not work. I'm asking for advice on an alternative approach.
}
}
return(reporthtml)
})
# display the report
output$report = renderUI({
myreport()
})
})
runApp(list(ui = ui, server = server))
You could capture the HTML of your page using html2canvas and then save the captured portion of the DOM as a image using this answer, this way your client can embed this in any HTML document without worrying about the origin of the page contents