Table way too wide to fit in Markdown generated PDF - mysql

I am trying to display a table from an SQL query to a pdf by using Rmarkdown. However, the table I get is too wide and it does not fit in the document.
I have been recommended to use the Pander package, and so I tried to use the pandoc.table() function which works greatly on the console, but for some reason it stops my document from rendering in Rmarkdown.
The code looks kinda like this :
rz = dbSendQuery(mydb, "select result.id result_id, company.id company_id, (...)")
datz = fetch(rz, n=-1)
It is a very long query but, as I said, it works both on MySQL and R console (working on RStudio).
So, when I do
kable(datz, "latex", col.names = c(colnames(datz)), caption=paste('This is a sample table')) %>% kable_styling(latex_options = "striped") %>% column_spec(1, bold = T, color = "red"))
the results that get printed are too wide to fit in the PDF.
I do not know how can I solve this. I tried with pandoc.tables() from pander package, but the format of the result seems to be very humble compared to the options I have in kable.

You have to use the scale_down option from kableExtra. The scale_down option is going to fit your table on one page when it is too wide. The police font will also be reduce.
Here is an example of the code you could use :
kable(your_dt, "latex", booktabs = T) %>%
kable_styling(latex_options = c("striped", "scale_down"))

Related

Flextable : using superscript in the dataframe

This question was asked few times, but surprinsingly, no answer was given.
I want some numbers in my dataframe to appear in superscript.
The functions compose and display are not suitable here since I don't know yet which values in my dataframe will appear in superscript (my tables are generated automatically).
I tried to use ^8^like for kable, $$10^-3$$, paste(expression(10^2)), "H\\textsubscript{123}", etc.
Nothing works !! Help ! I pull out my hair...
library(flextable)
bab = data.frame(c( "10\\textsubscript{-3}",
paste(as.expression(10^-3)), '10%-3%', '10^-2^' ))
flextable(bab)
I am knitting from Rto html.
In HTML, you do superscripts using things like <sup>-3</sup>, and subscripts using <sub>-3</sub>. However, if you put these into a cell in your table, you'll see the full text displayed, it won't be interpreted as HTML, because flextable escapes the angle brackets.
The kable() function has an argument escape = FALSE that can turn this off, but flextable doesn't: see https://github.com/davidgohel/flextable/issues/156. However, there's a hackish way to get around this limitation: replace the htmlEscape() function with a function that does nothing.
For example,
```{r}
library(flextable)
env <- parent.env(loadNamespace("flextable")) # The imports
unlockBinding("htmlEscape", env)
assign("htmlEscape", function(text, attribute = FALSE) text, envir=env)
lockBinding("htmlEscape", env)
bab = data.frame(x = "10<sup>-3</sup>")
flextable(bab)
```
This will display the table as
Be careful if you do this: there may be cases in your real tables where you really do want HTML escapes, and this code will disable that for the rest of the document. If you execute this code in an R session, it will disable escaping for the rest of the session.
And if you were thinking of using a document like this in a package you submit to CRAN, forget it. You shouldn't be messing with bindings like this in code that you expect other people to use.
Edited to add:
In fact, there's a way to do this without the hack given above. It's described in this article: https://davidgohel.github.io/flextable/articles/display.html#sugar-functions-for-complex-formatting. The idea is to replace the entries that need superscripts or subscripts with calls to as_paragraph, as_sup, as_sub, etc.:
```{r}
library(flextable)
bab <- data.frame(x = "dummy")
bab <- flextable(bab)
bab <- compose(bab, part = "body", i = 1, j = 1,
value = as_paragraph("10",
as_sup("-3")))
bab
```
This is definitely safer than the method I gave.

Getting data from an html page using R

Anyone can help me why the below code doe not have any data for the selected table?
library('httr')
library('rvest')
url= read_html("http://projects.worldbank.org/search?lang=en&searchTerm=&sectorcode_exact=AB")
table = html_node(url,"table#f05v5-sorting-table.border-top2.border-allside.clearboth")
Thanks!
You are missing some steps. Your workflow should look like this:
dat_html <- read_html(
"http://projects.worldbank.org/search?lang=en&searchTerm=&sectorcode_exact=AB"
)
dat_nodes <- html_nodes(dat_html, xpath = "xxxx")
dat <- html_table(dat_nodes)
dat will be a list, so if you want a data frame, you could do something like:
dat_df <- as.data.frame(dat)
Or, if you like tibbles:
dat_tbl <- as_tibble(dat)
I cannot find the table you are interested in on that webpage, so you have to replace "xxxx" by the xpath of the table you are interested in.
To find the xpath, if you are inspecting the page from chrome or chromium, you can right click on the node in the inspector window, and look for Copy, then Copy XPath.

Accessing html Tables with rvest

So I am wanting to scrape some NBA data. The following is what I have so far, and it is perfectly functional:
install.packages('rvest')
library(rvest)
url = "https://www.basketball-reference.com/boxscores/201710180BOS.html"
webpage = read_html(url)
table = html_nodes(webpage, 'table')
data = html_table(table)
away = data[[1]]
home = data[[3]]
colnames(away) = away[1,] #set appropriate column names
colnames(home) = home[1,]
away = away[away$MP != "MP",] #remove rows that are just column names
home = home[home$MP != "MP",]
the problem is that these tables don't include the team names, which is important. To get this information, I was thinking I would scrape the four factors table on the webpage, however, rvest doesnt seem to be recognizing this as a table. The div that contains the four factors table is:
<div class="overthrow table_container" id="div_four_factors">
And the table is:
<table class="suppress_all sortable stats_table now_sortable" id="four_factors" data-cols-to-freeze="1"><thead><tr class="over_header thead">
This made me think that I could access the table via something along the lines of
table = html_nodes(webpage,'#div_four_factors')
but this doesnt seem to work as I am getting just an empty list. How can I access the four factors table?
I am by no means an HTML expert but it appears that the table you are interested in is commented out in the source code then the comment is overridden at some point before being rendered.
If we assume that the Home team is always listed second, we can just use positional arguments and scrape another table on the page:
table = html_nodes(webpage,'#bottom_nav_container')
teams <- html_text(table[1]) %>%
stringr::str_split("Schedule\n")
away$team <- trimws(teams[[1]][1])
home$team <- trimws(teams[[1]][2])
Obviously not the cleanest solution but such is life in the world of web scraping

How to put entire datatable onto html report? (or at least left align)

I have several wide tables that should fit onto an html report, but I don't know how to do it.
Consider the following example. It's rather silly I know, because I could chop the digits off, but many of my tables have string columns that are about this long and cannot be chopped:
---
title: "DT Fitting"
output: html_document
---
```{r testTable, fig.align = 'left', fig.width = 6in}
DT::datatable(datasets::euro.cross)
```
It renders an html report that looks like this:
Notice that I've tried using fig.align and fig.width to align or shrink the table, but they don't seem to work. Does anyone know to put this single table onto the page so as to be completely visible?
It looks like a previous SO post captures this.
This allows you to set the width via options(width = some number). This doesn't seem to be the ideal solution if you have multiple wide tables.
Another option is to consider, fixing the columns when setting up the datatables and enabling scrolling. Check out Section 5. They allow you to fix the first and last column, and scroll the columns in between.
Based on the Datatables link:
```{r}
DT::datatable(datasets::euro.cross,
extensions = 'FixedColumns',
options = list(
dom = 't',
scrollX = TRUE,
fixedColumns = list(leftColumns = 2, rightColumns = 1))
)
```

Adjust size of leaflet map in rmarkdown html

I'd like to change the height and width of leaflet map outputs in html document. Is there a simple way to do this in R markdown without getting into whole CSS business?
```{r}
library(leaflet)
library(dplyr)
m <- leaflet() %>% setView(lng = -71.0589, lat = 42.3601, zoom = 12)
m %>% addTiles()
```
Ideally, I want the width of map to be the same width of code block as shown below.
I found that changing fig.width and fig.height did not change the size of my leaflet maps. However, changing a different parameter/option did.
Try changing the width/height using this in the header for the code chunk:
{r, width = 40, height = 30}
Or alternatively, another thing that has worked for me is to use this (in this case, do not put anything in the chunk header:
m <- leaflet(width = "100%") %>%
This works fine:
SomeData %>%
leaflet(height=2000, width=2000) %>%
addTiles() %>%
addMarkers(clusterOption=markerClusterOptions())
Here SomeData is a dataframe containing columns: Lat and Long.
you can set like a global figure size to the whole document... But I think your code chunks will rescale, the images not.
library(knitr)
opts_chunk$set(fig.width=12, fig.height=8)
Actually, I didn't checked it with leaflets. Hope this code is still working.
To elaborate on the other answers:
My understanding is that fig.width and fig.height don't work for leaflet (or plotly) because they are html objects, not 'true figures' as far as knitr is concerned. leaflet (and plotly) do however respond to out.width, which is why it works.
In case anyone ever has a similar situation to me:
I have a code chunk which conditionally includes either a leaflet map (for html output), or 4x saved ggplot .png images using knitr::include_graphics(p1, p2, p3, p4) (for pdf output). In order to tile the 4 ggplot objects in a 2x2 grid, I had to set out.width = '50%', but this also reduced the width of the leaflet output.
The solution was to include leaflet(width = '100%') in the code, and out.width = '50%' in the chunk header. leaflet(width = '100%') seems to override out.width = '50%', giving me either a full sized leaflet in html outputs, or 50% width ggplot figures.