Converting LDAvis output to tiff - json

I'm working on topic modeling and I've recently discovered the excellent library 'LDAvis'. Unfortunately, the visual output of the library is in json, and I do not know how to convert it to tiff. This is the format required for graphs by most academic journals.
What I need is a way to convert the output of LDAvis to tiff.
Here an example I've derived from the 'text2vec' package, from here: https://github.com/dselivanov/text2vec/issues/235
library(text2vec)
library(magrittr)
data("movie_review")
tokens = movie_review$review[1:4000] %>%
tolower %>%
word_tokenizer
it = itoken(tokens, ids = movie_review$id[1:4000], progressbar = FALSE)
v = create_vocabulary(it) %>%
prune_vocabulary(term_count_min = 10, doc_proportion_max = 0.2)
vectorizer = vocab_vectorizer(v)
dtm = create_dtm(it, vectorizer, type = "dgTMatrix")
lda_model = LDA$new(n_topics = 10, doc_topic_prior = 0.1, topic_word_prior = 0.01)
doc_topic_distr =
lda_model$fit_transform(x = dtm, n_iter = 1000,
convergence_tol = 0.001, n_check_convergence = 25,
progressbar = FALSE)
# this plots LDAvis in current session
lda_model$plot()
What I want to accomplish are graphs like this:
and this:
Unfortunately, this is the best I can do, by using of some kind of converter (press "print" the page, and it opens automatically, as a .pdf converter)
Thank you in advance.

Related

Deleting commas in R Markdown html output

I am using R Markdown to create an html file for regression results tables, which are produced by stargazer and lfe in a code chunk.
library(lfe); library(stargazer)
data <- data.frame(x = 1:10, y = rnorm(10), z = rnorm(10))
result <- stargazer(felm(y ~ x + z, data = data), type = 'html')
I create a html file win an inline code r result after the chunk above. However, a bunch of commas appear at the top of the table.
When I check the html code, I see almost every </tr> is followed by a comma.
How can I delete these commas?
Maybe not what you are looking for exactly but I am a huge fan of modelsummary. I knit to HTML to see how it looks and then usually knit to pdf. The modelsummary equivalent would look something like this
library(lfe)
library(modelsummary)
data = data.frame(x = 1:10, y = rnorm(10), z = rnorm(10))
results = felm(y ~ x + z, data = data)
modelsummary(results)
There are a lot of ways to customize it through kableExtra and other packages. The documentation is really good. Here is kind of a silly example
library(kableExtra)
modelsummary(results,
coef_map = c("x" = "Cool Treatment",
"z" = "Confounder",
"(Intercept)" = "(Intercept)")) %>%
row_spec(1, background = "#F5ABEA")

Meta-model of the field function OpenTurns 1.16rc1

After updating Openturns from 1.15 to 1.16rc1 I have the following issue with building the meta-model of the field function:
to reduce the computational burden:
ot.ResourceMap.SetAsUnsignedInteger("FittingTest-KolmogorovSamplingSize", 1)
algo = ot.FunctionalChaosAlgorithm(sample_X, outputSampleChaos)
algo.run()
metaModel = ot.PointToFieldConnection(postProcessing, algo.getResult().getMetaModel())
The "FittingTest-KolmogorovSamplingSize" was removed from OpenTurns 1.16rc1 and when I try to replace the fitting test with:
ot.ResourceMap.SetAsUnsignedInteger("FittingTest-LillieforsMaximumSamplingSize", 10)
Or with
ot.ResourceMap.SetAsUnsignedInteger("FittingTest-LillieforsMinimumSamplingSize", 1)
The code is freezing. Is there any solution for this?
The proposed solution is simply to use another distribution to model your data. You could have used any other multivariate continuous distribution of proper dimension. IMO it is not a valid answer as the distribution has no link to your data.
After inspection, it appears that the problem has nothing to do with Lilliefors's test. In OT 1.15 we were using this test (under the wrong name of Kolmogorov) to select automatically a distribution suited to the input sample, but we switched to a more sophisticated selection algorithm (see MetaModelAlgorithm::BuildDistribution). It is based on a first pass using the raw Kolomgorov test (thus ignoring the fact that parameters have been estimated) then an information-based criterion is used to select the most relevant model (AIC, AICC, BIC depending on the value of the "MetaModelAlgorithm-ModelSelectionCriterion" key in ResourceMap. The problem is caused by the TrapezoidalFactory class during the Kolmogorov phase. I will provide a fix ASAP in OpenTURNS master. In the mean time, I have adapted the proposed solution to something more adapted to your data:
degree = 6
dimension_xi_X = 3
dimension_xi_Y = 450
enumerateFunction = ot.HyperbolicAnisotropicEnumerateFunction(dimension_xi_X, 0.8)
basis = ot.OrthogonalProductPolynomialFactory(
[ot.StandardDistributionPolynomialFactory(ot.HistogramFactory().build(sample_X[:,i])) for i in range(dimension_xi_X)], enumerateFunction)
basisSize = enumerateFunction.getStrataCumulatedCardinal(degree)
#basis = ot.OrthogonalProductPolynomialFactory(
# [ot.HermiteFactory()] * dimension_xi_X, enumerateFunction)
#basisSize = 450#enumerateFunction.getStrataCumulatedCardinal(degree)
adaptive = ot.FixedStrategy(basis, basisSize)
projection = ot.LeastSquaresStrategy(
ot.LeastSquaresMetaModelSelectionFactory(ot.LARS(), ot.CorrectedLeaveOneOut()))
ot.ResourceMap.SetAsScalar("LeastSquaresMetaModelSelection-ErrorThreshold", 1.0e-7)
algo_chaos = ot.FunctionalChaosAlgorithm(sample_X,
outputSampleChaos,basis.getMeasure(), adaptive, projection)
algo_chaos.run()
result_chaos = algo_chaos.getResult()
meta_model = result_chaos.getMetaModel()
metaModel = ot.PointToFieldConnection(postProcessing,
algo_chaos.getResult().getMetaModel())
I also implemented a quick and dirty estimator of the L2-error:
# Meta_model validation
iMax = 5
# Input values
sample_X_validation = ot.Sample(np.array(month_1_parameters_MSE.iloc[:iMax,0:3]))
print("sample size=", sample_X_validation.getSize())
# sample_X = ot.Sample(month_1_parameters_MSE[['Rseries','Rsh','Isc']])
# output values
#month_1_simulated.iloc[0:1].transpose()
Field = ot.Field(mesh,np.array(month_1_simulated.iloc[0:1]).transpose())
sample_Y_validation = ot.ProcessSample(1,Field)
for k in range(1,iMax):
sample_Y_validation.add( np.array(month_1_simulated.iloc[k:k+1]).transpose() )
# In[18]:
graph = sample_Y_validation.drawMarginal(0)
graph.setColors(['red'])
drawables = graph.getDrawables()
graph2 = metaModel(sample_X_validation).drawMarginal(0)
graph2.setColors(['blue'])
drawables = graph2.getDrawables()
graph.add(graph2)
graph.setTitle('Model/Metamodel Validation')
graph.setXTitle(r'$t$')
graph.setYTitle(r'$z$')
drawables = graph.getDrawables()
L2_error = 0.0
for i in range(iMax):
L2_error = (drawables[i].getData()[:,1]-drawables[iMax+i].getData()[:,1]).computeRawMoment(2)[0]
print("L2_error=", L2_error)
You get an error of 79.488 with the previous answer and 1.3994 with the new proposal. Here is a graphical comparison.
Comparison between test data & previous answer
Comparison between test data & new proposal
The solution is to use:
degree = 1
dimension_xi_X = 3
dimension_xi_Y = 450
enumerateFunction = ot.LinearEnumerateFunction(dimension_xi_X)
basis = ot.OrthogonalProductPolynomialFactory(
[ot.HermiteFactory()] * dimension_xi_X, enumerateFunction)
basisSize =450 #enumerateFunction.getStrataCumulatedCardinal(degree)
adaptive = ot.FixedStrategy(basis, basisSize)
projection = ot.LeastSquaresStrategy(
ot.LeastSquaresMetaModelSelectionFactory(ot.LARS(), ot.CorrectedLeaveOneOut()))
ot.ResourceMap.SetAsScalar("LeastSquaresMetaModelSelection-ErrorThreshold", 1.0e-7)
algo_chaos = ot.FunctionalChaosAlgorithm(sample_X,
outputSampleChaos,basis.getMeasure(), adaptive, projection)
algo_chaos.run()
result_chaos = algo_chaos.getResult()
meta_model = result_chaos.getMetaModel()
metaModel1 = ot.PointToFieldConnection(postProcessing,
algo_chaos.getResult().getMetaModel())

dataExplorer::create_report produces html file that is blank

I am trying to create a HTML report from the DataExplorer::create_report(). The code is as follows
DataExplorer::create_report(iris, config = list(add_plot_qq = FALSE, global_ggtheme = quote(theme_minimal(base_size = 14))))
The code creates "report.html", which is blank when I open it in any browser. I am using the DataExplorer version 0.8.0
Nick's answer is correct. I made some updates in v0.8 to simplify report customization, i.e., #87. However, I would like to use this section to provide a little more information on that. Please do not accept this as an answer.
configure_report helps you write less code in terms adding/removing sections, as well as editing themes. However, the output is no different from the list output from previous versions. If you want, you can still make your own list files and pass it to create_report. The template is here:
config <- list(
"introduce" = list(),
"plot_intro" = list(),
"plot_str" = list(
"type" = "diagonal",
"fontSize" = 35,
"width" = 1000,
"margin" = list("left" = 350, "right" = 250)
),
"plot_missing" = list(),
"plot_histogram" = list(),
"plot_qq" = list(sampled_rows = 1000L),
"plot_bar" = list(),
"plot_correlation" = list("cor_args" = list("use" = "pairwise.complete.obs")),
"plot_prcomp" = list(),
"plot_boxplot" = list(),
"plot_scatterplot" = list(sampled_rows = 1000L)
)
After that, you can just call create_report as usual:
create_report(iris, config = config)
Hope this helps!
For those searching for the answer use: config = configure_report() instead of config = list()
DataExplorer::create_report(iris,
config = configure_report(add_plot_qq = FALSE,
global_ggtheme = quote(theme_minimal(base_size = 14))))

RShiny integration with google sites

I would like to be able to add interactive shiny elements into a website. My HTML skills are not up to speed to make fancy websites from scratch. Google allows you to make nice slick well functioning websites fast, using sites.google.com.
I was wondering if it is possible to add R Shiny elements into a sites.google.com site.
For example, it is possible to put
library(plotly)
trace_0 <- rnorm(100, mean = 5)
trace_1 <- rnorm(100, mean = 0)
trace_2 <- rnorm(100, mean = -5)
x <- c(1:100)
data <- data.frame(x, trace_0, trace_1, trace_2)
p <- plot_ly(data, x = ~x, y = ~trace_0, name = 'trace 0', type = 'scatter', mode = 'lines') %>%
add_trace(y = ~trace_1, name = 'trace 1', mode = 'lines+markers') %>%
add_trace(y = ~trace_2, name = 'trace 2', mode = 'markers')
into https://sites.google.com/view/shinytest ?
EDIT: I read that in Shiny you can build a 'raw' HTML UI instead of a ShinyUI (shiny.rstudio.com/articles/html-ui.html). Would it be possible to extract the HTML from an existing site (e.g. the sites.google site from the example and keep all its functionality) and start using that as a base HTML UI in which Shiny elements can be added (and thususing the server part as back-end)?

Edit map with "R for leaflet"

I have a script which allows me to generate a map with with "R for leaflet" :
library(htmlwidgets)
library(raster)
library(leaflet)
# PATHS TO INPUT / OUTPUT FILES
projectPath = "path"
#imgPath = paste(projectPath,"data/cea.tif", sep = "")
#imgPath = paste(projectPath,"data/o41078a1.tif", sep = "") # bigger than standard max size (15431804 bytes is greater than maximum 4194304 bytes)
imgPath = paste(projectPath,"/test.tif", sep = "")
outPath = paste(projectPath, "/leaflethtmlgen.html", sep="")
# load raster image file
r <- raster(imgPath)
# reproject the image, if necessary
#crs(r) <- sp::CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
# color palette, which is interpolated ?
pal <- colorNumeric(c("#FF0000", "#666666", "#FFFFFF"), values(r),
na.color = "transparent")
# create the leaflet widget
m <- leaflet() %>%
addTiles() %>%
addRasterImage(r, colors=pal, opacity = 0.9, maxBytes = 123123123) %>%
addLegend(pal = pal, values = values(r), title = "Test")
# save the generated widget to html
# contains the leaflet widget AND the image.
saveWidget(m, file = outPath, selfcontained = FALSE, libdir = 'leafletwidget_libs')
My problem is that this is generating a html file and I need this map to be dyanamic. For example, when a user click on some html button which is not integrate on the map, I want to add a rectangle on the map. Any solutions would be welcome...
Leaflet itself does not provide the interactive functionality you are looking for. One solution is to use shiny, which is a web application framework for R. From simple R code, it generates a web page, and runs R on the server-side to respond to user interaction. It is well documented, has a gallery of examples, and a tutorial to get new users started.
It works well with leaflet. One of the examples on the shiny web site uses it, and also includes a link to the source code.
Update
Actually, if simple showing/hiding of elements is enough, leaflet alone will suffice with the use of groups. From the question it's not very clear how dynamic you need it to be.