Generate HTML reports containing info stored in pdf files - html

I would like to generate an html report containing some outputs (graphs, statistics from R). The graphs are saved in pdf files.
My option : perl script that will generate the html report( by converting the pdf into jpeg)
What other options would be ideal in this case?
I am working in UNIX environment.

If you are familiar with R, you can probably look at the knitr package. R2HTML is based on Sweave, which is not quite extensible, and knitr is fully extensible and supports HTML naturually; see a minimal example with source.
You have many choices on how to save R graphics (pdf, png, jpeg, ...); see the dev option (graphical device). So there is no need for conversion from PDF to other bitmap formats on R's side.
You said you had Ruby and C output as well; I'm not sure how you are going to deal with them: do you want to generate the output dynamically (literate programming) or insert them manually? For the former, you can probably use the R function system() to run external programs (e.g. call C to generate PDF and call perl to convert). You can also define knitr hooks to do these jobs, but you may need to more to learn how hooks work in knitr.
Another approach is to convert your PDF output with a batch job, and modify the HTML code, e.g. replace <img src='foobar.pdf' /> with <img src='foobar.jpeg' /> in HTML after you have converted all PDF files to JPEG. This should easier.

If you have control over the how the graphs and statistics are created in R, your easiest approach would probably be using the R2HTML package to generate the HTML directly. This would include an sweave-like approach which would substitute R output in appropriate places in an HTML template.
Also, R can create jpeg files (or gif's) for graphs as easily as it creates PDF's, so that conversion step can be avoided entirely.

Related

R Writing Excel Document

My question is whether or not anybody knows of a better way to do what I'm already doing. I'm creating a report as a list, and trying to render it both in HTML and Excel.
I'm developing a shiny app that generates reports for Qualtrics surveys.
The results table is a list of HTML strings that I paste together and display in a shinydashboard. Here's a dput of the example results tables.
Here's how I'm creating the html results tables list -- the html_tabelize() function in my package. Here's a dput of the example input.
In the shiny server.R file the way I create the Excel file is with the following code:
output$downloadResults <- downloadHandler(
filename = 'tables.xls',
content = function(file) {
write(html_tabelize(main()[['blocks']]), file)
}
)
To summarize: I get the blocks, I run html_tabelize on them, and then I write the HTML output to a file called "tables.xls". When I open that file, because Excel can render HTML, it renders something like this:
My concern and problem with what I'm doing are two-fold:
If I were writing an Excel document instead of simply rendering HTML in Excel, then I could perhaps get a better formatted document. I'd like that.
When you download the results tables xls file and try to open it, you get a warning from Excel. I don't want the users of my app to see this warning, because it's distracting and could worry them about something that isn't really a concern.
I know that options exist for writing Excel files in R, but so far what I've seen indicates that their input must be either a data frame, or a list of data frames. The list I am rendering from has different types of components, like the question text, as well as data frames of results. Originally I was using pandoc, but pandoc, even when run from R, is a system binary, and it's difficult to list as a dependency (and if I can't list it as a dependency, it's tough to make sure it's installed for the users of my app). Additionally, I found out pandoc doesn't even convert to "real" Excel -- it also just saves HTML in a .xls file. Does anybody have any suggestions as to how I can improve this part of my app?

Load a CSV template and write data to it via java

I have a CSV template file, say, having 10 columns.
I would like to load this CSV file template, and then write data to the relevant cells(say only to 5 of the 10 cells) through a java program.
I went through JSAPAR, SuperCSV etc, but am not sure whether these libraries have the "stuff" what exactly I need.
Is there any framework supporting this kind of operations?
Checkout freemarker: http://freemarker.org/
Open your text file.
Enter freemarker paramerters for required cells.
Your template file may look something like below:
"Templatetext1","text2","text4", "${myVal4}",${myVal5}","text6", ${myVal7}",${myVal8}",${myVal9}","textInCell10"
Pass in the values, you have your csv from template.
If you want to pass for multiple rows you can use other elements like <#list> etc.
OpenCSV is generally considered the best CSV toolkit for Java. It's a very lightweight library that makes working with CSV dead simple. I would recommend looking at it since it's not among the list of things you've tried yet.

How do I download contents of an html table generated by play 1.2.7 backend on java in xls

I've generated a table using play's #{list} tag and get pretty decent results. Now I need to be able to generate and download an xls version of the table and have no idea what to do. Any pointers at all will be much appreciated
Well you have various options.
Excel will open HTML files. So instead of rendering your table as HTML you can it to stream it to the browser and set the content type as XLS.
While Excel will open it this it will still be an HTML file rather than an XLS(X) document.
You can generate as CSV from your data model and stream this to the browser. Again this will be a CSV rather than a proper XLS(X) document.
There also seem to be some solutions around which can do it using Javscript. See as a starting point: Generate excel sheet from html tables using jquery
Finally you can can use something like Apache POI or JXLS to generate a 'proper' xls(x) document and stream this to the browser. I have some code here that will export HTML to 'proper' xlsx file if this is the route you wish to go. Workflow is then to create some HTML from your data model and use this to convert to Excel rather than having to programmatically build the Excel document using POI. https://github.com/alanhay/html-exporter

How can I create PDF output from rrdcgi?

I have created a rrdcgi script to display information about the system performance with graphs. Now I would like to add an option for the users to create PDF on the fly with the details on current page (images and information) and header and footer. I also want the generated PDF files to be saved in some location so that that can be easily accessed next time. Is this possible to do with rrdcgi or any Perl code would be really appreciated.
I need this options
You need to consider what you want to put in the PDF: Do you want an exact replica of the web page the user is viewing (too hard to be close to impossible without having the user's browser installed on your side and using its print output) or do you want the same information in a roughly similar layout?
An important issue is how you are generating the HTML: I did something similar once to generate PDF receipts for experiment participants (now, I just output HTML with print styles).
The HTML is generated using HTML::Template although Template.pm would be just as fine.
It is then trivial to write another template, one that generates a LATEX document which can be processed using pdflatex. If you save the data the time the snapshot is requested, you can add the snapshot to a queue that generates documents asynchronously so that requests do not tie up the web server.
Update: Looking at rrdcgi, I now realize that it already does use a template. That is perfect: Instead of putting HTML in the template, put LATEX code in the template and run rrdcgi with the --filter option to create a LATEX source file which you can run through pdflatex. I guess the problem to solve there is to be able to use the exact same data that was used to generate the page the user is looking at.
If it is not possible to re-run rrdcgi with the exact same data, consider adding some JavaScript that submits the HTML source of the page the user is reviewing (or some JSON representation thereof) to a CGI script that parses the HTML and outputs LATEX. Writing clean HTML in the original template and judicious use of class and id attributes would help there.
I do not have time to test any of these ideas right now, but I will take a look again within the next couple of days.
Is it worth the effort?
Why don't you add a FAQ explaining how to setup a PDF-printer on Windows/MAC/Linux and provide a 'clean' page that can then be printed?
Since you apparently have to create the PDF,
take a look at this (what-is-the-best-perl-module-to-use-for-creating-a-pdf-from-scratch) post here on SO.
There is also this post, that could combine the 'clean' HTML page and a server-side print.
Regarding the LaTeX route, if you have rrdcgi generate graphs in pdf format, pdflatex will be able to integrate them directly into the document, producing super quality pdf with graphs ... very slick. Sorry, no code.

Ways to export Tables/Views from mySQL Database to printer friendly format (other than phpMyAdmin)

I've created a bunch of views in a database and I'd like to export them to pdf. However phpmyadmin lets me only put a title on each page and it's very limited to how i can layout the output.
does anybody have some recommendations of software/scripts they used?
tcpdf is a PHP class for generating pdf documents. They have many example scripts.
There are a fantastillion ways to do this, some ideas:
export csv, import it to your favorite spreadsheet editor, format it, get the pdf using a pdf printer.
export xml, process it using xsl-fo to produce the output you want ( hacking required, fun )
export html ( should work? ), put a css on top optimized for print layout, pdf-printer.
Usually, I write up a script to pull info from a database, then generate a .csv, attach it to an email and send it on its way. Most scripts with support for mySQL can do that and they also go as far as generate a .pdf file with the appropriate formatting (in my case, I use Ruby, so I could have used Prawn to generate a PDF - I just choose not to as of this time).