Generate single-file HTML code documentation - html

How can I use Doxygen to create the HTML documentation as a single, very long file? I want something like the RTF output, but as HTML.
The reason: I need my API published as a single, printable, document. Something that can be loaded into Word, converted to PDF, etc.

I think you can use HTMLDOC to convert the generated html files to a single html file. (I did not try it myself)
The manual includes the following example to generate a html from two source html files:
htmldoc --book -f output.html file1.html file2.html
But there is also a gui.

I don't think there's an option that will produce the output as a single HTML file, but the RTF output may be suitable if you need an editable output format (I haven't tried this myself so I don't know how well this works).
If you want good quality printable output, then Doxygen can output LaTeX format (set GENERATE_LATEX to YES in your doxygen configuration file). This can then be converted to PDF, although you'll need to install a LaTeX distribution such as MiKTeX.

Related

How to convert Html to PostScript using GhostScript?

I want to convert html to PDF in order to converting it i found that direct conversion is not possible so i choose a method html-->PS-->PDF i have successfully convert PS to PDF but cant convert html to PS through ghost script now can anyone please tell me is it actually possible or not and if it is then how ?
Platform - Windows
note - No thirt pary/paid/dll tools plesae
Ghostscript does not interpret HTML, so no you cannot take HTML and create a PostScript (or PDF) file using Ghostscript directly. Since you can't convert HTML to PDF using Ghostscript, why would you think you could convert HTML to PostScript with it ?
You can (in general) print from a browser window to a virtual PostScript printer in order to create PostScript.
If that isn't acceptable then you will need to use some other means to create PostScript from HTML, in which case you may as well just go straight to PDF. wkhtmltopdf is open source, you could try that.

Convert Notepad++ user defined language to html

I have created a user defined language using Notepad++. Now I want to convert this into html format. There are many free source tools available which can convert lets say rtf file to html but how can i convert this user defined language file into html?
Notepad++ allows me to save this file which is in user-defined language with any extension but the formatted text can only be viewed in notepad++.
Like if I save the file as draft.rtf, open the file in notepad++ the changes (formated text of user defined language) will stay but when I open the same file in Word the changes are lost!
A simple copy paste from notepad++ to Word also causes the text to lose its user-defined language specifications. Kindly help.
Check the NppExport plugin, it will export your file with the syntax highlighted as html (i think) and then you can use that somewhere else
http://www.addictivetips.com/windows-tips/nppexport-for-notepad-export-highlighted-code-in-html-rtf-format/

Generate HTML reports containing info stored in pdf files

I would like to generate an html report containing some outputs (graphs, statistics from R). The graphs are saved in pdf files.
My option : perl script that will generate the html report( by converting the pdf into jpeg)
What other options would be ideal in this case?
I am working in UNIX environment.
If you are familiar with R, you can probably look at the knitr package. R2HTML is based on Sweave, which is not quite extensible, and knitr is fully extensible and supports HTML naturually; see a minimal example with source.
You have many choices on how to save R graphics (pdf, png, jpeg, ...); see the dev option (graphical device). So there is no need for conversion from PDF to other bitmap formats on R's side.
You said you had Ruby and C output as well; I'm not sure how you are going to deal with them: do you want to generate the output dynamically (literate programming) or insert them manually? For the former, you can probably use the R function system() to run external programs (e.g. call C to generate PDF and call perl to convert). You can also define knitr hooks to do these jobs, but you may need to more to learn how hooks work in knitr.
Another approach is to convert your PDF output with a batch job, and modify the HTML code, e.g. replace <img src='foobar.pdf' /> with <img src='foobar.jpeg' /> in HTML after you have converted all PDF files to JPEG. This should easier.
If you have control over the how the graphs and statistics are created in R, your easiest approach would probably be using the R2HTML package to generate the HTML directly. This would include an sweave-like approach which would substitute R output in appropriate places in an HTML template.
Also, R can create jpeg files (or gif's) for graphs as easily as it creates PDF's, so that conversion step can be avoided entirely.

latex input to html and pdf output

Is there a ruby gem that can parse latex formatted string to html string and pdf binary string including bibtex bibliography?
I'm using textile (redcloth) right now in my rails app to get formated html, but I'd like to use latex to do it. I also would like to use *.bib file for references. And having latex it should also be easy to build a pdf file, in order to provide a pdf version of the same article (nice to have)...
I also could do it with the system call and e.g. texlive, but then I've to save the user input to file and manage these files and put it back to database and that all would take some time. I don't like this approach...
Is there a nice way to do it?
You could try runtex, although I do not know if it will do exactly what you want, I have not tested it.

HTML downloading and text extraction

What would be a good tool, or set of tools, to download a list of URLs and extract only the text content?
Spidering is not required, but control over the download file names, and threading would be a bonus.
The platform is linux.
wget | html2ascii
Note: html2ascii can also be called html2a or html2text (and I wasn't able to find a proper man page on the net for it).
See also: lynx.
Python Beautiful Soup allows you to build a nice extractor.
I know that w3m can be used to render an html document and put the text content in a textfile
w3m www.google.com > file.txt for example.
For the remainder, I'm sure that wget can be used.
Look for the Simple HTML DOM parser for PHP on Sourceforge. Use it to parse HTML that you have downloaded with CURL. Each DOM element will have a "plaintext" attribute which should give you only the text. I was very successful in a lot of applications using this combination for quite some time.
PERL (Practical Extracting and Reporting Language) is a scripting language that is excellent for this type of work. http://search.cpan.org/ contains allot of modules that have the required functionality.
Use wget to download the required html and then run html2text on the output files.