How to convert HTML back to markdown? - html

I am looking for a way to convert Markdown to HTML and back, more out of interest than real need. I am aware of the loss of information on such a conversion.
I hope for an html2text.pl like conversion. If there is no such utility in Perl I would try to take this script as a base for a CPAN module.

There you go: Pandoc can convert almost anything to anything. Sorry, no perl though.

HTML::WikiConverter::Markdown seems up to the task.

Related

Use plasTeX on short strings

I'm trying to find a python package which will convert short strings like
A closed-form solution of
$\textbf{R}\textbf{R}_1=\textbf{R}_2\textbf{R}$
w.r.t $\textbf{R}$
to a reasonable HTML representation, like
A closed-form solution of
<i><b>R</b><b>R<b><sub>1</sub> = <b>R</b><sub>2</sub> <b>R</b></i>
w.r.t <i><b>R</b></i>
No LaTeX packages or document layout commands (\section etc.) will be involved; just the subset of TeX allowed in stackexchange postings.
While Mathjax does handle this beautifully, unfortunately Javascript options are off the table, as this is for an email digest--it has to be static HTML output. Inline CSS is fine. I know there's a Node.js version of Mathjax that can approximate its output in static form (with a buhc of caveats about how the result won't be browser-responsive and other things I don't care about), but I want Python.
The best option I've found seems to be plasTeX, but all the documentation there seems to be about converting whole .tex files to .html, or, for some reason, .xml files, which is much more than I want to do.
I suppose, if need be, I could generate temporary .html files and then use BeautifulSoup to parse out only the part I'm interested in, but this seems a bit silly. Since I'm talking about doing this maybe 50 times per script invocation, this would certainly be doable.
Is there a simple way to use plasTeX or any other python package to get html equivalents of short latex snippets?
You may find https://github.com/alvinwan/TexSoup useful. Using this library, you could replace the boldfaced parts in two lines. Although a sufficient number of regexes could do, TexSoup gives you a bit more flexibility.
from TexSoup import TexSoup
soup = TexSoup(r"$\textbf{R}\textbf{R}_1=\textbf{R}_2\textbf{R}$")
for b in soup.find_all('textbf'):
b.replace("<b>{args[0]}</b>".format(args=b.args))

How can i convert xml file to html file without xslt

My Question is that How can i convert a xml file in to html using java code without using xslt functionality,and should be displayed as similar.
Please help us,i am very frustrated.
It isn't quite Java, but I have found Groovy's XMLSlurper and MarkupBuilder to be quite powerful for this purpose. The syntax is close enough to Java where there is no real learning curve.
See here and here.

Tool to remove leading/trailing spaces in HTML files?

I have searched but could not find anything similar to what I need. I am looking for a tool that is capable of removing leading/trailing spaces in my HTML files which also have embedded JavaScript. Basically in the end, I plan to use this tool within my Nant scripts to perform this task on the fly with every deployment.
Is there already a tool that can do this, or maybe the best scripting language?
Basically, I will like what MS Word does for text using "justify (Ctrl+J)", to be done for my HTML files.
Here is the solution I found for this.
Using the html compressor command line tool, I was able to only remove the leading spaces of the html file where as fully minifying them didnt work.
Soultion:
java -jar htmlcompressor.jar --preserve-comments --preserve-multi-spaces --preserve-line-breaks --output D:\html\foo-leading_spaces.htm D:\html\foo.htm
Using this tool to generate my desired results, I am able to apply this to my build scripts to perform this process on the fly.
Thanks everyone for their input and hope this helps others in the similar situation.

Create object hierarchy from Make output?

make -d and make -p provide useful information, but I need this in JSON format, so I can enumerate what libraries came from which source files, recursively. Is there a way to do this already (approximately close, anyhow)? Or is there a custom tool available? I've scoured the Intarwebs, and my search has come up dry. Thank you for any help!
Note: I'm looking for something that's similar to sysconfig.parse_makefile. In fact, what that does is pretty close to what I'm looking for, except that it's only useful for the implicit Makefile that is used to build Python. Any pointers?
It's not JSON, but the Perl CPAN module Makefile::GraphViz creates visualizations of the dependency graph from a makefile. If JSON is really what you want, you could probably capture the 'dot' dependency file that is generated and convert it to JSON fairly easily.

Which technology should I use to transform my latex documents into html documents

I want to write a little program that transforms my TeX files into HTML. I want to parse the documents and turn the macros (the build-in and of course my own) into HTML pieces. Here are my requirements:
predefined rules (e.g. begin{itemize} \item text \end{itemize} => <br> <p>text </p> <br/>)
defining own CSS style
ability to convert formulars (extract the formulars, load them in an imagecreator and then save the jpg/png)
easy to maintain and concise
I know there are several technologies out there, but I don't exactly know which is the best for me. Here are the technologies which flow into my mind
Ruby (I/O is easy, formular loading via webrat),
XML XSLT (I don't think that I need just overhead)
perl (there are many libs out there but I'm not quite familiar with it)
bash (I worked with sed and was surprised how easy it was to work with regular expressions)
latex2html ... (these converters won't work for me and they don't give me freedom in parsing)
Any suggestions, hints and comments are welcome.
Thanks for your time, folks.
have a look at pandoc here. it can also be installed on linux or os x. Though it won't do your custom macros. The only thing I've seen that can do a decent job with custom macros is tex4ht, but to really work well you need to be producing .DVI files. If you have a ton of custom macros, writing your own converter is going to take an ass load of time. Even if you only have a few custom macros, it's still going to be a pain. good luck!
Six: TeX
Seven: Haskell
(I gave up trying to persuade SO to start numbering my list from 6).