In .md files, does HTML h1 and markdown # work the same? - html

I am learning how to make README.md files stylish and easy to read. I am also trying to learn good coding, GitHub, and repository structuring practices.
I found that I could style README.mds using HTML. However, I am a bit confused about how HTML interacts with .md files.
For example, does <h1>Project Title</h1> work the same as #Project Title in a README.md?
Additionally, is it considered bad practice to use HTML to format a README.md? I noticed a lot of my friends repositories do not use HTML.
Thank you for your time and assistance.
Both <h1></h1> and # seem to work the same in README.md. However, I am worried I am not following standard practice or am making my README.mds incompatible with other markdown features (or something like that).
I am so new to GitHub that I can't tell if I am doing something wrong by using HTML. Thanks!

Typically, yes, a top-level heading is equivalent to h1. Markdown is designed to be a shorthand for HTML and so it has a lot of correspondence. In addition, you can embed HTML in your Markdown file to do things that Markdown itself can't.
However, having said that, if you're writing Markdown, it's best to use the Markdown features wherever possible. Those will be more familiar to people and generally more flexible, substantially easier to read in plain text, and in some cases, such as READMEs and documents on GitHub, certain tags, as well as JavaScript and CSS, are restricted for security purposes. In addition, if you're converting Markdown to another format with a tool like pandoc, much of that HTML won't transfer.
Thus, you'll have more success if you think about Markdown as an independent document language you use to write text, where you may, from time to time, embed HTML to do very specific things, rather than writing a full Markdown-HTML hybrid. When you think about using Markdown as a document language primarily for text, and not a language that has support for formatting documents (e.g. alignments, colours, layout, etc.), then it becomes easier to write Markdown documents that work in a wide variety of contexts. Since your README is designed to basically communicate ideas using text, perhaps with a few images, that approach will help you write more typical Markdown.

Related

Format suitable to export to both html & pdf?

I need to maintain many documents which need to be able to be viewed as 2 different types of format: PDF & HTML. The document will be mostly text, but may contain some images or mathematical formulas.
My current approach is to maintain 2 files for each document. However, this approach is tiresome, as if the content needs to change, I need to modify BOTH versions of the file.
I want to find a way to easily keep both versions of the file in sync. Preferably (but not necessary), the approach should allow me to use tools like git, or svn.
A solution that comes to my mind is to use latex. Represent the document in latex, then export it to HTML/PDF. This way, whenever there is a change, I need only to modify one file (the latex file).
But I have zero experience working with latex. I'm not sure whether latex is suitable for this, I need advice. What do you guys think? Is latex suitable for this task? If not, what alternatives do I have?
First of all,
yes, LaTeX is suitable for this (and it works particularly well with formulæ).
The main processing paths are:
Use pdflatex to create a pdf directly from LaTeX
Use latex2html or tex4ht to convert your LaTeX source to HTML
I am biased (having authored a text book for LaTeX in German language), but I think LaTeX is definitely worth learning.
restructuredText (Python docutils) is good for this. There are a couple of paths from text to PDF; one of them goes through LaTeX and the other one is a pure Python rst2pdf.
If you have a lot of formulas, it might be worth doing it in LaTeX, but restructuredText source is a lot more readable than LaTeX source.
Sounds like a good candidate scenario for working in markdown and using pandoc to convert to both LaTeX and HTML. Formulas can be essentially written in LaTeX (thus making the maintenance of that output painless) and the markdown-to-HTML conversion can be expressed with the --mathjax option to yield proper display in HTML.

How can we implement Jeff Eaton's custom tags strategy in Ruby?

I distinctly recognise the problem described in Jeff Eaton's article The Battle for the Body Field on A List Apart. That of handing clients a CMS that strikes a balance between conceptual simplicity in editing content, and flexibility in the flow and structure of that content. Whilst generating clean, forward-compatible code and responsive layouts.
I'm now convinced that some sort of custom tags are the solution to this problem. Even if they are wrapped in a WYSIWYG editor.
I'd like to keep preprocessing server-side until web-components are more widely, natively supported. And I favour Ruby/Rails for development.
So what libraries are available that would help with preprocessing and expanding custom XML or HTML tags in this way?
XSLT seems too limited. And Radius is perhaps a contender, though it doesn't appear to still be in active development.
I tend to favour markdown because it's extensible and acts as a subgroup of HTML. In Ruby, the main contenders are Redcarpet and kramdown. There are others, but I have not used them.
Redcarpet is mature and solid. It is also highly performant and extensible. You can define your own custom tags and syntax. It allows you to pre-process and post-process content.
It has disadvantages, though. Since it adheres to the markdown standard it can be limiting. I wrote my own figure tag syntax, and found that it was being inserted between paragraph tags, leading to invalid HTML. This is not its own fault. It's how markdown works.
!![figure caption](image_url "img alt text")
An alternative is kramdown, which is written with flexibility in mind. It allows full customisation of your syntax.

using html/css, i would like to automatically generate a bibliography at the bottom of my website akin to latex's \bibliography command

I'll ask my question first, then give some background for those who are interested:
I would like to know if there is a command in html that will automatically generate a bibliography from a .bib file? This means that throughout the text, i would add something like <cite name="Jones2010">, and then at the bottom of the html (or css) file, I would write something like <makebib file="biblist.bib", format="APA">, and a bibliography would be generated using my .bib file, and formated according to the APA style. The functionality would be quite similar to footnotes, except that each footnote is populated according to some script that extracts the information from (essentially) an xml file and outputs the content in the desired format. It is not difficult to imagine somebody creating a tool to do just that, however, my google search skills have not enabled me to find such a tool. It is easy to find tools that convert bib files to html or xml, but that is not sufficient for my needs. I do not desire to publish my entire bib file online. Rather, for each document that I generate, I want several of the entries in the bib file to be included as footnotes. Any pointers will be greatly appreciated.
Now, the reason behind the question:
I have recently begun switching from writing all my manuscripts using latex to writing them using html/css. The advantages of this approach are fast: only 1 file for versioning (instead of .dvi, .ps, .aux, .blg, etc.), it is much smaller to share, other people can edit the html file and compile it much more easily, it is more configurable to my tastes, easier to read on screen, etc. The disadvantage for me, however, is that while I've been writing in latex for years, I've only just begin using html and css for scientific document creating. The main impetus for the switch was MathJaX, which enables me to to embed latex equations in my html files, and therefore, allows me to combine the advantages of latex with the advantages of css. I imagine that nearly all my colleagues will switch away from latex to this simpler format, assuming a few remaining issues get resolved, like ease of creating bibliographies.
Many thanks.
What you're asking isn't possible, unless when you specify html/css you really mean html/css/php or html/css/python or some other combination that includes an actual programming language, rather than just a markup language.
I understand your motivation, I'd love to switch to html instead of latex! However, I suspect an html-based solution would involve so much extra processing added on top to sort out bibliographies etc that the complexity would start approaching that of LaTeX by the time you got it all worked out.
I'd be pleased to be proven wrong on this!
I've done this, in the past, using XSLT and BibTeX. In outline, the steps are
Mark up your document using some convention or other: I used <span class='citation'>Smith99</span>
Write an XSLT script to transform that file into a .aux file with \citation commands in it
Use BibTeX along with a .bst file which spits out HTML rather than LaTeX
Use another XSLT script (or the same one, in a different mode) to pull the bibliography in
It's not quite as fiddly as it sounds, but you can look at how I did it on google code. In particular, see structure.xslt and plainhtml.bst.
If there's a more direct way, I'd be quite interested to hear about it.
Both answers so far are somewhat correct, although not quite what you were asking for. Part of the problem is that the question as it's phrased doesn't necessarily makes sense.
HTML is just markup; you need something to process the markup, be it python, php, ruby, etc.
And you probably want to write in XML (or XHTML), not HTML.
XSLT may work for you (once it's in XML), but remember, an XSLT document that defines a set of rules. You would get an XSLT engine to apply your XSLT rules against your XML document.
You can create an html bibliography from a .bib file using bibtex2html. This package takes a series of command line arguments and extracts the info from the BibTeX source and outputs a file with html markup.
As far as I know you cannot get it to read and parse the html document like the LaTeX \cite command but there are several ways to indicate the references you want. I find that the easiest way is to just maintain a text file of the BibTeX keys I use in my manuscript and then call this using the --citefile option. There is also a tool called bib2bib included that will take search commands.
It is a very flexible package and there are a lot of options so it works in a lot of situations. For example you can get it to omit the <html> headers from the output file so that you can directly paste into an existing html document.
The documentation is useful but make sure you look at the pdf documentation file and the man pages.

What language should I use for editing documents?

Document editors are nice but they have their limitations.
What is a good alternative to them?
I already know HTML and CSS and while they can do the job, they are ill-suited for printed documents.
I was thinking in learning LaTeX, because many scholars use it. But I wonder if someone would recommend another language such as postscript.
LaTeX is fine. You don't want to write postscript by hand.
I’m using LaTeX almost exclusively nowadays, at least for text documents (everything from CV over letters to manuals).
For quick one-off notes, I’m actually using Markdown (without a renderer. I just think that Markdown preserves document structure quite nicely even when used in text-only mode).
For presentations and spreadsheets, I use appropriate applications, though. In particular, I don’t think LaTeX is that well-suited to do the former (depending on your style of presentations, obviously. Mine have next to no text though …).
I finally got a chance to write an entire paper in LaTeX for my final semester of College and found it to be easier than I thought it would be. A couple of the nice things I found about it were
A fairly lightweight syntax for most things (tables being the only real offender, but no one can get text tables right).
An extremely wide array of syntax for doing anything from automatically marking up a chemical formula to writing inline lists.
Beautiful output automatically.
Extremely easy to write modular documents where I might store a chapter in a file and then simply \include{} it in another. One particularly nice use I found for this was to include code that I had written in the document simply by referencing the files.
Wonderful support for footnotes and bibliographic references.
Libraries for just about anything you can imagine.
The major drawbacks are, IMHO:
A lack of any real direction or life in the language. It feels dead, and not because it's done.
A frustrating build process, although there are tools to help with that, from a simple bash script to a full fledged make file.
If you're interested in learning LaTeX, I would recommend starting out by reading the Not So Short Introduction to LaTeX 2e PDF.
However, I decided against using LaTeX for most things that I write these days specifically because it feels dead and has a frustrating build process. I instead switched over to MultiMarkdown, as it is well supported and can be transformed into a large array of other formats, including LaTeX which can then be hand massaged if you really need to in order to get it the format expected by some publication. If you haven't played with MultiMarkdown or Markdown before, then I highly recommend checking them out. The syntax is extremely lightweight and natural, even compared to LaTeX. I find that except for some of the higher level typographical constructs, MultiMarkdown supports everything I need on a regular basis.
My 2 cents.
It depends on what you want to do. If you are planning to write a formal document, maybe for printing too, just go for LaTex.
Not difficolt as it may appear at the very beginning but professional and fulfilling.
If Web is your goal, go for HTML / CSS.
OpenOffice or Word would do the trick in most cases; do not underestimate them, if you are going to use them (example for job) take time to learn them.
To expand on zzzzBov's commmment, LaTeX is SUPPOSED to allow the writer to concentrate on the content and allow the compiler/documentclass to handle formatting (and that usually is true). If you use HTML/CSS to format you will probably be spending more time (rather than less) doing formatting. Imagine that the LaTeX documentclass is the CSS, only it is already written for you, and your LaTeX source is the content, only the tags are more functional (such as italics or equations) than for patching between the HTML and the CSS (<div ...>). I recommend the LaTeX wikibook as an easy way to start, and the short-math-guide, it if you need mathematics. Enjoy!

Why do I need Markdown?

Why do I need a Markdown with a front edit editor like WMD? What does the markdown do to the content that’s sent from the WMD editor?
How does Markdown store the content in the backend? Is it the same way like *bold* or in some other format? Why can’t I just do an html encode?
Sorry if I sounded very naïve.

			
				
It's probably helpful to take a step back and ask some of the larger questions. The issue Markdown is trying to solve is that of rich editing in the browser. Consider this: At some point, for any piece of software to enable rich text it has to describe the richness in a some manner, however that may be.
We could call that description of richness (by description of richness I mean like "this bit of text is bold" or "this bit of text is a hyperlink), we could call that description of richness "markup" -- it marks up the text with meta "richness".
Implementations of rich text can take on two approaches, either a.) hide the markup from the user or b.) let them have access to the markup.
For those who choose to hide it, the end result is very often WYSIWYG. The user is oblivious to what is happening behind the scenes. The editor takes care of the details. Think MS Word as an example. No one manipulates the Word markup format as a regular end user.
For implementations which choose to expose the markup, a markup language is then in order to allow users to interacat with it. Such markup languages would be things like HTML doing <tag> or BB code for example, doing things like [tag].
Markdown is one such of these languages.
As opposed to the former types I mentioned, Markdown has tried to design itself so that the markup renders common ASCII people already use. For example, it's common for people to asterisk their text to set it off, *important*, and this notation in Markdown is an indicator of italic.
In regards to storage, as Stephan pointed out, the system will most likely store the raw markdown, because the user will most likely need to have the possibility of editing, and the original markdown can be recalled for that purpose.
In most of the systems I've built, I store the markdown, and then normalize it to a 2nd field which caches the HTML rendering of the markdown. This way I don't have to do markdown->HTML rendering for every markdown field. It takes a little more space, but I'd rather the user have a faster response than use less DB storage space.
Care should also be taken when accepting Markdown from the browser, as it can easily contain <script> tags which need to be filtered out. Most markdown implementations will also recognize HTML intermingled with Markdown formatting, as so to be safe, you need to make sure your inputs and caches are sanitized properly.
The reason for using an alternate encoding system other than HTML is for security
Markdown and other such wiki style encoding systems do not usually support scripting languages
HTML supports scripting languages in many ways (
The two main security issues are:
Malware criminals use scripts in user generated content to attempt malware actions on the content readers computer by scripting to access known security holes
Free loaders using scripts to subvert the rest of the site by changing the content frame or styles i.e. ads, menu's, logos etc. This can also be criminal behaviour if not just annoying
By using an intermediate language such as Markdown you have total control on the rendered output
Filtering HTML is possible, but is also complex and risky
The other significant reason for an alternate encoding system is enforcement of style. Normal HTML has too many options. By limiting the available options, users can only use certain styles. The usually makes for cleaner looking and more readable content (compare SO to Ebay)
The main reason for using Markdown is the readability of a marked text. For instance, you can send it in a plain-text email and the reader will still understand the emphiasis, bullets, the text will be divided in paragraphs et cetera.
When you ask about storing data, it depends. If you enable Markdown in the WordPress blog engine, it stores data as the user has input it - in Markdown. In Stack Overflow, however, it seems like the data is stored as HTML. At least, the "Stack Overflow data dumps" contain HTML, not Markdown (I've seen people complaining) that they have to convert it back).
If you use the WMD editor, you can show the user how the outputs will look like after being converted to HTML. Even though Markdown syntax is really simple, it is not hard to make mistakes. Hence, it is best to show users the output.
Another reason for using Markdown instead of a WYSIWIG control - a WYSIWIG control allows the user to use HTML in data you are displaying on your web page. So, you have to be the one who decides when there is simply incorrect HTML and when it is an evil XSS/CSRF/whatever injection. In Markdown, you simply convert *something* to <b>something</b>, remove any unknow HTML elements and you're done.