How can I add header metadata without adding the <h1>? - html

I'm writing something in markdown and converting it to html with pandoc, but when I add the title variable in the yaml header, it also adds an <h1> to the top of the document, which I don't want. In the pandoc documentation it says to use the title-meta variable, but it still says
[WARNING] This document format requires a nonempty <title> element.
Is there a way to set the title without adding the title block?
command I'm using:
pandoc -s "file.md" -o "file.html"`
output of pandoc --version:
pandoc 2.10.1
Compiled with pandoc-types 1.21, texmath 0.12.0.2, skylighting 0.8.5
Default user data directory: C:\Users\noah\AppData\Roaming\pandoc
Copyright (C) 2006-2020 John MacFarlane
Web: https://pandoc.org
This is free software; see the source for copying conditions.
There is no warranty, not even for merchantability or fitness
for a particular purpose.

One can set an explicit title with --metadata=title="My title" while simultaneously preventing the output of the <h1> and <header> elements by setting the template variable title to an empty string:
pandoc --metadata=title="Fancy title" --variable=title="" ...

Related

Add proper syntax name to code blocks when converting from HTML to Markdown with Pandoc

I need to convert some HTML to Markdown with Pandoc. All is fine except the code blocks in my document are not converted properly. I need them to appear in the resulting Markdown document as backtick-code blocks with syntax definition.
For example, if I have such source HTML:
<pre class="python"><code>
def myfunc(param):
'''Description of myfunc'''
return do_something(param)
</code></pre>
I want Pandoc to convert it into:
```python
def myfunc(param):
'''Description of myfunc'''
return do_something(param)
```
But what I am getting is:
``` {.python}
def myfunc(param):
'''Description of myfunc'''
return do_something(param)
```
It's almost there, but the syntax definition is in curly braces and with a dot, which is not recognised by my Markdown parser. How can I get ```python instead of ``` {.python} when converting HTML to Markdown?
I have control over the source HTML, so I can change it the way needed. If there's an option to insert "raw markdown" into the HTML which will be ignored by Pandoc, that would work for me too, I can embed those blocks into the source HTML the way I need, but I need to tell Pandoc not to touch them. But I can't find such option in the docs.
This behavior is governed by the fenced_code_attributes extension. It is enabled by default; disabling it will give your desired output:
pandoc --to=markdown-fenced_code_attributes ...

Pandoc metadata not appearing in default HTML template

I'm converting org and markdown files to HTML using pandoc. I want to set metadata such as the title, subtitle, and author tags in an external YAML file and have them display using a template. However I can't get anything to appear beyond the normal body conversion.
I'm using the default HTML template. I've run the conversion concatenating the YAML config beforehand:
pandoc -t html -o output.html metadata.yaml input.md
I also tried including the yaml_metadata_block extension:
pandoc -t html+yaml_metadata_block -o output.html metadata.yaml input.md
Also, I've tried setting the variables in the command itself:
pandoc -t html -o output.html -V title="my title" input.md
My YAML file looks like this:
---
title: "my title"
subtitle: "my subtitle"
author: "the author"
...
Inspecting the default html template with pandoc -D html, it looks like when title etc. are defined, it'll place in a header block:
$if(title)$
<header>
<h1 class="title">$title$</h1>
$if(subtitle)$
<p class="subtitle">$subtitle$</p>
$endif$
$for(author)$
<p class="author">$author$</p>
$endfor$
$if(date)$
<p class="date">$date$</p>
$endif$
</header>
But in every case, the html file only contains the converted text from input.md. I think this is the $body$ line defined in the default template.
How can I get these fields to appear in my html document?
My goodness, all I was missing is the -s attribute!
from the man page:
-s, --standalone
Produce output with an appropriate header and footer (e.g. a standalone HTML, LaTeX, TEI, or RTF file, not a fragment). This option is set automat‐
ically for pdf, epub, epub3, fb2, docx, and odt output.
Thus the following command works as expected
pandoc -s -t html -o output.html metadata.yaml input.md

pandoc: Convert GitHub-flavoured MarkDown containing mixed html and markdown to html

My markdown was created according to the style from this top-result cheatsheet with HTML directives, using this commmand:
pandoc -f gfm -t html --atx-headers -s -o out.html in.md
However, the generated html always ignores titles that contains the following HTML code above them, leaving tons of ###, #### in my output HTML. My titles look like these:
# H1
<a name=toc-anchor-h2 />
## H2
<a name=toc-anchor-h3 />
### H3
<a name=toc-anchor-h4 />
#### H4
Then H1 works fine, but the # in the rest levels are all seen by pandoc as plain text. How should I solve this problem?
The headers must be preceded by a blank line. The missing blank line is causing the Markdown parser to not recognize them as headers. Therefore, edit your document to the following:
# H1
<a name=toc-anchor-h2 />
## H2
<a name=toc-anchor-h3 />
### H3
<a name=toc-anchor-h4 />
#### H4
Of, if you are concerned that that moves the anchors too far away from the intended target, include them inline:
# H1
## <a name=toc-anchor-h2 />H2
### <a name=toc-anchor-h3 />H3
#### <a name=toc-anchor-h4 />H4
Or, as you are using Pandoc, you could use one of the many Pandocs extensions which assigns identifiers directly to each header.
As it turns out, Pandoc's gfm variant of Markdown (which you are using) already includes the auto_identifiers extension. As the name implies, the auto_identifiers extension will cause id attributes to be auto-generated for every header. As a reminder, assigning an id attribute to an HTML element has the same effect as defining an anchor; you can link to either with a hash fragment. Therefore, you could simply remove your anchors and use the auto-generated ids which have already been assigned to the headers themselves.
However, if you would like to define your own custom id attributes for each header, then you may want to enable the header_attributes extension and alter your Markdown as follows:
# H1
## H2 {#toc-anchor-h2}
### H3 {#toc-anchor-h3}
#### H4 {#toc-anchor-h4}
which would generate the following HTML:
<h1 id="h1">H1</h1>
<h2 id="toc-anchor-h2">H2</h2>
<h3 id="toc-anchor-h3">H3</h3>
<h4 id="toc-anchor-h4">H4</h4>
Note that the "H1" header has an auto id assigned (based upon the text content of the element), while the remaining headers have the custom ids assigned to them.
One word of caution regarding the header_attributes extension: The syntax for defining the custom ids is non-standard and not supported by most Markdown implementations. If you want portable Markdown, then you should probably stick to the auto-generated ids as that does not require any non-standard markup in your documents.
Update: Note that according to the docs, the header_attributes extension is not compatible with gfm. Therefore, you wouldn't be able to use that extension. However, you get auto_identifiers by default. If you want custom identifiers, the you would need to use the custom raw HTML anchors. Of course that gives you the added benefit of a portable Markdown document.

Using the Author field of R Markdown in footer.html

R Markdown allows to add a footer to your html output. The YAML header allows to give an author name using a specific field.
I would like to use this author name in my footer.html file, but cannot figure out how to achieve that.
Here is a minimal example:
fic.rmd:
---
title: "title"
author: "Mister-A"
output:
html_document:
include:
after_body: footer.html
---
content
And in the same folder the footer.html file:
I am - #author-name-field-that-I-don't-konw-how-to-get -
Any help or advice would me much appreciated. Thank you very much.
If you want to be able to use the YAML parameters within sections of the report, you need to alter the base pandoc template. You can find all of them here
The basic structure of making this work is to put the variable surrounded by dollar signs to use the YAML variable in the output document. So for example $author$ is required in this case.
Solution
We can create a copy of the pandoc template for HTML in our local directory using the following command. This is the same file as here.
# Copies the RMkarkdown template to the local directory so we can edit it
file.copy(rmarkdown:::rmarkdown_system_file("rmd/h/default.html"), to = "template.html")
In the template.html, we need to add the pandoc tags. To add a footer, we want to add code to the buttom of the document. This is line 457 in the current template but this may change in future versions, so we want to put it after the include-after tag:
$for(include-after)$
$include-after$
$endfor$
<hr />
<p style="text-align: center;">I am $author$</p>
$if(theme)$
$if(toc_float)$
</div>
</div>
$endif$
Finally, the R Markdown file looks like:
---
title: "title"
author: "Mister-A"
output:
html_document:
template: template5.html
---
This is some text
As a possible extension of this, you may want to check out this post on designing a stylish footer.

Pandoc HTML variables: `quotes` and `math`

Pandoc default HTML template contains these two variables:
quotes,
math.
How are they supposed to be used?
More specifically I see that quotes sets the values for the tag <q>. Is this tag used in markdown to HTML conversion?
tl;dr: they seem to be mostly obsolete legacies from previous versions of pandoc
quotes
A little archeology of pandoc commits shows that 'quotes' was added when pandoc switched from using <q> tags to directly adding quotes signs. A new option, --html-q-tags was added to keep the previous behavior: the option wraps quotes in <q> and sets quotes to true so that a piece of css code is added as explained in the html template. See this commit to pandoc and this commit to pandoc-templates. See the behavior with the following file:
"hello world"
This:
pandoc test.md -t html --smart --standalone
Produces (skipping the usual head, with no css affecting <q>)
<p>“hello world”</p>
While this
pandoc test.md -t html --standalone --html-q-tags --smart
produces (skipping the usual header)
<style type="text/css">q { quotes: "“" "”" "‘" "’"; }</style>
</head>
<body>
<p><q>hello world</q></p>
</body>
You have to use --smart though.
math
It looks like this was introduced to include math rendering scripts inside the standalone file. See this commit from 2010. I think some command-line options picking non-(currently)-default math rendering systems, like --mathml, sets this variable to a value that actually makes sense (like copying the math rendering scripts). Try:
pandoc -t html --mathml
For the quotes variable, see #scoa.
As regards the math variable, I found what follows.
When using MathML, that is the option --mathml, the code block:
$if(math)$
$math$
$endif$
in the default HTML conversion template adds a portability script to the HTML output.
Anyway, Chrome and Edge do not currently support MathML and Firefox seems to support it without this script.
So, for a custom template, removing the $if(math)$ ... code block will not affect MathML rendering.
When using MathJax, that is the option --mathjax, $if(math)$ ... adds to the HTML output the script block:
<script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_CHTML-full" type="text/javascript"></script>
This is always necessary to render the maths formulae.
When using the --latexmathml, a giant script, converting the LaTeX style math into MathML, is inserted by the $if(math)$ ... code block. Without this code block in the conversion template, the script is not inserted and the maths can't be rendered.