Pandoc fails to embed metadata from the supplied YAML file - html

I need to convert some .xhtml files to regular .html (html5) with pandoc, and during the conversion I would like to embed some metadata (supplied via a YAML file) in the final files.
The conversion runs smoothly, but any attempt to embed the metadata invariably fails.
I tried many variations of this command, but it should be something like:
pandoc -s -H assets/header -c css/style.css -B assets/prefix -A assets/suffix --metadata-file=metadata.yaml input_file -o output_file --to=html5
The error I get is:
pandoc: unrecognized option `--metadata-file=metadata.yaml'
Try pandoc --help for more information.
I really don't get what's wrong with this, since I found this option in the pandoc manual
Any ideas?

Your pandoc version is too old. Update to pandoc 2.3 or later.

Related

Converting a Swagger YAML file to JSON from the command line

I'd like to convert a Swagger YAML file to JSON from the command line. The plan is to use this command line during a CI job. I've searched on google and found many solutions, but most of them use Python or Ruby, which I'd prefer not to use. For example: http://www.commandlinefu.com/commands/view/12218/convert-yaml-to-json
I'd like to do this without using Python or Ruby, and
I'd also like to be able to control the leading whitespace when formatting the JSON to match exactly the JSON that is output from Swagger's editor.swagger.io editor, when you choose File -> Download JSON
All this means is that I'd like the whitespace padding to be four spaces, like so:
{
"swagger": "2.0",
"info": {
"title": "API TITLE",
I haven't tried the Python method in the link above, but the Ruby method uses two space whitespace padding. Perhaps there is a way to control that, but I don't want to use Ruby or Python in this solution anyway.
I'm sure that there are many "correct" answers to this question. I am looking for the solution that is most elegant with the least number of dependencies. Ideally, a diff of the resulting JSON file against a JSON file generated by the editor.swagger.io should be empty.
I think that you are looking for the swagger-codegen (now OpenApi-generator) functionality:
Running
swagger-codegen generate -i swagger.yaml -l swagger
will out put a swagger.json in the same location.
Update For CI:
If you can install it on your build machine- good for you.
If you can't - the github page has a link to a docker image with a nodejs server is available (to convert using a curl command as suggested in a different answer).
Update For Docker:
If you use Docker, try swaggerapi/swagger-codegen-cli, there is an example for docker-compose that might help a few answers down by Fabian & ckeeney.
Update about OpenApi:
This question is about swagger, and a few years old. If you're just starting to use Swagger you should switch to OpenApi instead, and if you have existing swagger files, i suggest migrating.
Using yamljs:
yaml2json swagger.yaml -p -i4
The output from this command diff'd against the JSON output from editor.swagger.io produces an empty diff.
This is indeed what I'm looking for, but it brings in a huge dependency (node). I'm hoping for something even lighter, yet equally as elegant as this.
swagger-codegen cli interface
As Liel has already pointed out, you can run
swagger-codegen generate -i swagger.yaml -l swagger
Docker
If you use Docker, then I suggest you try swaggerapi/swagger-codegen-cli.
You can generate a json file using docker with the following command:
docker run -v ./docs:/docs swaggerapi/swagger-codegen-cli generate -i /docs/swagger.yaml -l swagger -o /docs
I like to setup a docker-compose.yml to "alias" this command for easy reuse:
version: "2"
services:
gen-swagger:
volumes:
- ./docs:/docs
image: swaggerapi/swagger-codegen-cli
command: generate -i /docs/swagger.yaml -l swagger -o /docs
And now I can just run docker-compose run gen-swagger
For version swagger-codegen 3.0.4
Use
swagger-codegen generate -i my_yaml.yaml -l openapi
to get a .json.
Another possibility to convert a swagger.yml file to swagger.json is a NPM package called swagger-cli.
npm install -g swagger-cli
Then you can convert a yml to json file:
swagger-cli bundle -o api-spec.json api-spec.yml
You can use the online swagger codegen project to do this:
curl -X POST --header "Content-Type: application/json" --header "Accept: application/json" -d "{
\"spec\": {}
}" "https://generator.swagger.io/api/gen/clients/swagger-yaml"
Put the value of your swagger definition in the spec object. You'll get a link to download the converted & validated spec, in yaml format.
For options, take a look here:
http://generator.swagger.io/
I'd use https://openapi-generator.tech/
It's an npm install (I just used it locally npm install #openapitools/openapi-generator-cli) and then
npx #openapitools/openapi-generator-cli generate -i source.yaml -g openapi -o outputdir
For a gradle with Kotlin, i've wrote in my build.gradle.kts:
import com.fasterxml.jackson.databind.JsonNode
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.dataformat.yaml.YAMLFactory
import java.nio.file.Path
and then in a some task like compileJavacode for convertation:
val compileJava: Task by tasks.getting {
val openApiDir = "${rootProject.projectDir}/openapi"
val json: JsonNode? = ObjectMapper(YAMLFactory())
.readTree(Path.of("$openApiDir/openapi.yaml").toFile())
ObjectMapper().writerWithDefaultPrettyPrinter()
.writeValue(Path.of("$openApiDir/openapi.json").toFile(), json)
}

Use Pandoc to convert tex file

I try to use pandoc to convert my tex file in html or epub. It is not a complex Latex file with Math formule. It is something like a book.
But I have a problem. When I convert the file in pdf with pdflatex, the all file is ok. But when I use
pandoc book.tex -s --webtex -o book.html
or
pandoc -S book.tex -o book.epub
It is as if there was no compilation.. << are not replaced by «. Each command, like \emph{something}, are just ignored and the word is delete from the paragraph.
In fact, it is as if I had made a simple copy and paste, without commands.
In older versions of pandoc, I think you needed to tell it the input format, otherwise it would assume it was markdown.
pandoc -f latex book.tex -S -o book.epub

Convert HTML with missing external references to epub?

I have save the web page via "save as..." in the browser as HTML file (single-file) on the disk.
http://pedrokroger.net/2012/10/using-sphinx-to-write-books/
Now, I'd like to convert it to epub.
pandoc -f html -t epub -S -R -s Using\ Sphinx\ to\ Write\ Technical\ Books\ -\ Pedro\ Kroger.html -o Using\ Sphinx\ to\ Write\ Technical\ Books\ -\ Pedro\ Kroger.epub
But pandoc throws an error:
pandoc: /images/pages/profile.png: openBinaryFile: does not exist (No such file or directory)
Is there a option to tell pandoc to ignore all external references and just convert the bare text?
This would be very convenient!

Recursive directory parsing with Pandoc on Mac

I found this question which had an answer to the question of performing batch conversions with Pandoc, but it doesn't answer the question of how to make it recursive. I stipulate up front that I'm not a programmer, so I'm seeking some help on this here.
The Pandoc documentation is slim on details regarding passing batches of files to the executable, and based on the script it looks like Pandoc itself is not capable of parsing more than a single file at a time. The script below works just fine in Mac OS X, but only processes the files in the local directory and outputs the results in the same place.
find . -name \*.md -type f -exec pandoc -o {}.txt {} \;
I used the following code to get something of the result I was hoping for:
find . -name \*.html -type f -exec pandoc -o {}.markdown {} \;
This simple script, run using Pandoc installed on Mac OS X 10.7.4 converts all matching files in the directory I run it in to markdown and saves them in the same directory. For example, if I had a file named apps.html, it would convert that file to apps.html.markdown in the same directory as the source files.
While I'm pleased that it makes the conversion, and it's fast, I need it to process all files located in one directory and put the markdown versions in a set of mirrored directories for editing. Ultimately, these directories are in Github repositories. One branch is for editing while another branch is for production/publishing. In addition, this simple script is retaining the original extension and appending the new extension to it. If I convert back again, it will add the HTML extension after the markdown extension, and the file size would just grow and grow.
Technically, all I need to do is be able to parse one branches directory and sync it with the production one, then when all changed, removed, and new content is verified correct, I can run commits to publish the changes. It looks like the Find command can handle all of this, but I just have no clue as to how to properly configure it, even after reading the Mac OS X and Ubuntu man pages.
Any kind words of wisdom would be deeply appreciated.
TC
Create the following Makefile:
TXTDIR=sources
HTMLS=$(wildcard *.html)
MDS=$(patsubst %.html,$(TXTDIR)/%.markdown, $(HTMLS))
.PHONY : all
all : $(MDS)
$(TXTDIR) :
mkdir $(TXTDIR)
$(TXTDIR)/%.markdown : %.html $(TXTDIR)
pandoc -f html -t markdown -s $< -o $#
(Note: The indented lines must begin with a TAB -- this may not come through in the above, since markdown usually strips out tabs.)
Then you just need to type 'make', and it will run pandoc on every file with a .html extension in the working directory, producing a markdown version in 'sources'. An advantage of this method over using 'find' is that it will only run pandoc on a file that has changed since it was last run.
Just for the record: here is how I achieved the conversion of a bunch of HTML files to their Markdown equivalents:
for file in $(ls *.html); do pandoc -f html -t markdown "${file}" -o "${file%html}md"; done
When you have a look at the script code from the -o argument, you'll see it uses string manipulation to remove the existing html with the md file ending.

Wget recognizes some part of my URL address as a syntax error

I am quite new with wget and I have done my research on Google but I found no clue.
I need to save a single HTML file of a webpage:
wget yahoo.com -O test.html
and it works, but, when I try to be more specific:
wget http://search.yahoo.com/404handler?src=search&p=food+delicious -O test.html
here comes the problem, wget recognizes &p=food+delicious as a syntax, it says: 'p' is not recognized as an internal or external command
How can I solve this problem? I really appreciate your suggestions.
The & has a special meaning in the shell. Escape it with \ or put the url in quotes to avoid this problem.
wget http://search.yahoo.com/404handler?src=search\&p=food+delicious -O test.html
or
wget "http://search.yahoo.com/404handler?src=search&p=food+delicious" -O test.html
In many Unix shells, putting an & after a command causes it to be executed in the background.
Wrap your URL in single quotes to avoid this issue.
i.e.
wget 'http://search.yahoo.com/404handler?src=search&p=food+delicious' -O test.html
if you are using a jubyter notebook, maybe check if you have downloaded
pip install wget
before warping from URL