Include data file as Assemble.io/YAML Front Matter variable - json

I have a base.hbs file that does a {{#foreach}} loop through some items in an array. That array comes from a YAML/FM variable, set on a per-page basis.
On some pages, that array could be huge, so I'd like to keep it stored in separate JSON file. Is it possible to set a YAML/FM variable to the data contained within a JSON file stored inside my configured data folder?
I've tried to do:
---
my-array: {{ my-data-file }}
---
...but with no success. It doesn't fail the build, but nothing is there.
Update
Alright, so messing around a bit more... turns out even though Assemble via Handlebars can recognize {{ my-data-file }}, YAML/FM doesn't like the dashes. It would fail with 'my' is not defined. So I changed the filename to use underscores, using:
---
my-array: <%= my_data_file %>
---
I'm keeping this up in case someone else runs into the same issue.

Related

How to use different config files for different environments in airflow?

I'm using SparkKubernetesOperator which has a template_field called application_file. Normally on giving this field a file's name, airflow reads that file and templates the jinja variable in it (just like script field in the BashOperator).
So this works and the file information is shown in the Rendered Template tab with the jinja variables replaced with the correct values.
start_streaming = SparkKubernetesOperator(
task_id='start_streaming',
namespace='spark',
application_file='user_profiles_streaming_dev.yaml',
...
dag=dag,
)
I want to use different files in the application_file field for different environments
So I used a jinja template in the field. But when I change the application_file with user_profiles_streaming_{{ var.value.env }}.yaml, the rendered output is just user_profiles_streaming_dev.yaml and not the file contents.
I know that recursive jinja variable replacement is not possible in airflow but I was wondering if there is any workaround for having different template files.
What I have tried -
I tried using a different operator and doing xcom push to read the file contents and sending it to SparkKubernetesOperator. While this was good for reading different files based on environment, it did not solve the issue of having the jinja variable replaced.
I also tried making a custom operator which inherits the SparkKubernetesOperator and has a template_field applicaton_file_name thinking that jinja replacement will take place 2 times, but this didn't work too.
I made an env file which had the environment details (dev/prod). Then I added this code to the start of my dag file
ENV = None
with open('/home/airflow/env', 'r') as env_file:
value = env_file.read()
if value == None or value == "":
raise Exception("ENV FILE NOT PRESENT")
ENV = value
and then accessed the environment in the code like this
submit_job = SparkKubernetesOperator(
task_id='submit_job',
namespace="spark",
application_file=f"adhoc_{ENV}.yaml",
do_xcom_push=True,
dag=dag,
)
This way I could have separate dev and prod files.

ElectronJs: What does curly braces '{}' mean in package json

I was going through some electron package.json examples where I found some interpolations like given below:
"updater": {
"urls": {
"darwin": "{{& SQUIRREL_UPDATES_URL }}/update/%CHANNEL%/darwin?version=%CURRENT_VERSION%",
"win32": "{{& SQUIRREL_UPDATES_URL }}/update/%CHANNEL%/win32",
"linux": "{{& SQUIRREL_UPDATES_URL }}/update/%CHANNEL%/linux"
}
}
"piwik": {
"serverUrl": "{{& PIWIK_SERVER_URL }}"
},
"sentry": {
"dsn": "{{& SENTRY_DSN_PRIVATE }}"
}
I do not really know the following:
what does this {{}} mean in json
where does these variable exist
what does & mean in {{}} "{{& SENTRY_DSN_PRIVATE }}"
If anyone can explain then it would be really kind. Many thank in advance.
I guess you are talking about Whatsie and it's package.json.
If you take a look at one of the Gulp tasks located in the file tasks/compile.coffee, you'll be able to see the lines (in CoffeeScript):
# Move package.json
gulp.task 'compile:' + dist + ':package', ['clean:build:' + dist], ->
gulp.src './src/package.json'
.pipe mustache process.env
.pipe gulp.dest dir
Here the actual package.json is being passed to a mustache template engine - it receives a template as a first argument (package.json here acts like a template) and a data to be inserted in the template as a second argument - process.env.
As package.json acts like a template for mustache, you can use mustache syntax in it.
Curly braces {{}} are the part of it, they are used as placeholders which will be replaced by the actual data, when templates are being compiled. In the mustache docs you can also find a line:
You can also use & to unescape a variable: {{& name}}
So {{& name}} is to prevent values from being escaped. Otherwise, if you don't use & and values for output have some dangerous characters , they will be replaced by more secure ones (originally to prevent XSS in templates), as a result it will transform initial value, which is not always what you want. In this case author wants to preserve original value.
Going back to process.env - it is an object which gives access to environment variables in Node.JS. There is a file in repository .env-example with an example of env variables developer has to set in order to have the application work differently in different environments (for example on local machine or CI server). Names of some of the variables in this file are the ones that are used in a package.json as template placeholders - I guess author of the app uses all of this to simplify a build process for different environments.

SSIS Foreachloop container inquiry

I have searched for this question everywhere and can't seem to find it so here we go.
I set up a Foreachloop container which is using the "Foreach File Enumerator" and in the Files section where you're naming the format of the file as well as a wildcard if you want to return only certain format files, I have in as F_*.csv which works fine, however, I can't seem to find a way to also return files who's name begin with D_. I'm aware this can get done by having 2 separate Foreachloop containers but is there any way it can be done in the same one so that it checks for both those files?
The reason I need this is because there are other csv files in that folder which don't begin with a D_ nor an F_ so I'm trying to exclude those.
Thanks in advance !
I do not believe you can specify a regex in the "Files" criteria. I would try to read in all files matching the broadest criteria ("*.csv") and then in the foreach loop, evaluate the filename into a variable and test the variable in the control flow. Add a sequence container to perform the desired ETL. On the connector to the sequence container, add a constraint for the filename test, if the test fails, do nothing, if it passes, move to the sequence container and perform actions.

PanDoc: How to assign level-one Atx-style header (markdown) to the contents of html title tag

I am using PanDoc to convert a large number of markdown (.md) files to html. I'm using the following Windows command-line:
for %%i in (*.md) do pandoc -f markdown -s %%~ni.md > html/%%~ni.html
On a test run, the html looks OK, except for the title tag - it's empty. Here is an example of the beginning of the .md file:
#Topic Title
- [Anchor 1](#anchor1)
- [Anchor 2](#anchor2)
<a name="anchor1"></a>
## Anchor 1
Is there a way I can tell PanDoc to parse the
#Topic Title
so that, in the html output files, I will get:
<title>Topic Title</title>
?
There are other .md tags I'd like to parse, and I think solving this will help me solve the rest of it.
I don't believe Pandoc supports this out-of-the-box. The relevant part of the Pandoc documentation states:
Templates may contain variables. Variable names are sequences of alphanumerics, -, and _, starting with a letter. A variable name surrounded by $ signs will be replaced by its value. For example, the string $title$ in
<title>$title$</title>
will be replaced by the document title.
It then continues:
Some variables are set automatically by pandoc. These vary somewhat depending on the output format, but include metadata fields (such as title, author, and date) as well as the following:
And proceeds to list a bunch of variables (none of which are relevant to your question). However, the above quote indicates that the title variable is a metadata field. The metadata field can be defined in a pandoc_title_block, a yaml_metadata_block, or passed in as a command line option.
The docs note that:
... you may also keep the metadata in a separate YAML file and pass it to pandoc as an argument, along with your markdown files ...
So you have a couple options:
Edit each document to add metadata defining the title for each document (this could possibly be scripted).
Write your script to extract the title (perhaps a regex which looks for #header in the first line) and passes that in to Pandoc as a command line option.
If you intend to start including the metadata in new documents you create going forward, then the first option is probably the way to go. Run a script once to batch edit your documents and then your done. However, if you have no intention of adding metadata to any documents, I would consider the second option. You are already running a loop, so just get the title before calling Pandoc within your loop (although I'm not sure how to do that in a windows script).

Jekyll Filename Without Date

I want to build documentation site using Jekyll and GitHub Pages. The problem is Jekyll only accept a filename under _posts with exact pattern like YYYY-MM-DD-your-title-is-here.md.
How can I post a page in Jekyll without this filename pattern? Something like:
awesome-title.md
yet-another-title.md
etc.md
Thanks for your advance.
Don't use posts; posts are things with dates. Sounds like you probably want to use collections instead; you get all the power of Posts; but without the pesky date / naming requirements.
https://jekyllrb.com/docs/collections/
I use collections for almost everything that isn't a post. This is how my own site is configured to use collections for 'pages' as well as more specific sections of my site:
I guess that you are annoyed with the post url http://domaine.tld/category/2014/11/22/post.html.
You cannot bypass the filename pattern for posts, but you can use permalink (see documentation).
_posts/2014-11-22-other-post.md
---
title: "Other post"
date: 2014-11-22 09:49:00
permalink: anything-you-want
---
File will be anything-you-want/index.html.
Url will be http://domaine.tld/anything-you-want.
What I did without "abandoning" the posts (looks like using collections or pages is a better and deeper solution) is a combination of what #igneousaur says in a comment plus using the same date as prefix of file names:
Use permalink: /:title.html in _config.yml (no dates in published URLs).
Use the format 0001-01-01-name.md for all files in _posts folder (jekyll is happy about the file names and I'm happy about the sorting of the files).
Of course, we can include any "extra information" on the name, maybe some incremental id o anything that help us to organize the files, e.g.: 0001-01-01-001-name.md.
The way I solved it was by adding _plugins/no_date.rb:
class Jekyll::PostReader
# Don't use DATE_FILENAME_MATCHER so we don't need to put those stupid dates
# in the filename. Also limit to just *.markdown, so it won't process binary
# files from e.g. drags.
def read_posts(dir)
read_publishable(dir, "_posts", /.*\.markdown$/)
end
def read_drafts(dir)
read_publishable(dir, "_drafts", /.*\.markdown$/)
end
end
This overrides ("monkey patches") the standard Jekyll functions; the defaults for these are:
# Read all the files in <source>/<dir>/_drafts and create a new
# Document object with each one.
#
# dir - The String relative path of the directory to read.
#
# Returns nothing.
def read_drafts(dir)
read_publishable(dir, "_drafts", Document::DATELESS_FILENAME_MATCHER)
end
# Read all the files in <source>/<dir>/_posts and create a new Document
# object with each one.
#
# dir - The String relative path of the directory to read.
#
# Returns nothing.
def read_posts(dir)
read_publishable(dir, "_posts", Document::DATE_FILENAME_MATCHER)
end
With the referenced constants being:
DATELESS_FILENAME_MATCHER = %r!^(?:.+/)*(.*)(\.[^.]+)$!.freeze
DATE_FILENAME_MATCHER = %r!^(?>.+/)*?(\d{2,4}-\d{1,2}-\d{1,2})-([^/]*)(\.[^.]+)$!.freeze
As you can see, DATE_FILENAME_MATCHER as used in read_posts() requires a date ((\d{2,4}-\d{1,2}-\d{1,2})); I put date: 2021-07-06 in the frontmatter.
I couldn't really get collections to work, and this also solves another problem I had where storing binary files such as images in _drafts would error out as it tried to process them.
Arguably a bit ugly, but it works well. Downside is that it may break on update, although I've been patching various things for years and never really had any issues with it thus far. This is with Jekyll 4.2.0.
I wanted to use posts but not have the filenames in the date. The closest I got was naming the posts with an arbitrary 'date' like 0001-01-01cool-post.md and then use a different property to access the date.
If you use the last-modified-at plugin - https://github.com/gjtorikian/jekyll-last-modified-at - then you can use page.last_modified_at in your _layouts/post.html and whatever file you are running {% for post in site.posts %} in.
Now the dates are retrieved from the last git commit date (not author date) and the page.date is unused.
In the json schema for the config file are actually some useful information. See below code block for some examples.
I have set it to /:categories/:title. That drops the date and file extension, while preserving the categories.
I still use a proper date for the file name because you can use that date in your templates. I.e. to display the date on a post using {{ page.date }}.
{
"global-permalink": {
"description": "The global permalink format\nhttps://jekyllrb.com/docs/permalinks/#global",
"type": "string",
"default": "date",
"examples": [
"/:year",
"/:short_year",
"/:month",
"/:i_month",
"/:short_month",
"/:day",
"/:i_day",
"/:y_day",
"/:w_year",
"/:week",
"/:w_day",
"/:short_day",
"/:long_day",
"/:hour",
"/:minute",
"/:second",
"/:title",
"/:slug",
"/:categories",
"/:slugified_categories",
"date",
"pretty",
"ordinal",
"weekdate",
"none",
"/:categories/:year/:month/:day/:title:output_ext",
"/:categories/:year/:month/:day/:title/",
"/:categories/:year/:y_day/:title:output_ext",
"/:categories/:year/:week/:short_day/:title:output_ext",
"/:categories/:title:output_ext"
],
"pattern": "^((/(:(year|short_year|month|i_month|short_month|long_month|day|i_day|y_day|w_year|week|w_day|short_day|long_day|hour|minute|second|title|slug|categories|slugified_categories))+)+|date|pretty|ordinal|weekdate|none)$"
}
}