How to Convert a folder of csv files to json files within Jekyll - jekyll

Is it possible to convert a folder of csv files to json as part of a Jekyll workflow? I currently use a python script to do this but would like to do it entirely within Jekyll

You can call your python script as part of the build process, but you'll have to make a teeny tiny plugin to do it. This also assumes you're not using github pages, because they don't like plugins.
make a _plugins directory in your site root.
Inside that directory, create csv_to_json.rb
In that ruby file, call your python script with the following code:
# The following line tells jekyll to run everything between 'do' and 'end'
# when it finishes writing the site to disk:
Jekyll::Hooks.register :site, :post_write do |_site|
# Backticks are one way to call shell commands from ruby:
`python your_script_here.py` # replace with the correct filename
end
This is untested code. Relevant documentation here and here
There are quite a few ways to do this, but I think that's the simplest for your case.

You can :
1 - Store your csv in _data/foldername (eg : _data/members) see : Jekyll's data files
2 - Put all your datas in a new array with concat filter
{% comment %} ### Create an empty array{% endcomment %}
{% assign all-members = "" | split: "" %}
{% for part in site.data.members %}
{% assign all-members = all-members | concat: part[1] %}
{% endfor %}
3 - Output datas unsing the jsonify filter : {{ all-members | jsonify }}
An all in one, a members.json file can look like :
---
layout: null
---
{% assign all-members = "" | split: "" %}
{%- for part in site.data.members %}
{% assign all-members = all-members | concat: part[1] %}
{% endfor -%}
{{ all-members | jsonify }}

Thank you to both of you. I have gone the plugin route as this is the easiest and I may need to develop another plugin at some point so I may as well learn how to make one.

Related

Does Jinja support variable assignment as a result of a loop?

I've been using Jinja and DBT for a month now, and despite reading a lot about it, I didn't quite figure out how to create a list from another, using a simple for loop like I would in Python.
Just a toy example:
{%- set not_wanted_columns = ['apple', 'banana'] -%}
{%- set all_columns = ['kiwi', 'peach', 'apple', 'banana', 'apricot', 'pineapple'] -%}
What I want is a list as so:
{% set filtered_columns = ['kiwi', 'peach', 'apricot', 'pineapple'] %}
Naturally, I don't want to manually write this result because the full list might be dynamic or too long. I'm not even sure if Jinja does actually support this, although I do think this is a common problem.
As you have probably read from the documentation:
Please note that assignments in loops will be cleared at the end of the iteration and cannot outlive the loop scope. Older versions of Jinja had a bug where in some circumstances it appeared that assignments would work. This is not supported.
Source: https://jinja.palletsprojects.com/en/3.1.x/templates/#for
And I guess when you are speaking about
using a simple for loop like I would in Python
What you mean here is using a list comprehension.
So, as showed in the documentation, Jinja is using filter to achieve this:
Example usage:
{{ numbers|select("odd") }}
{{ numbers|select("divisibleby", 3) }}
Similar to a generator comprehension such as:
(n for n in numbers if test_odd(n))
(n for n in numbers if test_divisibleby(n, 3))
Source: https://jinja.palletsprojects.com/en/3.1.x/templates/#jinja-filters.select
There are actual four of those filter acting as generator comprehension:
reject
rejectattr
select
selectattr
So, in your case, a reject filter would totally do the trick:
{%- set filtered_columns = all_columns
| reject('in', not_wanted_columns)
| list
-%}
But, if you really want, you could also achieve it in a for:
{%- for column in all_columns if column not in not_wanted_columns -%}
{% do filtered_columns.append(column) %}
{%- endfor -%}
The do statement being a way to use list.append() without oddities being printed out: https://jinja.palletsprojects.com/en/3.1.x/templates/#expression-statement.

dbt run-operation to call macros that have non-string arguments

Say that I want to drop the ajs schema as a clean up activity on our dev db, not as part of a regular dbt workflow
dbt run-operation drop_schema --args '{relation: ajs}'
perhaps I need to wrap drop_schema into another macro drop_schema_str(schema_str) where schema_str is the string of the schema and it is used to make a Relation object before invoking drop_schema()?
create this macro
{% macro drop_schema_str(schema) %}
{% set relation = api.Relation.create(database=target.database, schema=schema) %}
{% do drop_schema(relation) %}
{% endmacro %}
then invoke it with
dbt run-operation drop_schema_str --args '{schema: ajs}'

How to loop through all files in Jekyll's _data folder, and pull their filenames?

Similar to the question at How to loop through all files in Jekyll's _data folder?, how would one loop through the files in their /_data directory (or a subdirectory) and pull the filenames of each file?
for example, if you had:
_data/
navigation.yml
news.yml
people/
advisors.yml
board.yml
staff.yml
... and you wanted to get the list of files inside /_data/people/?
If you loop through the subdirectory, each for item (in this case "file") will have file[0] as the filename (without the extension) and file[1] as the content of the file.
Thus, your code can look like:
{% for file in site.data.people %}
{{ file[0] }}
{% endfor %}
would result in:
advisors
board
staff

Jekyll: What is the default _data sorting criteria?

When iterating through an array of files in the _data folder, what is the default criteria for sorting the files?
At first I was expecting it to be sorted alphabetically, but after some testing I realized it was not. Still, I couldn't figure out what was the criteria being used to sort the files.
{%- for file in site.data.folder -%}
{{ file | inspect }}
<br />
<br />
{%- endfor -%}
From what I understood file is an array containing the filename as the first element and the data as the second element, so I'm not sure using sort with any property name would work. When I tried I had the error message:
Liquid Exception: no implicit conversion of String into Integer
When using sort with no arguments, I could return the files sorted by filename alphabetic order:
{%- assign files = site.data.folder | sort -%}
{%- for file in files -%}
{{ file | inspect }}
<br />
<br />
{%- endfor -%}
So my questions are:
What is the default sorting criteria for _data files?
Is sorting in relation to an object property possible? (I'm thinking the issue with that one is having an array and not the pure objects when you access site.data.folder)
Example:
After creating the default Jekyll page, I created the _data/folder directory, where I'd include 5 random .json files:
_data/folder/a.json
_data/folder/b.json
_data/folder/c.json
_data/folder/d.json
_data/folder/e.json
Each of them have the following content:
_data/folder/a.json:
{"name":"Mike"}
_data/folder/b.json:
{"id":"4343"}
_data/folder/c.json:
[{"age":"29"},{"job":"journalist"}]
_data/folder/d.json:
{"name":"John"}
_data/folder/e.json
{"haircolor":"green"}
With those files in place, I created a page named page.html on the root directory with:
---
---
<pre>{{ site.data.folder | inspect }}</pre>
<br />
<br />
{%- for file in site.data.folder -%}
<pre>{{ file | inspect }}</pre>
<br />
{%- endfor -%}
And the output of that page was:
{"e"=>{"haircolor"=>"green"}, "c"=>[{"age"=>"29"}, {"job"=>"journalist"}], "d"=>{"name"=>"John"}, "a"=>{"name"=>"Mike"}, "b"=>{"id"=>"4343"}}
["e", {"haircolor"=>"green"}]
["c", [{"age"=>"29"}, {"job"=>"journalist"}]]
["d", {"name"=>"John"}]
["a", {"name"=>"Mike"}]
["b", {"id"=>"4343"}]
The files were not ordered alphabetically, but instead in some apparently random order. I can get them in alphabetical order by using:
---
---
<pre>{{ site.data.folder | sort | inspect }}</pre>
<br />
<br />
{%- assign folder = site.data.folder | sort -%}
{%- for file in folder -%}
<pre>{{ file | inspect }}</pre>
<br />
{%- endfor -%}
Output:
[["a", {"name"=>"Mike"}], ["b", {"id"=>"4343"}], ["c", [{"age"=>"29"}, {"job"=>"journalist"}]], ["d", {"name"=>"John"}], ["e", {"haircolor"=>"green"}]]
["a", {"name"=>"Mike"}]
["b", {"id"=>"4343"}]
["c", [{"age"=>"29"}, {"job"=>"journalist"}]]
["d", {"name"=>"John"}]
["e", {"haircolor"=>"green"}]
But it's still unclear what is the ordering criteria on the call without sort.
Going from #ashmaroli's assumption that this was not a Jekyll's issue, I started making a little bit of research about file ordering and ran into the following resources:
File ordering behavior while using Dir on Ruby
Indeterministic File order using Dir
The link describes a counter intuitive behavior when loading multiple dependencies. If the order the files are loaded matter the shortcut below could result in they being loaded in a different order than the expected.
Dir[File.join(File.dirname(__FILE__), 'example/*.rb')].each{ |f| require f }
This is apparently due to the underlying glob system call according to the answer in the link.
Python glob ordering
How is Pythons glob.glob ordered?
In the SO question above, the user is asking why the returned glob file order in Python is different than the order on the output of ls -l. Even though the question is about Python and not Ruby, the underlying call to the OS is likely the same. The OS is not required to deliver the files in any order, so they should be sorted after the call.
The first answer states that if you run ls -U you get the unordered list of files, which matches the order I have here when I make a list of _data objects on Jekyll without sorting. So this is most likely the cause of the weird ordering: it's OS dependent.
Since Jekyll orders the _post files, I think it wouldn't be a major issue to order _data files by default as well, to avoid any confusion. But as it was stated before in the question itself, it can be easily done with the sort filter.

How can I test jinja2 templates in ansible?

Sometimes I need to test some jinja2 templates that I use in my ansible roles. What is the simplest way for doing this?
For example, I have a template (test.j2):
{% if users is defined and users %}
{% for user in users %}{{ user }}
{% endfor %}
{% endif %}
and vars (in group_vars/all):
---
users:
- Mike
- Smith
- Klara
- Alex
At this time exists 4 different variants:
1_Online (using https://cryptic-cliffs-32040.herokuapp.com/)Based on jinja2-live-parser code.
2_Interactive (using python and library jinja2, PyYaml)
import yaml
from jinja2 import Template
>>> template = Template("""
... {% if users is defined and users %}
... {% for user in users %}{{ user }}
... {% endfor %}
... {% endif %}
... """)
>>> values = yaml.load("""
... ---
... users:
... - Mike
... - Smith
... - Klara
... - Alex
... """)
>>> print "{}".format(template.render(values))
Mike
Smith
Klara
Alex
3_Ansible (using --check)
Create test playbook jinja2test.yml:
---
- hosts: 127.0.0.1
tasks:
- name: Test jinja2template
template: src=test.j2 dest=test.conf
and run it:
ansible-playbook jinja2test.yml --check --diff --connection=local
sample output:
PLAY [127.0.0.1] **************************************************************
GATHERING FACTS ***************************************************************
ok: [127.0.0.1]
TASK: [Test jinja2template] ***************************************************
--- before: test.conf
+++ after: /Users/user/ansible/test.j2
## -0,0 +1,4 ##
+Mike
+Smith
+Klara
+Alex
changed: [127.0.0.1]
PLAY RECAP ********************************************************************
127.0.0.1 : ok=2 changed=1 unreachable=0 failed=0
4_Ansible (using -m template) thanks for #artburkart
Make a file called test.txt.j2
{% if users is defined and users %}
{% for user in users %}
{{ user }}
{% endfor %}
{% endif %}
Call ansible like so:
ansible all -i "localhost," -c local -m template -a "src=test.txt.j2 dest=./test.txt" --extra-vars='{"users": ["Mike", "Smith", "Klara", "Alex"]}'
It will output a file called test.txt in the current directory, which will contain the output of the evaluated test.txt.j2 template.
I understand this doesn't directly use a vars file, but I think it's the simplest way to test a template without using any external dependencies. Also, I believe there are some differences between what the jinja2 library provides and what ansible provides, so using ansible directly circumvents any discrepancies. When the JSON that is fed to --extra-vars satisfies your needs, you can convert it to YAML and be on your way.
If you have a jinja2 template called test.j2 and a vars file located at group_vars/all.yml, then you can test the template with the following command:
ansible all -i localhost, -c local -m template -a "src=test.j2 dest=./test.txt" --extra-vars=#group_vars/all.yml
It will output a file called test.txt in the current directory, which will contain the output of the evaluated test.j2 template.
I think this is the simplest way to test a template without using any external dependencies. Also, there are differences between what the jinja2 library provides and what ansible provides, so using ansible directly circumvents any discrepancies. It's also possible to test ad-hoc variables without making an additional vars file by using JSON:
ansible all -i "localhost," -c local -m template -a "src=test.j2 dest=./test.txt" --extra-vars='{"users": ["Mike", "Smith", "Klara", "Alex"]}'
You can use the debug module
tasks:
- name: show templating results
debug:
msg: "{{ lookup('template', 'template-test.j2') }}"
Disclaimer - I am the author of this, but I put together JinjaFx (https://github.com/cmason3/jinjafx).
This is a Python based tool that allows you to pass Jinja2 templates with a YAML file for variables. I originally wrote it so it can pass CSV based data to generate group_vars and host_vars for our deployments, but it also allows easy testing of Jinja2 templates - there is an online version at https://jinjafx.io
I needed to verify that the template I had defined gave the right result for the server it was created for. (The template included the hostname as a variable and other per host defined variables.)
Neither of the above methods worked for me. The solution for me was to add
check_mode: yes
diff: yes
to the task executing the template command, this got me the difference between the generated file and the file actually on the server without changing the remote file.
For me it actually worked better than looking at the whole generated file, since the changes was the interesting part anyway.
It needs to log in on the remote machine, so a limited use-case.
Example of a complete command:
- name: diff server.properties
check_mode: yes
diff: yes
ansible.builtin.template:
src: "src.properties"
dest: "/opt/kafka/config/server.properties"