Feed Jinja list with results of SQL query - jinja2

I'm new to DBT and Jinja and wondering if it is possible to dynamically define a list using a SQL query. Instead of manually declaring the items of the list like:
{% set myOrders = [123,234, 345, 456, 567] %}
Define the list with a SQL query, something like this:
{% set myOrders = SELECT DISTINCT OrderNum FROM OrdersTable ORDER BY OrderNum %}
Is this possible?
Thanks!

Yes! Not quite as you've written it, but this is supported.
First, a note that this is inherently difficult because DBT typically runs in two phases:
templates are compiled to make actual SQL queries (i.e. all the Jinja gets executed)
the compiled SQL queries are executed
But there is a construction, {% if execute %}, that allows you to defer compilation to the execution stage.
Straightforwardly adapting the example in the docs for your use case:
{% set my_orders_query %}
SELECT DISTINCT OrderNum
FROM {{ ref('OrdersTable') }}
ORDER BY OrderNum
{% endset %}
{% set rows = run_query(my_orders_query) %}
{% if execute %}
{# Return the first column #}
{% set myOrders = rows.columns[0].values() %}
{% else %}
{% set myOrders = [] %}
{% endif %}

Related

Condition based on the three first letters of a string?

In my Jinja template, model.DataType value can be user defined or built in. My requirenement is if model.DataType start with the three letters ARR, then do a specific operation.
Example of values:
ARRstruct124
ARR_int123
ARR123123
CCHAR
UUINT
etc.
{% set evenDataType = model.eventDataType %}
{%if evenDataType | regex_match('^ARR', ignorecase=False) %}
// do the operation
{%else%}
// do the operation
{% endif %}
With this template, I am getting the error
{%if evenDataType | regex_match('^ARR', ignorecase=False) %}
jinja2.exceptions.TemplateAssertionError: no filter named 'regex_match'
There is indeed no regex_match filter in the Jinja builtin filters. You might have found some examples using it, but this is an additional filter provided by Ansible, so it won't work outside of Ansible.
This said, your requirement does not need a regex to be fulfilled, you can use the startswith() method of a Python string.
So, you template should be:
{% set evenDataType = model.eventDataType %}
{% if evenDataType.startswith('ARR') %}
`evenDataType` starts with 'ARR'
{% else %}
`evenDataType` does not starts with 'ARR'
{% endif %}

How to set loop.index as a variable in Jinja

{% for data in data_list %}
{% set data_index = {{loop.index}} %}
for data_dict in data:
pass
In my inner loop, I need to use the loop index in the outer loop, so I intend to set it to a variable as above. But the syntax is invalid.
How to do that? Or is there another way to get the outer loop index?
i think, you should not use Expressions({{..}}) inside statements ({%..%}), try this :
{% for data in data_list %}
{% set data_index = loop.index %}
for data_dict in data:
pass
You could use the built-in enumerate function for the same to get i as the variable and also use it in an inner loop if you want.
{% for i,data in enumerate(data_list) %}
{{ i }}
{% for j in range(i) %}
{% endfor %}
{% endfor %}
All you need to do is pass enumerate or whatever built-in python function you need as a parameter to the render template function as shown below
#app.get("/foo")
def foo():
return render_template("foo.html", enumerate=enumerate, range=range)

How to use the if statement in jinja2 template

I want to check if a condition in the linked database is true and then execute some code but I am getting error such as
jinja2.exceptions.TemplateSyntaxError: expected token ':', got '}'
{% for prod in prod %}
{% if {{prod.sh}} is 1 %}
<pre>Lines to come if true</pre>
{% endif %}
{% endfor %}
The extensive Jinja2 documentation can be found at https://jinja.palletsprojects.com/en/2.11.x/templates/
About your code - I spot two problems.
You should not use for prod in prod - rather, but something like for product in products, ie name them differently.
You do not have to put angle brackets around prod.sh. You would do this only when the variable is referenced directly within the HTML code.
So a working code could look like:
{% for prod in prod %}
{% if prod.sh == 1 %}
<pre>Lines to come if true</pre>
{% endif %}
{% endfor %}

dbt jinja returning the results of a query

I am trying to model the following situation:
given some query, return multi-column result-set (e.g. run_query or db_utils.get_query_results_as_dict
iterate over in a case/statment
for exmaple:
{% set conditions = dbt_utils.get_query_results_as_dict("select comment, criteria from "
~ ref('the_model') %}
...
select case
{% for condition in conditions %}
when {{ condition["criteria"] }}
then {{ condition["comment"] }}
{% endfor %}
Have not been able to get this to work, any guidance appreciated.
Some ideas I tried:
get_column_values x2 and zipping them into a new list of tuples. zip not recognised
get the count(*) from the_model then trying to iterate over the range - ran into issues with types
various for conditions {% for k, v in conditions.items() %}
Was able to self resolve with the following:
{% set conditions = dbt_utils.get_query_results_as_dict("select criteria, comment from " ~ ref('reference_data') ~ " order by sequence desc") %}
with main as (
select * from {{ ref('my_other_model') }}
),
-- [NEEDS_REVIEW] there's probably a cleaner way to do this iteration - however it's interpolated result. Could do with the zip function.
comments as (
select
*,
case
{# {{- log(conditions, info=True) -}} #}
{%- for comment in conditions.COMMENT -%}
when {{ conditions.CRITERIA[loop.index0] }}
then '{{ comment }}'
{% endfor %}
end as comment
from main
)
select * from comments
The gotchas:
this was on snowflake, so the keys returned by the function will be up-cased as that is how I loaded the data.
Using the loop.index0 to get the current iteration of the loop and index into the other collection of tuples (in this case CRITERIA).
i added a SEQUENCE key to my reference data just to ensure consistent rendering by using that to order. The criteria do overlap a-little bit so this was important.

Jinja2 for loop behaving similarly to with

I'd like to iterate over a set of objects and find the maximum of one particular attribute, however jinja2 ignores any action within an iterator on a variable declared outside of the iterator. For example:
{% set maximum = 1 %}
{% for datum in data %}
{% if datum.frequency > 1 %}
{% set maximum = datum.frequency %}
{% endif %}
{% endfor %}
{# maximum == 1 #}
datum.frequency is definitely greater than 1 for some datum in data.
EDIT (solution)
This is similar to this post, but there's a bit more to it. The following works and is very ugly.
{% set maximum = [1] %}
{% for datum in data %}
{% if datum.freq > maximum[-1] %}
{% if maximum.append( datum.freq ) %}{% endif %}
{% endif %}
{% endfor %}
{% set maximum = maximum[-1] %}
Have you considered writing a custom filter to return the highest value of a particular attribute within your collection? I prefer to minimize the amount of logic I use in Jinja2 templates as part of maintaining a 'separation of concerns'.
Here is a link to a very good example of how one can be written in python:
Custom jinja2 filter for iterator
Once you have your filter returning the value you need access it by using '|' like so:
{% set maximum = datum|filtername %}