DBT - only insert\merge columns that exist in the source object - jinja2

I'm new to dbt and jinja but trying my best.
We have a scenario where when using incremental merge our destination table schema is different from our Source schema so we'd like to only update/insert the common columns.
I'm loading my source columns into a variable then sending it as a configuration value like so:
{%- set src_cols = adapter.get_columns_in_relation(ref('pre_Dim_Entities_Client')) -%}
{{
config(
materialized='incremental',
unique_key='Entity_ID',
source_columns = src_cols
)
}}
SELECT *
FROM {{ ref ('pre_Dim_Entities_Client')}}
Then I've overridden merge.sql macro:
{% macro default__get_merge_sql(target, source, unique_key, dest_columns, predicates) -%}
{%- set predicates = [] if predicates is none else [] + predicates -%}
{%- set temp = config.get('source_columns') -%}
{{- print(temp) -}}
{%- set dest_cols_csv = get_quoted_csv(config.get('source_columns', default = dest_columns) | map(attribute="name"))%}
{%- set update_columns = config.get('merge_update_columns', default = dest_columns | map(attribute="quoted") | list) -%}
{%- set sql_header = config.get('sql_header', none) -%}
I've assigned the config value to a temp var and printed it to confirm what I suspected was the issue - the columns are not passed in source_columns configuration as I would have expected.
What am I doing wrong, and alternatively- Is there a better way to go about this issue?

Self-answer here -
Turns out passing variables in config block is problematic.
Instead I sent the source object name as a config value and in the merge macro retrieved the columns and updated dest_columns to the retrieved object.
{{
config(
materialized='incremental',
unique_key='Entity_ID',
source_scehma = 'pre_Dim_Entities_Client'
)
}}
SELECT *
FROM {{ ref ('pre_Dim_Entities_Client')}}
And in the merge macro:
{% macro default__get_merge_sql(target, source, unique_key, dest_columns, predicates) -%}
{%- set predicates = [] if predicates is none else [] + predicates -%}
{%- set src_name = config.get('source_scehma', none) -%}
{% if src_name %}
{%- set dest_columns = adapter.get_columns_in_relation(ref(src_name)) -%}
{% endif %}
{%- set dest_cols_csv = get_quoted_csv( dest_columns | map(attribute="name")) -%}
{%- set update_columns = config.get('merge_update_columns', default = dest_columns | map(attribute="quoted") | list) -%}
{%- set sql_header = config.get('sql_header', none) -%}
##Note:
In my solution I assume all source columns exist in the destination table.
If your need is to only take the common columns between source and destination the code should be modified accordingly.

Related

Cleaner way to turn string separated by commas into JSON using Shopify Liquid?

I've got this working, but I was wondering if there is a cleaner way to write this using liquid. I'm pulling a string from theme settings which would like this:
"keyOne: valueOne, keyTwo:ValueTwo, keyThree:ValueThree"
Then in my liquid file I have this:
this is getting the string
{% assign messageList = settings.message | split: ',' %}
Loops over messageList and as long as there is a length it splits the text again by the colon and returns the key and value wrapped in quotes. Then checking the last iteration so that it doesn't add the comma
<script class="js-product-messages" type="application/json">
{
{%- for messages in messageList -%}
{%- if forloop.length > 0 -%}
{%- assign singalMessage = messages | split:':' -%}
"{{ singalMessage[0] | strip}}":"{{ singalMessage[1] | strip}}"{% unless forloop.last %},{% endunless -%}
{%- endif -%}
{%- endfor -%}
}
</script>

Comparing timestamps gives type error in jinja2 template

I'm trying to compare two timestamps but getting a type error:
"TypeError: '<' not supported between instances of 'float' and 'str'
The template code is using 'as_timestamp' and the documentation recommended using float() when comparing.
I'm trying to get the name and date of the sensor that has a date value that is closest to now() (in Home Assistant).
So I'm trying to do the following:
{%- set list = ['sensor.1','sensor.2','sensor.3','sensor.4'] -%}
{%- set data = namespace(min_date=float(as_timestamp(now() + timedelta(days = 360))), min_name = '') -%}
{%- for i in list %}
{%- set value = states(i) %}
{%- if float(as_timestamp(value)) < float(data.min_date) %}
{%- set data.min_date = value -%}
{%- set data.min_name = state_attr(i, 'friendly_name') -%}
{% endif %}
{%- endfor %}
{{ data.min_name + ' (' + data.min_date | timestamp_custom('%Y-%m-%dT%H:%M:%S%z') + ')' }}
Any ideas why data.min_date is a str in the comparison, despite the value being defined with both float() and as_timestamp() ?
had to remove the 'as_timestamp' from the initial value assignment and add it to the if statement. that solved it.

Append string to array and join back into string

I'm using Home Assistant templates, which run on Jinja2 script.
I have a group of entities (states.group.doors) that have attribute battery_level. I want to build an array of entities with battery_level < min_battery level and display as a string separated by commas.
I can't figure out what's wrong with my syntax. Two questions:
Is there just a better way overall to create a list that is filtered for battery_level < min_battery_level rather than building an array like I am?
If not, then there must be something wrong with the way I am building this array. Can someone spot it?
Thanks for the help.
The following code does successfully detect battery_level < 98 and display true if anything meets that criteria, so I'm almost there.
{% set min_battery_level = 98 -%}
{% set ns = namespace(found=false, entities=[]) -%}
{% set entities = [] -%}
{% for entity_id in states.group.doors.attributes.entity_id -%}
{% set parts = entity_id.split('.') -%}
{% if (state_attr(entity_id, 'battery_level') | replace("%","") | int) < min_battery_level -%}
{% set ns.found = true -%}
{% set entities = entities + [entity_id] -%}
{% endif -%}
{% endfor -%}
{{ ns.found }}
{{ entities | join(' ') }}
Welp... kept playing with it and got it working as follows:
{% set min_battery_level = 98 -%}
{% set ns = namespace(found=false, entities = []) -%}
{% for entity_id in states.group.doors.attributes.entity_id -%}
{% set name = state_attr(entity_id, 'friendly_name') | string -%}
{% set battery = state_attr(entity_id, 'battery_level') | replace("%","") | int -%}
{% if (battery) < min_battery_level -%}
{% set ns.found = true -%}
{% set ns.entities = ns.entities + [name+' ('+battery|string+'%)'] -%}
{% endif -%}
{% endfor -%}
{{ ns.found }}
{{ ns.entities | join(', ') }}

Why does this where filter not work in Jekyll?

I have a github page, whose _include directory has a file courses.html:
{% assign id = include.lessonID | split: '.' %}
{% assign courseID = id | first %}
{% assign node = site.data.courses | where: "id","1" %}
{% assign node = node[1] %}
{%- if node.id == empty -%}
<h1> EMPTY NODE Warning</h1>
{%- else -%}
<h2> DATA Found! </h2>
ID: {{ node.id }}
{%- endif -%}
<p>CourseID: {{node.id}}</p>
<p>Name: {{ node.name }}</p>
<p>Link: {{ node.permalink }}</p>
{%- for node in site.data.courses -%}
{%- if node.id == 1 -%}
<p>{{ node.name }}</p>
<p>{{ node.permalink }}</p>
{%- endif -%}
{%- endfor -%}
It is being used by a file in _layout called courses.html:
{% include courses.html post=page.lessonInfo.lessonID post=page %}
Finally, there's file lister.md that has the following contents:
---
layout: courses
title: 'Test'
lessonInfo:
lessonID : 1.1
modName: 'Installing RHEL Server'
chapterName: 'Using Essential Tools'
---
# There should be some course list around here!
The output is as follows:
DATA Found!
ID:
CourseID:
Name:
Link:
RHCSA
/rhcsa
So, apparently the node variable isn't empty, but I can't access any of the properties when I'm selecting the right element of the array using where clause.
However, this works when using the second part using if statement in for loop. How do I fix the where clause?!
Edit
The suggestions by #JJJ did solve my problem, but I have a related problem now. I can't replace the constant 1 in the expression where: "id","1" with a variable! I tried the normal where clause (both with and without quotes) which didn't work. So, I tried a where expression, which also doesn't work:
{% assign node = site.data.courses | where: "id",courseID %}
Doesn't work!
{% assign node = site.data.courses | where_exp: "selNode","selNode.id == courseID" %}
Neither does this.
What am I doing wrong and how do I fix it?
Firstly, like in most programming languages, arrays are zero-indexed. node[1] contains the second node, not the first one. You probably meant {% assign node = node[0] %} instead.
Secondly, if node.id == empty isn't how you check if a value exists. Just do unless node or node.size == 0.

how to access pillar data with variables?

I have a pillar data set like this;
vlan_tag_id:
nginx: 1
apache: 2
mp: 3
redis: 4
in the formula sls file I do this;
{% set tag = pillar.get('vlan_tag_id', 'u') %}
so now I have a variable tag which is a dictionary {'apache': 2, 'nginx': 1, 'redis': 4, 'mp': 3}
At run time I pass a pillar data app whose value will be either
1. apache
2. nginx
3. redis
4. mp
so if at run time I pass apache I want to something which will get me the value 2
I cant do {{ salt['pillar.get']('vlan_tag_id:app', '')}} because app itself is a variable.
I tried doing {{ salt'pillar.get'}}, but it throws error.
how can I do this ?
Since tag is just another dictionary, you can do a get on that as well:
{%- set tag = pillar.get('vlan_tag_id', 'u') %}
{%- set app = pillar.get('app') %}
{{ tag.get(app) }} # Note lack of quotes
If you want to use the colon syntax, you can append the contents of app to the key string:
{%- set app = pillar.get('app') %}
{{ salt['pillar.get']('vlan_tab_id:' + app) }}
I find it simpler to follow if I alias pillar.get and break it up a bit:
{%- set pget = salt['pillar.get'] %}
{%- set app = pget('app') %}
{%- set tag = pget('vlan_tag_id') %}
{{ tag.get(app) }}