How does Airflow decide to render template values? - jinja2

I am working with Airflow 2.2.3 in GCP (Composer) and I am seeing inconsistent behavior which I can't explain when trying to use template values.
When I reference the templated value directly, it works without issue:
ts = '{{ ds }}' # results in 2022-05-09
When I reference the templated value in a function call, it doesn't work as expected:
ts_parts = '{{ ds }}'.split('-') # result ['2022-05-09']
The non-function call value is rendered without any issues, so it doesn't have any dependency on operator scope. There are examples here that show rendering outside of an operator, so I expect that not to be the issue. It's possible that Composer has setting configured so that Airflow will apply rendering to all python files.
Here's the full code for reference
dag.py
with DAG('rendering_test',
description='Testing template rendering',
schedule_interval=None, # only run on demand
start_date=datetime(2020, 11, 10), ) as rendering_dag:
ts = '{{ ds }}'
ts_parts = '{{ ds }}'.split('-')
literal_parts = '2022-05-09'.split('-')
print_gcs_info = BashOperator(
task_id='print_rendered_values',
bash_command=f'echo "ts: {ts}\nts_parts: {ts_parts}\nliteral_parts {literal_parts}"'
)
I thought that Airflow writes the files to some location with template values, then runs jinja against them with some supplied values, then runs the resulting python code. It looks like there is some logic applied if the line contains a function call? The documentation mentions none of these architectural principles and gives very limited examples.

Airflow does not render values outside of operator scope.
Rendering is a part of task execution which means that it's a step that happens only when task is in the worker (after being scheduled).
In your code the rendering is a top level code which is not part of operator templated fields thus Airflow consider it to be a regular string.
In your case the os.path.dirname() is executed on '{{ dag_run.conf.name }}' before it was rendered.
To fix your issue you need to set the Jinja string in templated fields of the operator.
bash_command=""" echo "path: {{ dag_run.conf.name }} path: os.path.dirname('{{ dag_run.conf.name }}')" """
Triggering DAG with {"name": "value"} will give:
Note that if you wish to use f-string with Jinja strings you must double the number of { }
source_file_path = '{{ dag_run.conf.name }}'
print_template_info = BashOperator(
task_id='print_template_info',
bash_command=f""" echo "path: { source_file_path } path: os.path.dirname('{{{{ dag_run.conf.name }}}}')" """
)
Edit:
Let me clarify - Airflow template fields as part of task execution.
You can see in the code base that Airflow invokes render_templates before it invokes pre_execute() and before it invokes execute(). This means that this step happens when the task is running on a worker. Trying to template outside of operator means the task doesn't even run - so the step of templating isn't running.

Related

How to pass Pulumi's Output<T> to the container definition of a task within ecs?

A containerDefinition within a Task Definition needs to be provided as a single valid JSON document. I'm creating a generic ECS service that should handle dynamic data. Here is the code:
genericClientService(environment: string, targetGroupArn: Output<string>) {
return new aws.ecs.Service(`${this.domainName}-client-service-${environment}`, {
cluster: this.clientCluster.id,
taskDefinition: new aws.ecs.TaskDefinition(`${this.domainName}-client-${environment}`, {
family: `${this.domainName}-client-${environment}`,
containerDefinitions: JSON.stringify(
clientTemplate(
this.defaultRegion,
this.domainName,
this.taskEnvVars?.filter((object: { ENVIRONMENT: string }) => object.ENVIRONMENT === environment),
this.ecrRepositories
)
),
cpu: "256",
executionRoleArn: taskDefinitionRole.arn,
memory: "512",
networkMode: "awsvpc",
requiresCompatibilities: ["FARGATE"],
}).arn,
desiredCount: 1,
...
There is a need of information from an already built resource this.ecrRepositories which represents a list of ECR repositories needed. The problem here is that let's say you want to retrieve the repository URL and apply the necessary 'apply()' method, it will return an Output<string>. This would be fine normally, but since containerDefinitions needs to be a valid JSON document, Pulumi can't handle it since JSON on an Output<T> is not supported;
Calling [toJSON] on an [Output<T>] is not supported. To get the value of an Output as a JSON value or JSON string consider either: 1: o.apply(v => v.toJSON()) 2: o.apply(v => JSON.stringify(v)) See https://pulumi.io/help/outputs for more details. This function may throw in a future version of #pulumi/pulumi.
Blockquote
Neither of the suggested considerations above will work as the dynamicly passed variables are wrapped within a toJSON function callback. Because of this it won't matter how you pass resource information since it will always be an Output<T>.
Is there a way how to deal with this issue?
Assuming clientTemplate works correctly and the error happens in the snippet that you shared, you should be able to solve it with
containerDefinitions: pulumi.all(
clientTemplate(
this.defaultRegion,
this.domainName,
this.taskEnvVars?.filter((object: { ENVIRONMENT: string }) => object.ENVIRONMENT === environment),
this.ecrRepositories
)).apply(JSON.stringify),

Terraform - Iterate over a List of Objects in a Template

I'm having issues iterating over a list of objects within a template interpreted by the templatefile function.
I have the following var:
variable "destinations" {
description = "A list of EML Channel Destinations."
type = list(object({
id = string
url = string
}))
}
This is passed in to the templatefile function as destinations. The snippet of template relevant is this:
Destinations:
%{ for dest in destinations ~}
- Id: ${dest.id}
Settings:
URL: ${dest.url}
%{ endfor }
When planning Terraform this gives an error of:
Error: "template_body" contains an invalid YAML: yaml: line 26: did not find expected key
I have tried switching the template code to the following:
Destinations:
%{ for id, url in destinations ~}
- Id: ${id}
Settings:
URL: ${url}
%{ endfor }
Which gives a different error:
Call to function "templatefile" failed:
../../local-tfmodules/eml/templates/eml.yaml.tmpl:25,20-23: Invalid template
interpolation value; Cannot include the given value in a string template:
string required., and 2 other diagnostic(s).
[!] something went wrong when creating the environment TF plan
I get the impression my iterating over the data type here is somehow incorrect but I cannot fathom how and I cannot find any docs about this at all.
Here is a cut down example of how I'm calling this module:
module "eml" {
source = "../../local-tfmodules/eml"
name = "my_eml"
destinations = [
{
id = "6"
url = "https://example.com"
},
{
id = "7"
url = "https://example.net"
}
]
<cut>
}
I've just found (after crafting a small Terraform module to test templatefile output only) that the original config DOES work (at least in TF v0.12.29).
The errors given are a bit of a Red Herring - the issue is to do with indentation within the template, e.g. instead of:
Destinations:
%{ for destination in destinations ~}
- Id: ${destination.id}
Settings:
URL: ${destination.url}
%{ endfor ~}
it should be:
Destinations:
%{~ for destination in destinations ~}
- Id: ${destination.id}
Settings:
URL: ${destination.url}
%{~ endfor ~}
Notice the extra tilde's (~) at the beginning of the Terraform directives. This makes the Yaml alignment work correctly (you get some lines incorrectly indented and some blank lines). After this the original code in my question works as I expected it to & produces valid yaml.
You can't pass var.destinations as a list of maps to the template. It must be list/set of strings.
But you could do the following:
templatefile("eml.yaml.tmpl",
{
ids = [for v in var.destinations: v.id]
urls = [for v in var.destinations: v.url]
}
)
where eml.yaml.tmpl is
Destinations:
%{ for id, url in zipmap(ids, urls) ~}
- Id: ${id}
Settings:
URL: ${url}
%{ endfor ~}
Since you are aiming to generate a YAML result, I suggest following the advice in the templatefile documentation about generating JSON or YAML from a template.
Using the yamlencode function will guarantee that the result is always valid YAML, without you having to worry about correctly positioning newlines or quoting/escaping strings that might contain special characters.
Write your templatefile call like this:
templatefile("${path.module}/templates/eml.yaml.tmpl", {
destinations = var.destinations
})
Then, in the eml.yaml.tmpl, make the entire template be the result of calling yamlencode, like this:
${yamlencode({
Destinations = [
for dest in destinations : {
Id = dest.id
Settings = {
URL = dest.url
}
}
]
})
Notice that the argument to yamlencode is Terraform expression syntax rather than YAML syntax, because in this case it's Terraform's responsibility to do the YAML encoding, and all you need to do is provide a suitable value for Terraform to encode, following the mappings from Terraform types to YAML types given in the yamldecode documentation.

How to convert Pulumi Output<t> to string?

I am dealing with creating AWS API Gateway. I am trying to create CloudWatch Log group and name it API-Gateway-Execution-Logs_${restApiId}/${stageName}. I have no problem in Rest API creation.
My issue is in converting restApi.id which is of type pulumi.Outout to string.
I have tried these 2 versions which are proposed in their PR#2496
const restApiId = apiGatewayToSqsQueueRestApi.id.apply((v) => `${v}`);
const restApiId = pulumi.interpolate `${apiGatewayToSqsQueueRestApi.id}`
here is the code where it is used
const cloudWatchLogGroup = new aws.cloudwatch.LogGroup(
`API-Gateway-Execution-Logs_${restApiId}/${stageName}`,
{},
);
stageName is just a string.
I have also tried to apply again like
const restApiIdStrign = restApiId.apply((v) => v);
I always got this error from pulumi up
aws:cloudwatch:LogGroup API-Gateway-Execution-Logs_Calling [toString] on an [Output<T>] is not supported.
Please help me convert Output to string
#Cameron answered the naming question, I want to answer your question in the title.
It's not possible to convert an Output<string> to string, or any Output<T> to T.
Output<T> is a container for a future value T which may not be resolved even after the program execution is over. Maybe, your restApiId is generated by AWS at deployment time, so if you run your program in preview, there's no value for restApiId.
Output<T> is like a Promise<T> which will be eventually resolved, potentially after some resources are created in the cloud.
Therefore, the only operations with Output<T> are:
Convert it to another Output<U> with apply(f), where f: T -> U
Assign it to an Input<T> to pass it to another resource constructor
Export it from the stack
Any value manipulation has to happen within an apply call.
So long as the Output is resolvable while the Pulumi script is still running, you can use an approach like the below:
import {Output} from "#pulumi/pulumi";
import * as fs from "fs";
// create a GCP registry
const registry = new gcp.container.Registry("my-registry");
const registryUrl = registry.id.apply(_=>gcp.container.getRegistryRepository().then(reg=>reg.repositoryUrl));
// create a GCP storage bucket
const bucket = new gcp.storage.Bucket("my-bucket");
const bucketURL = bucket.url;
function GetValue<T>(output: Output<T>) {
return new Promise<T>((resolve, reject)=>{
output.apply(value=>{
resolve(value);
});
});
}
(async()=>{
fs.writeFileSync("./PulumiOutput_Public.json", JSON.stringify({
registryURL: await GetValue(registryUrl),
bucketURL: await GetValue(bucketURL),
}, null, "\t"));
})();
To clarify, this approach only works when you're doing an actual deployment (ie. pulumi up), not merely a preview. (as explained here)
That's good enough for my use-case though, as I just want a way to store the registry-url and such after each deployment, for other scripts in my project to know where to find the latest version.
Short Answer
You can specify the physical name of your LogGroup by specifying the name input and you can construct this from the API Gateway id output using pulumi.interpolate. You must use a static string as the first argument to your resource. I would recommend using the same name you're providing to your API Gateway resource as the name for your Log Group. Here's an example:
const apiGatewayToSqsQueueRestApi = new aws.apigateway.RestApi("API-Gateway-Execution");
const cloudWatchLogGroup = new aws.cloudwatch.LogGroup(
"API-Gateway-Execution", // this is the logical name and must be a static string
{
name: pulumi.interpolate`API-Gateway-Execution-Logs_${apiGatewayToSqsQueueRestApi.id}/${stageName}` // this the physical name and can be constructed from other resource outputs
},
);
Longer Answer
The first argument to every resource type in Pulumi is the logical name and is used for Pulumi to track the resource internally from one deployment to the next. By default, Pulumi auto-names the physical resources from this logical name. You can override this behavior by specifying your own physical name, typically via a name input to the resource. More information on resource names and auto-naming is here.
The specific issue here is that logical names cannot be constructed from other resource outputs. They must be static strings. Resource inputs (such as name) can be constructed from other resource outputs.
Encountered a similar issue recently. Adding this for anyone that comes looking.
For pulumi python, some policies requires the input to be stringified json. Say you're writing an sqs queue and a dlq for it, you may initially write something like this:
import pulumi_aws
dlq = aws.sqs.Queue()
queue = pulumi_aws.sqs.Queue(
redrive_policy=json.dumps({
"deadLetterTargetArn": dlq.arn,
"maxReceiveCount": "3"
})
)
The issue we see here is that the json lib errors out stating type Output cannot be parsed. When you print() dlq.arn, you'd see a memory address for it like <pulumi.output.Output object at 0x10e074b80>
In order to work around this, we have to leverage the Outputs lib and write a callback function
import pulumi_aws
def render_redrive_policy(arn):
return json.dumps({
"deadLetterTargetArn": arn,
"maxReceiveCount": "3"
})
dlq = pulumi_aws.sqs.Queue()
queue = pulumi_aws.sqs.Queue(
redrive_policy=Output.all(arn=dlq.arn).apply(
lambda args: render_redrive_policy(args["arn"])
)
)

How to capture an attribute from a random JSON index in serverless artillery

In Artillery, how can I capture the attribute of a random index in a JSON array returned from a GET, so my subsequent POSTs are evenly distributed across the resources?
https://artillery.io/docs/http-reference/#extracting-and-reusing-parts-of-a-response-request-chaining
I'm using serverless artillery to run a load test, which under the hood uses artillery.io .
A lot of my scenarios look like this:
-
get:
url: "/resource"
capture:
json: "$[0].id"
as: "resource_id"
-
post:
url: "/resource/{{ resource_id }}/subresource"
json:
body: "Example"
Get a list of resources, and then POST to one of those resources.
As you can see, I am using capture to capture an ID from the JSON response. My problem is that it is always getting the id from the first index of the array.
This will mean in my load test I end up absolutely battering one single resource rather than hitting them evenly which will be a more likely scenario.
I would like to be able to do something like:
capture:
json: "$[RANDOM].id
as: "resource_id"
but I have been unable to find anything in the JSONPath definition that would allow me to do so.
Define setResourceId function in custom JS code and to tell Artillery to load your custom code, set config.processor to the JS file path:
processor: "./custom-code.js"
- get:
url: "/resource"
capture:
json: "$"
as: "resources"
- function: "setResourceId"
- post:
url: "/resource/{{ resourceId }}/subresource"
json:
body: "Example"
custom-code.js file containing the below function
function setResourceId(context, next) {
const randomIndex = Math.round(Math.random() * context.vars.resources.length);
context.vars.resourceId = context.vars.resources[randomIndex].id;
}
Using this version:
------------ Version Info ------------
Artillery: 1.7.9
Artillery Pro: not installed (https://artillery.io/pro)
Node.js: v14.6.0
OS: darwin/x64
The answer above didn't work for me.
I got more info from here, and got it working with the following changes:
function setResourceId(context, events, done) {
const randomIndex = Math.round(Math.random() * (context.vars.resources.length - 1));
context.vars.resourceId = context.vars.resources[randomIndex].id;
return done();
}
module.exports = {
setResourceId: setResourceId
}

Index.js file continuously gives "JSON text did not start with array" despite being formatted as an array

I have a parse-server hosted by heroku, which has an index.js file utilized for its configuration. I want to use Mailgun to up functionality for the user to request a password reset, and I have set up the config file, following this answer, as follows:
var api = new ParseServer({
appName: 'App Name',
publicServerURL: 'https://<name>.herokuapp.com/parse',
databaseURI: databaseUri || 'mongodb://localhost:27017/dev',
cloud: process.env.CLOUD_CODE_MAIN || __dirname + '/cloud/main.js',
appId: process.env.APP_ID || 'myAppId',
masterKey: process.env.MASTER_KEY || '', //Add your master key here. Keep it $
serverURL: process.env.SERVER_URL || 'http://localhost:1337/parse', // Don't$
liveQuery: {
classNames: ["Posts", "Comments"] // List of classes to support for query s$
},
push: JSON.parse(process.env.SERVER_PUSH || "{}"),
verifyUserEmails: true, //causing errors
emailAdapter: { //causing errors
module: 'parse-server-simple-mailgun-adapter',
options: {
fromAddress: 'parse#example.com',
domain: '<domain>',
apiKey: '<key>',
}
}
});
This code does not work, though, because of the verifyUserEmails and emailAdapter. Removing both of them removes the "JSON text did not start with array" error. Adding either one of them back in results in the error being thrown. I have no idea why, though, since I do not see any obvious reason as to how they aren't being set up in an array correctly?
Do I need to set up the cooresponding config vars in heroku in addition to having them in the config file? I considered this, but appName and publicServerURL are not set up in this way and don't give this error.
emailAdapter.options.apiKey doesn't need a comma at the end since its the last element of it's JSON.
I wouldn't be surprised that you're also leaving in the comma at the end of verifyUserEmails when you include it improperly as well.
options: {
fromAddress: 'parse#example.com',
domain: '<domain>',
apiKey: '<key>',
}
This is not valid JSON, because there is a comma at the end of the apiKey line. The last item in a JSON object does not have a comma.
For anyone that is repeatedly running into this issue, I have figured out exactly what was going wrong. Despite the error informing me that my JSON was incorrectly formatted, it turns out it was actually that the module was misnamed. According to this post, the updated module has been renamed to '#parse/simple-mailgun-adapter'. Inserting this into the index.js, after ensuring I had ran the npm install --save #parse/simple-mailgun-adapter in my local repo, fixed the issue.