Can I use one packer builder with many provisioners and still run parallel builds? - packer

I have what seems to be like a valid use case for an unsupported - afaik - scenario, using packer.io and I'm worried I might be missing something...
So, in packer, I can add:
many builders,
have a different name per builder,
use the builder name in the only section of the provisioners and finally
run packer build -only=<builder_name> to effectively limit my build to only the provisioners combined with the specific builder.
This is all fine.
What I am now trying to do, is use the same base image to create 3 different builds (and resulting AMIs). Obviously, I could just copy-paste the same builder config 3 times and then use 3 different provisioners, linking each to the respective builder, using the only parameter.
This feels totally wasteful and very error prone though... It sounds like I should be able to use the same builder and just limit which provisioners are applied .. ?
Is my only solution to use 3 copy-pasted builders? Is there any better solution?

I had the same issue, where I want to build 2 different AMIs (one for staging, one for production) and the only difference between them is the ansible group to apply during the provisioning. Building off the answer by #Rickard ov Essen I wrote a bash script using jq to duplicate the builder section of the config.
Here's my packer.json file:
{
"builders": [
{
"type": "amazon-ebs",
"name": "staging",
"region": "ap-southeast-2",
"source_ami_filter": {
"filters": {
"virtualization-type": "hvm",
"name": "ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-*",
"root-device-type": "ebs"
},
"owners": ["099720109477"],
"most_recent": true
},
"instance_type": "t2.nano",
"ssh_username": "ubuntu",
"force_deregister": true,
"force_delete_snapshot": true,
"ami_name": "my-ami-{{ build_name }}"
}
],
"provisioners": [
{
"type": "ansible",
"playbook_file": "provisioning/site.yml",
"groups": ["{{ build_name }}"]
}
]
}
The ansible provisioner user the variable build_name to choose which ansible group to run.
Then I have a bash script build.sh which runs the packer build:
#!/bin/bash
jq '.builders += [.builders[0] | .name = "production"]' packer.json > packer_temp.json
packer build packer_temp.json
rm packer_temp.json
You can see what the packer_temp.json file looks like on this jqplay.
If you need to add more AMIs you can just keep adding more jq filters:
jq '.builders += [.builders[0] | .name = "production"] | .builders += [.builders[0] | .name = "test"]
This will add another AMI for test.

only works on filters on builder name so that is not an option.
You could solve this with any of these aproches:
Preprocess a json and create 3 templates from one.
Use a template with a user variable defining which build it is and build 3 times. Use conditions on the variable in you scripts to run the correct scripts.
Build a base AMI with the common parts of the template and then run 3 different builds on that provisioning the differences.
In general Packer try to solve one thing well, by not including a advanced DSL for describing different build flavours the scope decreses. It's easy to preprocess and create json for more advanced use cases.

Related

jq-win64.exe: Parsing data from a JSON file in Windows Batch File

I have the following JSON file (song.json) that contains:
{
"Result": [
{
"ItemTitle": "Sometimes It Hurts",
"Artists": [
"Voost"
],
"MediaEnd": "00:02:15.8490000",
"Extro": "00:02:12.8200000",
"MediaId": 9551,
"ActualLength": "00:02:12.8200000",
"ItemType": "Song"
},
{
"ItemTitle": "Been a Long Time (Full Intention 2021 Remix)",
"Artists": [
"The Fog"
],
"MediaEnd": "00:03:11.3170000",
"IntroEnd": "00:00:07.4700000",
"Extro": "00:03:08.6300000",
"MediaId": 9489,
"ActualLength": "00:03:08.6300000",
"ItemType": "Song"
}
],
"ExceptionMessage": null,
"FailMessage": null,
"ExceptionTypeName": null
}
I want to extract the first “ItemTitle” and the first “Artist” and save them as variables.
In this example the result I am looking for would be:
ItemTitle=Sometimes It Hurts
Artist=Voost
I have been trying to use jq-win64.exe as this needs to run in a Windows Batch File, but I can’t get the syntax right. I have tried various examples that I have found on here but none of them appears to work as required. Can anyone suggest a solution?
First things first.
Since you seem to be having trouble on several fronts , I would suggest that you first get your jq query working the way you want
by using jq with the -f command-line option. That way, you can write your query without having to worry about
Windows rules for escaping characters on the command line. When the results are as you wish, you might even to decide to
leave well-enough alone.
Next, to obtain the values you want, it would seem you will want a jq query like this:
.Result | first(.[].ItemTitle), first(.[].Artists[])
With your JSON, this produces:
Sometimes It Hurts
Voost
But you say you want these values in the KEY=VALUE form. This can be achieved by modifying the basic query, e.g. as follows:
.Result|"ItemTitle=\(first(.[].ItemTitle))", "Artist=\(first(.[].Artists[]))"
Somehow I doubt this is really what you want, but the rest should be clear sailing.

Using jq to combine multiple JSON files

First off, I am not an expert with JSON files or with JQ. But here's my problem:
I am simply trying to download to card data (for the MtG card game) through an API, so I can use it in my own spreadsheets etc.
The card data from the API comes in pages, since there is so much of it, and I am trying to find a nice command line method in Windows to combine the files into one. That will make it nice and easy for me to use the information as external data in my workbooks.
The data from the API looks like this:
{
"object": "list",
"total_cards": 290,
"has_more": true,
"next_page": "https://api.scryfall.com/cards/search?format=json&include_extras=false&order=set&page=2&q=e%3Alea&unique=cards",
"data": [
{
"object": "card",
"id": "d5c83259-9b90-47c2-b48e-c7d78519e792",
"oracle_id": "c7a6a165-b709-46e0-ae42-6f69a17c0621",
"multiverse_ids": [
232
],
"name": "Animate Wall",
......
}
{
"object": "card",
......
}
]
}
Basically I need to take what's inside the "data" part from each file after the first, and merge it into the first file.
I have tried a few examples I found online using jq, but I can't get it to work. I think it might be because in this case the data is sort of under an extra level, since there is some basic information, then the "data" category is beneath it. I don't know.
Anyway, any help on how to get this going would be appreciated. I don't know much about this, but I can learn quickly so even any pointers would be great.
Thanks!
To merge the .data elements of all the responses into the first response, you could run:
jq 'reduce inputs.data as $s (.; .data += $s)' page1.json page2.json ...
Alternatives
You could use the following filter in conjunction with the -n command-line option:
reduce inputs as $s (input; .data += ($s.data))
Or if you simply want an object of the form {"data": [ ... ]} then (again assuming you invoke jq with the -n command-line option) the following jq filter would suffice:
{data: [inputs.data] | add}
Just to provide closure, #peak provided the solution. I am using it in conjunction with the method found here for using wildcards in batch files to address multiple files. The code looks like this now:
set expanded_list=
for /f "tokens=*" %%F in ('dir /b /a:-d "All Cards\!setname!_*.json"') do call set expanded_list=!expanded_list! "All Cards\%%F"
jq-win32 "reduce inputs.data as $s (.; .data += $s)" !expanded_list! > "All Cards\!setname!.json"
All the individual pages for each card set are named "setname"_"pagenumber".json
The code finds all the pages for each set and combines them into one variable which I can pass into jq.
Thanks again!

packer templates and conditional statements

I'd like to use conditional statements in the packer template at the "provisioners" stage.
"provisioners": [
{
"execute_command": "echo 'vagrant'|sudo -S sh '{{.Path}}'",
"override": {
"virtualbox-iso": {
"scripts": [
"scripts/base.sh",
"scripts/puppet.sh",
]
}
},
"type": "shell",
}
]
For instance, if the user, at the "packer build" command line, specifies, somehow, a "puppet" parameter then then the "scripts/puppet.sh" will be executed otherwise skipped.
How can I do that?
I don't think that this is possible with packers native template formate, because packer uses the JSON format for configuration which as long as I know does not support flow control mechanisms like the conditional statement. But it should be possible to archive a similar behaviour with a user variables and the shell provisioner.
The idea
The easiest way should be to set a user variable from the build command and pass this variables from packer to the shell provisioner script which detects the value of the user variable and calls the appropriate provisioner script e.g. puppet, salt,...
Disclaimer:
I didn't test this approach, but it should give you a hint to what I mean and maybe you can come up with an even better solution. ;-)
The problem solving approach
1. Define the variable which indicates the used provisioner:
There are multiple ways to define a user variable,
by calling the packer build command with the -var flag (see end of answere)
define the user variables in the box-template file
The packer box template file: **template.json**
"variables": [
"using_provision_system": "puppet"
]
"provisioners": [
{...}
define a variable definition file and specify the path to it in the build command with -var-file
The variable file: **variables.json**
Is a great alternative if you want to define variables in a seperate file.
{
"using_provision_system": "puppet"
}
2. Calling the shell script which calls the provisioner scripts:
Now modify the execute_command in a way that the 'master' script is called with the defined variable as argument.
"provisioners": [
{
"execute_command": "echo 'vagrant'|sudo -S sh '{{.Path}}' '{{user `using_provision_system`}}'",
"override": {
"virtualbox-iso": {
"scripts": [
call_provisioner_script.sh
]
}
},
"type": "shell",
}
]
Notice, we only need to specify one script. This one 'master' script takes the passed variable as argument and compares the value with some predefined provisioner names in a switch condition.
(Short: It chooses which provisioner scripts will be executed.)
master provision script: call_provisioner_script.sh
case $1 in
puppet*) sh puppet_provisioner.sh
*) sh shell_provisioner.sh
Take care!
Because this script will run inside your box, you might need to upload the scripts to the box before this command gets executed.
3. Last step is to build your box :)
Calling packers build command:
#define variable in build command (first option from above)
$ packer build -var 'using_provision_system=puppet' template.json
#define variables in variable file and set path in build command
$ packer build -var-file=variables.json template.json
Instead of using user variables, there should maybe a way to set enviromental variables in your box and use this type of variable to specify the provisioner.
An alternative way is to define 2 different builders which are of the same type but having different names. You can then exclude a provisioning step from a specific build using the only field:
{
"source": "foo.txt",
"destination": "/opt/foo.txt",
"type": "file",
"only": ["docker-local"]
}

Build system for typescript using different flags

I want to use different flags (sourcemap, out, target) that the typescript compiler provides. I am trying to define a build system in sublime 2 but unable to do so.
Have already read this question.
basically i want to do something like the following
tsc src/main/ts/myModule.ts --out src/main/js/myModule.js --sourcemap --target ES5
Just add them to the cmd array
{
"cmd": ["tsc","$file", "--out", "src/main/js/myModule.js"],
"file_regex": "(.*\\.ts?)\\s\\(([0-9]+)\\,([0-9]+)\\)\\:\\s(...*?)$",
"selector": "source.ts",
"osx": {
"path": "/usr/local/bin:/opt/local/bin"
}
}
First of all let me say that I'm using Sublime Text 3 on Windows and Typescript 1.0.
I don't think that SublimeText2 is so much different, though...
If you're on similar conditions, take a look at my current configuration file:
{
"cmd": ["tsc", "$file"],
"file_regex": "(.*\\.ts?)\\s*\\(([0-9]+)\\,([0-9]+)\\)\\:\\s(.+?)$",
"selector": "source.ts",
"windows": {
"cmd": ["tsc.cmd", "$file", "--target", "ES5"]
}
}
Please notice that I tweaked the regex so that it matches the TSC error format (and brings you to the line containing the error when you double click it from the error log...)
Besides of that, I think that the real command-line which gets run is the lower one: as a matter of fact I had it working only placing the options down there... (in this specific case I'm asking an ES5 compilation type, your parameters will differ).
This suppose you have a tsc.cmd avaliable on path; if not, put the full path of tsc.cmd or tsc.exe instead of "tsc.cmd" and be sure to escape backslashes \ as \\...
This works in my situation, maybe in other contexts they should also be placed on the first line...
Hope this helps :)

Is there any way in Elasticsearch to get results as CSV file in curl API?

I am using elastic search.
I need results from elastic search as a CSV file.
Any curl URL or any plugins to achieve this?
I've done just this using cURL and jq ("like sed, but for JSON"). For example, you can do the following to get CSV output for the top 20 values of a given facet:
$ curl -X GET 'http://localhost:9200/myindex/item/_search?from=0&size=0' -d '
{"from": 0,
"size": 0,
"facets": {
"sourceResource.subject.name": {
"global": true,
"terms": {
"order": "count",
"size": 20,
"all_terms": true,
"field": "sourceResource.subject.name.not_analyzed"
}
}
},
"sort": [
{
"_score": "desc"
}
],
"query": {
"filtered": {
"query": {
"match_all": {}
}
}
}
}' | jq -r '.facets["subject"].terms[] | [.term, .count] | #csv'
"United States",33755
"Charities--Massachusetts",8304
"Almshouses--Massachusetts--Tewksbury",8304
"Shields",4232
"Coat of arms",4214
"Springfield College",3422
"Men",3136
"Trees",3086
"Session Laws--Massachusetts",2668
"Baseball players",2543
"Animals",2527
"Books",2119
"Women",2004
"Landscape",1940
"Floral",1821
"Architecture, Domestic--Lowell (Mass)--History",1785
"Parks",1745
"Buildings",1730
"Houses",1611
"Snow",1579
I've used Python successfully, and the scripting approach is intuitive and concise. The ES client for python makes life easy. First grab the latest Elasticsearch client for Python here:
http://www.elasticsearch.org/blog/unleash-the-clients-ruby-python-php-perl/#python
Then your Python script can include calls like:
import elasticsearch
import unicodedata
import csv
es = elasticsearch.Elasticsearch(["10.1.1.1:9200"])
# this returns up to 500 rows, adjust to your needs
res = es.search(index="YourIndexName", body={"query": {"match": {"title": "elasticsearch"}}},500)
sample = res['hits']['hits']
# then open a csv file, and loop through the results, writing to the csv
with open('outputfile.tsv', 'wb') as csvfile:
filewriter = csv.writer(csvfile, delimiter='\t', # we use TAB delimited, to handle cases where freeform text may have a comma
quotechar='|', quoting=csv.QUOTE_MINIMAL)
# create column header row
filewriter.writerow(["column1", "column2", "column3"]) #change the column labels here
for hit in sample:
# fill columns 1, 2, 3 with your data
col1 = hit["some"]["deeply"]["nested"]["field"].decode('utf-8') #replace these nested key names with your own
col1 = col1.replace('\n', ' ')
# col2 = , col3 = , etc...
filewriter.writerow([col1,col2,col3])
You may want to wrap the calls to the column['key'] references in try / catch error handling, since documents are unstructured, and may not have the field from time to time (depends on your index).
I have a complete Python sample script using the latest ES python client available here:
https://github.com/jeffsteinmetz/pyes2csv
You can use elasticsearch head plugin.
You can install from elasticsearch head plugin
http://localhost:9200/_plugin/head/
Once you have the plugin installed, navigate to the structured query tab, provide query details and you can select 'csv' format from the 'Output Results' dropdown.
I don't think there is a plugin that will give you CSV results directly from the search engine, so you will have to query ElasticSearch to retrieve results and then write them to a CSV file.
Command line
If you're on a Unix-like OS, then you might be able to make some headway with es2unix which will give you search results back in raw text format on the command line and so should be scriptable.
You could then dump those results to text file or pipe to awk or similar to format as CSV. There is a -o flag available, but it only gives 'raw' format at the moment.
Java
I found an example using Java - but haven't tested it.
Python
You could query ElasticSearch with something like pyes and write the results set to a file with the standard csv writer library.
Perl
Using Perl then you could use Clinton Gormley's GIST linked by Rakesh - https://gist.github.com/clintongormley/2049562
Shameless plug. I wrote estab - a command line program to export elasticsearch documents to tab-separated values.
Example:
$ export MYINDEX=localhost:9200/test/default/
$ curl -XPOST $MYINDEX -d '{"name": "Tim", "color": {"fav": "red"}}'
$ curl -XPOST $MYINDEX -d '{"name": "Alice", "color": {"fav": "yellow"}}'
$ curl -XPOST $MYINDEX -d '{"name": "Brian", "color": {"fav": "green"}}'
$ estab -indices "test" -f "name color.fav"
Brian green
Tim red
Alice yellow
estab can handle export from multiple indices, custom queries, missing values, list of values, nested fields and it's reasonably fast.
If you are using kibana (app/discover in general), you can make your query in the UI, then save it and share -> CSV Reports. This creates a csv with a line for each record and columns will be comma separated
I have been using https://github.com/robbydyer/stash-query stash-query for this.
I find it quite convenient and working well, though i struggle with the install every time I redo it (this is due to me not being very fluent with gem's and ruby).
On Ubuntu 16.04 though, what seemed to work was:
apt install ruby
sudo apt-get install libcurl3 libcurl3-gnutls libcurl4-openssl-dev
gem install stash-query
and then you should be good to go
Installs Ruby
Install curl dependencies for Ruby, because the stash-query tool is working via the REST API of elasticsearch
Installs stash query
This blog post describes how to build it as well:
https://robbydyer.wordpress.com/2014/08/25/exporting-from-kibana/
you can use elasticsearch2csv is a small and effective python3 script that uses Elasticsearch scroll API and handle a big query response.
You can use GIST. Its simple.
Its in Perl and you can get some help from it.
Please download and see the usage on GitHub. Here is the link.
GIST GitHub
Or if you want in Java then go for elasticsearch-river-csv
elasticsearch-river-csv