We are building a service. It has to read config from a file. We are currently using YAML and Jackson for deserializing the YAML. We have a situation where our YAML file needs to inherit/extend another YAML file(s). E.g., something like:
extends: base.yaml
appName: my-awesome-app
...
thus part of the config is stored in base.yaml. Is there any library that has support for this? Bonus points if it allows to inherit from more than one file. We could change to using JSON instead of YAML.
Neither JSON nor YAML have the ability to include files. Whatever you do will be a pre-processing step where you will be putting the base.yaml and your actual file together.
A crude way of doing this would be:
#include base.yaml
appName: my-awesome-app
Let this be your file. Upon loading, you first read the first line, and if it starts with #include, you replace it with the content of the included file. You need to do this recursively. This is basically what the C preprocessor does with C files and includes.
Drawbacks are:
even if both files are valid YAML, the result may not.
if either files includes a directive end or document end marker (--- or ...), you will end up with two separate documents in one file.
you cannot replace any values from base.yaml inside your file.
So an alternative would be to actually operate on the YAML structure. For this, you need the API of the YAML parser (SnakeYAML in your case) and parse your file with that. You should use the compose API:
private Node preprocess(final Reader myInput) {
final Yaml yaml = new Yaml();
final Node node = yaml.compose(myInput);
processIncludes(node);
return node;
}
private void processIncludes(final Node node) {
if (node instanceof MappingNode) {
final List<NodeTuple> values = ((MappingNode) node).getValue();
for (final NodeTuple tuple: values) {
if ("!include".equals(tuple.getKeyNode().getTag().getValue())) {
final String includedFilePath =
((ScalarNode) tuple.getValueNode()).getValue();
final Node content = preprocess(new FileReader(includedFilePath));
// now merge the content in your preferred way into the values list.
// that will change the content of the node.
}
}
}
}
public String executePreprocessor(final Reader source) {
final Node node = preprocess(source);
final StringWriter writer = new StringWriter();
final DumperOptions dOptions = new DumperOptions()
Serializer ser = new Serializer(new Emitter(writer, dOptions),
new Resolver(), dOptions, null);
ser.open();
ser.serialize(node);
ser.close();
return writer.toString();
}
This code would parse includes like this:
!include : base.yaml
appName: my-awesome-app
I used the private tag !include so that there will not be name clashes with any normal mapping key. Mind the space behind !include. I didn't give code to merge the included file because I did not know how you want to handle duplicate mapping keys. It should not be hard to implement though. Be aware of bugs, I have not tested this code.
The resulting String can be the input to Jackson.
Probably for the same desire, I have created this tool: jq-front.
You can do it by following syntax and combinating with yq command.
extends: [ base.yaml ]
appName: my-awesome-app
...
$ yq -j . your.yaml | jq-front | yq -y .
Note that you need to place file names to be extended in an array since the tool supports multiple inheritance.
Points potentially you don't like are
It's quite a bit slow. (But for configuration information, it might be ok since you can convert it to an expanded file once and you will never not the original one after that for your system)
Objects inside an array cannot behave as expected since the tool relies on * operator of jq.
Related
I am looking at extracting the root element of a JSON document. It looks like this is possible neither using JsonPointer nor JsonPath as my attempts to look up for such an expression has been unsuccessful. Any tips would be appreciated. TIA.
Sample document:
{
"MESSAGE1_ROOT_INPUT": {
"CTRL_SEG": "test"
}
}
The below using gson 2.9.0:
$.*~
produces:
{"CTRL_SEG": "test"}
while JSONPath Online produces this:
[
"MESSAGE1_ROOT_INPUT"
]
The attempt is to get text "MESSAGE1_ROOT_INPUT" using JsonPath/JsonPointer expression(s). Note that, extracting this the traditional (substring or regex on a stringified json text) way, would preferably be my last resort.
Background: We are building an API service that accepts JSON documents with different roots. Such as, MESSAGE2_ROOT_INPUT, MESSAGE3_ROOT_INPUT, etc. It is based on this, the routing of a message further will occur.
Supported/Employed Languages: Java/GSON Library/RegEx
Gson does not natively support JSONPath or JSON Pointer. However, you can quite efficiently obtain the name of the first property using JsonReader:
public static String getFirstPropertyName(Reader reader) throws IOException {
// Don't have to call JsonReader.close(); that would just close the provided reader
JsonReader jsonReader = new JsonReader(reader);
jsonReader.beginObject();
return jsonReader.nextName();
}
There are however two things to keep in mind:
This only reads the beginning of the JSON document; it neither verifies that the complete JSON document has valid syntax, nor checks if there might be more top-level properties
This consumes some data from the Reader; to further process the data you have to buffer the data to allow re-reading it again (you can also first store the JSON in a String and pass a StringReader to JsonReader)
I’m loading a json file with jsondecode() in terraform, and I need to dynamically lookup a path in the json tree. Eg say I have the following json in file.json:
{
"some1": {
"path1": {
"key1": value1
"key2": value2
}
}
}
If I load this into a local called myjson then I could write local.myjson.some1.path1.key1 to get value 1.
But I need the path to be an input. The following does not work:
locals {
tree = jsondecode("file.json")
path = ["some1", "path1", "key1"]
value = local.tree[local.path]
}
I looked at all the builtin functions in terraform, such as lookup, flatten, etc, I could not see any combination that would allow me to loop over elements of local.path2 to extract successively deeper elements of local.tree. Except try, works nicely but the max depth is hardcoded:
locals {
level1 = try(local.json[local.path[0]], null)
level2 = try(local.level1[local.path[1]], local.level1)
level3 = try(local.level2[local.path[2]], local.level2)
level4 = try(local.level3[local.path[3]], local.level3)
...
result = try(local.levelN[local.path[N]], local.levelN)
}
so regardless of how many levels there actually are in the local.tree, result will contain it.
I can live with hardcoded N, but is there a better way, that does not have that limitation? (short of creating a custom provider that defines a data source that does this)
The Terraform language has no built-in functionality for this sort of arbitrary dynamic traversal.
As you noted in your question, it is possible in principle for a provider to offer this functionality. It wasn't clear to me whether you didn't want to use a provider at all or if you just didn't want to be the one to write it, and so just in case it was the latter I can at least offer a provider I already wrote and published which can potentially address this need, which is called apparentlymart/javascript and exposes a JavaScript interpreter into the Terraform language which you can use for arbitrary complex data manipulation:
terraform {
required_providers {
javascript = {
source = "apparentlymart/javascript"
version = "0.0.1"
}
}
}
variable "traversal_path" {
type = list(string)
}
data "javascript" "example" {
source = <<-EOT
for (var i = 0; i < path.length; i++) {
data = data[path[i]]
}
data
EOT
vars = {
data = jsondecode(file("${path.module}/file.json"))
path = var.traversal_path
}
}
output "result" {
value = data.javascript.example.result
}
I can run this with different values of var.traversal_path to select different parts of the data structure in the JSON file:
$ terraform apply -var='traversal_path=["some1", "path1", "key1"]' -auto-approve
data.javascript.example: Reading...
data.javascript.example: Read complete after 0s
Changes to Outputs:
+ result = "value1"
You can apply this plan to save these new output values to the Terraform state, without changing any real infrastructure.
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
result = "value1"
$ terraform apply -var='traversal_path=["some1", "path1", "key2"]' -auto-approve
data.javascript.example: Reading...
data.javascript.example: Read complete after 0s
Changes to Outputs:
~ result = "value1" -> "value2"
You can apply this plan to save these new output values to the Terraform state, without changing any real infrastructure.
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
result = "value2"
$ terraform apply -var='traversal_path=["some1", "path1", "key3"]' -auto-approve
data.javascript.example: Reading...
data.javascript.example: Read complete after 0s
Changes to Outputs:
- result = "value2" -> null
You can apply this plan to save these new output values to the Terraform state, without changing any real infrastructure.
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
I included the final example above to be explicit that escaping into JavaScript for this problem means adopting some of JavaScript's behaviors rather than Terraform's, and JavaScript handles looking up a non-existing object property by returning undefined rather than returning an error as Terraform would, and the javascript data source translates that undefined into a Terraform null. If you want to treat that as an error as Terraform would then you'd need to write some logic into the loop to test whether data is defined after each step. You can use the JavaScript throw statement to raise an error from inside the given script.
Of course it's not ideal to embed one language inside another like this, but since the Terraform language is intended for relatively straightforward declarations rather than general computation I think it's reasonable to use an escape-hatch like this if the overall problem fits within the Terraform language but there is one small part of it that would benefit from the generality of a general-purpose language.
Bonus chatter: if you prefer a more functional style to the for loop I used above then you can alternatively make use of the copy of Underscore.js that's embedded inside the provider, using _.propertyOf to handle the traversal in a single statement:
source = <<-EOT
_.propertyOf(data)(path)
EOT
I would like to use nifi to encrypt the attributes in a json but not the keys as I would like to upload the data to a mongodb server. Is there a way to do this? For the project I an using twitter data as a proof of concept. So far I have used the EvaluateJsonPath processor to extract only the text of the tweet, and I can encrypt this text, however the resulting json no longer has a key. Can Nifi recreate a json that attaches a key to this attribute that I extracted? Is there a better way to do this?
Unfortunately, this workflow isn't well supported by existing Apache NiFi processors. You could probably fashion a workflow that split the JSON content into attributes, split each attribute into the content of an individual flowfile, encrypted that content, merged the flowfiles back, and reconstituted the now-encrypted content into a attributes via UpdateAttribute.
I have created a Jira for a new NiFi processor to make this much simpler. My recommendation until such time as that is available is to use the ExecuteScript processor to achieve this. I have provided a template with an example, which you can import directly into your NiFi instance and connect to your flow. The body of the ExecuteScript processor is provided below (you can see how I initialized the AES/GCM cipher, and change the algorithm, key, and IV to your desired values).
import javax.crypto.Cipher
import javax.crypto.SecretKey
import javax.crypto.spec.IvParameterSpec
import javax.crypto.spec.SecretKeySpec
import java.nio.charset.StandardCharsets
FlowFile flowFile = session.get()
if (!flowFile) {
return
}
try {
// Get the raw values of the attributes
String normalAttribute = flowFile.getAttribute('Normal Attribute')
String sensitiveAttribute = flowFile.getAttribute('Sensitive Attribute')
// Instantiate an encryption cipher
// Lots of additional code could go here to generate a random key, derive a key from a password, read from a file or keyring, etc.
String keyHex = "0123456789ABCDEFFEDCBA9876543210" // * 2 for 256-bit encryption
SecretKey key = new SecretKeySpec(keyHex.getBytes(StandardCharsets.UTF_8), "AES")
IvParameterSpec iv = new IvParameterSpec(keyHex[0..<16].getBytes(StandardCharsets.UTF_8))
Cipher aesGcmEncCipher = Cipher.getInstance("AES/GCM/NoPadding", "BC")
aesGcmEncCipher.init(Cipher.ENCRYPT_MODE, key, iv)
String encryptedNormalAttribute = Base64.encoder.encodeToString(aesGcmEncCipher.doFinal(normalAttribute.bytes))
String encryptedSensitiveAttribute = Base64.encoder.encodeToString(aesGcmEncCipher.doFinal(sensitiveAttribute.bytes))
// Add a new attribute with the encrypted normal attribute
flowFile = session.putAttribute(flowFile, 'Normal Attribute (encrypted)', encryptedNormalAttribute)
// Replace the sensitive attribute inline with the cipher text
flowFile = session.putAttribute(flowFile, 'Sensitive Attribute', encryptedSensitiveAttribute)
session.transfer(flowFile, REL_SUCCESS)
} catch (Exception e) {
log.error("There was an error encrypting the attributes: ${e.getMessage()}")
session.transfer(flowFile, REL_FAILURE)
}
I have csv file as follows:
A;B;C
1;test;22
2;test2;33
where first line is a kind of header, and others are data. I have an issue to import all data rows with respect to header and report how many rows are correct and how many are not.
My first idea is to split source file to multiple files in the form of:
file1:
A;B;C
1;test;22
file2:
A;B;C
2;test2;33
How can I do this in camel, and how can I collect data necessary to print a summary report?
Take a look at Bean IO, and the Camel BeanIO component.
Looks like a good fit for your scenario.
You could probably build upon the example code on the first page of bean IO
BeanIO
http://beanio.org/
Camel BeanIO component
http://camel.apache.org/beanio.html
You should not need to split your incoming file if the only thing you need to do is collect and count successful and unsuccessful records.
If the CSV is not too big and fits in memory, I would read and convert the CSV file to a list of Java objects. The latest Camel CSV component can convert a CSV file into a List<Map>, before Camel 2.13 it produced List<List>. After having read converted CSV file into List of something you can write your own processor to iterate over the List and check its content.
You can unmarshall the file as a CSV file, remove the first line (header) and then do your validations as desired. Follow an example of camel route implementation
from("file:mydir/filename?noop=true")
.unmarshal()
.csv()
.process(validateFile())
.to("log:my.package?multiline=true")
Then you need to define the validateFile() method using the camel Processor
class like this:
public Processor validateFile() {
return new Processor() {
#override
public void process(Exchange exchange) throws Exception {
List<List<String>> data = (List<List<String>>) exchange.getIn().getBody();
String headerLine = data.remove(0);
System.out.println("header: "+headerLine);
System.out.println("total lines: "+data.size());
// iterate over each line
for( List<String> line : data) {
System.out.println("Total columns: "+line.size());
System.out.println(line.get(0)); // first column
}
}
};
}
In this method you can validate each file line/columns as you wish and then print it out or even write this report in other output file
Use as reference the File and CSV component page from Apache camel docs;
http://camel.apache.org/file.html
http://camel.apache.org/csv.html
I want to parse a json file into objects, and save it to database. I just create a groovy script that runs in grails console(typing grails console in cmd line). I did not create grails app or domain class. Inside this small script, When I call save, I have
groovy.lang.MissingMethodException: No signature of method: Blog.save()
is applicable for argument types: () values: []
Possible solutions: wait(), any(), wait(long), isCase(java.lang.Object),
sleep(long), any(groovy.lang.Closure)
Am I missing something?
I'm also confused that if I do save, is it going to save data to a table called Blog? Should I build any database connection here? (Because I grails domain class, we don't need to. But is it different using pure groovy?)
Many Thanks!
import grails.converters.*
import org.codehaus.groovy.grails.web.json.*;
class Blog {
String title
String body
static mapping = {
body type:"text"
attachment type:"text"
}
Blog(title,body,slug){
this.title = title
this.body=body
}
}
here parse the json
// parse json
List parsedList =JSON.parse(new FileInputStream("c:/ning-blogs.json"), "UTF-8")
def blogs = parsedList.collect {JSONObject jsonObject ->
new Blog(jsonObject.get("title"),jsonObject.get("description"),"N/A");
}
loop blogs and save each object
for (i in blogs){
// println i.title; I'll get the information needed.
i.save();
}
I don't have large experience with grails, but from a quick googling seems like that for a class be treated like a model class, it will need to be either on the correct convention-package/dir or a legacy jar with hibernate mapping/JPA annotation. Thus your example can't work. Why not define that model in your model package?