NiFi UpdateRecord on child paths with FlowFile Attributes - json

A source JSON of the form
{"fu": "bar"}
should be changed to
{"fu": {"bar": "chi"}}
with the chi coming from one of the flowfile attributes.
This sounds like an obvious task for the UpdateRecord processor, however I seem to unable to apply it successfully.
the result looks like
{"fu":"bar","bar":"chi"}
seemingly ignoring the child structure in the processor attribute name.
What is the correct way to do this?

Related

Apache NiFi: Changing Date and Time format in csv

I have a csv which contains a column with a date and time. I want to change the format of the date-time column. The first 3 rows of my csv looks like the following.
Dater,test1,test2,test3,test4,test5,test6,test7,test8,test9,test10,test11
20011018182036,,,,,166366183,,,,,,
20191018182037,,27,94783564564,,162635463,817038655446,,,0,,
I want to change the csv to look like this.
Dater,test1,test2,test3,test4,test5,test6,test7,test8,test9,test10,test11
2001-10-18-18-20-36,,,,,166366183,,,,,,
2019-10-18-18-20-37,,27,94783564564,,162635463,817038655446,,,0,,
How is this possible?
I tried using the UpdateRecord Processor.
My properties look like this:
But this approach doesn't work since the data gets routed as a failure from the UpdateRecord Processor. Suggest me a method to complete the task.
I was able to accomplish this using the UpdateRecord Processor. The expression language I used is ${field.value:toDate('yyyyMMddHHmmss'):format('yyyy-MM-dd HH:mm:ss')}.
Just this didn't work since every time, the data was routed towards the failure path from the UpdateRecord Processor.
To fix this error I changed the configuration of the CSVRecordSetWriter. The Schema Access Strategy must be changed to Use String Fields from Header. This is by default Use Schema Name Property
Strategy: use UpdateRecord to manipulate the timestamp value using expression language:
${field.value:toDate():format('ddMMyyyy')}
Flow:
GenerateFlowFile:
UpdateRecord:
Setup reader and writer to inherit schema. Include header line. Leave other properties untouched.
Result:
However this solution might not satisfy you because of a strange problem. When you format the date like that:
${field.value:toDate():format('dd-MM-yyyy')}
ConvertRecord routes to the failure relationship:
Type coercion does not work properly. Maybe it is a bug. I could not find a solution for this problem.

How to print json property name on nifi?

I have a json in the following format:
{
"nm_questionario":{"isEmpty":"MSGE1 - Nome do Questionário"},
"ds_questionario":{"isEmpty":"MSGE1 - Descrição do Questionário"},
"dt_inicio_vigencia":{"isEmpty":"MSGE1 - Data de Vigência"}
}
How can I print the names of the properties using nifi? I want to retrieve the names nm_questionario, dt_inicio_vigencia and ds_questionario. Tried many things already but to no avail.
You can use a LogAttribute processor with Log payload set to true to print the full contents in your $NIFI_HOME/logs/nifi-app.log file. You can also use a PutFile processor to write the contents to a flat file on disk. If you need to do something programmatic with those values, you can use the EvaluateJSONPath processor to extract various pieces of content into named attributes, which you can manage using UpdateAttribute or LogAttribute again.

How to output original json along with transformed json in JoltTransformation

I want to save both transformed and original json into Hbase using the same Key. I am using JoltTransformation + EvaluateJsonPath to transform and find an element from transformed json. I want to use this element to save both transformed and original json.
If I can get original json along with transformed json then I can save both of them using the same key.
Thanks,
Ani
The JoltTransformJson processor only has success and failure relationships, and success is going to be the flow file with the content after the transform. So the only way to get the original content is to route the flow file from before JoltTransformJson, so that it goes to an HBase processor and also to the JoltTransformJson processor.
You could also first insert the original json to hbase then continue on to the transform, so something like:
Source -> PutHBaseJson -> JoltTransformJson -> PutHBaseJson
The first one is inserting the original json, the second one inserting the transformed json. As long as you use the same row id, then they'll be part of the same row.

How to specify key for kafka producer in apache nifi?

I have simple pipeline using apache nifi and i want to publish some messages in kafka topic using existing kafka puplisher processor.
The problem is how to specify kafka key using apache nifi expression language?
I tired something like ${message:jsonPath('$.key')} but, of course, i got an error because object message does not exist.
I also tried to use filename object which is something like a default object name for input messages, but it didn't help
Using another kafka publisher processor it is possible by setting message key field property, but what about PublishKafka processor?
NiFi expression language can only reference flow file attributes, and cannot directly reference the content (this is done on purpose).
So if you want to use the value of a field from your json document as the key, then you need to first use another processor like EvaluateJsonPath to extract the value of that field into a flow file attribute.
Lets say you have a field "foo" in your json document, you might use EvaluateJsonPath with destination to set to "flow file attributes" and then add a dynamic property like:
foo = $.foo
Then in PublishKafka set the key property to ${foo}.
Keep in mind this only makes sense if you have a single json document per flow file, otherwise if you have multiple then it is unclear what the key is since you can only have one "foo" attribute for the flow file, but many "foo" fields in the content of the flow file.

Neo4j node property containing raw json as metadata

Is this possible to have a node property as json raw string and to filter on it with cypher ?
I have a node with some defined properties and metadata (json raw string).
I would like to select or filter on those metadata property.
This is something like this :
START movie=node:TYPE_INDEX(Type = 'MOVIE') // Start with the reference
MATCH movie-[t:TAG]->tag
WHERE collect(movie.Metadata).RatingPress > 3
RETURN distinct movie.Label
And metadata are something like this :
{"RatingPress" : "0","RatingSpectator" : 3"}
I have expected to use collect function in order to call the property like this :
collect(movie.Metadata).RatingPress
But, of course it fails...
Is this a way to bind some json string from a node property with cypher ?
Thanks for your help
That's going against the principles of properties. Why not set the properties in the JSON metadata directly on the node?
But to answer your question:
No, cypher has no knowledge about JSON.
We treat the entire Node as a JSON blob. Since Neo4j doesn't support hierarchical properties, we flatten out the JSON into delimited property names on save and unflatten them on read. You can then form Cypher queries on (for example) property name "foo.bar.baz". The queries tend to look a bit funky because you'll need to quote them using single back quotes, but it works.