I am working on a project to process data and, depending upon the contents of the data, format it for use by another system. Some of the data provided are not of use to that other system and some of it is so sparsely populated that it would be of no use - is there a way, using Freemarker, to prevent the output of a file at all based upon the contents of the data? I have tried using <#if> statements, but if the checks do not pass, I simply get a blank file output.
This is not up to FreeMarker, but to the software that calls it. FreeMarker doesn't create files, it just writes the output to a Writer. That Writer is provided by the software that calls FreeMarker. So that could implements a logic where the file isn't created until something non-whitespace is written, or could expose a directive to FreeMarker that drops the output file.
Related
I have an SFTP trigger in a logic app which fires when a file is added to a certain file area. It is a CSV-formatted file and I want the rows to be parsed and coverted into json. Which is the best way to convert CSV-data into json without using any custom connectors?
I cannot find any built-in connectors doing this job. And as far as I know there are no logic apps functions doing the job either.
Right now, there is no connector/action in logic app that can provide the out of box solution for your requirement. You need to loop in through the array and perform the calculation as per your requirement but I will not suggest you leverage the loop, variables action as it may take time and cost you more.
The alternative would be leveraging the inline code (JavaScript code) to do the calculation as per your requirement. Please note that you will need Integration Account to run your inline code.
Please refer to javascript code and modified if needed according to your needs. I have used '_' for differentiating the nested objects. For more details you can refer to previous discussion here.
For complex calculation you can offload this functionality to azure function and write your code as per the supported languages and call azure function from logic app.
1.Created logic app as shown below:
2 .Created container in storage account and uploaded a CSV file in container.
3.Next using compose action to split the contents of the CSV file on every new line into an array.
a. Here is the expression used in SplitLines compose action:
split(body('Get_blob_content_(V2)'),decodeUriComponent('%0D%0A'))
b. Follow the below MS Doc to write expressions:
4. Removing last(empty) line from previous output using another compose action as shown below ,
take(outputs('SplitLines'),add(length(outputs('SplitLines')),-1))
5.Separating filed names using compose action
split(first(outputs('SplitLines')), ',')
Forming json as shown below using Select action,
**From**: **`skip(outputs('RemoveLastLine'), 1)`**
**Map:**
**`outputs('SplitFieldName')[0]`** **`split(item(), ',')?[0]`**
**`outputs('SplitFieldName')[1]`** **`split(item(), ',')?[1]`**
Tested logic app and it is running successfully. 
Content of CSV file is as shown below:
Csv data is formatted as json:
Reference:Use data operations in Power Automate (contains video) — Power Automate | Microsoft Docs
Credit: #Iason Koulas
I'm looking for ideas for an Open Source ETL or Data Processing software that can monitor a folder for CSV files, then open and parse the CSV.
For each CSV row the software will transform the CSV into a JSON format and make an API call to start a Camunda BPM process, passing the cell data as variables into the process.
Looking for ideas,
Thanks
You can use a Java WatchService or Spring FileSystemWatcher as discussed here with examples:
How to monitor folder/directory in spring?
referencing also:
https://www.baeldung.com/java-nio2-watchservice
Once you have picked up the CSV you can use my example here as inspiration or extend it: https://github.com/rob2universe/csv-process-starter specifically
https://github.com/rob2universe/csv-process-starter/blob/main/src/main/java/com/camunda/example/service/CsvConverter.java#L48
The example starts a configurable process for every row in the CSV and includes the content of the row as a JSON process data.
I wanted to limit the dependencies of this example. The CSV parsing logic applied is very simple. Commas in the file may break the example, special characters may not be handled correctly. A more robust implementation could replace the simple Java String .split(",") with an existing CSV parser library such as Open CSV
The file watcher would actually be a nice extension to the example. I may add it when I get around to it, but would also accept a pull request in case you fork my project.
I am trying to create a flow within Apache-Nifi to collect files from a 3rd party RESTful APi and I have set my flow with the following:
InvokeHTTP - ExtractText - PutFile
I can collect the file that I am after, as I have specified this within my Remote URL however when I get all of the data from said file it is outputting multiple (100's) of the same files to my output directory.
3 things I need help with:
1: How do I get the flow to output the file in a readable .csv rather than just a file with no ext
2: How can I stop the processor once I have all of the data that I need
3: The Json file that I have been supplied with gives me the option to get files from a certain date range:
https://api.3rdParty.com/reports/v1/scheduledReports/877800/1553731200000
Or I can choose a specific file:
https://api.3rdParty.com/reports/v1/scheduledReports/download/877800/201904/CTDDaily/2019-04-02T01:50:00Z.csv
But how can I create a command in Nifi to automatically check for newer files, as this process will be running daily and we will be looking at downloading a new file each day.
If this is too broad, please help me by letting me know so I can edit this post.
Thanks.
Note: 3rdParty host name has been renamed to comply with security - therefore links will not directly work. Thanks.
1) You change the filename of the flow file to anything you want using the UpdateAttribute processor. If you want to make it have a ".csv" extension then you can add a property named "filename" with a value of "${filename}.csv" (without the quotes when you enter it).
2) By default most processors have a scheduling strategy of timer-driver 0 seconds, which means keep running as fast as possible. Go to the configuration of the processor on the scheduling tab and configure the appropriate schedule, it sounds like you probably want CRON scheduling to schedule it daily.
3) You can use NiFi expression language statements to create dynamic time ranges. I don't fully understand the syntax for the API that you have to communicate with, but you could do something like this for the URL:
https://api.3rdParty.com/reports/v1/scheduledReports/877800/${now()}
Where now() would return the current timestamp as an epoch.
You can also format it to a date string if necessary:
${now():format('yyyy-MM-dd')}
https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html
I'm, writing a Puppet (3.6.2) module that reads data fields from a CSV file via the extlookup function and I cannot figure out how to tell extlookup that the first line is the header field. Does extlookup support this? If not, can anyone recommend an external function I could import and use?
thanks,
PS - Yes I know about hiera, and having the data in YAML or JSON files but my requirement is CSV files only.
Brandon
The behavior of extlookup() is pretty well documented. It makes no special provision for column headers, which are by no means an inherent feature of CSV format. Indeed, if your header line is not readable as a data line, then your file is not CSV at all.
Supposing that your file is indeed valid CSV, the absolute simplest solution would be to ignore the issue. It presents a problem only if the first column heading duplicates an actual or potential data name. If it does not, then you will never look up or use the psuedo-value represented by the first row.
If your file in fact is not CSV on account of its first line, or if the first column name conflicts with a real data name, then it seems the next best alternative would be to just remove that line, or to avoid creating it in the first place. I don't see any reason why one of these should not be possible.
I know about heira, and having the data in YAML or JSON files but my requirement is CSV files only.
How sad. Do be aware that extlookup() has long been deprecated, and it was removed from Puppet 4.
I'm inclined to suggest you implement a translator from CSV to Hiera-friendly YAML, and use Hiera in your module. Alternatively, Hiera supports custom backends, and it's not too hard to write one. I am unaware of an existing CSV backend for Hiera, but you could write one. Ignoring a header line would then be under your control, and you would simultaneously achieve a measure of future-proofing.
I have a CSV template file, say, having 10 columns.
I would like to load this CSV file template, and then write data to the relevant cells(say only to 5 of the 10 cells) through a java program.
I went through JSAPAR, SuperCSV etc, but am not sure whether these libraries have the "stuff" what exactly I need.
Is there any framework supporting this kind of operations?
Checkout freemarker: http://freemarker.org/
Open your text file.
Enter freemarker paramerters for required cells.
Your template file may look something like below:
"Templatetext1","text2","text4", "${myVal4}",${myVal5}","text6", ${myVal7}",${myVal8}",${myVal9}","textInCell10"
Pass in the values, you have your csv from template.
If you want to pass for multiple rows you can use other elements like <#list> etc.
OpenCSV is generally considered the best CSV toolkit for Java. It's a very lightweight library that makes working with CSV dead simple. I would recommend looking at it since it's not among the list of things you've tried yet.