As far as I understand, it is possible to specify delimiter on Firehose as long as you activate Dynamic Partitioning. You can see "New line delimiter" option under the Dynamic partitioning section.
However, I need to specify a delimiter without using Dynamic Partitioning. I cannot see this option anywhere. Is it possible to achive that?
Note: I am trying to find a built-in solution other than appending manually data + "\n" in producer application.
You have to setup lambda function for firehose which is going to process the records and add new line to them.
Related
I'm a total noob when it comes to NiFi - so please feel free to highlight any stupidity/ignorance.
I'm reading messages from a Kafka topic using NiFi.
Each message contains JSON that contains a field called Function and then a whole bunch of different fields, based on the Function. For example, if Function ="Login", you can expect a username and password field, but if Function = "Pay", you can expect "From", "To" and "Amount" fields.
I need to process each type of Function differently. So, basically, I want to read the message from Kafka, determine the function and then route the message, based on the function to the appropriate set of rules.
It sounds like this should be simple - but for one small complication. I have about 500 different types of Functions. So, I don't want to add a RouteOnAttribute node for each function.
Is there a better way to do this? If this was "real code", I suppose that I'm looking for the difference between an "if" statements and some sort of "switch/case" statement....
You would first use EvaluateJsonPath to extract the function into a flow file attribute, then RouteOnAttribute which would need 500 conditions added to it, and then connect each of those 500 conditions to whatever follow on processing is required. The only other thing you could do is implement a custom processor that handles the 500 conditions internally.
I have a JSON file in Azure Blob storage that I need to parse and insert rows into SQL using the Logic App.
I am using the "Get Blob Content" and my first attempt was to then pass to "Parse JSON". It returns and error": InvalidTemplate. Unable to process template language expressions in action 'Parse_JSON' inputs at line '1' and column '2856'"
I found some discussion that indicated that the content needs to be converted to a string so I used "Compose" and edited the code as suggested to
"inputs": "#base64ToString(body('Get_blob_content').$content)"
This works but then the InvalidTemplate issue gets pushed to the Parse function and I get the InvalidTemplate error there. I have tried wrapping the output in JSON expression and a few other things but I just can't get it to parse.
If I take a sample or even the entire JSON and put it into the INPUT of the Parse function it works without issue but it will not accept the blob content as JSON.
The only thing I have been able to do successfully from blob content is to take it as a string and update a row in SQL to later use the OPENJSON in SQL...but I run into an issue there that is for another post.
I am at a loss of what to do.
You don't post much information about your logic app actions, so maybe you could refer to my flow design. I test with a json data with array.
The below is my flow picture. I'm not using compose action, and use decodeBase64(body('Get_blob_content')['$content']) as the Parse Json content.
And if select property from the json, you need set the array index. I set a variable to get a value 'body('Parse_JSON')1['name']'.
you could have a try with this, if still fail, please provide more information or some sample to let us have a test.
Got myself in a tricky situation. I'm using local storage to save values from a popup window, and then paste them into an input when focus returns to parent window.
But then something rather awkward takes place, when I try to store ';' separated values, is that I get only the 1st set, losing all the rest of the string. What makes it more bizarre is that after saving my value, I test by calling
alert('SELECTED : ' + localStorage.getItem('MyStr'));
the whole string is there... but on the script I retrieve this value, when i'm checking
alert(localStorage.getItem('MyStr'));
Only the 3rd set is there, i.e.: I store something like
abcdefg;123323;ffasfs;5445;iuiuifa;
but when I need to get it back, theres only
ffasfs
I could use some help then, I'm all new to this whole thing, and killing myself to get a website working.
Thanks in advance, sorry if my question looks stupid.
Store your values in localStorage as JSON strings. This may even help you build more complex objects for the future.
For now though... Just do:
localStorage.setItem("your key", JSON.stringify("abcdef;1234;whatever"));
This procedure will not only sanitize your input but also create oppertunity to store serialized objects in the future.
It's important to note that while JSON.stringify is pretty much supported everywhere... Not all browsers have it built in.
For those cases, check out json2.js.
Hope that helps.
I’m trying to use LOAD CSV to create nodes with the labels being set to values from the CSV. Is that possible? I’m trying something like:
LOAD CSV WITH HEADERS FROM 'file:///testfile.csv' AS line
CREATE (x:line.label)
...but I get an invalid syntax error. Is there any way to do this?
bicpence,
First off, this is pretty easy to do with a Java batch import application, and they aren't hard to write. See this batch inserter example. You can use opencsv to read your CSV file.
If you would rather stick with Cypher, and if you have a finite set of labels to work with, then you could do something like this:
USING PERIODIC COMMIT 1000
LOAD CSV WITH HEADERS FROM 'file:///testfile.csv' AS LINE
CREATE (n:load {lab:line.label, prop:line.prop});
CREATE INDEX ON :load(lab);
MATCH (n:load {lab:'label1'})
SET n:label1
REMOVE n:load
REMOVE n.lab;
MATCH (n:load {lab:'label2'})
SET n:label2
REMOVE n:load
REMOVE n.lab;
Grace and peace,
Jim
Unfortunately not, parameterized labels are not supported
Chris
you can do a workaround - create all nodes and than filter on them and create the desired nodes, than remove those old nodes
LOAD CSV WITH HEADERS FROM 'file:///testfile.csv' AS line
CREATE (tmp:line[1])
WITH tmp
CREATE (x:Person {name: labels(tmp)[0]})
WITH tmp
REMOVE tmp
paste this into http://console.neo4j.org to see example:
LOAD CSV
WITH HEADERS FROM "http://docs.neo4j.org/chunked/2.1.2/csv/import/persons.csv" AS csvLine
CREATE (p:tmp { id: toInt(csvLine.id), name: csvLine.name })
WITH p
CREATE (pp:Person { name: labels(p)[0]})
WITH p, pp
DELETE p
RETURN pp
I looked around at a few questions like this, and came to the conclusion that a nice concise way to handle these kinds of complex frustrations of not being able to easily add dynamic labels through 'LOAD CSV', is simply use your favorite programming language to read CSV lines, and produce a text output file of Cypher statements that will produce the Neo4j node/edge structure that you want. Then you will also be able to edit the text file directly, to alter whatever you want to further customize your commands.
I personally used Java given I am most comfortable with Java. I read each line of the CSV into a custom object that represents a row in my CSV file. I then printed to a file a line that reflects the Cypher statement I wanted. And then all I had to do was cut and paste those commands into Neo4j browser command line.
This way you can build your commands however you want, and you can completely avoid the limitations of 'LOAD CSV' commands with Cypher
Jim Biard's answer works but uses PERIODIC COMMIT which is useful however deprecated.
I was able to write a query that:
Loads from CSV
Uses multiple transactions
Creates nodes
Appends labels
Will work for 4.5 and onwards
:auto LOAD CSV WITH HEADERS FROM 'file:///nodes_build_ont_small.csv' AS row
CALL {
with row
call apoc.create.node([row.label], {id: row.id})
yield node
return null
} IN TRANSACTIONS of 100 rows
return null
Seems that apoc procedures are more useful then the commands themselves since this is not possible (at least in my attempts) with CREATE.
I have an OLE DB Data source and a Flat File Destination in the Data Flow of my SSIS Project. The goal is simply to pump data into a text file, and it does that.
Where I'm having problems is with the formatting. I need to be able to rtrim() a couple of columns to remove trailing spaces, and I have a couple more that need their leading zeros preserved. The current process is losing all the leading zeros.
The rtrim() can be done by simple truncation and ignoring the truncation errors, but that's very inelegant and error prone. I'd like to find a better way, like actually doing the rtrim() function where needed.
Exploring similar SSIS questions & answers on SO, the thing to do seems to be "Use a Script Task", but that's ususally just thrown out there with no details, and it's not at all an intuitive thing to set up.
I don't see how to use scripting to do what I need. Do I use a Script Task on the Control Flow, or a Script Component in the Data Flow? Can I do rtrim() and pad strings where needed in a script? Anybody got an example of doing this or similar things?
Many thanks in advance.
With SSIS, there are many possible solutions! From what you mention, you could use a Derived Column transform within a Data Flow to perform the trimming and padding - you would use an expression to do this, it would be relatively straightforward. Eg,
ltrim([ColumnName])
to trim and something along the lines of
right("0000"+ [ColumnName],6)
to pad (this is off the top of my head so syntax may not be exact).
As for the scripting method, that is also valid. You would use the Script Component Transform on the Data Flow and use VB.NET or C# (if you have 2008) string manipulation methods (eg strVariable.Trim()).