Biztalk 2010 aggregation - csv

I got an input xml file like this :
<root>
<item ...>
<item ...>
<item ...>
</root>
and I need to construct and send messages in 2 ways in the same orchestration :
1. Send to one destination as 1 xml file per item
2. Send the whole bunch of "item" as one single csv flat file.
My file is actually handled by a pipeline that extracts every "item" from the envelope. The problem is that I need to merge all "items" based on a certain condition.
Any ideas how to achieve this ?

There seem to be at least 2 ways of going about this - it isn't clear as to how you 'arrive' at the input xml file batch - this will drive the decision, IMO.
Since it seems that you've already got all the messages in a single xml batch at the start, this should be quite easy. Before you debatch them in a pipeline, you need to ensure that you also publish this batch message (root ...) into the messagebox if it isn't already so (i.e. Direct binding, if the message doesn't already come from the messagebox).
You can then create a map for the CSV file which takes the root message as input and then filters out the items which you don't want in the CSV. To do the filtering in the Map, you could use a looping functoid with conditionals, or my preference would just to be to implement the map in xslt and then apply templates just to desirable items with an xpath filter. A subscribing FILE send port, which filters (BTS.MessageType) to the incoming xml batch message can then apply this map.
The individual xml files would then be processed by your debatching pipeline and then another subscribing Physical FILE send port can write them out.
2 . Alternatively, if it is too late and the root xml file has been debatched (and you can't get to the original Xml file FWR), you would need to use another orch to reassemble the messages needed for the CSV (scatter and gather). This will be more complicated, as you will likely need to correlate the messages (e.g. on some batch identifier), or apply a timer, etc.
Have a look at the Pipeline Aggregator sample on how to collect the 'desirable' CSV messages into a Microsoft.XLANGs.Pipeline.SendPipelineInputMessages variable using a loop, and then use a pipeline to assemble the batch. If the criteria for 'desirable' are already promoted on the individual item messages, then you can apply the filter on your receive, but if not, you will need to use a decision in your loop to determine whether or not to add the message to the batch.

Related

Jenkins API xpath like functionality for JSON

I am trying to use the jenkins API to retrieve a list of running jobs buildURLs and this works with this the query
https://jenkins.server.com/computer/api/xml?tree=computer[executors[currentExecutable[url]]]&depth=1&xpath=//url&wrapper=buildUrls
By searching for all executors on given jenkins server and then grabbing the urls and wrapping them in a xml buildUrls object
(what I actually want is a count of total running jobs but I can filter in api then get a .size of this once client side)
However the application I am uses only accepts json, although I could convert to json I am wondering if there is a way I can have this return a json object which contains just the buildUrls. Currently if I change return to json the xpath modification is removed (since its xml only)
And I instead get returned the list of all executors and their status
I think my answer lies within Tree but i cant seem to get the result I want
To return a JSON object you can modify the URL as below
http://jenkins.server.com/computer/api/json?tree=computer[executors[currentExecutable[url]]]&depth=1&pretty=true
It will not be possible to get only the build urls you need to iterate through all the executables to get the total running jobs. The reason is that build urls alone cannot give you information where it is running. Hence, unfortunately we need to iterate through each executor.
The output you will get is something like below.
Another way of doing this same thing is parsing via jobs
http://jenkins.server.com/api/json?tree=jobs[name,url,lastBuild[building,timestamp]]&pretty=true
You can check for building parameter. But here too you will not be able to filter output from url diretly in jenkins. Iteration would be required through each and every job.

Json file that would be used in Elasticsearch

I want to know, if the Json files that would be used in Elasticsearch should have a predefined structure. Or can any Json document can be uploaded?
I've seen some Json documents that before each record there's such:
{"index":{"_index":"plos","_type":"article","_id":0}}
{"id":"10.1371/journal.pone.0007737","title":"Phospholipase C-β4 Is Essential for the Progression of the Normal Sleep Sequence and Ultradian Body Temperature Rhythms in Mice"}
Theoretically you can upload any JSON document. However, be mindful that Elasticsearch can create/change the index mapping based on your create/update actions. So if you send a JSON that includes a previously unknown field? Congratulations, your index mapping now contains a new field! In this same way the data type of a field might also be affected by introducing a document with data of a different type. So, my advice is be very careful in constructing your requests to avoid surprises.
Fyi, the syntax you posted looks like a bulk request (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html). Those do have some demands on the syntax to clarify what you want to do to which documents. "Index" call sending one document is very unrestricted though.

Node Red HTTP POST for every JSON object (loop)

I'm working in Node Red and I need to understand how to loop JSON object and, for each JSON object, make a HTTP request towards another endpoint.
In other words: here's the use case I will need to represent.
Node Red makes a HTTP post to get all devices.
All devices are returned in a JSON, with several information and a DEVICE_ID.
For every device ID, Node Red has to make another HTTP request passing the above id, to get all the resources for that device.
I'm in trouble since I expected Node Red to have a dedicated block for loops, but is not like that. So how can I make a workaround?
If I use the "function" block, and I type there the code to make my loop, how to "come back" to the flow view, using Http blocks? Thank you!
In flow-based programming, you tend to avoid loops. The most "flow like" approach would be to split the JSON output into multiple messages.
If you are familiar with JavaScript, the easiest way to do this is with a function node. In the JS code, create a loop over the JSON and within the loop, use node.send(msgobject) to output a new message. You can remove the normal return msg at the end of the code as you don't need it in this instance. Obviously, you have to create your msgobject for each loop instance (or create it inline). When the loop exits and you reach the end of your function node code, Node-RED will continue on to the next node in the flow, you don't need to do anything special.
Otherwise, you can use the core split node (or one of the contributed split nodes) to split the data into multiple messages.
Some specialized nodes exist for this purpose, for example node-red-contrib-loop.

How can I display an XML page instead of JSON, for a dataset

I am using the pycsw extension to produce a CSW file. I have harvested data from one CKAN instance [1], into another [2], and am now looking to run the pycsw 'paster load' command:
paster ckan-pycsw load -p /etc/ckan/default/pycsw.cfg -u [CKAN INSTANCE]
I get the error:
Could not pass xml doc from [ID], Error: Start tag expected, '<' not found, line 1, column 1
I think it is because when I visit this url:
[CKAN INSTANCE 2]/harvest/object/[ID]
It comes up with a JSON file as opposed to an XML (which it is expecting)
I have run the pycsw load command on other ckan instances and have had no problems with them. They also display an XML file at the url stated above, so I wanted to know how to get CKAN to serve an XML file instead of JSON?
Thanks in advance for any help!
As you've worked out, your datasets need to be in ISO(XML) format to load into a CSW server. A CKAN only has a copy of the dataset in ISO(XML) format if it harvested them from a CSW.
If you use the CKAN(-to-CKAN) harvester in the chain then the ISO(XML) record doesn't get transferred with it. So you'd either need to add this functionality to the CKAN(-to-CKAN) harvester, or get rid of the CKAN-to-CKAN harvest step.
Alternatively if the record originated in a CKAN, then it has no ISO(XML) version anyway, and you'd need to create that somehow.

How to deal with 3+ message formats in mule?

Let's suppose i'm dealing with 3 (very) different messages formats inside my Mule ESB, i'll call them A, B and C. They could be XML (via socket), some custom text-format and SOAP transporting another kind of XML (it's not the same that is transported via socket), for example. All of them can be transformed into each other. A, B and C carry the same information, only in different formats.
Each one of them will have it's own entry point in the flow, some format validations etc..
But, there's some (a lot) of logic that i need to be executed in all of them like dealing/extracting some information, routing based in the content, enriching etc etc..
What should i do? I mean, i did some research on integration patterns but didn't found anything about this situation or similar.
The easier approach sounds like taking one of the formats (let's take B) as the "default" one of my "main-flow" and implementing all the common logic based on it. Then, every message that arrives will be transformed to B and then transformed again to the destination format, even if the two points uses the same format.
Examples:
1) One "A" hit my app, then it's transformed to "B" to execute the common-logic, then it's transformed again to "A" to be delivered.
2) One "C" hit my app, then it's transformed to "B" to execute the common-logic, then it's transformed to "A" to be delivered.
Then, my question is, does Mule have a feature that provides me a better way on doing something like this or the solution above looks reasonable?
Thanks in advance.
The are a few options, any of these can be implemented in Mule. The first two are close to what you have suggested.
Normalizer: http://eaipatterns.com/Normalizer.html
Canonical Data Model: http://eaipatterns.com/CanonicalDataModel.html
Routing slip: http://eaipatterns.com/RoutingTable.html
Envelope: http://eaipatterns.com/EnvelopeWrapper.html
Which you use will depend on your messages and what you need to do with them.
With Canonical Data Model for example you could build a seperate flow for each incoming type that:
Receives a object in their own format.
Translates that object to the canonical object.
Passes that message on to the main processing flow.
The main flow would only need to know how to process that object.
Any endpoints that need the original object back would sit behind a transformer that can reverses the transformation.
You could pick one of your existing objects and use message variables to remember the original format or create a new object that remembers the original type itself.