Boomi integration - Dynamically inject mapping information - integration

We are now in process of evaluating integration solutions and comparing Mule and Boomi.
Use case is to read an Excel file, map the columns to a pre-defined set of JSON attributes and then use the JSON to insert records into a database. The mapping may vary from one Excel template to another wherein the column names in an Excel may be different from others.
How do I inject mapping information (source vs target) from outside integration flow?
Note: In Mule, I'm able to do that using a mapping variable (value is JSON) that I inject using Mule DataWeave language.

Boomi's mapping component is static in terms of structure but more versatile solutions are certainly possible.
The data processor component opens up Groovy, JavaScript, and XSLT 3.0 as options. These are Turing-complete languages that can be used to bend Boomi to almost any outcome.
You could make the Boomi UI available to those who need to write the maps in JSON. It's a pretty simple interface to learn. By using a route component, there could be one "parent" process that governs the a process for each template/process and then a map for each template. Such a solution would be pretty easy to build and run; allowing the template-specific processes to be deployed independently of the "parent".
You could map to a generic columnar structure and then dynamically alter the target
columns by writing a SQL procedure that would alter the target columns.
I've come across attempts to do what you're describing (not using either Boomi or Mulesoft) which were tragic failures: https://www.zdnet.com/article/uk-rural-payments-agency-rpa-it-failure-and-gross-incompetence-screws-farmers/ I draw your attention to the NAO's points:
ensure the system specifications retain a realistic level of flexibility
and
bespoke software is costly to develop, needs to be thoroughly tested, and takes more time to implement
The general goal for such a requirement like yours is usually to make transformation/ETL available to "non-programmers" which denies the reality that there are many more skills to delivering an outcome than "programming".

Related

Class diagram for xpages project

I am working on a project with Xpages.I wanted to know how to make the representation of a class diagram to my project.Notes is a documentary database so no relationnal.How I could represent my entities?
In Domino, documents are merely evidence of the existence of people, processes, and physical entities (products, offices, inventory, etc.). Ideally, your classes should model those things.
For instance, you might have classes like Employee, with properties like firstName, lastName, hireDate; maybe Asset, with properties like category, model, serialNumber; or perhaps Request, with properties like status, requester, dateApproved. Eventually the values of each of these properties might be stored as item values in Domino documents, but defining these first as attributes of classes allows you to follow a simple pattern to develop your application:
Use your class structure to rapidly define the nature of each "thing" your application interacts with, without worrying yet what each must look like or how and where the data will ultimately be stored.
Once you have these classes defined, you can bind visual components on an XPage (such as input fields like edit boxes and radio button groups) very easily using the #{dataSource.propertyName} syntax.
When these two steps are done, all you have left to do is to add two methods to each of these entity classes: one to write the data, and another to retrieve it.
Following this approach makes it very easy to rapidly build the application, but also protects your user interface from changes in how you wish the data to be stored. Initially, each object might represent a single document. As the application grows in either complexity or adoption, however, you may decide to segregate the data such that many documents are created to represent a single entity. Or at some point you might even decide to store some, or all, of the data outside of Domino (DB2, SQL, etc.). If your XPage components are bound to properties of these entity classes, all you need to do to change how or where the data is stored is to update the two methods you created in step 3 of the above list: alter how you write and retrieve the data. Your actual XPage design elements don't need to change at all.
Depends how you look at it. You can always think of following relation: Notes Form <-> Java POJO and Notes View <-> Java Collections.
See http://www.pipalia.co.uk/notes-development/rethinking-xpages-part-two/ for some tips on using Java world standards when working with xPages.

Build system that is not file-centric

We have a software infrastructure which works pretty much like a software build system: Information is gathered from different sources and used to generate some outputs. Like in traditional software builds we have different types of output, dependency trees, etc.
The main difference is that our sources, intermediate results and outputs are not inherently file-based. Rather, they're (uniquely addressable) data objects.
Right now we're mapping our data structure to files and directories in combination with a traditional build system (SCons) but that does not scale, both w.r.t. performance but (more importantly) w.r.t. maintainability. Hence I'm looking for an infrastructure that's built for this purpose from the ground up.
As an illustration, assume you have 3 XML documents A, B and C. Let's say that B/foo/bar is to be calculated from A/x/y and A/x/z, and that similarly C/a/b is calculated from A/x/y. I need an infrastructure to
Implement these relationships (i.e. the transformations and their dependencies)
Automatically re-build the relevant parts after changes are made
One major problem with using files is that, if I map A, B and C to some files A.xml, B.xml and C.xml and use a traditional build system, then any change to A.xml will trigger a rebuild of B.xml and C.xml, even if A/x/y and A/x/z (the original dependencies of B) are not modified. For a fine-grained dependency resolution I therefore would need to map each of A, B and C not to a file, but to a directory where each sub-directory represents an element, files represents attributes, etc. As I said, this does not scale for us.
(Please note that our system is not actually based on XML)
Right now I'm looking for any existing software, infrastructure or concept which points into this direction, regardless of implementation language and underlying data structures.
It sounds like you need an active object database management system (ODBMS) like GemStone/S. ODBMSs provide the traditional persistence services without the old cost of mapping data structures to files and the well-known benefits of object technology. As you've mentioned dependency trees and addressable objects, in ODBMSs navigational references are stored as part of their data, allowing any complex interaction patterns among objects to be represented/accessed. This is specially true when you predict a system which makes use of inheritance, object nesting and cross-referencing.
Although an object engine may seem oversized for your requirements, it is common for large-scale production business systems to store and execute methods using OODBMs, within a concurrent and multiuser environment. It doesn't come for free because you have to invest in the human part of the equation (education and experience) but once the initial fear is overcome, it will pay the return of investment.
For re-building (subscribed) parts after changes (notifications from announcers) are made, you may use the Observer design pattern, or one of its variants (SASE or Announcements framework), to implement your announce/subscription architecture. Under this type of event frameworks there are intrinsic problems which are hard to solve with traditional file-based solutions, as you have noticed already. For example, it is typical for a dependency mechanism to manage the replacement of an object, or in your example an XML document, by another one. Any modern events framework should manage when an object is removed, all dependents plugged to the old object are updated to the new reference.
Finally, there is a free GemStone/S stack which includes object dependency framework so you may experiment with a real object-database.
So nothing comes to mind that solves exactly your problem, but there are a few tools that might get you a little closer than you are now:
1) You might be able to throw something together using Fuse that would give you better control of how your data objects are mapped out to files. Fuse basically allows you to construct arbitrary file systems from whatever backing data you want. (The python bindings are pretty friendly, but there are a number of other language interfaces available as well). Then you could use a traditional build tool, and take advantage of file like objects better associated w/your data.
2) Cmake has a pretty extensible language for writing custom targets that you might be able to press into service. Unfortunately its language is pretty didactic and has something of a steep learning curve, so it wouldn't be my first choice.

Hardcoded database select. IDs vs names vs something else?

I am currently refactoring a project where so far a lot of data was kept as constants and arrays in the code. Also there are a lot of redundancies. Now I want to move all that data into the db, but I am not sure how I would do the mapping. The data is rarely dynamically selected based on user input but rather specifically selected in the code. It is used at a very core level of the application, but it is actually not THE core. Also a database is already being used, so there would be no real extra effort.
My idea would be to use a Mapping class in which I have constants pointing to the IDs of the respective rows. Is that a good idea?
Another idea would be to index the name row and just directly query for the names.
The database would probably have the following columns: id, name, polynom and params.
So, basically we are talking math data. For example: 1, "Price approximation", 20x^3 - 5x^2 + 11x", "non-cumulated".
I think this question is language-agnostic but since there might be a language-specific (or even framework-specific) best practice, here is what I use: PHP5 with the Yii Framework.
I don't have much experience with PHP nor Yii, but here is my 2 cents...
If these are constants and collections of constants that technically define your application (application architecture constants), but the end-user shouldn't have control over, I would put them in a configuration file instead of your database, unless you've built a module to easily access and modify them. Whether you implement a mapping class (or a configuration class) to retrieve them is not important, but be consistent in how you retrieve them. If you have too many to manage in a configuration file, then storing them in the database would be appropriate, but make sure you provide an easy way to modify them. To make your source code readable, I'd use descriptors that a human can understand and map those descriptors to the respective row like you mentioned.
If these are user defined constants, then you should definitely provide an interface. But keep the same architecture as the application architecture constants.
In a perfect program/application (or even better--an application framework), nothing is hard coded, and everything is controlled by constants (switches). If you're able to achieve this successfully without the need to maintain your source code, you will win the Nobel Peace Prize.

best practices for writing to a file from multiple methods

I have a class that contains a bunch of methods for checking data I scrape every week (for things like well-formedness and other errors in gathering the data). Each of these methods performs a test, and then prints out a summary of the test.
I want to print out the output from these tests to a file, but I'm not sure what the best way to do it is. For example...
Should the class hold an instance variable to the file, and each method open/appends/closes the file? (A problem is that methods sometimes call other methods, so this seems kinda messy?)
Should each method get passed the file as a parameter? (Seems messy as well.)
Should each method return a string, and a"central" method that calls all the other tests outputs all these strings to a file?
I'm not really familiar with using logger libraries -- would that be a solution?
My particular context
I have a scraper that pulls data from various websites and stores them in a database. Websites change all the time, so I'm writing a "scrape checker" program that checks my scrapes for various things, like:
number of empty results
length of results
weird characters in results
and so on
So I have methods like:
check_num_empty_results
check_weird_characters
check_scrape (calls a bunch of other checks)
check_scrape_pair (sometimes I want to check pairs of scrapes together, e.g., to match results against each other, so this is different checking each one in isolation)
etc.
I want my "scrape checker" program to print out a file that summarizes all the checks.
Separation of concerns. Write code the focuses on the scraping activity and return the value(s) scraped. Then use aspect oriented programming for logging, which can simplify the problem greatly as the aspect holds the reference to the file or logging API.
Ultimately, it depends on what language you're using.
The first solution makes the most sense if your language permits it. For each instance of the logging class, have a field for the file object that you're reading from/writing to. This is basically equivalent to passing the file object as a parameter to every method.
That said, most mature languages have modules that will do a lot of this work for you; off the top of my sh/awk, Perl, and Python all come to mind as being suited to this task (though if you want to, you could use Java or something else).
Seems like a logging framework would be a perfect solution for this. If you are using Java or .NET, log4j and log4net are pretty much the de-facto standards for that.

Testing and mocking with Flex

I am developing a "dumb" front-end, it's an AIR application that interacts with a "smart" LiveCycle server. There are currently about 20 request & response pairs for the application. For many reasons (testing, developing outside the corporate network, etc), we have several XML files of fake data, and if a certain configuration flag is set, the files are loaded, a specific file is parsed and used to create a mock response. Each XML file is a set of responses for different situation, all internally consistent. We currently have about 10 XML files, each corresponding to different situation we can run into. This is probably going to grow to 30-50 XML files.
The current system was developed by me during one of those 90-hour-week release cycles, when we were under duress because LiveCycle was down again and we had a deadline to meet. Most of the minor crap has been cleaned up.
The fake data is in an object called FakeData, with properties like customerType1:XML, customerType2:XML, overdueCustomer1:XML, etc. Then in the FakeData constructor, all of the properties are set like this:
customerType1:XML = FileUtil.loadXML(File.applicationDirectory.resolvePath("fakeData/customerType1.xml");
And whenever you need some fake data (this happens in special FakeDelegates that extend the real LiveCycle Delegates), you get it from an instance of FakeData.
This is awful, for many reasons, but it works. One embarrassing part is that every time you create an instance of FakeData, it reloads all the XML files.
I'm trying to figure out if there's a design pattern that is not Singleton that can handle this more elegantly. The constraints are:
No global instances can be required (currently, all the code dealing with the fake data, including the fake delegates, is pulled out of production builds without any side-effects, and it needs to stay that way). This puts the Factory pattern out of the running.
It can handle multiple objects using the XML data without performance issues.
The XML files are read centrally so that the other code doesn't have to know where the XML files are, and so some preprocessing can be done (like creating a map of certain tag values and the associated XML file).
Design patterns, or other architecture suggestions, would be greatly appreciated.
Take a look at ASMock which was developed by a good friend of mine (and a member here Richard Szalay) and is based on .nets Rhino mocks. We've used it in several production environments now so i can vouch for it's stability.
should be able to get rid of any fake tests (more like integration tests) by using the mock object instead.
Wouldn't it make more sense to do traditional mocking with a mocking framework? Depending on your implementation, it might be possible to set up the Expects by reading the fake-data XML files.
Here is a Google Code project that offers mocking for ActionScript.