how to create custom node in knime? - knime

I have added all the plugins of Knime in Eclipse and I want to create my Own custom node. but I am not able to understand how to pass the data from one node to another node.
I saw one node which has been provided by the Knime itself which is " File Reader " node. Now I want the source code of this node or jar file for this node But I am not able to find it out.
I am searching with the similar name in eclipse plugin folder but still I didn't get it.
Can someone please tell me how to pass the data from one node to another node and how to identify the classes or jar for any node given by knime and source code also.

Assuming that your data is a standard datatable, then you need to subclass NodeModel, with a call to the supertype constructor:
public MyNodeModel(){
//One incoming table, one outgoing table
super(1,1);
}
You need to override the default #execute(BufferedDataTable[] inData, ExecutionContext exec) method - this is where the meat of the node work is done and the output table created. Ideally, if your input and output table have a one-to-one row mapping then use a ColumnRearranger class (because this reduces disk IO considerably, and if you need it, allows simple parallelisation of your node), otherwise your execute method needs to iterate through the incoming datatable and generate an output table.
The #configure(DataTableSpec[] inSpecs) method needs to be implemented to at the least provide a spec for the output table if this can be determined before the node is executed (it normally can, and this allows downstream nodes also to be configures, but the 'Transpose' node is an example of a node which cannot do so).
There are various other methods which you also need to implement, but in some cases these will be empty methods.
In addition to the NodeModel, you need to implement some other classes too - a NodeFactory, optionally a NodeSettingsPane and optionally a NodeView.
In Eclipse you can view the sources for many nodes, and also the KNIME community 'book' pages all have a link to their source code. Take a look at https://tech.knime.org/developer-guide and https://tech.knime.org/developer/example for a step-by-step guide. Also, questions to the knime forums (including a developer forum) generally get rapid responses - and KNIME run a Developer Training Course a few times a year if you want to spend a few days learning more. And last but not least, it is worth familiarising yourself with the noding guidelines which describe the best practice of how your node should behave

Source code for KNIME nodes are now available on git hub.
Alternatively you can check under your project>plugin dependencies>knime-base.jar>org.knime.base.node.io.filereader for file reader source code in eclipse KNIME SDK.
Knime-base.jar will be added to your project by default when created with KNIME SDK.

Related

Boomi integration - Dynamically inject mapping information

We are now in process of evaluating integration solutions and comparing Mule and Boomi.
Use case is to read an Excel file, map the columns to a pre-defined set of JSON attributes and then use the JSON to insert records into a database. The mapping may vary from one Excel template to another wherein the column names in an Excel may be different from others.
How do I inject mapping information (source vs target) from outside integration flow?
Note: In Mule, I'm able to do that using a mapping variable (value is JSON) that I inject using Mule DataWeave language.
Boomi's mapping component is static in terms of structure but more versatile solutions are certainly possible.
The data processor component opens up Groovy, JavaScript, and XSLT 3.0 as options. These are Turing-complete languages that can be used to bend Boomi to almost any outcome.
You could make the Boomi UI available to those who need to write the maps in JSON. It's a pretty simple interface to learn. By using a route component, there could be one "parent" process that governs the a process for each template/process and then a map for each template. Such a solution would be pretty easy to build and run; allowing the template-specific processes to be deployed independently of the "parent".
You could map to a generic columnar structure and then dynamically alter the target
columns by writing a SQL procedure that would alter the target columns.
I've come across attempts to do what you're describing (not using either Boomi or Mulesoft) which were tragic failures: https://www.zdnet.com/article/uk-rural-payments-agency-rpa-it-failure-and-gross-incompetence-screws-farmers/ I draw your attention to the NAO's points:
ensure the system specifications retain a realistic level of flexibility
and
bespoke software is costly to develop, needs to be thoroughly tested, and takes more time to implement
The general goal for such a requirement like yours is usually to make transformation/ETL available to "non-programmers" which denies the reality that there are many more skills to delivering an outcome than "programming".

Questionnaire tool to create config files

I have an application that needs a configuration file with several inputs which depend on the project that is going to be delivered. Things that are included in this conf file are IP's of databases, activating certain functions depending on the customer's needs, changing the values of some title screens, etc... A short example of a file could be something like:
postgresdb=192.156.98.98
transactions.enabled=true
application.name="client-1-logistics"
historicaldb=196.125.125.16
....
This files can become large and it might be difficult to find which parameters must be changed, specially if the configuration process has to be done by an external department.
I was looking into some kind of tool or framework that allows you to create some sort of questionnaire by which the user answers yes or no questions and fills out boxes with specific IP's or messages and get as a result the configuration file needed. This would be much tidier as you could group the questions into sections and has the potential of customising the configuration process with more context on the different parameters.
Does anyone know of such a framework?. How do you handle this kind of complex configuration processes?
The approach I outline below is not exactly what you are looking for, but it might provide some food for thought.
Use a template engine (example, Velocity, or any of the
several dozen listed in Wikipedia) to create a templated
version of your configuration file, containing lots of boilerplate
configuration that won't change, with the occasional
${variable_name} placeholder (the syntax for a placeholder will
vary from one template engine to another).
Write a small metadata file containing variable_name=value
settings.
Write a trivial program that: (a) parses the metadata file and loads
the variable_name=value settings into a Map (the template engine
might refer to the Map as, say, a context object); (b) uses the
template engine to parse the template file; (c)
merges/evaluates/instantiates the parsed template file with the settings in
the Map; and (d) writes the result to the target
configuration file.
You might be able to use steps 1 and 3 above without change. It is only step 2 that you need to adapt to your questionnaire requirements. Instead of a questionnaire, perhaps you could give users a document that explains how to write the metadata file.

What are the output files of the VxWorks Workbench kernel configuration GUI

I'm trying to generate a VxWorks 6.9.4.8 kernel configuration that is identical to another kernel workbench project. The Workbench 3.3.6 only allows GUI configuration.
Is there an underlying kernel configuration file, produced by the GUI, which can be replaced?
After updating the kernel configuration using the Workbench GUI, I see the following files have changed:
linkSyms.c,
prjComps.h,
prjConfig.c, and
prjParams.h
I guess my question is, which one, if any uniquely identifies the kernel as built?
prjComps.h will contain all the component's names, as you have chosen in your kernel configuration GUI.
First step to create new Kernel configuration based on some other Kernel configuration is to use GUI configurator and add the missing component in prjComps.h, Better use some diff tool like 'beyond compare', and keep reducing the differences by adding/removing the components. Remember not to edit this file directly, but via GUI configurator only. As the tool calculates the dependent component and adds/removes them.
Second step is to create the new prjParams.h as above.
The Workbench actually allows to use command line to edit Kernel configuration via vxprj tool in vxworks 6.9(this tool has been replaced by "wrtool" in vxworks 7), you can right click on the Image project and chose 'Open Wind River vxWorks 6.9 Developement Shell'.
If you want to add a component for e.g. telnet client (INCLUDE_TELNET_CLIENT)
, you can use the following command
vxprj component add INCLUDE_TELNET_CLIENT
To remove a component
vxprj component remove INCLUDE_TELNET_CLIENT
For more of vxprj tool, you can look up the documentation in the workbench itself.
The project configuration is held in a handful of files in the kernel project directory.
These are:
.project
.cproject
.wrproject
projectname.wpj
Files such as prjComps.h, prjParams.h prjConfig.c are all generated by the configuration tool, however these are not configuration files themselves. Instead, this is generated C code that contains, amongst other things, a list of selected components.
These files are also re-generated, I believe, when you rebuild the project.
As such, these are not really the authoritative source you are interested in.
For this, you need to look at the project files. In terms of a list of components, the most interesting is the .wpj file, which contains amongst other things a list of explicitly and implicitly included components.
The explicitly included components are those you manually selected in the Kernel Configuration GUI, the implicitly included are those that were then included to satisfy dependencies.
This distinction can sometimes make comparing kernel configurations tricky, then you may want to fall back on the generated files eg prjComps.h, however you should always remember that this is a representation of the configuration, not the source.
The .project etc configuration files are big and complex, but a decent diff tool, such as BeyondCompare can make comparisons of the project directories fairly easy
Thanks for the clue, #endTunnel. I looked at that file, and noticed that a few files get modified when I save my GUI selections.
prjComps.h - all the components #included in the kernel build
prjParams.h - the additional parameters set for the enabled components
prjConfig.c - the configuration and initialization calls for each module included.
'linkSyms.c' also gets modified. Not sure how that is used, yet.
I can now use diff to compare kernel configurations, and perhaps even duplicate a configuration (haven't tried that yet).

Is there a preprocessor for json files? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I have some configuration files that I store the complex object values as serialized json. Currently there is a configuration file for each environment (localhost, dev, prod etc.) and for each installation by client. Most of the values are identically for the configurations between environments but not all. So for three environments and four clients I currently have 12 total files to manage.
If this were a web.config file there would be web.config transforms that would solve the problem. If this was c# I'd have compiler preprocessor directives that could be useed to substitute the different values based on the current build configuration.
Does anyone know of anything that works basically this way or have some good suggestion on tried and true ways to proceed? What I would like is to reduce the number of files down to a single instance for each installation that can suffice for each environment.
Configuration of configuration always seems a bit overdone to me, but you could use a properties file for the parts that change, and apache ant's <replace> task to do the substitutions. Something like this:
<replace
file="configure.json"
propertyFile="config-of-config.properties">
<replacefilter
token="#token1#"
property="property.key"/>
</replace>
Jsonnet from Google is a language that with with a super-set syntax based on JSON, adding high level language features that help to model data in JSON fromat. The compilation step produces JSON. I have used it in a project to describe complex deployment environments that inherit from one another at times and that share domain attributes albeit utilizing them differently from one instance to another.
As an example, an instance contains applications, tenant subscriptions for those applications, contracts, destinations and so forth. The values for all of these attributes are objects the recur throughout environments.
Their docs are very thorough and don't miss the std functions because they make for some very powerful data rendering capabilities.
I wrote a Squirrelistic JSON Preprocessor which uses Golang Text Templates syntax to generate JSON files based on parameters provided.
JSON template can include reference to other templates, use conditional logic, comments, variables and everything else which Golang Text templates package provides.
This really comes down to your full stack.
If you're talking about some application that runs solely client-side, with no server-side processing, whatsoever, then there's really no such thing as pre-processing.
You can process the data further before actually using it, but that won't mean that it will be processed prior to the page being served -- it means that people have to sit around, waiting for that to happen before the apps which need that data can be initialized.
The benefit of using JSON, to begin with is that it's just a data-store, and is quite language-agnostic, and quite widely supported, now. So if it's not 100% client-side, there's nothing stopping you from pre-processing in whatever language you're using on the server, and caching those versions of those files, to serve (and cache) to users, based on their need.
If you really, really need a system to do live processing of config-files, on the client-side, and you've gone through the work of creating app-views which load early, but show the user that they're deferring initialization (ie: "loading..."/spinners), then download a second JSON file, which holds all of the needed implementation-specific data (you'll have 12 of these tiny little files, which should be simple to manage), parse both JSON files into JS objects, and extend the large config object with the additional data in the secondary file.
Please note: Use localhost or some other storage facility to cache this, so that for html5-browsers, this longer load only happens one time.
There is one, https://www.npmjs.com/package/json-variables
Conceptually, it is a function which takes a string, JSON contents, sprinkled with specially marked variables and it produces a string with those variables resolved. Same like Sass or Less does for CSS - it's used to DRY up the source code.
Here's an example.
You'd put something like this in JSON file:
{
"firstName": "customer.firstName",
"message": "Hi %%_firstName_%%",
"preheader": "%%_firstName_%%, look what's inside"
}
Notice how it's DRY — single source of truth for the firstName value.
json-variables would process it into:
{
"firstName": "customer.firstName",
"message": "Hi customer.firstName",
"preheader": "customer.firstName, look what's inside"
}
that is, Hi %%_firstName_%% would look for firstName at the root level (but equally, it could be a deeper path, for example, data1.data2.firstName). Resolving also "bubbles up" to the root level, also you can use custom data structures and more.
Missing pieces of a JSON-processing task puzzle are:
Means to merge multiple JSON files, various ways (object-merge-advanced)
Means to orchestrate actions — Gulp is good if you're preferred programming language is JS
Means to get/set values by path (object-path - its notation uses dots only, no brackets key1.key2.array.2 instead of key1.key2.array[2])
Means to maintain the same set of keys across set of JSON files - you add a key in one, it's added on all others (object-fill-missing-keys)
In described case, we can do at least two approaches: one-to-many, or many-to-many.
Former - Gulp could be "baking" many JSON files from one or more JSON-like source files, json-variables DRY-ing up the references.
Later - alternatively, it could be "managed" set of JSON files rendered into set of distribution files — Gulp watches src folder, runs object-fill-missing-keys to normalise schemas, maybe even sorting objects (yes, it's possible, sorted-object).
It all depends how similar is the desired set of JSON files and how values are customised and is it done manually or programmatically.

Testing and mocking with Flex

I am developing a "dumb" front-end, it's an AIR application that interacts with a "smart" LiveCycle server. There are currently about 20 request & response pairs for the application. For many reasons (testing, developing outside the corporate network, etc), we have several XML files of fake data, and if a certain configuration flag is set, the files are loaded, a specific file is parsed and used to create a mock response. Each XML file is a set of responses for different situation, all internally consistent. We currently have about 10 XML files, each corresponding to different situation we can run into. This is probably going to grow to 30-50 XML files.
The current system was developed by me during one of those 90-hour-week release cycles, when we were under duress because LiveCycle was down again and we had a deadline to meet. Most of the minor crap has been cleaned up.
The fake data is in an object called FakeData, with properties like customerType1:XML, customerType2:XML, overdueCustomer1:XML, etc. Then in the FakeData constructor, all of the properties are set like this:
customerType1:XML = FileUtil.loadXML(File.applicationDirectory.resolvePath("fakeData/customerType1.xml");
And whenever you need some fake data (this happens in special FakeDelegates that extend the real LiveCycle Delegates), you get it from an instance of FakeData.
This is awful, for many reasons, but it works. One embarrassing part is that every time you create an instance of FakeData, it reloads all the XML files.
I'm trying to figure out if there's a design pattern that is not Singleton that can handle this more elegantly. The constraints are:
No global instances can be required (currently, all the code dealing with the fake data, including the fake delegates, is pulled out of production builds without any side-effects, and it needs to stay that way). This puts the Factory pattern out of the running.
It can handle multiple objects using the XML data without performance issues.
The XML files are read centrally so that the other code doesn't have to know where the XML files are, and so some preprocessing can be done (like creating a map of certain tag values and the associated XML file).
Design patterns, or other architecture suggestions, would be greatly appreciated.
Take a look at ASMock which was developed by a good friend of mine (and a member here Richard Szalay) and is based on .nets Rhino mocks. We've used it in several production environments now so i can vouch for it's stability.
should be able to get rid of any fake tests (more like integration tests) by using the mock object instead.
Wouldn't it make more sense to do traditional mocking with a mocking framework? Depending on your implementation, it might be possible to set up the Expects by reading the fake-data XML files.
Here is a Google Code project that offers mocking for ActionScript.