PDI how to replace variable name in metadata - pdi

I use PDI 4.4, there are so many transformations and jobs. I use ${checkPeriod} variable.
I want to copy this repository to a new project, and want to use ${checkPeriod_dy} replace ${checkPeriod} in all transformations and jobs.
HOw can I replace all variables?

Related

Using specific CSV rows in JMeter CSV dataset (based on condition)

I have a single CSV file with following sample data..
Column headers:
USER_VAR,PASSWORD_VAR,APP1,APP2,APP3
Actual data:
username1,password1,true,blank,blank
username2,password2,true,blank,blank
username3,password3,blank,true,blank
username4,password4,blank,true,blank
username5,password5,blank,blank,true
I have 3 different (independent) JMeter Scripts.
JMeterScript-APP1
JMeterScript-APP2
JMeterScript-APP3
I want to use the same CSV file with all these 3 scripts in such a way that each script only use the specific rows in given CSV.
JMeterScript-APP1 should only process first 2 rows (APP1=true).
JMeterScript-APP2 should only process 3rd and 4th row (APP2=true).
JMeterScript-APP3 should only process 5th row (APP3=true).
I have tried implementing this scenario using CSV dataset config. But it seems that JMeter does not provide any built in support for implementing this. Can anyone share workarounds for doing it?
If required I can also manipulate the data in last 3 columns of CSV before feeding it to JMeter scripts.
I believe the easiest way is using If Controller where you can check if this or that variable is set like:
${__javaScript(vars.get("APP1") != null,)}
So if ${APP1} variable is not defined underlying samplers will not be executed, this way you will be able to "skip" the unwanted samplers.
vars is a shorthand to JMeterVariables class instance, it provides read/write access to all JMeter Variables in scope.

Passing a path as a parameter in Pentaho

In a Job I am checking if the file that I want to read is available or not. If this csv exists I want to read the data and save them in a database table within a transformation.
This is what I have done so far:
1) I have create the job, 2) I have implemented some parameters, one of them with the path for the file, 3) I have indicated that I am going to pass this value to the transformation.
Now, the thing is, I am sure this is should be something very simple to implement, but even when I have follow some blogs, I have not succeeded with this part of the process. I've tried to follow this example:
http://diethardsteiner.blogspot.com.co/2011/03/pentaho-data-integration-scheduling-and.html
My question remains the same. How can I indicate to the transformation that it has to use the parameter that I am given him from the job?
You just mixed up the columns
Parameter should be the name of the parameter in the transformation you are running.
Value is the value you are passing.
Since you are passing a variable, and not a constant value you use the ${} syntax to indicate this.

How to combine multiple MySQL databases using D2RQ?

I have four different MySQL databases that I need to convert into Linked Data and then run queries on the aggregated data. I have generated the D2RQ maps separately and then manually copied them together into a single file. I have read up some material on customizing the maps but am finding it hard to do so in my case because:
The ontology classes do not correspond to table names. In fact, most classes are column headers.
When I open the combined mapping in Protege, it generates only 3 classes (ClassMap, Database, and PropertyBridge) and lists all the column headers as instances of these.
If I import this file into my ontology, everything becomes annotation.
Please suggest an efficient way to generate a single graph that is formed by mapping these databases to my ontology.
Here is an example. I am using the EEM ontology to refine the mapping file generated by D2RQ. This is a section from the mapping file:
map:scan_event_scanDate a d2rq:PropertyBridge;
d2rq:belongsToClassMap map:scan_event;
d2rq:property vocab:scan_event_scanDate;
d2rq:propertyDefinitionLabel "scan_event scanDate";
d2rq:column "scan_event.scanDate";
# Manually added
d2rq:datatype xsd:int;
.
map:scan_event_scanTime a d2rq:PropertyBridge;
d2rq:belongsToClassMap map:scan_event;
d2rq:property vocab:scan_event_scanTime;
d2rq:propertyDefinitionLabel "scan_event scanTime";
d2rq:column "scan_event.scanTime";
# Manually added
d2rq:datatype xsd:time;
The ontology I am interested in has the following:
Data property: eventOccurredAt
Domain: EPCISevent
Range: datetime
Now, how should I modify the mapping file so that the date and time are two different relationships?
I think the best way to generate a single graph of your 4 databases is to convert them one by one to a Jena Model using D2RQ, and then use the Union method to create a global model.
For your D2RQ mapping file, you should read carefully The mapping language, it's not normal to have classes corresponding to columns.
If you give an example of your table structure, I can give you an illustration of a mapping file.
Good luck

Which data flow transform component is able to separate & extract data by specifying a delimit type such as a comma or a pipe?

IF a table field contains string value delimited by a pipe like:
"John|Lee|12-01-2015"
When SSIS reads it, is there an SSIS component that you can specify a separator and map each into next flow stage?
Say. data flow reads it("John|Lee|12-01-2015") into a variable, then anything can separate the whole string by | and map each sub-value into a set of column via configuration?
I know it is pretty simple to use C# to do it. In SSIS, script component can be used. Just wonder if there is an SSIS component for it, pretty much lile a flat file source, but read from a variable rather than a file.

JMeter / AMQ - Substitute substring when reading strings from JSON file

I've been bashing against a brick wall on this ever since Monday, when the customer told me that we needed to simulate up to 50,000 pseudo-concurrent entities for the purposes of performance testing. This is the setup. I have text files full of JSON objects containing JSON data that looks a bit like this:
{"customerId"=>"900", "assetId"=>"NN_18_144", "employee"=>"", "visible"=>false,
"GenerationDate"=>"2012-09-21T09:41:39Z", "index"=>52, "Category"=>2...}
It's one object to a line. I'm using JMeter's JMS publisher to read the lines sequentially:
${_StringFromFile(${PATH_TO_DATA_FILES}scenario_9.json)}
from the each of which contain a different scenario.
What I need to do is read the files in and substitute assetId's value with a randomly selected value from a list of 50,000 non-sequential, pre-generated strings (I can't possibly have a separate file for each assetId, as that would involve littering the load injector with 50,000 files and configuring a thread group within JMeter for each). Programatically, it's a trivial matter to perform the substitution but it's not so simple to do it in JMeter on the fly.
Normally, I'd treat this as the interesting technical challenge that it is and spend a few days working it out, but I only have the weekend, which I suspect I'll spend sleeping overnight in the office anyway.
Can anyone help me with this, please?
Thanks.
For reading your assets, use a CSV Data SetConfig , I suppose assetId will be the variable name.
Modify your expression:
${_StringFromFile(${PATH_TO_DATA_FILES}scenario_9.json, lineToSubstitute)}
To do the substitution, add a Beanshell sampler or JSR223_SamplerJ (using groovy) and code the substitution:
String assetId = vars.get("assetId");
String lineToSubstitute = vars.get("lineToSubstitute");
String lineSubstituted = ....;
vars.put("lineSubstituted", lineSubstituted);
If your JSON body is always the same or you have little changes in it, you should:
Use an HTTP Sampler with RAW POST Body
Put the JSON body in it with variables for asset ids
Put asset ids in CSV Data Set config
Avoid using ${_StringFromFile} as it has a cost.
If you need scripting , use JSR223 Post Processor with Script in external file + Caching (available since 2.8) so that script is compiled.