Splitting Values in new column in rapidminer 5 - rapidminer

In rapidminer 5 I would like to split values of a column and put the new values in a new column. I.e:
BIC|"Cognome"|"Nome"|"Via_completa"|"Civico"|"Esponente"|"CAP"|"Frazione"|"Comune"|"Provincia"
50417273|"ACCROCIA"|"ALESSANDRO"|"VIA NAPOLI"|"66"||"00100"||"MILANO"|"MI"
The "Via_completa"column is made out of the values 'VIA' and 'NAPOLI'. To normalize the address according to my data set I would like to split the values 'VIA' and 'NAPOLI' in the column "Via_completa", create a new column called 'DUG' and place value 'VIA' in the new column.
Like this:
BIC|"Cognome"|"Nome"|"DUG"|"Via_completa"|"Civico"|"Esponente"|"CAP"|"Frazione"|"Comune"|"Provincia"
50417273|"ACCROCIA"|"ALESSANDRO"|"VIA"|"NAPOLI"|"66"||"00100"||"MILANO"|"MI"
In Excel there is a 'text into columns' function. Is there an operator in Rapid miner to execute this function?
Thanks,
Friso

The Split operator is one option. You need to supply a regular expression to the split pattern parameter of this operator. For a space \s+ should suffice. This outputs new attributes with names based on the original attribute but with _1, _2, ... appended so you might need to do some renaming afterwards.

Related

Comparing multiple fields in two datasets to return a 3rd value

I am in report builder and I have my primary dataset that is from a SQL database, I also then created a second dataset (enter data). I need to compare 2 fields from each dataset to retrieve the correct value from the 2nd dataset and populate a column on my report. I have tried the IIF statements and Lookup statements but I keep getting the error "report item expressions can only refer to fields within the current dataset".
I have a attached a screenshot of what I am trying to do....
The IIF statement I tried to use.. If Acctnum and prodid = each other return IncodeNumber
=IIF((Fields!AcctNum.Value=Fields!AcctNum.Value, "IncodeAccount") AND
(Fields!ProdId.Value =Fields!ProdId.Value, "IncodeAccount")),(Fields!IncodeNumber.Value, "IncodeAccount"),"True")
See code in my problem.
You need to use LOOKUP(). The problem with LOOKUP() is that is can only compare a single value from each dataset. However, we can easily get around this issue by concatenating the two values you need to compare.
Note: This assumes the expression will be in a tablix that is bound to your first dataset and that IncodeAccount is your second dataset - the values you want to lookup. If this is not the case just adjust the expression accordingly
So for you, you probably need to do something like this..
=LOOKUP(
Fields!AcctNum.Value & "||" & Fields!ProdId.Value,
Fields!AcctNum.Value & "||" & Fields!ProdId.Value,
Fields!IncodeNumber.Value,
"IncodeAccount"
)
I've used two pipe symbols to join the values to avoid incorrect matches being found. e.g. Account 123 and product ID 4567 would incorrectly match to Account 1234 and product ID 567 as they would both be 1234567 when joined. By using the || the match would be 123||4567 and 1234||567 respectively.
You may need to convert the values to string using CStr()
Alternative approach
If you are going to do this 'join' multiple times in the same dataset then you could add a calculated column to the dataset that concatenates the two columns. Then you can use this single field in the lookup which will make things a little simpler.
Or, you could do this concatenation in a database view which would make things even easier.

What is the SSRS Multi Value Data Type and how to use

I have a multi-select.
I think the underlying datatype is int || array(int). This is pretty frustrating that you have to do a check to see if a multi-value is present before jumping into an index. But how does this value get passed to SQL?
It's easy enough to use in a IN (#variable) statement. How else can it be used? Is it a string or a table. From my investigations it appears to be single table row with many un-named columns but I'm not really sure.
Finally, when you want to simulate a multi-select in a query inside visual studio, for example to "Refresh Fields" how do you do that? For example "1,2,3", {1,2,3} or #{1,2,3}. It's not (123) because that is -123.
It dpends what you are trying to do and in what context.
As you said, if you have a datset query that is a SQL script (as opposed to a stored proc) then you can use IN(#paramName). In this instance SSRS take the parameter values (not the labels) and injects them into the sql statement as a string e.g. '1,2,3'. The result would be IN(1,2,3). If you want to pass in a list of, say, countries then you would have to set the parameter values to be the same as the parameter labels So rather then Value =1, Label = Spain you would have Value = Spain and Label = Spain. Used in an IN() would generate something like IN('Spain', 'France').
If you try to do the same with a stored proc e.g. EXEC myProc #myParam, then the parameter values would be passed as a sing string which would then need to be split out by the proc.
If you just want to get a list of selected parmeter values or label shoing in your report then you can simply do something like
=JOIN(Parameters!myParam.Value, ",")
or
=JOIN(Parameters!myParam.Label, ",")
where "," is the delimiter
If you pop this expression in a text box, you'll get a list of the selected parmater values/labels
I think it's a kind of madness but I found a workaround to get a table of values from the results from SSRS. I query the IDs against a source table using IN(). I hope there is a better way of doing this?
SELECT [TblFeeBillingCycleID]
FROM [TblFeeBillingCycle]
WHERE [TblFeeBillingCycleID] IN(#intCycleId)

Filter Multivalue Parameter on Dataset

So I have a multiple value parameter than contains 3 options. >250K, <250K, >2M.
I also have a table that consists of multiple columns.
. Because the parameter is a multivalue, i am having difficulties filtering the dataset.
I need to filter the dataset by checking, (if > 250K is selected, filter the dataset accordingly), (if < 250K is selected, filter the dataset accordingly) and (if > 2M is selected, filter the dataset accordingly).
I was told to use a join and split on the parameter within the (>250K condition, then do a contains to see if it contains any of the parameter values) but I am not as advanced in my knowledge of coding to be able to do that.
Any Suggestion? Thanks in Advance
I previously tried the method below but then i came to realise that it wont work because the parameter is a multi value.
I know its been a while since you raised this, you were on the right track but all you should need to do is add a filter to the Tablix on the field you will be filtering, use the 'in' operator and in the Value type [#Yourparametername] the square brackets and case sensitivity are important. Also ensure the expression type is correct, in your case it looks like you are using Integer. The image should help.
If you want to use multi-parameters, In the dataset, you can read parameter value using JOIN.
Example:
If you want to read multiple values for #MyParamter in a dataset given in the following example:
Dataset Parameters
you need to use =JOIN(Parameters!myMultiParamter.Value,",") as an expression to read all selected values in CSV form.
Expression
Now the #ParameterValues param has all selected values as comma separated values and you can use them in your dataset code as per design requirements.
Note: It's not necessary to use a comma but u can use anything you want to separate values.
Your sql query where should look like
Where
(
(0 IN (#Parameter) AND ValueColumn<250000)
OR
(1 IN (#Parameter) AND ValueColumn>=250000)
OR
(2 IN (#Parameter) AND ValueColumn>=2000000)
)
One parameter
Two parameters
All parameters
Once you return the value you can also use charindex or patindex* and look for where the value in your where clause is a pattern where the index number is > 0 . For instance if the returned string from SSRS is '01,02,03' and then your where clause has something like this right(field, 2) which would result in value '03'. you change your where clause to be where patindex('%' + right(field, 2) + '%', #returnedstring) > 0 which will give you results. The keeps you from having to parse apart the #returnedstring parameter in your sql code.

lookup in ssrs and return a value

I have a table with three columns, in column 1 I have a name and column 2 I have a quantity.
I want to look up column 1 and return the value in column 2.
I used to the expression below and it wouldn't return what I wanted.
=Lookup(fields!NAME.Value, "Paul" ,1 , 0)
Could anybody tell me what expression I need to use?
You are close, sort of. To use the Lookup function properly, you need to change some things. Here is an example using the Lookup function.
=Lookup(Fields!Field1.Value, Fields!Field1.Value, Fields!Field2.Value, "DatasetB")
It takes 4 parameters. The first is the field/value from the current (in scope) data set that you want to be the value to match in the lookup data set. The second is the field in the lookup data set to match on. The third in the field to return when a match is found, and the last parameter is the name of the lookup data set.
Based on the expression in your question, it may actually work like this:
=Lookup("Paul" , Fields!NAME.Value, Fields!QUANTITY.Value , "DataSet2")
Of course, hard coding the name in the first parameter is probably not what you want to do.

SSRS lookupset - row output

I am looking to use the lookupset function. I have a one to many relationship when one Risk has many Actions.
Using this function I can return it as a comma separated line, however what I really need is for each entry to display in a new row of a table, is this possible?
I have replaced the comma so instead of creating a comma separated line it does a newline, however I need it to line up with other values. This Is why I would like to to come out in rows in a table.
Can it display on separate lines rather than separate rows? If so, then you can simply separate the fields using carriage return/line feed pairs, like so:
=Join(LookupSet(Fields!Id.Value, Fields!Id.Value, Fields!FieldToLookup.Value, "LookupDataset"), vbCrLf)
Otherwise you are better off defining a subreport and embedding that in your table. The subreport is simply a report that is just a table with the report taking a parameter. You add this to the table in the main report using the toolbar and then set the parameters in the subreport properties to pass in the identification field name from the table's dataset.