Filter Example - Meta data - Attribute Range - rapidminer

I am new to RaipdMiner. i used filter example to remove rows I did not want from my input data eg remove row where Attr1='AB'. But when I check the meta data, the range for Attr1 includes 'AB', have I gone wrong somewhere?

The operator Remove Unused Values will remove any nominal values that are not used within the example set.

Related

Pass a Parameter in SSRS while removing Dashes

I am passing a unique ID as parameter in SSRS report. In the source table, unique id does not contain dashed. However, the user may insert Unique ID including dashes "-" and in some cases without dashes. Is there a way that we could remove dashes from the parameter.
For example, unique id 3120-20268-8 is stored in table as 3120202688. How I could retrieve if user pass multiple values with or without dashes in the SSRS Report.
When is used below query, it gives record against single value only. However, gives error when more than one values are provided.
select * from Table
where Unique_ID in (REPLACE(#Unique_ID,'-',''))
For more than 1 values, it gives errors mentioned below:
The replace function requires 3 argument(s).
Query execution failed for dataset 'ATL_List'.
Thanks
One of the simplest mechanisms for this is to create an expression based parameter to hold the sanitised input. This parameter would be hidden so the user is not aware of it, but the rest of the usage of the parameter is the same.
NOTE: You could do something similar with a query based default value, but this case is easier to do via a simple expression
Single Value Parameter
Create a new parameter:
set it to hidden
Set the default value expression:
=Str(Parameters!inputID.Value).Replace("-","")
Multi-Value Parameter
This is only slightly trickier, in the expression we can join the selected values together into a CSV string, then process that value and then split it back:
Set the parameter to multi-value, but still hidden:
Set the default value expression:
=Join(Parameters!inputID.Value,",").Replace("-","").Split(",")
Without going to detailed, if we made the sanitised parameter temporarily visible, just to demonstrate the conversion, it should look like this:
The parameter MUST be hidden!
NOTE: DO NOT make your sanitised parameter visible as in the above screenshot in your deployed report! Doing so will mean that it will not pickup changes made to the input value after it has rendered the first time.
remember that we have exploited the default value, we haven't arbitrarily defined en expression to always execute.
The output when the parameter is hidden is calculated when the report is rendered, it's just harder to visualise the behavior in this static post:
In your DataSet query you would just use the sanitised parameter:
SELECT * FROM Table WHERE Unique_ID IN (#sanitisedMultiValue)
You should be able to use the replace function in your report to format the parameter value after it has been entered, something like the below
replace(Fields!Paramater.Value,"-","")=FieldinYourTable

Filter Multivalue Parameter on Dataset

So I have a multiple value parameter than contains 3 options. >250K, <250K, >2M.
I also have a table that consists of multiple columns.
. Because the parameter is a multivalue, i am having difficulties filtering the dataset.
I need to filter the dataset by checking, (if > 250K is selected, filter the dataset accordingly), (if < 250K is selected, filter the dataset accordingly) and (if > 2M is selected, filter the dataset accordingly).
I was told to use a join and split on the parameter within the (>250K condition, then do a contains to see if it contains any of the parameter values) but I am not as advanced in my knowledge of coding to be able to do that.
Any Suggestion? Thanks in Advance
I previously tried the method below but then i came to realise that it wont work because the parameter is a multi value.
I know its been a while since you raised this, you were on the right track but all you should need to do is add a filter to the Tablix on the field you will be filtering, use the 'in' operator and in the Value type [#Yourparametername] the square brackets and case sensitivity are important. Also ensure the expression type is correct, in your case it looks like you are using Integer. The image should help.
If you want to use multi-parameters, In the dataset, you can read parameter value using JOIN.
Example:
If you want to read multiple values for #MyParamter in a dataset given in the following example:
Dataset Parameters
you need to use =JOIN(Parameters!myMultiParamter.Value,",") as an expression to read all selected values in CSV form.
Expression
Now the #ParameterValues param has all selected values as comma separated values and you can use them in your dataset code as per design requirements.
Note: It's not necessary to use a comma but u can use anything you want to separate values.
Your sql query where should look like
Where
(
(0 IN (#Parameter) AND ValueColumn<250000)
OR
(1 IN (#Parameter) AND ValueColumn>=250000)
OR
(2 IN (#Parameter) AND ValueColumn>=2000000)
)
One parameter
Two parameters
All parameters
Once you return the value you can also use charindex or patindex* and look for where the value in your where clause is a pattern where the index number is > 0 . For instance if the returned string from SSRS is '01,02,03' and then your where clause has something like this right(field, 2) which would result in value '03'. you change your where clause to be where patindex('%' + right(field, 2) + '%', #returnedstring) > 0 which will give you results. The keeps you from having to parse apart the #returnedstring parameter in your sql code.

SSRS chart series labels: Field!axisfield.Value not current value

I am trying to dynamically format the labels on my SSRS charts based on the underlying value. I'm trying to do this in two scenarios, one to format dates as ordinals and another to choose the appropriate number of decimal places based on actual values present. However, when I use the expression editor with an expression something like this...
=IIF(MAX(ABS(Fields![axisfield].Value))<2, "0.0%","0%")
...the Fields![axisfield].Value is always returning the first value from the dataset, meaning, in this example, if the first value is less than two, the labels will be formatted with one decimal place, even if it is the only one less than two. (So the 'MAX' function is essentially irrelevant.)
That example is attempting to set the overall formatting based on the largest data point in the series, in this next one I'm trying to format each label separately to get Ordinal dates (i.e. 1st, 2nd, etc, and yes, this formula is incomplete: it doesn't need to be to illustrate the problem):
="dd"+IIF(DatePart("d", Fields!date.Value)=1,"\s\t"
,IIF(DatePart("d", Fields!date.Value)=2,"\n\d"
,IIF(DatePart("d", Fields!date.Value)=3,"\r\d"
,"\t\h")))
This will give 1st, 2st, 3st and so on, as the first row in the dataset is for the first.
So, my question is, how do I get round this and, in the first example get the true maximum, and in the second reference the actual value being formatted?
Thanks!
I've had the same issues with using custom functions for setting label visibility. (see my entry for this: How to Display Only 1 Value Label in SSRS 2012 Calculated/Derived Series? )
I believe the issue is that data and fields are bound to the underlying data series but are not bound and accessible within the label itself.
You should be able to set the formatting in the function for the series data itself (as in the 2nd example) and then just set data labels, which will use the underlying series field value. An example with your data might be something like the following, which returns the values with the format:
="dd"+IIF(DatePart("d", Fields!date.Value)=1,Format(Fields!date.Value, "\s\t")
,IIF(DatePart("d", Fields!date.Value)=2,Format(Fields!date.Value,"\n\d")
,IIF(DatePart("d", Fields!date.Value)=3,Format(Fields!date.Value, "\r\d")
,Format(Fields!date.Value,"\t\h"))))
In the first example, you can get the max value to referring to the Dataset, as opposed to the field. Your code would then be:
=IIF(ABS(MAX(Fields![axisfield].Value, "YourDatasetName"))<2, "0.0%","0%")
(I changed the order of operations for Abs and Max because you have to use an aggregate function when referring to the whole dataset. Only then can you refer to the specific value.)

Splitting Values in new column in rapidminer 5

In rapidminer 5 I would like to split values of a column and put the new values in a new column. I.e:
BIC|"Cognome"|"Nome"|"Via_completa"|"Civico"|"Esponente"|"CAP"|"Frazione"|"Comune"|"Provincia"
50417273|"ACCROCIA"|"ALESSANDRO"|"VIA NAPOLI"|"66"||"00100"||"MILANO"|"MI"
The "Via_completa"column is made out of the values 'VIA' and 'NAPOLI'. To normalize the address according to my data set I would like to split the values 'VIA' and 'NAPOLI' in the column "Via_completa", create a new column called 'DUG' and place value 'VIA' in the new column.
Like this:
BIC|"Cognome"|"Nome"|"DUG"|"Via_completa"|"Civico"|"Esponente"|"CAP"|"Frazione"|"Comune"|"Provincia"
50417273|"ACCROCIA"|"ALESSANDRO"|"VIA"|"NAPOLI"|"66"||"00100"||"MILANO"|"MI"
In Excel there is a 'text into columns' function. Is there an operator in Rapid miner to execute this function?
Thanks,
Friso
The Split operator is one option. You need to supply a regular expression to the split pattern parameter of this operator. For a space \s+ should suffice. This outputs new attributes with names based on the original attribute but with _1, _2, ... appended so you might need to do some renaming afterwards.

Replace Variable Numerical Value in String

I am in the process of a migration between CMS and the old forum I used embedded attachments using [attachment=attachment#]imageURL[/attachment] and I want to update this where it changes the old attachment tags into [img][/img]
Getting the [/img] tag replaced was easy given that it's a single string. But my problem is that there is a unique numerical value with a range from 1-4000 in the first part of the shortcode eg: [attachment=3789]picture.jpg[/attachment]
Is there a way for me to run a similar replace query that either ignores all the numbers in the first tag and just replaces it, or perhaps something that removes the entire number rage within that part of the string.
I am unable to replace all numerical ranges in that field because the image names may have numerical values in them, so it will need to replace the numerical range only within that tag.