Storing text with name in sequence - rapidminer

I am using RapidMiner 5 GUI and i want to store all the value of an attribute in different text file. But if i am using any write utility like write or write document it is either overwriting the data or giving an error.
I want to store all the value in different files with a sequence or something attached to it.
Is there any way?

With the Loop Attributes operator you can loop with the attributes and with the Generate Macro operator you can create from the macro as attribute name a file name you prefer.

Related

NiFi : Regular Expression in ExtractText gets CSV header instead of data

I'm working on a flow where I get CSV files. I want to put the records into different directories based on the first field in the CSV record.
For ex, the CSV file would look like this
country,firstname,lastname,ssn,mob_num
US,xxxx,xxxxx,xxxxx,xxxx
UK,xxxx,xxxxx,xxxxx,xxxx
US,xxxx,xxxxx,xxxxx,xxxx
JP,xxxx,xxxxx,xxxxx,xxxx
JP,xxxx,xxxxx,xxxxx,xxxx
I want to get the field value of the first field i.e, country. Put those records into a particular directory. US records goes to US directory, UK records goes to UK directory, and so on.
The flow that I have right now is:
GetFile ----> SplitText(line split count = 1 & header line count = 1) ----> ExtractText (line = (.+)) ----> PutFile(Directory = \tmp\data\${line:getDelimitedField(1)}). I need the header file to be replicated across all the split files for a different purpose. So I need them.
The thing is, the incoming CSV file gets split into multiple flow files with the header successfully. However, the regex that I have given in ExtractText processor evaluates it against the splitted flow files' CSV header instead of the record. So instead of getting US or UK in the "line" attribute, I always get "country". So all the files go to \tmp\data\country. Help me how to resolve this.
I believe getDelimitedField will only work off a singular line and is likely not moving past the newline in your split file.
I would advocate for a slightly different approach in which you could alter your ExtractText to find the country code through a regular expression and avoid the need to include the contents of the file as an attribute.
Using a regex of ^.*\n+(\w+) will capture the first line and the first set of word characters up to the comma and place them in the attribute name you specify in capture group 1. (e.g. country.1).
I have created a template that should get the value you are looking for available at https://github.com/apiri/nifi-review-collateral/blob/master/stackoverflow/42022249/Extract_Country_From_Splits.xml

SSIS 2008 sequence number

I have a requirement where the output file needs to be saved(dynamically) with the naming convention as FileName_YYY-MM-DD_FileNumber where file number is the sequence number. For example:-
ABC_2009-01-01_001
ABC_2009-01-01_002
ABC_2009-01-01_003 and so on
I am able to get the name part and date part using expression in .TXT connection but unable to get the sequence number part. I would appreciate if anyone could help me out with the solution.
Thanks in advance!
Use a package variable that starts with "1" and add one to it for each new file.
EDIT: To populate the variable, one way is to use a script task that opens a filesystemobject and gets all the file names in the folder, parses them and figures out what is the highest sequence number. Then just add one to that number and set the value of the variable to that.
And no, I don't have any code handy that does that. You'll need to write it yourself.

How to change Column Delimiter in MS VSTS for web performance test

I am using Microsoft VSTS for Performance test a web application
I am adding a Data Pool (.csv file) for parameterize multiple values, But the problem is .csv file is showing it in column delimited type like:
VariableA,VariableB,Variable3
Test1,Test2,Test3
Test4,Test5,Test6
But i want these multiple values in single column, Because whenever we will select the column delimited type, .csv file automatically converts all values in different columns.
Like in HP-LoadRunner we have 3 options [Column, Tab, Space]. I tried to find out in VSTS data-pool settings but not able to find any option.
I am trying to do this:
VariableA
Test1,Test2,Test3
Test5,Test6,Test7
Kindly help me out.
If you want to use Test1,Test2,Test3 in first iteration, Test5,Test6,Test7 in second iteration then try below in your csv file.
VariableA
"Test1,Test2,Test3"
"Test5,Test6,Test7"
This should consider Test1,Test2,Test3 as a single variable.

How can i extract value of each fields of each table of multiple tables of html file using perl

I want to convert html output of file into IVIL format, for that i want to retreive value of each field of multiple table present on html file. How should i do it?
Have a look at HTML::TableExtract. The module can easily extract information from HTML tables.

How to create dynamic number of output files with SSIS?

I will be creating flatfiles and based on the data in the batch, it might be necessary to split the data into an undetermined number of files.
I can make the connection string dynamic with an expression, but that is only evaluated when the package starts. I'd like to change that expression to include a '-a' or '-b' in the filename.
Alternately, if I have to create new connection manager objects at run time on demand, how do I go about that?
First determine your naming scheme for the output files and come up with an expression formula in your head
Put the Data Flow Task in a loop.
Within this Data Flow Task, define the source and destination. Destination being the Flat File Destination. Read the source and add some derived column that sets a value to another variable that you'll later use in the Filename expression.
Connect the Flat File Destination to a Connection Manager. First define some path but then add an Expression to define a Connection String based on your File Name scheme (Path + Filename + extension). Now this Filename is tricky. You'll have to put IIF statements based on the values you've got from Source
1) create grobal variable(a variable is created within the scope of a package) and assign it to the file name property.
2) change the variable during the looping.
EDITED
see for more details...
You can access the data set in a script (in the script component) and write out to a set of files based on your criteria.