In this sheet, I've the below input data:
As seen, the courses are separated by /
I want to display the same in the format below, where each line shows one course only, with the data of the student repeated:
I know using =split(C3," / ",true,true) can split the courses into 2 columns at the same row, but I need them in the same column, so I tried =TRANSPOSE(split(C3," / ",true,true)) that is working fine for the first line only, but it fail with using ARRAYFORMULA.
Any thought? I'm opened for any potential solution, formula or script or any other.
UPDATE
I tried this trick, creating a new column showing number of courses for each student as =ArrayFormula(LEN(REGEXREPLACE(C11:C13, "[^/]", ""))+1)
Then using Rep to repeat each row based on the number of courses =arrayformula({transpose(split(concatenate(rept(B11:B13 & ",",D11:D13)),",",false,true)),transpose(split(concatenate(REPT(C11:C13 & ",",D11:D13)),",",false,true))}) then ended up with:
But here, I've the courses still joint together, how can i split them!
I've added two sheets to your sample spreadsheet. "Sheet2" is a cleanup of your testing sheet, "Sheet1." The other sheet ("Erik Help") references Sheet2, not Sheet1, and contains the following formula in cell A1:
=ArrayFormula({"Student ID","Student Name","Course";SUBSTITUTE(SPLIT(QUERY(FLATTEN(SPLIT(FILTER(SUBSTITUTE("/ "&Sheet2!C3:C,"/","/ "&Sheet2!A3:A&"zzz~"&Sheet2!B3:B&"~"),Sheet2!A3:A<>""),"/")),"Select * WHERE Col1 Is Not Null"),"~"),"zzz","")})
This one array formula produces all headers and results.
A virtual array is formed between the curly brackets { }. Headers are introduced first followed by a semicolon, which means "bump down one row to continue." The header titles can be changed as you like.
How It Works:
An addition "/ " is concatenated to the front of every non-blank entry in Sheet2!C2:C. Then SUBSTITUTE replaces every one of these forward slashes with Col A data, "zzz~", Col B data and "~". The tildes (~) will be used later by the outer SPLIT. The "zzz" is added to make sure that ID numbers are converted to text so that they hold formatting throughout the processing and don't turn into real numbers; later, the outer SUBSTITUTE will replace those with null (i.e., get rid of the 'zzz').
Once the initial concatenations are complete, they are SPLIT at the forward slash and then FLATTENed into one column. QUERY removes any blank rows in this virtual array so far. The remaining results are again SPLIT at the tilde. Finally, that outer SUBSTITUTE removes the temporary instances of 'zzz'.
I also added a custom CF formula for the alternating color banding on alternate rows.
You can try this one:
Formula:
=ARRAYFORMULA(TRIM(QUERY(SPLIT(FLATTEN(IF(IFERROR(SPLIT(C3:C5, "/"))="",,
A3:A5&"×"&B3:B5&"×"&SPLIT(C3:C5, "/"))), "×"),
"where Col3 is not null")))
Output:
Reference:
How to transpose & split multiple columns and repeat specific cells in a column
I have a large data set, roughly 7000 lines. this has been generated with a particular piece missing. Is there a way I can on mass add in the missing information? Below is an example line from my dataset,
PRIPOS;20150527;EUR;AAAAA;Maxi Dresses;5050300000000;22200000;Thyme;Thyme;6;32;AAAAAA MAXI DRESS;AAAAAA MAXI DRESS;2;All AAAAA Products;000;Dresses;100;Maxi Dresses;10000;Soft Maxi Dress;000.00;00.00;;;;;SS15;;;Insert;;
The first bold field (32) need to be considered the second bold field (insert) is where data needs to be added. The 32 represents a size and the Insert should represent a different size. file contains around 7k lines, all different information.
Is there a particular text editor that will allow me to use a wildcard on a replace function, or an ideas on a script? Failing this I would assume dumping into a SQL table and updating via query would be the quickest method?
Thanks a lot.
You could load into Excel and do a formula on the insert column that looks at the 11th column and based on that sets it's value. Set your list separator character to a semi-colon in the regional settings first.
I am reading contents of a file and adding it as a row in mysql db. The column in which the file contens will be added is a TEXT column.
There are multiple files that will be uploaded and it's contents are extracted in a cronjob and added to the TEXT column. One row per file.
My files sometimes are empty. In that case, a row with no content is created.
Now, I need to retrieve this content in another cronjob and perform some activities. I would like to filter and retrieve only those rows where content exists. Like using where clause with LENGTH(TRIM(ContentCol)) > 0. Since it is a TEXT column, I am unable to use LENGTH & TRIM functions.
Also when I use LENGTH function, it show different length.. I could see 5, 1 etc. though there is no value in the row-cell.
How can I perform this criteria?
Well I use the function BIT_LENGTH(string); but only with short text data.
This function only return the length of bits, I evaluate this like
BIT_LENGTH(string) > 0
I have a flat file that I need to parse in SSIS, part of this parsing is to chop off a load of extra text at the bottom of the file. To help do this I added a row number to each row using a Script Transformation.
In the Script Transformation (ST) under Inputs and Outputs I have an Input Column defined called Column256_in (it has a length of 256) and its ID is 59.
For Output columns I have defined Column256_out, it has an ID of 68 and a MappedColumnID of 59, there is another Output Col called rowCount.
There is script code contained in the ST the calculates the row number for each row.
When I run the SSIS package I have a Data Grid after the Script Transformation I get the following:
Column256_in contains the data from the orginal text file.
rowCount is populated correctly. ( I did something right today!)
Column256_out is empty --> I thought that the MappedColumnId of 59 would populate this col with the data from Column256_in.
What does the MappedColumnID attribute do on the Out put col?
Thanks for your assistance.
KD
MappedColumnID is just an alternative way of identifying the columns instead of using their names.
From MSDN
The use of these properties is not required. These properties provide an easier way for developers to associate related columns, such as input and output columns, in custom data flow components.
I have a table with 48 fields, i am filtering some data and need to use a script component( i need this for 10 of the fields in which i am actually changing the data) on the other 38 fields I only want to do a trim. I know i can do this in a script component, but i would rather do it in a more efficient way...Thanks!
Try using a derived column transformation. If it's really nothing but a trim, you cna even replace the contents of the field without creatingg a new field in your dataflow.