ssis flat file source trim blank spaces - ssis

Is there any way i can trim the blank spaces when I read data from CSV file?
Thanks.

The SendMail task requires ';' between email ids. If you are building one large string to send your email to, consider using a script component to remove spaces and append a';' between each one.

Use a derived column component after the CSV input, in the editor you should see columns in top-left so drag the email column into the editor and replace any spaces with empty strings. You should also set the derived column to replace the email column (Or you could add it as a new column, if you need that).

Related

String gets separated into different rows in SSIS

I am currently working on an input file and I do have a column which contains 3 different values in one cell itself. Although this data is not being used in the transformation , I need to input this data from the source and then ignore when it is loaded into the staging table.
But the issue I face is that it gets loaded into separate rows rather than 1 cell.
This particular column is input as a string datatype. what change do I need to make to resolve this issue. Please let me know If more details are needed to answer the question.
I have uploaded a sample file to google drive https://drive.google.com/file/d/17hn8xmRd4CWsgKBzHgdwnR9W4jTJ9lTn/view?usp=sharing
The following is a screenshot of the csv data as opened in a text editor
Having downloaded sample.csv from your link, the first thing I did was open it in a text editor (Notepad++, TextPad, Visual Studio, etc) and just looked at what you have.
Row 1 is column headers
Encoded in UFT-8 with BOM (byte order marker)
Line Endings are CR/LF (Carriage Return & Line Feed)
Column delimiter appears to be a comma ,
Double Quote, ", is used as the text qualifier but only when needed
There are CR/LF characters in the actual data
I then define my flat file connection manager based on that data
Finally, I have a data flow with a Flat File Source to a Derived Column and drop a Data Viewer between them
As you can see, configuring your Flat File Connection Manager as I show will allow all the data to flow into your table as expected.
What is happening now is the CRLF, which is our row delimiter, is having precedence over the embedded CRLF in the column data. By setting the double quote as the Text Qualifier, the data reader correctly "skips" the embedded CRLF until it is encountered outside of the quotes.

Remove last blank row in CSV using Logic App

I have a CSV file stored in SFTP where the last row is a blank, so the data looks like this in text:
a,b,c
d,e,f
,,
How can I use Logic App to remove that final row and then save it in BLOB? I have the following but will need some extra steps before the BLOB creation I think.
Considering the same sample here is my Logic app
In Compose_2 it takes the index of the last empty item. Below is the expression that I used to retrieve the lastIndex.
lastIndexOf(variables('Sample'),'\n')
Then in Compose_3 I'm selecting the one which I wanted
substring(variables('Sample'),0,outputs('Compose_2'))
Here is the Final Result
NOTE:-
Make sure you remove an extra ' \ ' been attached to '\n' in the code view at the Compose_2.
So the final Compose_2 looks like
lastIndexOf(variables('Sample'),'
')
Updated Answer
If the received data is coming from CSV then you can use the take() expression you retrieve the wanted rows. Here are a few screen shots for detailed explanation:-
Below is the expression in the compose connector
take(outputs('Split_To_Get_Rows'),sub(length(outputs('Split_To_Get_Rows')),1))

To validate the header name of each field of a csv file in azure blob storage using Azure data factory v2 pipelines

I have a scenario where the user will upload a file with some data and a header in that file. i need to process the file and make sure that the field names in the header are correct and have no whitespaces and no special characters.
eg. User dropped file in storage account contains the following header
i need to change it to this
How can i do this ADF v2 ?
Data Factory won't really do this as is, but if this is part of a larger ETL process, you can rename the columns in a Data Flow using Select.
Source:
Add a Select node and go to the "Select settings" tab. If you know the schema, you can just fix the columns manually here:
You can also use a Rule-based mapping to remove spaces from all the column names. To do this, remove all the existing mappings and add the following:
"true()" in this context means apply to all columns, and '$$' refers to the column name. The "Inspect" tab will show the updated column names:

Adwords csv file in attachment is not parsing properly

I am trying to use google apps script to extract data from an email attachment which is basically an Adwords report as csv file.
Here is the gist of the code
var dataTest3 = Utilities.parseCsv(msg.getAttachments()[0].getDataAsString());
SpreadsheetApp.getActive().getSheetByName("Sheet1").getRange(1, 1, dataTest3.length, dataTest3[0].length).setValues(dataTest3);
msg is the GmailMessage object.
The result that i am getting is an array with strange format
The data shows ok but its value is strange
Any idea how can i make it parse into the spreadsheet like a normal csv. It opens up like a normal csv when downloaded.
Thanks
The description basically an Adwords report as csv file needs to be investigated... what exactly is the file format? With only pictures of your problem, the best I can do is guess that your file is using some custom delimiter, not commas.
"CSV" stands for Comma Separated Values, but in practice it applies to text files with a number of different field delimiters - call them Delimiter-separated values. Common delimiters include commas (,), tabs (\t), colons (:), v-bars (|), and sometimes just spaces (usually between quote-enclosed text fields).
Instead of using the version of Utilities.parseCsv(csv) that assumes a comma delimiter, you can use Utilities.parseCsv(csv, delimiter) to specify a custom delimiter. You should be able to determine what the delimiter is by reviewing the attachment in the debugger.
You could also try adapting importFromCSV() from How to Import tab-delimited "CSV", which automatically detects tab or comma delimiters.

Parse tab separated text file in Google Sheets

I have a txt file available on the web which contains tab separated values (TSV/CSV) like this:
Product_IdtabColortabPricetabQuantityItem1 tabRed tab$5.2 tab5Item2 tabBlue tab$7.5 tab10
I imported the txt file into a Google Spreadsheet using the IMPORTDATA(url) formula. The problem is that now I need to split the text to columns. I tried the following formulas without success:
Split(A1,"\t")
Split(A1," ")
Split(A1,"<tab>")
another thing I tried is to to use the Substitute function, but I just can't figure out how to match the Tab character in Google Spreadsheets?
Pages strips tabs by default when you paste text using a standard paste. Tab delimited data can be pasted and automatically parsed using:
Right Click -> Paste special -> Paste values only
IMPORTDATA(url) seems to handle tabs automatically, as others have mentioned before, if the URL ends in ".tsv".
I had trouble trying to import a file from Dropbox even though the file was named "something.tsv", because the url was
"https://www.dropbox.com/s/xxxxxxx/something.tsv?dl=1"
I managed to solve the problem by adding a dummy query parameter to the url:
"https://www.dropbox.com/s/xxxxxxx/something.tsv?dl=1&x=.tsv"
NOTE: I know this question was asked back in 2014 and I am answering this question some 5 years later. I am posting the answer here in hopes that someone else who googles their way here will be saved the headache and can be helped by how I devised a solution.
SUMMARY OF THE ISSUE: By default the IMPORTDATA() function will properly process a tab-delimited file only if the file name ends with the extension .TSV
UPDATE Nov 14, 2019:
In a comment below, Poul shared that he has found an undocumented parameter for the IMPORTDATA() function by which you can specify the delimiter to split the data. As of writing this, the official documentation makes no reference to this delimiter.
In effect the documentation should look something like the following:
IMPORTDATA("url","delimiter")
So, if you wanted to force a file to be split on the TAB character, it would look something like
IMPORTDATA("url","\t")
PRIOR ANSWER:
UPDATE: I am leaving my original answer just in case it might be helpful if the answer above, which includes undocumented functionality, does not continue to work.
ORIGINAL ANSWER: After seemingly countless attempts, I figured out how to coax Google Sheets into importing a tab-delimited file regardless of the extension.
For those looking for the quick and dirty answer, copy the following into a cell of a Google Sheet to give it a try:
=ARRAYFORMULA(IFERROR(SPLIT(IMPORTDATA("https://iso639-3.sil.org/sites/iso639-3/files/downloads/iso-639-3_Latin1.tab"),CHAR(9),FALSE,FALSE)))
For those that want to know a bit more, I will try to explain how each of the nested functions are helping to create the final solution:
=ARRAYFORMULA( IFERROR( SPLIT( IMPORTDATA(URL-HERE) ,CHAR(9),FALSE,FALSE) ) )
IMPORTDATA() - the primary function that pulls in the data file from the web
SPLIT - split the row by tab, note the use of char(09) to generate the tab character; also note the use of FALSE for the last parameter which was required in my case to ensure empty cells were not collapsed together
IFERROR - used to catch situations where an import might fail, the error will be trapped and not returned to the spreadsheet
ARRAYFORMULA - this function ensures that every line in the file is parsed; without this, only the first line of the file would be returned to the spreadsheet
It turns out that IMPORTDATA(url) can import a tab separated file, but it expects the file name to have the .tsv extension. This is inconsistent with Excel, where a tab-separated export results in *.txt.
If you can ensure that you use a .tsv extension, then your problem is solved.
You can also use the Sheets UI to import the file (into a new Spreadsheet). Select File > Import..., then Upload > Select a file from your computer. When the file selection dialog opens, paste the URL into the file name field, and click Open. The file will be downloaded to your PC then uploaded to Drive, through the Import dialog that will let you choose the delimiter.
(Validated on Windows 8.1 with Chrome; I don't know how this will behave on other OSes or browsers.)
Edit: See this gist.
importFromCSV(string fileName, string sheetName)
Populates a sheet with contents read from a CSV file located in the user's GDrive. If either parameter is not provided, the function will open inputBoxes to obtain them interactively.
Automatically detects tab or comma delimited input.
I had luck using split() and indicating only a single space as the delimiter, even though the data i pasted in had tabs separating each "column": =SPLIT(A1, " ", True) where A1 had data separated by 1 or more spaces. It seems that pasting in TSV data results in conversion from tabs to spaces.
This could be done in two steps leveraging the fact that tab is essentially multiple spaces.
Steps are as follows:
Select the columns which have tab separated data. Then trim tab to single space by using Data -> Data cleanup -> Trim whitespaces.
Now usual Data -> Split text to columns should work out of the box or after selecting space as separator.