Flat File to retain commas from SQL data - ssis

I’m importing a SQL view to SSIS using the Flat File Connection Manager. One of my columns in SQL has comma(s) in it. (123 Main St, Boston, MA) . When I import the data to SSIS, the commas within the column are being treated as delimiters, and my column is being broken into several columns. I have done a lot of research online, and have followed some workarounds which aren't working for me.
In SQL Server, I added double quotes around the values that have comma(s) in it.
' "'+CAST(a.Address as varchar(100))+'" '
So, 123 Main St, Boston, MA now reads “123 Main St, Boston, MA”
Then in my SSIS Flat File Connection Manager,
In the General tab:
Text Qualifier is set to “
Header Row Delimiter is set to {CR}-{LF}
In the columns tab:
Row delimiter is set to {LF}
Column delimiter is set to Comma {,}
And in the advanced Tab, all of my columns have the Text Qualified set to True.
After all of this, my column with commas in it, is still being separated into multiple columns. Am I missing a step? How can I get the SSIS package to treat my address column as one column and not break it out to several columns?
EDIT: Just to add more specifics. I am pulling from a SQL view that has double quotes around any field that has commas in it. I am then emailing that file and opening it in MS Excel. When I open it the file it read as follows:
123 Main St Boston MA" " (In three cells)
And I need it to read as
123 Main St, Boston, MA (in one cell)

Have a look of this - Commas within CSV Data
If there is a comma in a column then that column should be surrounded
by a single quote or double quote. Then if inside that column there is
a single or double quote it should have an escape charter before it,
usually a \
Example format of CSV
ID - address - name
1, "Some Address, Some Street, 10452", 'David O\'Brian'
Change every comma values with another unique delimiter which values haven't any of the characters inside,like : vertical bar ( | )
Change column delimiter to this new delimiter , and set text qualifier with double quote ( " )
You can automate the replace process using a Script Task before Dataflow Task for replacing delimiters. You can use replace script form here.
Also have a look of these resources.
Fixing comma problem in CSV file in SSIS
How to handle extra comma inside double quotes while processing a CSV file in SSIS

I ended up recreating the package, using the same parameters that are listed in my question. I also replaced this
' "'+CAST(a.Address as varchar(100))+'" '
with this in my SQL view
a.Address
And it now runs as desired. Not sure what was going on there. Thanks to everyone for their comments and suggestions.

Related

SSIS - How to insert all values inside ""

there is a requirement for all the values integrating from SQL Server into a flat file (.csv) being inserted between a double quotation mark, such as 123 to be inserted as "123".
I am having such difficulty with this, i tried the derived columns with the script "\"\"" + [columnName] + "\"\"" but does not work at all.
Please be advised i need the column headers to have the same "" as well.
Many thanks!
If you mean you want to export data from SQL Server into a csv file using SSIS, and that you want the values to be double quoted, you just need to set the Text Qualifier property of your Destination connection to a double quote " character.

Reading CSV file in SAP Data Services using double quotes as text delimiter - but only single double quotes in column value

I am reading a CSV file in SAP Data Services Designer and using (") as text delimiter. The client sometimes sends data that only has one double quote in a column, without a closing " which would mark the end of string. Because of this, it ends up reading next many rows as one single column, until it encounters the next ". I need to retain the " as text delimiter as that is also required. Is there a way to avoid the anomaly where the software only sees one "?
Thanks!

Escape semicolon; MYSQL for Excel

I want to import data from an excel sheet into a MySQL database with the MySQL for Excel plugin. In some cells are texts with semicolons and I already figured out this causes a SQL error. I tried escaping the semicolons with backslash but I still get the error message. How can I escape the semicolon?
Kai,
this behaviour is purely the fault of MySQL for Excel, and seem to be a bug.
In the meantime, if you are not keen on changing your Excel data as suggested by others there is a workaround:
In your MySQL-for-Excel window click Options and then select Preview SQL statements before they are sent to the server and Accept.
Then proceed as normal with export / append data using the Add-in, but when a Review SQL script window appears, copy the contents into a different SQL tool (MySQL workbench, HeidiSQL, SQLWorkbench etc), and run. Then click cancel in the Mysql-for-Excel popups, and refresh the query if necessary.
Also: feel free to report the bug at: http://bugs.mysql.com/
Replace the semicolon with some unique text e.g. [SEMICOLON].
Next import the data to SQL and run something like
UPDATE your_table
SET your_field = REPLACE(your_field, '[SEMICOLON]', ';')
WHERE your_field LIKE '%[SEMICOLON]%'
I think all you need to do is consider the requirements Excel has when it imports data from CSV files (the parsing rules are probably the same or similar)
In your case, if a field contains any special characters, just quote the values with double quotes before importing the content in Excel.
So:
UPDATE table
SET field = '"' || field || '"'
WHERE field like '%,%'
The following rules should apply:
Fields containing a line-break, double-quote, and/or commas should be quoted
Any field may be quoted (with double quotes)
A (double) quote character in a field must be represented by two (double) quote characters.
More details: Wikipedia: Comma-separated values

Using Excel to create a CSV file with special characters and then Importing it into a db using SSIS

Take this XLS file
I then save this XLS file as CSV and then open it up with a text editor. This is what I see:
Col1,Col2,Col3,Col4,Col5,Col6,Col7
1,ABC,"AB""C","D,E",F,03,"3,2"
I see that the double quote character in column C was stored as AB""C, the column value was enclosed with quotations and the double quote character in the data was replaced with 2 double quote characters to indicate that the quote is occurring within the data and not terminating the column value. I also see that the value for column G, 3,2, is enclosed in quotes so that it is clear that the comma occurs within the data rather than indicating a new column. So far, so good.
I am a little surprised that all of the column values are not enclosed by quotes but even this seems reasonable OK when I assume that EXCEL only specifies column delimieters when special characters like a commad or a dbl quote character exists in the data.
Now I try to use SQL Server to import the csv file. Note that I specify a double quote character as the Text Qualifier character.
And a command char as the Column delimiter character. However, note that SSIS imports column 3 incorrectly,eg, not translating the two consecutive double quote characters as a single occurence of a double quote character.
What do I have to do to get Excel and SSIS to get along?
Generally people avoid the issue by using column delimiter chactacters that are LESS LIKELY to occur in the data but this is not a real solution.
I find that if I modify the file from this
Col1,Col2,Col3,Col4,Col5,Col6,Col7
1,ABC,"AB""C","D,E",F,03,"3,2"
...to this:
Col1,Col2,Col3,Col4,Col5,Col6,Col7
1,ABC,"AB"C","D,E",F,03,"3,2"
i.e, removing the two consecutive quotes in column C's value, that the data is loaded properly, however, this is a little confusing to me. First of all, how does SSIS determine that the double quote between the B and the C is not terminating that column value? Is it because the following characters are not a comma column delimiter or a row delimiter (CRLF)? And why does Excel export it this way?
According to Wikipedia, here are a couple of traits of a CSV file:
Fields containing line breaks (CRLF), double quotes, and commas
should be enclosed in double-quotes. For example:
"aaa","b CRLF
bb","ccc" CRLF
zzz,yyy,xxx
If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
However, it looks like SSIS doesn't like it that way when importing. What can be done to get Excel to create a CSV file that could contain ANY special characters used as column delimiters, text delimiters or row delimiters in the data? There's no reason that it can't work using the approach specified in Wikipedia,. which is what I thought the old MS DTS packages used to do...
Update:
If I use Notepad change the input file to
Col1,Col2,Col3,Col4,Col5,Col6,Col7,Col8
"1","ABC","AB""C","D,E","F","03","3,2","AB""C"
Excel reads it just fine
but SSIS returns
The preview sample contains embedded text qualifiers ("). The flat file parser does not support embedding text qualifiers in data. Parsing columns that contain data with text qualifiers will fail at run time.
Conclusion:
Just like the error message says in your update...
The flat file parser does not support embedding text qualifiers in data. Parsing columns that contain data with text qualifiers will fail at run time.
Confirmed bug in Microsoft Connect. I encourage everyone reading this to click on this aforementioned link and place your vote to have them fix this stinker. This is in the top 10 of the most egregious bugs I have encountered.
Do you need to use a comma delimiter.
I used a pipe delimiter with no Text qualifier and it worked fine. Here is my output form the text file.
1|ABC|AB"C|D,E|F|03|3,2
You have 3 options in my opinion.
Read the data into a stage table.
Run any update queries you need on the columns
Now select your data from the stage table and output it to a flat file.
OR
Use pipes are you delimiters.
OR
Do all of this in a C# application and build it in code.
You could send the row to a script in SSIS and parse and build the file you want there as well.
Using text qualifiers and "character" delimited fields is problematic for sure.
Have Fun!

CSV output file Error

I am exporting data from sql table to CSV file. Few of my columns in the table has "Comma(,)" separated data, while loading the same into CSV file, data has been splitted into two columns.
Example
data in Sql table
ename desig Industry
Roy PM Business,Analyst
Rem PL Marketting and Production
King PM Marketting, Analyst
while exporting the same data to CSv File it is coming in this way
ename desig Industry
Roy PM Business Analyst
Rem PL Marketting and Production
King PM Marketting Analyst
Since this is CSV format, it is delimiting after comma and taking Analyst as another column instead of same with Industry column.
My required output in CSV File
ename desig Industry
Roy PM Business,Analyst
Rem PL Marketting and Production
King PM Marketting, Analyst
my FaltFileConnectionManager Settings are below
in General tab
Header Row Delimiter {CR}-{LF}
Columns Tab
Row Delimiter {CR}-{LF}
Column Delimiter Comma{,}
I changed these setting , but still facing the same issue.
EDIT
Since it appears from new information that you are not using a CSV (comma separated values) file, but are instead using a pipe delimited file, the issue appears to be that whatever is processing beyond this file is using both | and , as delimiters and therefore delimiting in the middle of the last field where there is an internal comma. Since you don't want to use the industry standard where there are embedded comma's in a field and use a text qualifier on that field, then I am not sure what I can suggest without more definition about what you are using to test the file, what is processing the file after you are done etc.
Update your question with more information and I will refine my answer. First guess without new information is that since you are using the .CSV name on your file that the comma handling is automatic by whatever is processing downstream.
/EDIT
Change the Text Qualifier to " (the double quotation mark), this should qualify text fields with "" to follow the standard practice for CSV files of quoting strings so that imbedded commas don't cause a field break.