cannot import unicode flat file in SSIS - ssis

I have a flat file with almost 300 columns that needs to be imported in SQL server. It seems that if I use SSIS it can't read the csv file if I mark it as unicode. It seems that it looses ability to recognize cr and lf.
The specified header or data row delimiter "{CR}{LF}" is not found
after scanning 524288 bytes of the file "...\contact.csv". Do you want
to continue scanning this file?
what am I doing wrong?
EDIT
Based on comments, it seems I need to clarify - yes I did check that {CR}{LF} is present at the end of each line, and that it's set as a row delimiter in the connector.
The problem is with the "unicode" checkbox. If I uncheck it, the file is read fine. If I check it, it doesn't find {CR}{LF} any more.
It also doesn't matter what encoding I set on the file as that only effects the default "code page" selection.

ok, after a while I found an answer.
The unicode checkbox is still not working, but if you can go to the advanced section of the flat file manager and set your string columns to unicode. It's kind of tedious, and I don't know what I would do if I had 200 columns, but for my small data set it worked.

Related

Import fails from CSV file into SQL Server 2012 table

I am trying to import a rather large (520k rows) .CSV file into a SQL Server 2012 table. The file uses a delimiter of ;.
Please do not edit my delimiter. It is ";" I know that may seem strange, but that is what they used. It is not just a semicolon.
I don't think the delimiter is the issue because I replaced it with a tab and it seemed to be okay. When I try importing the file, I get a text truncation error, but I set the column to 255 just to be sure it had plenty of room.
Even when I delete the row, the next row causes the error. I don't see any offending characters in the data, so I am at a loss as to what the issue is.
I ended up using the EOL Conversion and selected Windows format in Notepad++ and then created a script to import the data.

MySQL Exporting Arabic/Persian Characters

I'm new to MySQL and i'm working on it through phpMyAdmin.
My problem is that i have imported some tables with (.sql) extension into a database with: UTF8_general_ci format and it contains some Arabic or Persian characters. However, when i export these data into an Excel file, they appear as the following:
The original value: أحمد الكمالي
The exported value: أحمد  الكمالي
I have searched and looked for this issue and tried to solve it by making the output and the server connection with the same format UTF8_general_ci. But, for some reason which i don't know, the phpMyAdmin doesn't allow me to change to the same format, it forces me to chose this: UTF8mb4_general_ci
Anyway, when i export the data, i'm making sure that the format is in UTF8 but it still appears like that.
How can i solve it or fix it?
Note: Here are some screenshots if you want to check organized by numbers.
http://www.megafileupload.com/rbt5/Screenshots.rar
I found easier way that you can rebuild excel file with correct characters.
Export your data from MySQL normally in CSV format.
Open new Excel and go to Data tab.
Select "From Text".if you not find this it is under "Get External Data".
Select your file.
Change file origin to Unicode(UTF-8) and select next.("Delimited" checked by default)
Select Comma delimiter and press finish.
you will see your language characters correctly.See more
Mojibake. Probably...
The bytes you have in the client are correctly encoded in utf8mb4 (good).
You connected with SET NAMES latin1 (or set_charset('latin1') or ...), probably by default. (It should have been utf8mb4.)
The column in the tables may or may not have been CHARACTER SET utf8mb4, but it should have been that.
(utf8 and utf8mb4 work equally well for Arabic/Persian.)
Please provide more details if this explanation does not suffice.

How to set the flat file column delimiter to an unprintable character [duplicate]

This question already has an answer here:
How to read a flatfile with lowercase thorn as the delimiter
(1 answer)
Closed 8 years ago.
I need to create a CSV file with a column delimiter of CTRL-A. Is that possible with the flat file destination? If it is, what's the syntax? If it isn't, is there a solution short of a custom destination?
I took a similar approach to sorrell, but my outcome was slightly different:
In SSMS, Create the character and copy the output. Note that this will look like nothing if you paste it anywhere.
SELECT char(1)
EDIT:
Make sure to copy the result of this query from the results window. You can confirm you have it by pasting into notepad - it will show the cursor move one space. or in notepad++, it will show a highlighted "SOH"
Here is where I found which decimal to use: http://www.unix-manuals.com/refs/misc/ascii-table.html
Paste that value into the Column Delimiter of the flat file manager:
This is what the output looks like in Notepad:
More interestingly, this is what it looks like in Notepad++ (matching the Start Of Heading text - SOH from the ASCII table I posted a link to above:
I haven't fully tested this, but it seems doable.
Create a template flat file with the headings. I used Linqpad to create the Ctrl-A character using a unicode string (\u0001). You could also get there the ascii route using \x01 (same character, just pointing this out if you need to use it in code). Here's what's in my flat file.
ColumnA□ColumnB
Create a Flat File Destination, and create a New Flat File Connection. Select Delimited as the type.
Browse to your flat file template, check the Unicode box (if unicode), and if the data should contain headers, check that box too.
Copy the Ctrl-A character from your template and paste it into the Column delimiter box. Then click the Refresh button.
You should now be able to work with the delimited columns. If you need to manually recreate that character in code, you can always use \x01 or \u0001.

SQL Server Export Unicode & Import via SSIS

(SQL Server 2008)
So here's my task ..
I need to export query results to file, and then import that file using SSIS to another DB.
Specific to the task, the data contains every awkward unicode character you can think of, so delimiting with commas, pipes etc is out of the question.
Here are the options SSMS gives me for export format:
Column Aligned
Comma/Tab/Space delimited
Custom delimiter
And here are the options SSIS gives me for a flat file data source:
Delimited (custom)
Fixed Width
Ragged Right
So given that a delimiter character is out of the question ... I cannot see another method that both SSMS & SSIS agree on.
Such as fixed width ?
Seems strange that the 2 closely related MS products have such different options.
Or have I missed something here ?
Any advice appreciated !!
It seems you need to try out different combination of options while creating delimited flat file(for your exported query result).
Try setting Code page to UTF-8 with and without Unicode. Also use Text qualifier as " or any of your choice which you thought might work. Also try using different option for column delimiter.
Once you are able to create delimited file then you have to apply same setting on file while importing to another DB.

SSIS package for export data into csv file to FTP

I'm creating SSIS package for to get .csv file to my local server and transfer it to FTP
When I get my csv into FTP and open into excel, My data getting shift over to other columns. Is there internally any kind set up do I need to change?
Also I tried different text qualifier still did not work.
It sounds like there may be hidden characters in your data set. If you are using comma's you may want to consider using a lesser used character for the delimiter such as a pipe "|". For instance an address may naturally have comma's. If a pipe shows up in an address field it's probably a type-o, and is far less likely. Things that shift data cells are often things like tab characters and CRLF. You can also open your data set in a text editor like notepad ++ and choose the "Show all Characters" option under "View->Show Symbols" menu option to see what the exact character is. If it's rampant in your data set you can use the replace function within the Derived Column Task to scrub the data as it comes out of the data source.