SSIS Data conversion Package Error - ssis

This is what happens every time I try to run the package:

It appears this error is coming from a data flow task where you are trying to apply a text or Excel file source and import to a database destination. The initial errors, which are likely causing the later ones, are due to an inconsistency in the data types. Some of the source fields are defined as a Unicode where a non-Unicode is expected. The message shows this is taking place with columns VILLE, HABITATION, and PROFESSION.
This can be corrected by inserting between the source and destination data flow tasks a Data Conversion. Here you can convert the data types creating new fields that can be applied in the destination task mapping.
Hope this helps.

Related

Load big one line flat json file in ssis

I am trying to load a big file which basically is a json format flat file from my local drive to SQL Server by using SSIS. It's a one line file and I don't need to specify columns and rows as I am going to parse it as soon as it's in SQL Server by OPENJSON.
but when I tried to create Flat File Source in Visual Studio SSIS, I was not able to do that as even I used 'fixed width' format according to the solution here: import large flat file with very long string as SSIS package, as the max width seems to be 32000, while the json file could be much bigger.
here are my settings:
There are other options of loading the data by t-sql like OPENROWSET but we have SQL Server instance installed on another server rather than the same one we are doing our dev work. So there are some security limits between them.
So just wondering if this is the limitation of Flat File Source in SSIS or I didn't do it right?
You're likely looking for the Import Column transformation. https://learn.microsoft.com/en-us/sql/integration-services/data-flow/transformations/import-column-transformation?view=sql-server-ver15
Define a Data Flow as OLE Source -> Import Column -> OLE Destination.
OLE Source
Really, any source but this is the easiest to reproduce
SELECT 'C:\curl\output\source_data.txt' AS SourceFilePath;
That will add a column named SourceFilePath with a single row.
Import Column
Reference the article on Import Column Transformation but the summary is
Check the column that will provide the path
Add a column to the Import Column Collection to hold the file content. Change the data type to DT_TEXT/DT_NTEXT depending on your unicode-ness and note the LineageID value
Click back to Import Column Input and find the column name. Scroll down to the Custom Properties and use the LineageID above for FileDataColumnID where it says 0. Otherwise, you have an error of
The "Import Column.Outputs[Import Column Output].Columns[FileContent]" is not referenced by any input column. Each output column must be referenced by exactly one input column.
OLE DB Destination
Any data sink will do but the important thing will be to map our column from the previous step to a n/varchar(max) in the database.

SSIS Errors for simple CSV Data Flow

Sorry to darken your day with my troubles, but SSIS has broken me! I am new to SSIS and I just seem to be misunderstanding it.
For background: I have a few versions of a basic package that includes a Foreach Loop container and a Data Flow with a few Derived Columns that imports CSV files into a SQL Server Staging table. It is very straightforward and does include an Execute SQL task and a File Move but those work fine. The issues are with the Foreach loop and the Data Flow.
I have one version of this package (let’s call it “A”) that seemed to be working fine. It would process multiple files in a folder, insert records into the staging table, properly execute the SQL Statements, and move the files to Archive. Everything seemed fine until I carefully QA’d the process. Turns out it was duplicating the data from one file, and never importing the data from a second Source File! Yet, the second/dupe round of data included the Source Filename (via a derived column) of the second file (but the data from the first). So it looked like I had successfully processed BOTH files until I looked at the actual data and saw that none of the values from the second source file were ever written to the Staging table.
Once I discovered this, I figured that the problem was in the Foreach loop and how I setup the different file path & name variables. So, I decided to try to make a new version of the package. I started by copying package A and created package B. In B, I deleted the Source Connection manager and created a new Connection Manager along with all new file & path variables. I then tried to cleanup/fix/replace various elements in my Data Flow and Foreach loop. In the process, I discovered that the Advanced Mappings from A – which DID work – were virtually all setup as String (even the Currency and Date columns). That did not seem right, so I modified each source money column by changing to data type Currency, and changed each date-related column to data type Date.
What followed has been dozens and dozens of Errors and I cannot get Package B to run. I have even changed all of the B data types back to String (mirroring the setup in Package A which DID work). But, still no joy.
This leads me to ask a few questions to those of you smarter than I:
1) Why can’t SSIS interpret Source CSV data using the proper data type? I.e. why do I need to set every Input column as a STRING when some columns are clearly & completely Numeric, Currency or Dates? (Yes, the Source CSV files are VERY clean – most don’t even have NULLS)
a. When I do change the Advanced mapping for a date-related Source column to Date, I get the ever present error message: [Flat File Source [30]] Error: Data conversion failed. The data conversion for column "Settle Date" returned status value 2 and status text "The value could not be converted because of a potential loss of data.".
2) When I reset the data types back to String in package B, I still get errors – usually Truncation errors (and Yes – I have adjusted the length to 250 in one of these columns).
a. Error Message: "The value could not be converted because of a potential loss of data.".
b. When I reset the Mappings to ignore the column (as a test), it throws a similar error at the next column.
3) Any ideas why Package A would dupe a file’s data and not process the second file, yet throw no errors and move both to Archive?
4) Why does the Data Viewer appear to have parsing errors (it shows data in the wrong columns) but when you use the Copy data feature in the data viewer and paste it into Excel, all of the data lines up perfectly?
5) Are there any tips & tricks that a rookie SSIS user needs to understand and which might not be apparent through the documentation and searching web articles as well as this site?
I can provide further details if they will help, but these packages are really very simple and should not be causing me this much frustration.
THANKS for any insights.
DGP
Wow seems like you have a lot of ssis issues... I think the reason for the same file being extracted is because of the the way your 'variable mappings' is defined.
Have you had a look and followed this guide:
https://www.simple-talk.com/sql/ssis/ssis-basics-introducing-the-foreach-loop-container/
Hope this helps.
Shaheen
Thanks Tab & Shaheen,
To all SSIS rookies - please learn from my mistakes!
It appears that my issue was actually in how I identified the TEXT QUALIFIER in the Connection Manager. I had entered "" and that was causing problems with how my columns were being parsed. The parsing issues caused unexpected values to appear in some of the columns and that was causing the errors in the package.
When I tried changing the the Text Qualifier to only ONE double quote - " - the whole thing worked!
As I mentioned - and as Shaheen suspected - my initial issues with the duplicate processing was probably due to how I setup the foreach loop. I had already fixed that, bit was still getting errors until I fixed the Text Qualifier.
I have only tested it a few times but it looks like that was the issue.
Thanks for the contributions.
DGP

SSIS Use DataFlow task with variables instead of a source database

I have a task that I am working on that has me stumped. Hoping you can help me. I am using a data flow task which is basically inserting a row into a sqlite table. I was doing this using a "SQL Task" but unfortunately the only way to successfully insert a guid into the sqlite table is to convert it as a byte stream using the data flow task. I do not want to use a source database because my data is not flowing from one table to another. I really just want to take my populated variables and convert them to a byte stream which i can then insert successfully into a sqlite database. The issue is, i cannot use a dataflow task without a source database.
My work-around so far has been to declare a source database/table and only one column (but never use it in the data flow). This works fine and I am unable to insert the row into sqlite using my pre-set variables, but i am left with a somewhat annoying message in my Output log every time i do this:
Warning: 0x80047076 at , SSIS.Pipeline: The output column "" (117) on output "OLE DB Source Output" (11) and component "OLE DB Source" (1) is not subsequently used in the Data Flow task. Removing this unused output column can increase Data Flow task performance.
Anyone know of a good way to get this warning not to show up?
In your dataflow choose a Script Component.
When prompted to choose Source, Destination, or Transformation, choose Source.
Add your pre populated variables to the CustomProperties.ReadOnlyVariables section of the script tab.
Go to the Inputs and Outputs section.
Add a column to the default output for each of your variables.
In your script (if using C#) put something similar to the following in the CreateNewOutputRows() section
Output0Buffer.AddRow();
Output0Buffer.ContainerName = Variables.ContainerName;
Output0Buffer.TaskName = Variables.TaskName;
Output0Buffer.TaskStartDate = Variables.ContainerStartTime;
Save your script.
Connect your script component to your destination object.
If this is causing your package execution to get failed, you got an option of ignoring these warnings/errors..
Just double click the Source block in Dataflow and navigate to the last tab("Error OUtput") in left side pane and you need to select the option to ignore the errors. (I dont know eactly what phrase in that option will do it )

What are the non-obvious causes of a data type mismatch while loading data in an SSIS package?

I'm very new to SSIS, so please bear with me. A developer gave me a SSIS package and asked me to create a scheduled job on our database server to run it. He says it runs on his development box but I'm seeing the job fail with the following data type mismatch error:
0xC020837F The data type of column "output column 'col1' does not match the data type "System.Byte[]" of the source column 'col1'"
I opened the package in Visual Studio, and in the Input and Output Properties of the item, it shows both the External Column and Output Column as being of data type database timestamp [DT_DBTIMESTAMP]. I checked the source column on the server and verified that it is a datetime column. Are there any other reasons this error could be thrown?
This looks like your source table definition is not the same on development and production environment. Since You didn't provide enough details about what kind of source component and what connection manager You use and what is your source query (maybe You CAST or CONVERT some data), we have to make some assumptions.
As stated in SSIS Error and Message Reference, error code 0xC020837F (-1071611009) has name DTS_E_ADOSRCDATATYPEMISMATCH and description:
The data type of "" does not match the data type "" of the source
column "__".
From error name (DTS_E_ADOSRCDATATYPEMISMATCH) and error message part "System.Byte[]" I conclude that You are probably using ADO NET Source source component.
For a start check following: open properties of source component, uncheck particular column and check it again - this forces source component to refresh external and output - this trick works for oledb source it might help You also
If that doesn't help, check following links to see if some of your source data types map to System.Byte:
Integration Services Data Types
SQL Server Data Types Mappings (ADO.NET)
Working with Data Types in the Data Flow
Probably, on either development or production environment, column is of timestamp, image, varbinary or some other type that maps to managed System.Byte[] but on the other it is not. Please recheck source tables definitions.
If this answer doesn't help You, please post create statements for your source tables as well as source query itself.

Errorneous Row numbers in a SSIS task

I am importing a text file into SQL server table which has got number of constraints. I have created one package and associated tasks.
At the end of a SSIS package execution, I want to know the erroenous row numbers which were not succefully exported to DB. Is any direct API or variable available in dts namespace to give this information?
Kindly share with me any knowledge to get this information.
Thanks,
Rahul
The error (red line) output of your import step inside the data flow lets you redirect to an error table. This should list the information you are after.
http://msdn.microsoft.com/en-us/library/ms140083.aspx
Error Outputs ( http://msdn.microsoft.com/en-us/library/ms140080.aspx )
Sources, destinations, and transformations can include error outputs. You can specify how the data flow component responds to errors in each input or column by using the Configure Error Output dialog box. If an error or data truncation occurs at run time and the data flow component is configured to redirect rows, the data rows with the error are sent to the error output. By default, an error output contains the output columns and two error columns: ErrorCode and ErrorColumn. The output columns contain the data from the row that failed, ErrorCode provides the error code, and ErrorColumn identifies the failing column.
For more information, see Handling Errors in the Data Flow.
Redirect the error rows on the destination component, pipe them through a count operation and then log that to a log table or whatever.