SSIS Lookup failure even though all columns are matching - ssis

I am getting strangest of the errors. Bunch of lookups are failing even though the data from source is exactly same as data from destination.
There are 4 columns from source and all are of varchar type. I am using these source data to match from destination using a cache connection manager.
I run the queries side by side and do the comparison and they are the same.
I also put them in notepad++ using the 'Show All symbols' to look for any special characters but i don't see any.
Any idea what may be causing the issue?

Look for possible nonprinting characters... I had this issue: I was solving it for some time and finaly I used HEX viewer to found that in one datasource one column had LF at the end of the string...
Or maybe just code pages of matching columns don't match?

Related

Reading Encrypted data with Datastage Tool

Actually i need Your help in datastage 11.7 tool. i am reading a AES encrypted column from my source and type of column is nvarchar so when we start our job and read data from source. The job run Successfully and exactly same data is moved to my target data base with same column type.
And the Problem Actually occur is that when i query the data to check whether the my source and target values are same, the query does not show any result and visually if we look source,target value they are same value but sql statement return nothing and the database is Vertica.
Column value are special Alpha numeric and special characters like �D�&7��x��d$�Q
I'm not at all sure this is even properly possible via datastage - treated encrypted data and a varchar. Some DB's have internal keys that go with the data that require decrypting before extracting. I'm assuming that decrypting, transporting, landing and then encrypting is not an option.
But if I had to take a stab in the dark.
The very first thing I'd check is that the character set and collation is the same on both databases on a table level. A difference can result in blank results on the target side.
Also check that the NLS map in the datastage (map for stages and collation locale) is set accordingly. What that settings is, I don't know but making it the same in DataSTage and the DBs would be ideal ; Google. You need to comment on what is already set in the DB's. And run tests. I'm not sure the DataStage default of ISO-8859-1 will work.
Please post your solution if you find one.

Data Cleanse ENTIRE Access Table of Specific Value (SQL Update Query Issues)

I've been searching for a quick way to do this after my first few thoughts have failed me, but I haven't found anything.
My Issue
I'm importing raw client data into an Access database where the flat file they provide is parsed and converted into a standardized format for our organization. I do this for all of our clients, but this particular client's software gives us a file that puts "(NULL)" in every field that should be NULL. lol as a result, I have a ton of strings rather than a null field!
My goal is to do a data cleanse of the entire TABLE, rather than perform the cleanse at the FIELD level (as I do in my temporary solution below).
Data Cleanse
Temporary Solution:
I can't add those strings to our datawarehouse, so for now, I just have a query with an IIF statement check that replaces "(NULL)" with "" for each field (which took awhile to setup since the client file has roughly 96 fields). This works. However, we work with hundreds of clients, so I'd like to make a scale-able solution that doesn't require many changes if another client has a similar file; not to mention that if this client changes something in their file, I might have to redo my field specific statements.
Long-term Solution:
My first thought was an UPDATE query. I was hoping I could do something like:
UPDATE [ImportedRaw_T]
SET [ImportedRaw_T].* = ""
WHERE ((([ImportedRaw_T].* = "(NULL)"));
This would be easily scale-able, since for further clients I'd only need to change the table name and replace "(NULL)" with their particular default. Unfortunately, you can't use SELECT * with an update query.
Can anyone think of a work-around to the SELECT * issue for the update query, or have a better solution for cleansing an entire table, rather doing the cleanse at the field level?
SIDE NOTES
This conversion is 100% automated currently (Access is called via a watch folder batch), so anything requiring manual data manipulation / human intervention is out.
I've tried using a batch script to just cleanse the data in the .txt file before importing to Access - however, this caused an issue with the fixed-width format of the .txt, which has caused even larger issues with the automatic import of the file to Access. So I'd prefer to do this in Access if possible.
Any thoughts and suggestions are greatly appreciated. Thanks!
Unfortunately it's impossible to implement this in SQL using wildcards instead of column names, there is no such kind syntax.
I would suggest VBA solution, where you need to cycle thru all table fields and if field data type is string, generate and execute SQL UPDATE command for updating current field.
Also use Null instead of "", if you really need Nulls in the field instead of empty strings, they may work differently in calculations.

Split columns in SSIS into a numeric part and the remainder of the column value

I'm trying to figure out how to convert my existing DTS files to DTSX hosted on a SQL 2005 server.
In my first try (first DTS) I'm already stuck.
I don't wanna look how things are done using DTS and wanna focus on the new DTSX techniques.
What has to be done.
Check if input file exists else exit.(not done yet)
Truncate destination table
Import file into DB
Report if everything was alright.
step import file is where I'm stuck. I have a fixed columns sized flat tekst file where housenumber and extension are in a single column. The Database has two columns for it.
I first tried a derived column but could find a check for splittng the (first) numeric part.
When searching for the use of regex i read about the "script component" which i read isn't compatible with SQL 2005.
Is there another possibility?
This brings me to a second question: Is it possible to use SQL Server Data Tools (SSDT) with SQL Server 2005.
You have a job ahead of you. You will need to use Derived Column transformations, so look them up. It will be helpful to add Data Viewers to your dataflows so you can see what data is moving through the flow, and what SSIS thinks is in there.
In your case, you are going to have to manipulate strings. There are two string data types, DT_STR and DT_WSTR (e.g., VARCHAR and NVARCHAR). SSIS is very particular about data types, and you may have to convert one type into the other using cast operations. E.g., (DT_WSTR, 50)Blah converts fifty characters from column Blah into DT_WSTR. The DT_STR type also needs to know the code page,. e.g. something like 1252.
You will likely need the SSIS version of Immediate If, which checks if a condition is true, and returns either one result or the other: Blah==100?1:0 The result of this will be a 1 for those records where column Blah equals 100, and 0 for all other records. You can nest these Immediate If statements, with one inside the next, inside the next, adn so forth.
You will need at least two new columns being created in the Derived Column widget, one for the numbers, and one for what follows. So here is one very painful way to do what you want. Use a string function to check whether the first character is a number, and then either return it, or don't. Do the same for the second character, etc.
Of course, I'm sure there's a better way to do this. Your toolbox consists of the functions in the Derived Column widget, so that's what you've got to work with. (Or, alternatively, you might do better in this case with a SQL UPDATE statement, which you would execute as a subsequent task in the Control Flow above, not in the Data Flow below.)
Fair warning: the SQL Server 2005 version of SSIS has many, many, many bugs and frustrations which were fixed or improved in later versions of SQL Server. Even if you just go to SQL Server 2008 you will save yourself many headaches.

SSIS missing data from SQL table using Fast Load

I have a bit of a problem. When I set up a SSIS package and i fire it off it shows me the amount of rows that is going into the SQL table, but when I query the table there is almost 40000 rows missing from what the last count was after the conditional split that I have in the package.
What causes this problem? Even if I have it on normal table or view it still does the same thing. But here I have to use the fastload option as it is a lot of source files being loaded. This is only testing before sending it to production and I am stuck at the moment. Is there a way I can work around this problem and get all the data that is supposed to be pumped into the table. please also take note that in the conditional split it removes any NULL values as seen in first picture.
Check the Error Output (under Connection Manager and Mappings) within Destination Component. If the Error setting is set to Ignore Failure or Redirect Row, the component will succeed, but only the successful rows will be inserted.
What is the data source? Try checking your data and make sure you don't have any terminators stored in one of the rows.

SSIS OLE DB conditional "insert"

I have no idea whether this can be done or not, but basically, I have the following data flow:
Extracts the data from an XML file (works fine)
Simply splits the records based on an enclosed condition (works fine)
Had to add a derived column object due to some character set issues (might be better methods, but it works)
Now "Step 4" is where I'm running into a scenario where I'd only like to insert the values that have a corresponding match in my database, for instance, the XML has about 6000 records, and from those, I have maybe 10 of them that I need to match back against and insert them instead of inserting all 6000 of them and doing the compare after the fact (which I could also do, but was hoping there'd be another method). I was thinking that I might be able to perform a sql insert command within the OLE DB DESTINATION object where the ID value in the file matches, but that's what I'm not 100% clear on or if it's even possible for that matter. Should I simply go the temp table route and scrub the data after the fact, or can I do this directly in the destination piece? Any suggestions would be greatly appreciated.
EDIT
Thanks to the last comment from billinkc, I managed to get bit closer, where I can identify the matches and use that result set, but somehow it seems to be running the data flow twice, which is strange.... I took the lookup object out to see whether it was causing it and somehow it seems to be the case, any reason why it would run this entire flow twice with the addition of the lookup? I should have a total of 8 matches, which I confirmed with the data viewer output, but then it seems to be running it a second time for the same file.
Is there a reason you can't use a Lookup transformation to find existing records. Configure it so that it routes non-match records to the no match output and then only connect the match found connector to the "Navigator Staging Manager Funds"
I believe that answers what you've asked but I wonder if you're expressing the right desire? My assumption is the lookup would go against the existing destination and so the lookup returns the id 10 for a row. All of the out of the box destinations in SSIS only perform inserts, so that row that found a match would now get doubled. As you are looking for existing rows, that usually implies you'd want to perform an update to an existing row. If that's the case, there is a specially designed transformation, the OLE DB Command. It is the component that allows for updates. There is a performance problem with that component, it issues a single update statement per row flowing through it. For 10 rows, I think it'd be fine. Otherwise, the pattern you'd use is to write all the new rows (inserts) into your destination table and then write all of your changed rows (updates) into a second staging-type table. After the data flow is complete, then use an Execute SQL Task to perform a set based update statement.
There are third party options that handle combined upserts. I know Pragmatic Works has an option and there are probably others on the tasks and components site.