I have 5000 rows in source file with 11 columns with no duplicate rows
& I loaded all those rows in the destination table.
Now I inserted one new record and I tried performing lookup
In the lookup I mapped all the 11 columns and did not pass anything
and o/p of lookup (No match output) is passed to the destination.
But my output has 5000(previously loaded) + 5001 = 10001 rows
But I require that 1 new record to be inserted into destination table
Why is this happening ? Can someone tell me where I am wrong
Alternate : I tried using SCD but I couldn't figure it out which can be my business key
Columns are : Purchased Date,City,Investor Name,Investor City, Developer Name,Square feet,Area Purchased...
The way this usually works is this:
[Load Source File/Table] => [Lookup based on destination]
On matches (either ignore or send to update)
On no match (Insert the record)
Related
I have two OLEDB sources such as
DB Source1= select count(*) from A
DB Source2= select count(*) from B
Now, I need to get the count of Records uploaded
DB Source1 -DB Source2
for eg,
DBSource1 = 9 ;DBSource2= 1
then record uploaded will be 9-1=8
Finally I need them to be loaded to a flat file destination with following columns
RecordsReceived ErrorRecords RecordsUploaded
9 1 8
How do I achieve this?
TIA :)
You should look into the Row Count Transformation task. This one will count your selected records that flow through it and store it in a variable you declared. You can use those variables later in your script to store them in a flat file.
I have to process a flat file whose syntax is as follows, one record per line.
<header>|<datagroup_1>|...|<datagroup_n>|[CR][LF]
The header has a fixed-length field format that never changes (ID, timestamp etc). However, there are different types of data groups and, even though fixed-length, the number of their fields vary depending on the data group type. The three first numbers of a data group define its type. The number of data groups in each record varies also.
My idea is to have a staging table with to which I would insert all the data groups. So two records like this,
12320160101|12323456KKSD3467|456SSGFED43520160101173802|
98720160102|456GGLWSD45960160108854802|
Would produce three records in the staging table.
ID Timestamp Data
123 01/01/2016 12323456KKSD3467
123 01/01/2016 456SSGFED43520160101173802
987 02/01/2016 456GGLWSD45960160108854802
This would allow me to preprocess the staged records for further processing (some would be discarded, some have their data broken down further). My question is how to break down the flat file into the staging table. I can split the entire record with pipe (|) and then use a Derived Column Transformation to break down the header with SUBSTRING. After that it gets trickier because of the varying number of data groups.
The solution I came up with myself doesn't try to split at the flat file source, but rather in a script. My Data Flow looks like this.
So the Flat File Source output is just a single column containing the entire line. The Script Component contains output columns for each column in the Staging table. The script looks like this.
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
var splits = Row.Line.Split('|');
for (int i = 1; i < splits.Length; i++)
{
Output0Buffer.AddRow();
Output0Buffer.ID = splits[0].Substring(0, 11);
Output0Buffer.Time = DateTime.ParseExact(splits[0].Substring(14, 14), "yyyyMMddHHmmssFFF", CultureInfo.InvariantCulture);
Output0Buffer.Datagroup = splits[i];
}
}
Note that in the SynchronousInputID property (Script Transformation Editor > Input and Outputs > Output0) must be set to None. Otherwise you won't have Output0Buffer available in your script. Finally the OLE DB Destination just maps the script output columns to the Staging table columns. This solves the problem I had with creating multiple output Records from a single input record.
I want load data from stagging to model from source table to dimension in the model,I want to apply the following:
source table:
ID|Name |STRDATE|ENDDATE
1,amr hassan,1-1-2016,2099-12-31
after applying update on the name column from'amr hassan' to 'amr',I want the new updated record to look like the following in the target table:
Dim_target_table:
ID|Name |STRDATE|ENDDATE
1,amr hassan,1-1-2016,21-1-2016
1, amr ,21-1-2016,2099-12-31
after look up create one condition split with in condition split
wirte this logic.
For new records purpose :ISNULL(LKP_EMPNO) and update purpose :(ENAME != LKP_ENAME || SAL != LKP_SAL) && (LKP_EMPNO == EMPNO)
end of condition split connect one oldeb designation it is loaded new records and other end is connect to multicast with multicast connect to oldeb designation and second one is oledb command
with command wirite this query
Update EMPLOYEE_D set ename=?,sal=?,enddate=getdate(),currentflag=0 where DW_EMP_ID=?
DW_EMP_ID is identity column
No need to do anything.Just drag scd tranformation ..it will take care of it....
http://www.msbiguide.com/2012/03/how-to-defineimplement-type-2-scd-in-ssis-using-slowly-changing-dimension-transformation/
Happy?
Using Rails 4, Ruby 2, MySql
I would like to find all the records in my database which are repeats of another record - but not the original record itself.
This is so I can update_attributes(:duplicate => true) on each of these records and leave the original one not marked as a duplicate.
You could say that I am looking for the opposite of Uniq* I don't want the Uniq values, I want all the values which are not uniq after the fact. I don't want all values which have a duplicate as that would include the original.
I don't mind using pure SQL or Ruby for this but I would prefer to use active record to keep it Railsy.
Let's say the table is called "Leads" and we are looking for those where the field "telephone_number" is the same. I would leave record 1 alone and mark 2,3 and 4 as duplicate = true.
* If I wanted the opposite of Uniq I could do something like Find keep duplicates in Ruby hashes
b = a.group_by { |h| h[:telephone_number] }.values.select { |a| a.size > 1 }.flatten
But that is all the records, I want all the duplicated ones other than the original one I'm comparing it to.
I'm assuming your query returns all 'Leads' that have the same telephone number in an array b. You can then use
b = b.shift
which takes the first element off of the b array. Then you can continue with your original thought update_attributes(:duplicate => true)
Using SSIS (2008) and T-SQL: How can you get the filename from a file and look up that value in a SQL Table column and then return a value from another column?
I have a folder which has .jpg image files of products. All the filesnames are in the format eancode.jpg, for example 1234567891023.jpg. Each filename is unique: one eancode per image file for every product.
The productID (primary key, varchar) and the eancode (varchar) are stored in the same SQL table (without the .jpg extensions of course).
What I would like to do is rename the file from eancode.jpg to productID.jpg.
This is the process I had in mind, for example:
Files in FolderA:
ean1.jpg
ean2.jpg
ean3.jpg
Steps/Tasks:
get filename ean1 from ean1.jpg
look up ean1 in Table column EANcode
return corresponding productid: select productID from Table where EANcode = 'ean1'
store returned value from step 3 into package variable (?)
use stored value of step 4 to rename filename
do step 1 - 5 in a foreach loop for all images; If no match can be found it should do nothing with the files.
The main focus of my question is on step 2 - 4.
Thanks!
Your suggested solution would work perfectly. Set the result set (middle of the general tab) of the execute sql task to single value and then map the zero based result value to your package variable from within the result set tab