Master-detail migrate in SSIS 2008 - ssis

I have two MSSQL 2008 databases dbA and dbB,
dbA contains master-detail tables pair: AMaster, ADetail. Corresponding it dbB also contains BMaster and BDetail. The only one difference betwee A and B is type of primary key. In source database (dbA) it is integer but in destination (dbB) it is uniqueidentifier.
Dear colleagues: how to describe dataflow in SSIS to convert this case? I need convert all fields, but replace it with new key type.

If you do not want to store the natural primary key in the destination how do you mange updates. Like if a record is changed in the source and you want to replicate a corresponding change in the destination table, how will you do that, I mean this not logically possible. Either you will have to keep the integer based natural key in the destination table or you will have to keep mapping table which stores old natural key and corresponding new key mapping info.
And UID is far bigger number and I don't think that converting integer to UID is a good option. If you really want to do it, do it this way. INT >> HEX STRING >> UID. Check more on UID here [http://msdn.microsoft.com/en-us/library/ms187942.aspx]

Related

PowerApps: Access-like update query

I’m pretty new to PowerApps and need to migrate an Access database over to PowerApps, first of all it’s tables to Dataverse. It’s a typical use case for a model-driven app, with many relationships between the tables. All Access tables had an autogenerated ID field as their primary key.
I transferred all tables via Excel ex/import to Dataverse. Before importing,I renamed all ID fields (columns) to ID_old and let Dataverse create its own, autogenerated ID field for each table.
What I want to achieve is to re-establish all relationships between the tables, where the foreign key points to the new primary key provided by Dataverse, as I want to avoid double keys. As a first step I created relationships between the ID_old field and the corresponding (old) foreign key field in the related table.
In good old Access, I’d now simply run an update query, filling the new (yet empty) foreign key field with the new ID of the related table. Finally, I would change the relationship to the new primary and foreign keys and then delete the old ID fields.
Where I got stuck is the update query. I searched the net and found a couple of options like UpdateIf / Patch functions or Power Query or Excel ex/import and some more. They all read pretty complicated and time intensive and I think I must have overseen a very simple solution for such a pretty common problem.
Is there someone out there who might point me in the right (and simple) direction? Thanks!
A more efficient approach would be to start with creating extra ID columns in Access. Generate your GUIDs and fix your foreign keys there. This can be done efficiently using a few SQL update statements.
When it comes to transferring your Access tables to Dataverse you just provide your Access shadow primary keys in the Create message.
I solved the issue as follows, which is pretty efficient in my perception. I”m assuming you have a auto-numbered ID field in every Access table, which you used for your relationships
Export your tables from Access to Excel.
Rename your ID fields to ID_old in all tables using Excel, as well as your foreign key fields to e.g. ForeignKey_old. This will make it easy to identify the fields later in Dataverse.
Import into Dataverse, using the Power Query tool. Important: Make sure, that you choose ID_old as additional primary key field in the last import step.
Re-create all relationships in Dataverse, using the Lookup datatype. This will create a new, yet empty column in your table.
Now use the “Edit in Excel” feature to open your table in Excel. You should get your prefix_foreignkey_old column with the old foreign keys displayed, as well as the reference to your related table, e.g. prefix_referencetable.prefix_id_old, which is still empty.
Now just copy the complete prefix_foreignkey_old column values into the prefix_referencetable.prefix_id_old column.
Import the changes and you’re done.
Hope this is helpful for some of you out there.

Issues with Slowly Changing Dimension Table with Primary Key

I have a package designed by another developer who used to work for the company. The package takes data from the source and inserts it into the destination. The slowly changing dimension task has 4 columns, set as historic attributes. Meaning it will insert a new row when any of the value changes. The business key is called PropertyID.
In destionation table, PropertyID is the primary key. When the package runs, we get primary key violation error. Which is understandable, because the destination table cannot insert a duplicate value when there is a change in the historic attribute. It is may be not the best design.
I want to correct this but i am not sure of the right approcah. I tried to add a new INT IDENTITY column (to use as a business key in SCD wizard) to the destination table and make the current PropertyID column as not primary key. But the INT IDENTITY column does not show up in the SCD wizard.
If someone can show me right approach to it, I would be much grateful.
Thanks.
In a slowly changing dimension, the destination table will have two types of keys, the surrogate key which will tie out to the fact table, and the business key, which identifies the record from the source.
You do not want the business key as the primary key on the destination in a slowly changing dimension. That is the point of the SCD, you will have multiple rows per business key since you are tracking changes. If you are not wanting to do this, and your table is all type one changes (overwrites with current value), then the SCD transform is not what you want.
See this link ... https://en.wikipedia.org/wiki/Surrogate_key
It looks like your are trying to design a Type 2 SCD as the changed records are getting inserted. In this case there should be a date field to track when a particular record was changed as well as to identify the current record. The primary key in the destination table should also be in property_id and a date field. You can refer the below link to check how is type 2 SCD Designed.
http://datawarehouse4u.info/SCD-Slowly-Changing-Dimensions.html

Is it okay to use the same column as a primary key for different tables?

I am a total novice to this whole database world and I have a question. I am building a database for my final project for my masters class. The database includes cities, counties, and demographic data for the state of Colorado. The database ultimately will be used as a spatial database. At this point I have all my tables built in Access, and have a ODBC connection to PostgreSQL to import the tables after they are created. Access does not allow for shapefiles to be added to the database, PostgreSQL does.
My question is about primary keys, each of my tables in Access share an FIPS code (this code allows me to join the demographic data to a shapefile and display the data in ArcMap with the proper coordinates). I have a many demographic data tables with this FIPS code. Is it acceptable to set the FIPS as the primary key for each table? Or does each table need its own individual primary key that is different from the others?
Thanks for the help!
The default PK is “ID”, so there really no problem with using this default for all tables.
In fact it means for any table or code you write you can now always rest easy as to what the primary key is going to be.
And if you copy or re-name a table, then again you know the ID.
Some people do prefer having the table name as part of the PK, but that does violate normalizing of data since now your attaching an external attribute to that PK column.
However for a FK (foreign key), since the VERY definition of the column is an external dependency, then I tend to include the table name like this:
Customers_ID
And once again due to this naming convention, then you can always “guess” or “know” the name of a FK column (table name + ID).
At the end of the day, there is not really a convention on this issue. However I will recommend for all tables you create, you do allow access to create that default PK of “id”. This of course assumes your database design is not using natural keys. And the debate of natural keys vs surrogate key (an auto number pk “id”) has many pros and cons. You can google natural keys vs surrogate keys for endless discussions on this issue.

ssis lookup transformation for very large table

I have two tables Person & Ownership having one to many relationship (each person have many ownerships). Primary Key is person_id of type GUID. I am developing SSIS package to load data from both tables to destination. In destination tables, I want to add surrogate key so that in future my sql joins use surrogate key not the GUID (Because GUID is slow). I did following tasks
in destination table I created person table with one additional column person_id_sk of type bigint (identity column).
load data to person
similarly created Ownership table with one additional column person_id_sk of bigint
load data to person with "lookup transformation".
The process in very slow as there are millions of record in both table and the package needs to be run twice in a week.
Is this the only way to insert surrogate key values in parent child relationship? or there are any other efficient way.
regards,
An SQL operation would almost certainly be faster than the SSIS one. Load the Person data in SSIS, with the database engine creating the IDENTITY surrogate key. Leave the GUID in so that it goes to the table. Load the Ownership data in SSIS, but don't do anything about the surrogate key at this stage.
Then update the child rows FK in an SQL operation like this:
UPDATE o
SET person_id_sk=p.person_id_sk
FROM
Ownership o
INNER JOIN
Person p
ON o.GUID=p.GUID

SSIS import using foreign key data?

I have an old database (OldDB) with a table (let's call it Call) that I'm using SSIS (2008) and a new database (NewDB) with the following setup:
OldDB.Call has a column called Statuswhich currently is varchar(1) and holds values such as "C", "D", etc.
NewDB now maps all the possible statuses in its own table with a foreign key constraint so that OldDB.Call.Status is now NewDB.CallStatus.id An example of the data in the NewDB.Call.StatusID would be 1,2, 3 and so forth.
NewDB.CallStatus now has a column called Status which holds the actual nvarchar(1) value of A,B, C, etc.
I'm using SSIS to migrate the data. So far, I know I need to use a Sort transformation for each source and then a Merge Join transformation to map the new NewDB.Call.StatusID to the OldDB.Call.Status value. For whatever reason, it seems to start just fine but ends up grabbing other columns (like a description column, for example) and shoves the wrong kind of data in there. In short, it's not mapping the foreign key like it should.
I've found numerous examples on the web on how to do this (like this) but it seems like I'm missing some key, critical piece of information in order to understand what I'm doing because I keep borking it.
In a perfect world, a step-by-step would be great but a good and concise tutorial or explanation would be useful as well. In short, I need to know how to hook those two tables up and map the value in OldDB to the foreign key in the the NewDB and store that value in NewDB.CallStatus.
I would use the Lookup Transformation for this requirement.
Within the Lookup definition, the Connection would point to your NewDB.CallStatus (writing a SELECT is best practice, rather than just choosing the table - it caches the metadata). On the Columns pane, map Status to Status, and choose StatusID as a Lookup column.
Now your data flow will carry that added column downstream, and you can deliver it (typically using an OLE DB Destination).
Lookup's default mode is Full Cache which will be much faster and use much less Memory compared to a Sort & Merge solution.