Using SSIS to import data from excel into multiple tables - ssis

I have one excel sheet which has information which has to be saved in 4 tables. I have to create unique ids for each table.I also have to insert some data into table 1 and then the unique id created there will be used for inserting data into second table(Referential Integrity) . Moreover one table will always get records to be inserted but for rest 3 tables if some data already exists then it has to be updated and not inserted. I m new to SSIS so please guide me on how to proceed further in SSIS.

loads of requirements :)
First, here is an example of a package that loads an excel sheet to a sql database.
You can easily follow it to build your package.
Differences:
You say you need to insert the same data on 4 tables, so between your excel source and your destination, you will add a multicast component and them instead of 1 destination, you will have 4. The "multicast" will create 4 copies of your data, so you can insert into your 4 tables.
The IDs may be a problem, since the 4 destinations will execute separately, you cant get the ID inserted on the first table to update the second. I suggest you do it using a T-SQL on a "Execute SQL task" after everything is imported .
If that is not possible you will need to have 4 separately data flows where on each one you do the inserts reading from your excel and joining with the result of the previous insert with a lookup task

Import it into a Temp table on SQL server. Then you will be able to write a query which retrieves from the Temp table to multiple table.
Hope this solves your problem as per your requirement.

Related

SSIS Delete records using one table that exists in a different table from different databases (SSDT)

I'm using a data flow task and 2 Ole Db sources. The 2 sources bring in data from tables on 2 different databases on the same server. The 2 tables can be mapped by ids. All of the ids from the second table (closedstops) exist in the first table (stops). I need to remove all the the closed stops by id from the first table. Afterwards I need to export the first table out of the database into a text file.
Do I need to use a merge join before deleting or do I need to use a ole db command to delete records (see attached screenshot). I have looked at many questions and answers on stackoverflow as well as tutorials and none of them quite answer my question. Any help is greatly appreciated. Thank you.
Closed stops is the driver table. Leave it be.
Instead of an OLE DB Source table for "stops" change that to a Lookup Component. You are only interested in rows that match.
And then you can use your OLE DB Command to fire off single delete statements.
My preference for performance and traceability would be to insert all the "to be deleted" ids into a table on the Stops database. When the Data Flow has completed, an Execute SQL Task would then fire up to perform the deletes in a set based operation(s).

SSIS PACKAGE_ How to update data in more than two table using single SSIS Package

I want to update three tables from CSV File in a single SSIS package.
I am done with updating the single table by comparing CSV file and table, I have attach the screenshot its working fine. But when I trying to update more than 3 tables in single package getting the problem in updating records.
so please share the detailed steps to update multiple(more than two) tables
Please be clear about how you want to update more than three tables?
or on basis of conditions, you want to insert and update data?

SSIS Script component - Reference data validation

I am in the process of extending an SSIS package, which takes in data from a text file, 600,000 lines of data or so, modifies some of the values in each line based on a set of business rules and persists the data to a database, database B. I am adding in some reference data validation, which needs to be performed on each row before writing the data to database B. The reference data is stored in another database, database A.
The reference data in database A is stored in seven different tables; each tables only has 4 or 5 columns of type varchar. Six of the tables contain < 1 million records and the seventh has 10+ million rows. I don't want to keep hammering the database for each line in the file and I just want to get some feedback on my proposed approach and ideas on how best to manage the largest table.
The reference data checks will need to be performed in the script component, which acts as a source in the data flow. It has an ado.net connection. On pre-execute, I am going to retrieve the reference data from database 'A', the tables which have < 1 million rows, using the ado.net connection, loop through them all using a sqldatareader, convert them to .Net objects; one for each table and add them to a dictionary.
As I process each line in the file, I can use the dictionaries to perform the reference data validation. Is this a good approach? Anybody got any ideas on how best to manage the largest table?

Excel to Multiple Tables in One Database Output - PDI

I'm using Pentaho Data Integration for my ETL process...
I have multiple excel files that I need to merge and upload in one database. However, I cannot Distribute the fields into its corresponding tables in the database. I can only send it to one table at a time. Is there any other way to do this? How can I have multiple target table?
P.S. I'm using MySQL Workbench for the database.
Thank you for your help!
You can connect multiple Table output steps to your last processing step and set it to copy all rows to both or all target steps. Connect Table outputs (or Insert/update, etc) like in the image, then right-click the step where the stream splits and select Copy Data to Next Steps. In the Table outputs you obviously only specify the columns that apply to that table.

How to use load data Infile to insert into multiple tables?

I use aa python program which inserts many new entries to database,
this new entries are spread across multiple tables.
I'm using load data infile to load the file, but this solution is only for one table, and I don't feel like to do this multiple times.
I found http://forge.mysql.com/worklog/task.php?id=875 this but I'm not quite
sure if its already implemented or not.
I am doing exactly what you are trying to do as follows:
Step 1: Create a temp table (holding all the fields of the import file)
Step 2: LOAD DATA LOCAL INFILE -> into the temp table
Step 3: INSERT INTO Table1 ( fieldlist ) SELECT FROM TempTable ( matching fieldlist ) ... include JOINS, WHERE, and ON PRIMARY KEY UPDATE as necessary
Step 4: Repeat step 3 with the second table insert query and so on.
Using this method I am currently importing each of my 22MB data files, and parsing them out to multiple tables (6 tables, including 2 audit/changes tables)
Without knowing your table structure and data file structure it is difficult to give you a more detailed explanation, but I hope this helps get you started
load data from local file to insert new data accross multiple tables isnt yet supported (v 5.1)
I don't think LOAD DATA can do that, but why not duplicate the table after importing?
See
Duplicating table in MYSQL without copying one row at a time
Or, if you can go outside mySQL, Easiest way to copy a MySQL database?