FlatFile Source in SSIS

FlatFile Source in SSIS - ssis

In My Flat File source, i want to transfer all these data in OLDEDB.
But I want to DIVIDE data into different tables.
Example.
Table one starts in first %F and ends before another %F in col[0].
And table two starts in second %F with different header because it has different fields than the first table.
Is this possible in SSIS?

Looks like, in a single flat file, 2 table data are provided. From image, it looks like, both tables have different data structure also. I think, it is difficult to load the file at one step.
May be, this steps will hep you.
Step 1. Load all the data into a table (Let to a table named [Table]). Load including the column headers.Data may look like this (just a pattern as example.)
In this table make sure you add an increment column
Step 2. A query like below will help you identifying from which row does the 2nd table starts.
Select Top 1 Column0 From [Table] Where Column1 = '%F' Order By Column0 Desc
In your ssis package, add a variable to store above result
Step 3. Add a dft with source as [Table]. After the source add a conditional split.
If Column0 < variable value, sent row to [Table1]
else to [Table2]
There may be some more modifications, still.
Added as per comment:
If you have more than 1 table.
step 1. Load all data to one table.
step 2. Add an additional column ([columnX] in image). Its value should be in such a way that, with it you should be able to identify the table.
step 3. Use a conditional split itself, using columnX map each rows to its corresponding table.
As per request, added Edit:
use a logic like this..Run the script in SSMS and see the result.
Declare #table table (id int identity(1,1),Col1 varchar(5), ColX int)
Insert into #table (Col1) Values
('%F'),('%R'),('%R'),('%R'),('%R'),('%R'),('%R'),
('%F'),('%R'),('%R'),('%R'),('%R'),('%R'),('%R'),
('%F'),('%R'),('%R'),('%R'),('%R')
Select *
from #table A
Update Y
Set ColX = Z.X
From #table Y Join(
Select A.id FromId,B.id ToId,A.X From
(
Select id,ROW_NUMBER() Over (Order By id) X From (
Select id from #table Where Col1 = '%F'
Union
Select max(id) id From #table ) Lu ) A,
(
Select id,ROW_NUMBER() Over (Order By id) X From (
Select id from #table Where Col1 = '%F'
Union
Select max(id) id From #table ) Lu ) B
Where A.X = B.X - 1 ) Z On Y.id >= Z.FromId and Y.id < Z.ToId
Select *
from #table A
Select *
from #table A

Related

Table is specified twice, both as a target for 'UPDATE' and as a separate source for data in mysql

I have below query in mysql where I want to check if branch id and year of finance type from branch_master are equal with branch id and year of manager then update status in manager table against branch id in manager
UPDATE manager as m1
SET m1.status = 'Y'
WHERE m1.branch_id IN (
SELECT m2.branch_id FROM manager as m2
WHERE (m2.branch_id,m2.year) IN (
(
SELECT DISTINCT branch_id,year
FROM `branch_master`
WHERE type = 'finance'
)
)
)
but getting error
Table 'm1' is specified twice, both as a target for 'UPDATE' and as a
separate source for data

This is a typical MySQL thing and can usually be circumvented by selecting from the table derived, i.e. instead of
FROM manager AS m2
use
FROM (select * from manager) AS m2
The complete statement:
UPDATE manager
SET status = 'Y'
WHERE branch_id IN
(
select branch_id
FROM (select * from manager) AS m2
WHERE (branch_id, year) IN
(
SELECT branch_id, year
FROM branch_master
WHERE type = 'finance'
)
);

The correct answer is in this SO post.
The problem with here accepted answer is - as was already mentioned multiple times - creating a full copy of the whole table. This is way far from optimal and the most space complex one. The idea is to materialize the subset of data used for update only, so in your case it would be like this:
UPDATE manager as m1
SET m1.status = 'Y'
WHERE m1.branch_id IN (
SELECT * FROM(
SELECT m2.branch_id FROM manager as m2
WHERE (m2.branch_id,m2.year) IN (
SELECT DISTINCT branch_id,year
FROM `branch_master`
WHERE type = 'finance')
) t
)
Basically you just encapsulate your previous source for data query inside of
SELECT * FROM (...) t

Try to use the EXISTS operator:
UPDATE manager as m1
SET m1.status = 'Y'
WHERE EXISTS (SELECT 1
FROM (SELECT m2.branch_id
FROM branch_master AS bm
JOIN manager AS m2
WHERE bm.type = 'finance' AND
bm.branch_id = m2.branch_id AND
bm.year = m2.year) AS t
WHERE t.branch_id = m1.branch_id);
Note: The query uses an additional nesting level, as proposed by #Thorsten, as a means to circumvent the Table is specified twice error.
Demo here

Try :::
UPDATE manager as m1
SET m1.status = 'Y'
WHERE m1.branch_id IN (
(SELECT DISTINCT branch_id
FROM branch_master
WHERE type = 'finance'))
AND m1.year IN ((SELECT DISTINCT year
FROM branch_master
WHERE type = 'finance'))

The problem I had with the accepted answer is that create a copy of the whole table, and for me wasn't an option, I tried to execute it but after several hours I had to cancel it.
A very fast way if you have a huge amount of data is create a temporary table:
Create TMP table
CREATE TEMPORARY TABLE tmp_manager
(branch_id bigint auto_increment primary key,
year datetime null);
Populate TMP table
insert into tmp_manager (branch_id, year)
select branch_id, year
from manager;
Update with join
UPDATE manager as m, tmp_manager as tmp_m
inner JOIN manager as man on tmp_m.branch_id = man.branch_id
SET status = 'Y'
WHERE m.branch_id = tmp_m.branch_id and m.year = tmp_m.year and m.type = 'finance';

This is by far the fastest way:
UPDATE manager m
INNER JOIN branch_master b on m.branch_id=b.branch_id AND m.year=b.year
SET m.status='Y'
WHERE b.type='finance'
Note that if it is a 1:n relationship the SET command will be run more than once. In this case that is no problem. But if you have something like "SET price=price+5" you cannot use this construction.

Maybe not a solution, but some thoughts about why it doesn't work in the first place:
Reading data from a table and also writing data into that same table is somewhat an ill-defined task. In what order should the data be read and written? Should newly written data be considered when reading it back from the same table? MySQL refusing to execute this isn't just because of a limitation, it's because it's not a well-defined task.
The solutions involving SELECT ... FROM (SELECT * FROM table) AS tmp just dump the entire content of a table into a temporary table, which can then be used in any further outer queries, like for example an update query. This forces the order of operations to be: Select everything first into a temporary table and then use that data (instead of the data from the original table) to do the updates.
However if the table involved is large, then this temporary copying is going to be incredibly slow. No indexes will ever speed up SELECT * FROM table.
I might have a slow day today... but isn't the original query identical to this one, which souldn't have any problems?
UPDATE manager as m1
SET m1.status = 'Y'
WHERE (m1.branch_id, m1.year) IN (
SELECT DISTINCT branch_id,year
FROM `branch_master`
WHERE type = 'finance'
)

Generic procedure to perform SCD in sql

I have 2 tables in mssql server.I can perform scd through custom insert/update/delete and also through Merge statement.
Awesome Merge
I want to know that is there any generic procedure that could server the purpose. we just pass it 2 tables and it should porform the SCD. any option in SQL server 2008?
Thanks

No, there isn't and there can't be a generic one suitable for no matter what tables you pass to it. For several reasons:
How do you know which SCD type? (Okay, could be another parameter, but...)
How do you know which column should be historicized and which should be overwritten?
How do you determine which column is the business key, the surrogate key, the expiration column and so on?
To specify the columns in an update statement you must write dynamic sql, which is possible, but the above point comes into play
Not a reason why it's not possible but also consider: For a proper UPSERT one usually works with temporary tables, the MERGE statement sucks for SCDs except in special cases. That is because you can't use a MERGE statement together with an INSERT/UPDATE and you would have to disable foreign keys for that, since an UPDATE is implemented as DELETE THEN INSERT (or something like that, don't remember clearly, but I had those problems when I tried).
I prefer doing it this way (SCD type 2 and SQL Server that is):
Step 1:
IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimSource')
DROP TABLE tmpDimSource;
SELECT
*
INTO tmpDimSource
FROM
(
SELECT whatever
FROM yourTable
);
Step 2:
IF EXISTS (
SELECT * FROM sys.objects
WHERE name = 'tmpDimYourDimensionName')
DROP TABLE tmpDimYourDimensionName;
SELECT * INTO tmpDimYourDimensionName FROM D_yourDimensionName WHERE 1 = 0;
INSERT INTO tmpDimYourDimensionName
(
sid, /*a surrogate id column*/
theColumnsYouNeedInYourDimension,
validFrom
)
SELECT
ISNULL(d.sid, 0),
ds.theColumnsYouNeedInYourDimension,
DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0) /*the current date*/
FROM
tmpDimSource ds
LEFT JOIN D_yourDimensionName d ON ds.whateverId = c.whateverId
;
The ISNULL(d.sid, 0) in step 2 is important. It returns the surrogate id of your dimension, if an entry already exists, otherwise 0.
Step 3:
UPDATE D_yourDimensionName SET
validTo = DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()) - 1, 0) /*yesterday*/
FROM
D_yourDimensionName d
INNER JOIN tmpDimYourDimensionName t ON d.sid = t.sid
WHERE t.sid <> 0 AND
(
d.theColumnWhichHasChangedAndIsImportant <> t.theColumnWhichHasChangedAndIsImportant OR
d.anotherColumn <> t.anotherColumn
)
;
In Step 3 you mark the existing entry as not valid anymore and keep a history of it. The valid entry you get with WHERE validTo IS NULL.
You can also add another UPDATE to overwrite any other column with the new value if needed.
Step 4:
INSERT INTO D_yourDimensionName
SELECT * FROM tmpDimYourDimensionName
WHERE sid = 0;
And that's it.

MySQL Insert from another table with 2 option WHERE statement

I have done my research but can not figure out how to do this. It is super simple to insert from another table but I want to include WHERE statements.
I want to insert value of a single column, column_Q from table A into table B's column_Q WHERE table A's column_W = '100' and column_Q does not already exist in table B.
I tried:
INSERT INTO B (column_Q) select DISTINCT(column_Q)
from A WHERE column_W = 100 AND b.column_Q<>a.column_Q;
Where am I doing wrong?
PS. Both tables already contain values. No field is Null.

INSERT
INTO b (q)
SELECT DISTINCT q
FROM a
WHERE a.w = 100
AND a.q NOT IN
(
SELECT q
FROM b
)
If your b.q has a UNIQUE constraint defined on it, then just use:
INSERT
IGNORE
INTO b (q)
SELECT q
FROM a
WHERE w = 100

You cannot refer to the left side of the "assignment", because there is no current row from B to compare to (that would be the one you are inserting) You need to check if a similar row is already present in B, like in:
INSERT INTO B (column_Q)
SELECT DISTINCT(A.column_Q)
FROM A
WHERE A.column_W = 100
AND NOT EXISTS (
SELECT *
FROM B
WHERE B.column_Q = A.column_Q
);

How can you find ID gaps in a MySQL recordset?

The issue here is related to another question I had...
I have millions of records, and the ID of each of those records is auto-incremented, unfortunately sometimes the ID that is generated is sometimes thrown away so there are many many gaps between IDs.
I want to find the gaps, and re-use the ids that were abandoned.
What's an efficient way to do this in MySQL?

First of all, what advantage are you trying to get by reusing the skipped values? An ordinary INT UNSIGNED will let you count up to 4,294,967,295. With "millions of records" your database would have to grow a thousand times over before running out of valid IDs. (And then using a BIGINT UNSIGNED will bump you up to 18,446,744,073,709,551,615 values.)
Trying to recycle values MySQL has skipped is likely to use up a lot of your time trying to compensate for something that really doesn't bother MySQL in the first place.
With that said, you can find missing IDs with something like:
SELECT id + 1
FROM the_table
WHERE NOT EXISTS (SELECT 1 FROM the_table t2 WHERE t2.id = the_table.id + 1);
This will find only the first missing number in each sequence (e.g., if you have {1, 2, 3, 8, 10} it will find {4,9}) but it's likely to be efficient, and of course once you've filled in an ID you can always run it again.

The following will return a row for each gap in the integer field "n" in mytab:
/* cs will contain 1 row for each contiguous sequence of integers in mytab.n
and will have the start of that chain.
ce will contain the end of that chain */
create temporary table cs (row int auto_increment primary key, n int);
create temporary table ce like cs;
insert into cs (n) select n from mytab where n-1 not in (select n from mytab) order by n;
insert into ce (n) select n from mytab where n+1 not in (select n from mytab) order by n;
select ce.n + 1 as bgap, cs.n - 1 as egap
from cs, ce where cs.row = ce.row + 1;
If instead of the gaps you want the contiguous chains then the final select should be:
select cs.n as bchain, ce.n as echain from cs,ce where cs.row=ce.row;

This solution is better, in case you need to include the first element as 1:
SELECT
1 AS gap_start,
MIN(e.id) - 1 AS gap_end
FROM
factura_entrada e
WHERE
NOT EXISTS(
SELECT
1
FROM
factura_entrada
WHERE
id = 1
)
LIMIT 1
UNION
SELECT
a.id + 1 AS gap_start,
MIN(b.id)- 1 AS gap_end
FROM
factura_entrada AS a,
factura_entrada AS b
WHERE
a.id < b.id
GROUP BY
a.id
HAVING
gap_start < MIN(b.id);

If you are using an MariaDB you have a faster option
SELECT * FROM seq_1_to_50000 where seq not in (select col from table);
docs: https://mariadb.com/kb/en/mariadb/sequence/

Select all from table except the records from a file

I have a table A with one column named a, and a file "test.txt" contains:
111111AAAA
222222BBBB
3333DDDDDD
.....
The records in test.txt have the same type with "a" column.
How to select all from A except the records in "test.txt"?
Update:
I tried 3 ways and the results not equal. What a strange!
// 7073 records -- Using NOT IN
SELECT * from mt_users WHERE TERMINAL_NUMBER_1 NOT IN (SELECT TERMINAL_NUMBER FROM A);
// 7075 records -- Using NOT EXISTS
SELECT * from mt_users WHERE NOT EXISTS (SELECT 1 FROM A WHERE A.TERMINAL_NUMBER = mt_users.TERMINAL_NUMBER_1);
// 7075 records -- Using LEFT JOIN
SELECT * FROM mt_users m LEFT JOIN A a ON m.TERMINAL_NUMBER_1 = a.TERMINAL_NUMBER WHERE a.TERMINAL_NUMBER IS NULL;

Firstly put all records from file into the newTable and make sure that there are no additional spaces at the beginning or the end in each field.
select a from tableA t where not exists(select 1 from newTable n where n.a = t.a)

Step 1. Put the records from test.txt into a different table.
Step 2.
SELECT a from tableA WHERE a NOT EXISTS (SELECT a FROM newTable)

doing what aF wrote would be my first answer too. if you cant/do not want to do that try "NOT IN" like:
SELECT a FROM A WHERE a NOT IN(...)
You have to generate the content of the () in the code where you create your query

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

FlatFile Source in SSIS - ssis

Related

Table is specified twice, both as a target for 'UPDATE' and as a separate source for data in mysql

Generic procedure to perform SCD in sql

MySQL Insert from another table with 2 option WHERE statement

How can you find ID gaps in a MySQL recordset?

Select all from table except the records from a file

Categories

Resources