Generate 10 queries to run in SSIS - sql-server-2008

I have a driver table, date_driver_table that contains 10 dates. Jan 2014, Feb 2014, ... Nov2014.
I need to run a query
select * from records_Jan2014 where recdate='Jan 2014'
This is query 1 . After this runs and puts the result set in a SQL server table, query 2,
select * from records_Feb2014 where recdate='Feb 2014'
will then run and do the same insert into SQL server table , and then query 3, and so forth until no dates left in driver table.
So in ssis I have an execute SQL task with full result set enabled that puts all the dates from date driver table in a variable called date with type object and then feeds into a for each loop with a variable called single date type string. A dat flow with source and a destination of a SQL server table. The problem is how do I set up a source to do query 1 then put the results in the table then do query 2 etc.
I was thinking maybe creating 10 files with SQL and then using the ole db source with file as the SQL that needs to run but sure there is a way to do this with the for each loop. Can anyone point me how to this ? The question is how to set up the for each loop so it runs query 1, puts into the table then runs query 2 and puts it into the table and so on until all the records are done.

Used sql command expression pointing to a variable on the ADO.net Source . Then the variable was fed from an execute sql task which gave the list to process.

Related

Data Flow Task - Set two User Date Variables as Parameters

I am creating an SSIS package that will run each month. This particular stored procedure needs to run for one week at a time since the data returned is very large.
I have set up my stored procedure to with two parameters: #StartDT and #EndDT. I created two SSIS variables: StartDT and Wk1EndDT (I'll create the other start and end dates for the weeks once I get this one working).
StartDT has this expression:
(DT_DATE)((DT_WSTR, 4)YEAR(DATEADD("mm", -1, GETDATE())) + "-" +RIGHT("0" + (DT_WSTR,2)MONTH(DATEADD("mm", -1, GETDATE())),2)+"-01")
Wk1EndDT has this expression:
DATEADD("DD",7, #[User::StartDT])
I'm using a DataFlow task with a SQL command text of:
EXECUTE dbo.uspUploadWk1 ?,?
When I go to preview the results, I receive the following error message:
There was an error displaying the preview.
No value given for one or more required parameters. (Microsoft SQL Server Native Client 11.0)
I have the parameters set like this:
I am not sure why this isn't working. I've searched all over and have not found an answer. I am using Visual Studio 2015.
Assuming an OLE DB Connection Manager, the Mappings tab should be using a zero based ordinal system on the Parameters column. Yes, it defaults to naming them as Parameter0, Parameter1, etc but for an OLE DB connection manager, you'll use the ordinal position of the question marks, ?, starting at zero.
For ODBC, it becomes a 1 based counting but still uses ? as the parameter place holder.
ADO.NET uses named parameters so we'd match EXECUTE dbo.uspUploadWk1 #Parameter0, #Parameter1 but the ADO.NET source component doesn't support parameterization
Reference on parameters and mapping for Execute SQL Task but the syntax remains the same for Data Flow Task components

Converting MS Access queries to MariaDB

I'm struggling to create MariaDB SQL commands which will produce the same output as these three queries (below) which I'm currently using with an MS Access database. My Excel VBA script calls the third SQL query command below (Hours to Heat Elecric WH) with only this SQL command, where the date value is substituted dynamically. For the purposes of this question that command would look like this:
SELECT ElectricWH_Data.*
FROM ElectricWH_Data
WHERE (ElectricWH_Data.Date_Reading) > #06/01/19#;
This is an abstract of the resulting table:
Date_Time Date Time Max WH Out Min WH Out
6/27/18 0:52 06/27/18 00.52 60.38 43.56
6/28/18 0:52 06/28/18 00.52 60.50 44.44
6/29/18 0:32 06/29/18 00.32 60.13 45.38
6/30/18 0:32 06/30/18 00.32 60.19 47.13
7/1/18 0:12 07/01/18 00.12 60.50 47.56
7/2/18 0:42 07/02/18 00.42 60.44 44.94
7/3/18 0:42 07/03/18 00.42 60.38 46.88
I would like to duplicate this process but using a MariaDB database and SQL commands. Can you assist?
By the way, I am aware that dates and date formats are handled differently in MariaDB.
Below are the SQL queries from the MS Access database.
GetTemDataByDay:
SELECT
Min(PiSolarWH.Electric_WH_Out) AS MinOfElectric_WH_Out,
Max(PiSolarWH.Electric_WH_Out) AS MaxOfElectric_WH_Out,
Format(PiSolarWH.Date_Reading,'mm/dd/yy') AS TheDay
FROM
PiSolarWH
GROUP BY
Format(PiSolarWH.Date_Reading,'mm/dd/yy');
ElectricWHData:
SELECT
PiSolarWH.Date_Reading,
Format([PiSolarWH.Date_Reading],'mm/dd/yy') AS TheDate,
Format([Date_Reading],'hh.mm') AS DayTime,
GetTempDataByDay.MaxOfElectric_WH_Out AS Expr1,
GetTempDataByDay.MinOfElectric_WH_Out AS Expr2
FROM
GetTempDataByDay, PiSolarWH
WHERE
Format([PiSolarWH.Date_Reading],'mm/dd/yy') = [GetTempDataByDay].[TheDay]
AND GetTempDataByDay.MaxOfElectric_WH_Out = [PiSolarWH].[Electric_WH_Out];
Hours to Heat Elecric WH:
SELECT
PiSolarWH.Date_Reading,
Format([Date_Reading],'hh.mm') AS DayTime,
GetTempDataByDay.MaxOfElectric_WH_Out,
PiSolarWH.Electric_WH_Out,
Format([PiSolarWH.Date_Reading],'mm/dd/yy') AS Expr1
FROM
GetTempDataByDay,
PiSolarWH
WHERE
GetTempDataByDay.MaxOfElectric_WH_Out = [PiSolarWH].[Electric_WH_Out]
AND Format([PiSolarWH.Date_Reading],'mm/dd/yy') = [GetTempDataByDay].[TheDay];
OK, I figured it out! MariaDB's stored VIEWS work like MS Access Stored Queries. I was able to add the three MS Access queries (of course with modified syntax) to the database as stored VIEWS. The work exacgtly like those in MS Access. Here is one example:
CREATE VIEW GetTempDataByDay AS
SELECT
date_reading,
Min(temps.Electric_WH_Out) AS MinOfElectric_WH_Out,
Max(temps.Electric_WH_Out) AS MaxOfElectric_WH_Out,
date(temps.Date_Reading) AS TheDay
FROM
temps
GROUP BY
date(temps.Date_Reading);
Which is then used in the other two VIEWS which I created to duplicate the MS Access stored queries.
Thanks for your time....RDK

How to compare the two table row count , if counts matches than ok if not matches this will restart the SSIS package

I have made the ssis package in which i made the data flow for incremental data. Source and destination server ip's are different. Below you can find the flow diagram of my packageControl flow diagram
Data flow diagram
the package is working fine .
In the Execute SQl task :- it controls the log table and start the incremental task
query which i used is :-
insert into audit_log (
Packagename,
process_date,
start_datetime,
end_datetime,
Record_processed,
status
)values('CRM-TO-TRANSORGDB',null,GETDATE(),null,null,null);
select MAX(ID) as ID,MAX(process_date) as proc_date from audit_log where Packagename ='CRM-TO-TRANSORGDB' ;
store the ID and proc_date in the variable.
in the Execute SQl task 1:- it just update the log table.
UPDATE audit_log
SET
process_date=?,
end_datetime = GETDATE(),
status='SUCCESS'
record_processed=?
WHERE (packagename = 'CRM-TO-TRANSORGDB') AND ID=? ;
this is the query we have used to update the log table.
In the Data flow simple fetching the all the records and put in into the destination table.
this all i have done .
But my question are:-
1) How to compare the total no. of row counts from the source table to destination table in ssis package.
2) if its doesn't matches than it will restart my task automatically.
#thomas as per your instruction i have done the following thing:
1) i have made the Execute SQl Task for source and destination .
2) and Add the Execute Package task and added the condition for not matching the count.
and added the expression for check row_count_src!= row_count_dest
and in Source_table_count i have used the below query:
select count(SubOrderID) as row_count_src from fact_suborder_journey
WHERE Suborderdate between '2016-06-01' and GETDATE()-1 ;
in dest_table_count i have used the below query:
select count(SubOrderID) as row_count_dest from fact_suborder_journey
WHERE Suborderdate between '2016-06-01' and GETDATE()-1 ;
i have added the two variable as int64 in ths ssis package. and map in the result set below you can find the pic what i have done.
but After done all this this i am getting this error:
[Execute SQL Task] Error: An error occurred while assigning a value to variable "row_count_src": "The type of the value being assigned to variable "User::row_count_src" differs from the current variable type. Variables may not change type during execution. Variable types are strict, except for variables of type Object.
".
I havent tested this completely but you might be able to do something like this. This creates a loop of your packages and will executes as long as your count variables are different from each other.
What have i done?
First i have a DataFlow Task which moves data from source to
destination.
Then i have an Execute SQL task which basically counts all rows from
TableA and maps it to variable count1 eg. Source table
Then i have an Execute SQL task which basically counts all rows from
TableB and maps it to variable count2 eg. Destination Table
Then i create an Execute Package task where i reference it too it
self. Then i make a precedence constraint with an expression saying
Count1 != count2.
Because if they are different you want to restart the task. If they
are equal the last task Execute Package task will never be executed.
Hope that is something like that?
If I understand your challenge correctly...
In the data flow task, use a RowCount transformation between source
and destination to capture the rows written to the destination. This
will be stored in a variable.
In the control flow, get the max row counts available from the log table and store that a variable.
Create an execute package tasks that executes this same package and put a precedence constraint before if that compares if variable from Step1 <> variable in Step2.

Find out Total Number of Rows affected by SQL Command Variable in Data Flow Task

I have SQL Command From Variable (In General it is a Select Statement) as a Source in DataFlow Task.
Destination is .csv File.
Problem: Even though no rows is affected by SQL command Variable .csv file is generating without records. I don't want to generate the file if the select statement (from SQL command variable) populate no records.
Please advise me.
Simple procedure:
you could count the rows with a query before export, using Execute SQL Task, if the number of rows is greater than 0 then proceed with the export;
The following is a possible solution:
use a query like SELEC COUNT(*) AS MYCOUNT FROM...
use a package variable (myVariable to associate with MYCOUNT), to contain the number of rows
set Result Set = Single Row in SQL Task Editor
map the variable in tab Result Set in SQL Task Editor (MYCOUNT - myVariable)
use two arrows from Execute SQL Task in each arrow choose Evaluation operation: Expression, Expression: myVariable > 0 (first arrow) and myVariable == 0 (second arrow), choose Logical OR, in this way you have a bifurcation!
connect the export to the arrow with myVariable > 0
connect the other arrow to another possible task, for example it can warn you that there are no rows via email
For counting rows can also use the task: Row Count (present in the latest SSIS versions); the Row Count transformation counts rows as they pass through a data flow and stores the final count in a variable.
I hope it help

SQL Server 2008: insert into table in batches

I have a linked server (Sybase) set up in SQL Server from which I need to draw data. The Sybase server sits on the other side of the world, and connectivity is pretty shoddy. I would like to insert data into one of the SQL Server tables in manageable batches (e.g. 1000 records at a time). I.e I want to do;
INSERT IN [SQLServerTable] ([field])
SELECT [field] from [LinkedServer].[DbName].[dbo].[SybaseTable]
but I want to fetch 1000 records at a time and insert them.
Thanks
Karl
I typically use python with the pyodbc module to perform batches like this against a SQL server. Take a look and see if it is an option, if so I can provide you an example.
You will need to modify a lot of this code to fit your particular situation, however you should be able to follow the logic. You can comment out the cnxn.commit() line to rollback the transactions until you get everything working.
import pyodbc
#This is an MS SQL2008 connection string
conn='DRIVER={SQL Server};SERVER=SERVERNAME;DATABASE=DBNAME;UID=USERNAME;PWD=PWD'
cnxn=pyodbc.connect(conn)
cursor=cnxn.cursor()
rowCount=cursor.execute('SELECT Count(*) from RemoteTable').fetchone()[0]
cnxn.close()
count=0
lastID=0
while count<rowCount:
#You may want to close the previous connection and start a new one in this loop. Otherwise
#the connection will be open the entire time defeating the purpose of performing the transactions in batches.
cnxn=pyodbc.connect(conn)
cursor=cnxn.cursor()
rows=cursor.execute('SELECT TOP 1000 ID, Field1, Field2 FROM INC WHERE ((ID > %s)) ' % (lastID)).fetchall()
for row in rows:
cursor.execute('INSERT INTO LOCALTABLE (FIELD1, FIELD2) VALUES (%s, %s)' % (row.Field1, row.Field2))
cnxn.commit()
cnxn.close()
#The [0] assumes the id is the first field in the select statement.
lastID=rows[len(rows)-1][0]
count+=len(rows)
#Pause after each insert to see if the user wants to continue.
raw_input("%s down, %s to go! Press enter to continue." % (count, rowCount-count))