Can an SSIS script component send more than 25,000 rows at a time?

Can an SSIS script component send more than 25,000 rows at a time? - ssis

We have several SSIS packages that sync records from Microsoft AX to our warehouse management system, we use a script component in it that calls a query service within AX to get the records and then we insert them into our target table in SSMS. On a single run it will grab 25000 exactly and send those over, is there a way to increase how many it will send over in a single run? We didn't set this 25000 limit it's just something we've always seen happen with any of them that are trying to send more than that.

No, there is no row limitation within SSIS itself. Here we have 25,001 rows flowing through a data flow.
Is there a default limit to Dynamics AX queries of 25,000? Yes there is
From a single call to the query service, the default maximum number of rows that can be returned from a query data source is 25,000. This means that each data table in the dataset returned by the query service can have up to 25,000 records if you do not specify a record limit.
https://learn.microsoft.com/en-us/dynamicsax-2012/appuser-itpro/handling-large-datasets-returned-by-the-query-service

Related

SSIS: How to get the number of updated and deleted rows in an audit?

Imagine that you want to save in a variable the number of rows the were updated or deleted in a table.
‌
This is the steps that i did:
First, in the Control flow i created a Data Flow Task.
Them, in the Data Flow, i created a source(in my case is a excel file), then i proceeded to create two variables to count those rows- countDeleted and countUpdated, then connected the variables to two row count transformations, and them connected my destination (OLE DB).
Now in the control flow, what do i do??
Create a SQL execute task?? or a Script task?? What is the best way to do it?? What is the piece of code to use??
Thanks for youy help.
P‌S: i only have 4 weeks off SSIS, sorry for my noobieness :)

An OLD DB destination only inserts. It can't UPDATE or DELETE
What's your logic for updating or deleting?
If you're just starting out and reading about doing things in SSIS you will eventually find advice to use the OLE DB Command to perform row by row delete and inserts.
In my opinion this is to be avoided. It does not scale (works fine for small recorsets then fails for large recordsets), and it is difficult to maintain parameter mappings in the OLE DB Command. Although you should try it anyway to familiarise yourself with it.
My advice is to load the Excel data into a staging table, perform batch DELETE and UPDATE statements to load the data and use ##ROWCOUNT to capture the records updated.
For example;
Your existing described dataflow can be used to load into a table called StagingTable
Before your dataflow you should run an Execute SQL Task (This is in the Control Flow pane, not the Data Flow pane) that clears the staging table:
TRUNCATE TABLE StagingTable;
So first get that working - repeatedly running your package clears the staging table then loads Excel into it without creating duplicates
This in itself is a challenge as Excel is a terrible data interchange format.
Once you have that working, you add an execute SQL task to the end that runs some SQL that deletes the records you want and captures the count. For example:
DELETE FROM MyFinalTable WHERE PriamryKey IN (SELECT PrimaryKey FROM StagingTable);
SELECT ##ROWCOUNT;
Then you follow the instructions here to load that back to your SSIS variable
http://microsoft-ssis.blogspot.com/2011/03/rowcount-for-execute-sql-statement.html
What are you doing with this row count? Are you writing it to a logging table? Save
yourself the bother of pulling it back into an SSIS variable and just write it directly:
DELETE FROM MyFinalTable WHERE PriamryKey IN (SELECT PrimaryKey FROM StagingTable);
INSERT INTO LogTable(Table,Operation,Type)
SELECT 'MyFinalTable','Delete', ##ROWCOUNT;
In my experience it is not a good idea to build convoluted logic into SSIS packages if you can instead do in a database. Although it does depend on the person who has to eventually maintain it. Hopefully you can appreciate that this T-SQL approach is a more straightforward code based approach as opposed to having to dig around in property pages and events and other places inside SSIS packages.

I assume that you're using an Execute SQL Task for the updates and deletes? As #Nick.McDermaid mentioned, using an OLE DB Command within a Data Flow presents various issues when performing DML. You can find the number of rows updated, inserted, or deleted in a table through an Execute SQL Task by using the ExecValueVariable property of this task. Set the variable that will hold the row count to this property and it will return the number of affected rows. Note that is will only return the number of rows impacted by the last statement in the Execute SQL Task, regardless of batches (i.e. GO separators) are in the component.

Regarding executing data flow tasks in a specific order in SSIS

I created 4 Data Flow Tasks as shown in the image in which order I want them to run:
Customers
Locations
Orders
Items
Each data task flow consists of:
Flat File data source
Data Conversion
OLE DB Destination
When I run, according to the Execution Results tab in the image, not in the order I wanted but:
Customers
Items
Locations
Orders
The execution completes successfully though. I was able to check all data successfully inserted into DB.
Questions:
Does sequence of tables migration matter when tables have foreign keys?
If yes, then why the above execution didn't throw any errors?
How to specify the order of execution?
Thanks.

The order you've specified is the order in which it has actually run. The Execution Results tab is showing them alphabetically.
If you'd like to see the order and prove it to yourself, add some tasks in the sequence to write to a table with datetimes in, or make or inspect a log (using the SSIS Catalog if you're using a Project deployment, or a Log Provider if using Package deployment).

The order is exactly the way you set. but If its important to you to be sure its in specific order, you can set the Constraint to Completion.
i this way the second will not start till the first complete.

How to View Data In SSRS Reports

I have an SSIS package, There are only two tasks in my SSIS package. One is Execute SQL Task and another one is Data Flow Task. My first task is doing to truncate a table (Table_1) and second task is just load data in truncated table means Table_1. An SSRS report get data from that table means Table_1. My SSIS packages run every hour. when SSIS package running in the same time users has complaining that they are not able to view data in the report. How do I do that my SSIS package start running. Users can able to view data in my Report.

Off the top of my head, have a table that stores values on whether this deal is running or not, doing something like setting a single value to 1 when running and 0 when not running.
Then in your SSRS, instead of a 'SELECT * FROM Table_1', write a stored procedure that has logic to lookup the value in the above table. If 0 return the set normally, if 1 then return some kind of friendly message that data is currently being loaded and the report is not available.

Access Continuous Form with Linked Table - How to Avoid Hitting Database Server for Every Row in Form?

I'm migrating the data from an Access database to SQL Server via the SQL Server Migration Assistant (SSMA). The Access application will continue to be used with the local tables converted to linked tables.
One continuous form hangs for 15 - 30 seconds when it's loading. It displays approximately 2000 records. When I looked in SQL Server Profiler to see what it was doing, it was making a separate call to the backend database for each record in the form. So the delay when the form opens is caused by the 2000-odd separate calls to the database.
This is amazingly inefficient. Is there any way to get Access to make a single call to the backend database and retrieve all the records at once?
I don't know if this is relevant but the Record Source for the form is a view in the SQL Server backend database, which is linked to via an Access linked table (so, hopefully, Access just sees it as a table, not a view). I needed an Instead Of trigger on the view in SQL Server, and a unique index on the linked table in Access, to allow the records to be updated via the form.

If the act of opening that continuous form really does generate ~2000 separate SQL queries (one for every row in the view) then that is unusual behaviour for Access interacting with a SQL Server linked "table". Under normal circumstances what takes place is:
Access submits a single query to return all of the Primary Key values for all rows in the table/view. This query may be filtered and/or sorted by other columns based on the Filter and Order By properties of the form. This gives Access a list of the key values for every row that might be displayed in the form, in the order in which they will appear.
Access then creates a SQL prepared statement using sp_prepexec to retrieve entire rows from the table/view ten (10) rows at a time. The first call looks something like this...
declare #p1 int
set #p1=4
exec sp_prepexec #p1 output,N'#P1 int,#P2 int,#P3 int,#P4 int,#P5 int,#P6 int,#P7 int,#P8 int,#P9 int,#P10 int',N'SELECT "ID","AgentName" FROM "dbo"."myTbl" WHERE "ID" = #P1 OR "ID" = #P2 OR "ID" = #P3 OR "ID" = #P4 OR "ID" = #P5 OR "ID" = #P6 OR "ID" = #P7 OR "ID" = #P8 OR "ID" = #P9 OR "ID" = #P10',358,359,360,361,362,363,364,365,366,367
select #p1
...and each subsequent call uses sp_execute, something like this
exec sp_execute 4,368,369,370,371,372,373,374,375,376,377
Access repeats those calls until it has retrieved enough rows to fill the current page of continuous forms. It then displays those forms immediately.
Once the forms have been displayed, Access will "pre-fetch" a couple of more batches of rows (10 rows each) in anticipation of the user hitting PgDn or starting to scroll down.
If the user clicks the "Last Record" button in the record navigator, Access again uses sp_prepexec and sp_execute to request enough 10-row batches to fill the last page of the form, and possibly pre-fetch another couple of batches in case the user decides to hit PgUp or start scrolling up.
So in your case if Access really is causing SQL Server to run individual queries for every single row in the view then there may be something particular about your SQL View that is causing it. You could test that by creating an Access linked table to a single SQL Table or a simple one-table SQL View, then use SQL Server Profiler to check if opening that linked table causes the same behaviour.

Turned out the problem was two aggregate fields. One field's Control Source was =Count(ID) and the other field's Control Source was =Sum(Total_Qty).
Clearing the control sources of those two fields allowed the form to open quickly. SQL Server Profiler shows it calling sp_execute, as Gord Thompson described, to retrieve seven batches of 10 rows at a time. Much quicker than making 2000 calls to retrieve one row at a time.

I've come across the same problem again but this time with a different cause. I'm including it here for completeness, to help anyone in a similar situation:
This time the underlying query was hanging and SQL Server Profiler showed the same behaviour as before, with Access making separate calls to the SQL Server database to bring back one record at a time, for every record in the query.
The cause turned out to be the ORDER BY clause in the query. I guess Access had to pull back all records in the linked table from SQL Server before being able to order them. Makes sense when I think of it. Although I don't know why Access doesn't just pull all records through at once, instead of getting the records one at a time.

I would try setting the Recordset Type to Snapshot (on the Data tab of the Form's property sheet and/or the property sheet of the query you are using for the form source)

What is the best tool to use to transfer Data from Reporting Database to another?

I have a reporting database and have to transfer data from that to another server where we run some other reports or functions on Data. What is the best way to transfer data periodically like months or by-weekly. I can use SSIS but is there anyway I can put some where clause on what rows should be extracted from the source database? like i only want to extract data for a current month. Please do let me know.
Thanks,
Vivek

For scheduling periodic extractions, I'd leave to that SQL Agent.
As for restricting the results by some condition, that's an easy thing. Instead of this (and you should always use SQL Command or SQL Command From Variable over Table Name/Table Name From Variable as they are faster)
Add a parameter. If you're use OLE DB connection manager, your indicator for a variable is ?. ADO.NET will be #parameterName
Now, wire the filter up by clicking the Parameters... button. With OLE DB, it's ordinal position starting at 0. If you wanted to use the same parameter twice, you will have to list it each time or use the ADO.NET connection manager.
The biggest question you will have to answer is how do I identify what row(s) need to go. Possibilities are endless: query into the target database and find most recent modified date for a table or highest key value. You could create a local table that tracks what's been sent and query that. You could perform an incremental load / ETL Instrumentation to identify new/updated/unchanged rows, etc.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008