Sudden increase in packages failures executed against Sharepoint/Office 365 from SSIS - ssis

Beginning yesterday afternoon (12/9 Central US time) we began seeing a marked increase in SSIS packages execution failures. These packages have been in operation for several months and experienced no failures on 12/8. Initially I brushed it off as temporary, but now it seems as if "none" of them are working. Several of these packages run hourly, with the first failure around 10:30 on 12/9. Between 10:30 and 1500 'most' succeeded, but after 1500 on 12/9 most failed.
I'm testing with a relatively simple dataflow package. I have two sources (SQL and Sharepoint). From the sources, I compare the two and then update the Sharepoint list with any changes that have been made (SQL query is the authoritative record). The Source Sharepoint list is the same list that is being updated. As a further test, I removed all steps except for querying the Sharepoint list and sorting it. The initial query still fails.
Errors are happening inconsistently within the dataflow package. For example since I've been testing this morning, I had one (and only once) that made it through the package to the point it was should have tried to Add, Update or Delete list items. The table comparison resulted in updates to the Sharepoint list. The package failed when attempting to update the records. Most of them (and all recent attempts) are failing when the dataflow queries the Sharepoint list initially. There are only two records on the Sharepoint List and two records on the SQL table.
I'm connecting to Sharepoint using MS Graph. Testing the connection (Connection Manager) within VS 2019 has succeeded every time. I've verified that the secret I'm using is not expired. I created a new secret and am receiving the same error. 'Usually' if I attempt to preview the Sharepoint source that is successful, but not always. Even if it is successful attempting to debug and run the package fails. I'm not seeing any alerts on Microsoft or Azure that would provide any indication that the problem's there, though I feel like something must have changed there.
I have opened a support ticket with CozyRoc and they have directed me to open a ticket with Microsoft. Microsoft's support request workflow is directing me here.
In the production All Execution reports, the error I'm getting back is:
"Data Flow Task:Error: Attempt to read message string for 0xc02090f5 failed with error 0xc02090f2. Make sure all message related files are registered."
Initial research pointed me toward a data typing issue, but I've not changed anything in our Sharepoint, SSIS or SQL environment to have changed the data types.
This appears to be very repeatable so I can try providing more information if needed.

Look likes the answer was to wait. After about 1 week, the issue resolved as quickly as it appeared. I didn't find a way to report my issue directly to Microsoft without adding a support plan. I was in the process of finding alternative methods to address our needs when it resolved 'by itself'.

Related

SSIS package wrote 0 rows

Yes, I read the other questions on the same topic, but they do not cover my issue.
We run two environments; DEV and Prod. The two were synched last week, meaning they ought to contain the same data, run the same SSIS packages, and source the same source data.
However, today we had a package on PROD go through its' usual steps (3 tables being truncated, and then loaded from OLEDB source to OLEDB destination, one after the other). The package finished without throwing an error, and the first 2 tables contain data, whereas the last one does not.
On DEV, everything looks fine.
I went through the package history, and it actually shows it wrote 0 rows:
Yesterday, however, it worked as intended:
When I manually ran the package, it wrote data. When I click "Preview", it displays data. When I manually run the source query, it consistently returns data, the same amount of rows, every time. The SSIS catalog has not been updated (no changes were deployed to PROD between yesterday and today).
The source query does not use table variables, but it does use CTEs. I have seen suggestions to add SET NOCOUNT ON, and willing to accept this could be an explanation. However, those answers seem to indicate the package never writes any data, whereas this package has worked successfully before, and works successfully on DEV.
Does anyone have any explanation as to how I can explain to my customer that I have no clue as to why 1 package suddenly chose not to write any data, and how I can ensure this won't happen again, to either this package or any of the other packages?
This can be tricky. Try the following:
Under Integration Service Catalogs -> SSISDB -> project -> (right click)Reports -> Standard Reports -> All executions. Check here if at any point, ETL job lost contact with warehouse.
2.If you have logging enabled, try to see at what task_name your package started returning 0:
select
data_stats_id,
execution_id,
package_name,
task_name,
source_component_name,
destination_component_name,
rows_sent
from
ssisdb.catalog.execution_data_statistics
How are you handling transactions and checkpoints? This is important if you want to know root cause of this issue. It may happen that due to loss of connectivity had forced to rollback any write in warehouse.
As it turns out, the issue was caused by an oversight.
Because we run DEV and PROD on the same server (we know, and have recommended the customer to at the very least consider using different instances)), we use variables in which we point at the proper environment (set in the environment variables).
The query feeding this particular package was updated, and apparently rather than using the variable to switch databases, it was hard-coded (likely as result of testing, and then forgetting to update the variable). The load for DEV and PROD run at the same time, and we suspect that while PROD was ready, DEV was still processing the source tables, and thus 0 rows were returned.
We only found this out today because the load again ran fine right until this morning. I was too late to catch it using Profiler, but because it was only this package, I checked, and spotted the hardcoded reference to _DEV.
Thanks everyone for chiming in.

SSIS: failed to mark the cache as filled with data

I'm creating an Integration Services package in Visual Studio 2010 Shell. I have a lookup transformation in the package that was working fine until recently.
About two weeks ago, the lookup transformation started failing. It will process about 1800 rows and then crash (the number of rows processed before crashing varies with each run. The error I receive is listed below:
Error: 0xC0010202 at Check if source key exists in bridge table, Lookup 1 [746]: Failed to mark the cache as filled with data.
One of the things that I find perplexing is that I receive this message regardless of whether I use full, partial, or no cache. I'm confused as to why the lookup would try to mark the cache as filled even when i set the lookup to not use a cache.
I've googled this and tried to adjust every setting and advanced setting that I could get my hands on but to no avail. Has anyone else had any experience like this?

Crystal Reports Server - get list of currently running reports and their progress

Hope it's the right place to ask this question - usually I use SO to ask about programming...
I'm doing a project that involves Crystal Reports Server. From code, I'm able to schedule reports successfully, but when I look at the BI launch pad I don't see the report in My Recently Run Documents (I see failed reports in that list - ones that has wrong database credentials).
When I go to Central Management Console and I find my reports in folders and I go to Properties > History I see the report status as "Running" - and it has been like that for a long while (too long than it should) for 2 different reports I have sent.
How can I diagnose what the problem is? and why it is stuck? there are no error messages anywhere about it.
How can I get a full history of all reports in the system (not just one single report at a time)? and how can I see currently running reports?
How can I stop a running report?
I really hope this is the right place for these kind of questions... if not, would be very happy to get a referral.
Thanks
How can I get a full history of all reports in the system?
Open the CMC and then click on the Instance Manager. At the bottom of the page, you can filter on the object type and status. That way, you can get a full overview of all running reports on your platform.
How can I stop a running report?
If you select a running instance (either in a document's history page or in the Instance Manager), you'll notice that there is no stop button. Instead, you have to delete the running instance. It might not stop running immediately though (depending on what it's doing), but it will be removed immediately from the list of instances.
How can I diagnose what the problem is?
What I would recommend is to enable tracing on all related servers (thus your job server, processing server, etc) and then retry scheduling the report. This should generate additional logging on the server which you can use to diagnose the issue.
The trace files have the extension .glf (generic log file) and are located in the logging folder on your Crystal Server. Have a look at the command-line property of each of the servers for which you're enabling the tracing, you should find a log folder there somewhere.
Make sure to turn the tracing off again as soon as you're finished, as tracing will not only create extra strain on your servers (causing the system to slow down), but it will also result in very large log files.
Before starting with tracing, have a look at the existing log files to see if it doesn't already contain error messages that might help you diagnose the issue. Sort the log files by date, and look at the most recent one for each of the servers involved. If there's nothing in there, start with tracing, but remove the existing .glf files to minimise log contamination (some files will be locked, just ignore them).

Deadlock on logging variable value changes using a SQL task

Morning
I've been reading "SQL Server 2008 Integration Services Problem - Design - Solution". It outlines a way of logging variable changes which I'm trying to replicate in SQL 2005.
Create variables e.g. PackageId, RecordsAffected. - Set Raise ChangeEvent to true.
Create a string variable g.g. strVariableValue. - Set Raise ChangeEvent to false.
On the package event handler: OnVariableValueChanged add a script task "SCR Convert value to string".
Add ReadOnlyVariables: System::VariableValue
Add ReadWriteVariables: User::strVariableValue
In the script, set a local variable to System::VariableValue.value.tostring
Set the variable User::strVariableValue to the local variable
Add an "Execute SQL Task" component "SQL Log Variable Value Changed" calling a SP with no resultsets.
Set parameter mapping to User::PackageId, System::VariableName, User::strVariableValue
When this is run, I get a deadlock on User::PackageID
Error: 0xC001405B at SQL Log Variable Value Changed: A deadlock was detected while trying to lock variable "User::_PackageID" for read access. A lock could not be acquired after 16 attempts and timed out.
The script step succeeds but the Execute SQL task fails. I'm using Visual Studio 2005 Version 8.0.50727.42, Microsoft SQL Server Integration Services Designer Version 9.00.4035.00 and BIDSHelper Version 1.4.3.0.
Any ideas?
Eureka!
I had the same problem and led to a few deadend posts, then I discovered the root.
I had the framework working just fine and wanted to force some info to be logged.
So I changed the value of the framework variable "strVariableValue" and this caused the deadlock with the change event task.
I fixed by creating my own variable "strLogMe" and putting whatever I wanted to log.
Moral: don't touch the framework variables
Did you use the code sample from the book? All the files are available on the Wiley website for free. The code sample includes a SSIS package, sql scripts, and VB code for the script. If this doesn't work for you, then let me know since one of my team members found a way to log variable changes that was different from this methodology.
I was getting this error ("a deadlock was detected" etc), suddenly, which seemed to coincide with I.T. having done a Microsoft Windows patch on the server. There were packages which were using script tasks, with read-only and/or read-write variables in the SSIS UI. Even though it seemed to have been an environmental issue (because the packages had worked for months, then suddenly stopped working, even though I hadn't changed any code), I thought, well (as I had seen from various blog posts from years gone by), there were instances of companies doing server patches, then having their SSIS packages break; and the blogs seemed to say, change the way you're locking the variables, don't reference them in the UI; instead, lock them explicitly in code. So I tried the same thing. It didn't fix it.
It turns out some individual had removed the permissions of the user under whose identity the packages run, from the AD group; those permissions were required because it was trying to copy a file from a directory which required read permissions on the directory. These packages are typically called by a SQL agent job using a proxy identity. When the package was executed manually from SSMS, it worked. But when it was run by calling the SQL agent job, it failed.
The bottom line is, it was just coincidence that the packages started failing around the time of the Windows update. But the other (main) point is, if your package is trying to access a file on the network, and the identity (or proxy identity) under which that package runs does not have permissions to the source or target directory, then your package could fail and the problem could manifest itself in this cryptic way, where it looks like a variable deadlock issue, but it's actually a file share permissions issue. I only wasted a day on this, but... maybe this will be useful to somebody in the future.

Reporting Services Subscription fails but manually running report works

Our company has a huge nasty report that takes about 50-60 minutes to run (it's for a university and lists all sorts of payment information for all students registered in courses). While it has been running each morning at 5am as a subscription, it recently stopped working and displays "An error has occurred during report processing." in the properties window for the subscription.
If I manually run the report from inside Visual Studio it will work every time, but the subscription will now always fail. I had our DBA turn on trace logging and it gave us no helpful information whatsoever. I've also set the subscription to run at different times throughout the day, with no success. The report is supposed to put an Excel file on a file share and it works for the other 5 subscriptions to this report (I have 6 subscriptions, only 1 of the 6 has a parameter set that returns values from a larger dataset). So this means that it has permission to write to the file share. Any ideas?
Could it be trying to write more than 65536 rows to the Excel file? If so it will just fail.
Also you might check the configuration for IIS to see if the report is causing a timeout.