SSRS Subscriptions Duplicating email - sql-server-2008

Had an interesting error today and couldn't find anything online about it so wondered if any of you guys had seen this behavior before.
We had an out of memory error and the CPU usage was spiking this morning on our reports server, a clean reboot seemed to rectify the issue, however since then all the email subscriptions have been sending multiple times. What do i mean by this, the subscription as far as SSRS is concerned ran once at its normal time (10am), this has been proven by scrutinizing the logs to see if another execution occurred (it didn't) and by renaming the SPROC that the report references to ensure that it would fail, yet it didn't and the mail resent. I then checked the Exchange queues and turned on logging for the connection and i could see a new mail being resubmitted every 30mins to the exchange mail queue.
The question is, what process is causing that mail to be resubmitted to the exchange server and how, other than another reboot do we stop the emails resending.
Thanks in advance
-- Further --
Having done more digging we have noticed that the [ReportServer].[dbo].Notifications table is populated with all of the reports that are sending out multiple times with the Attempts column incrementing every time the duplicate email is sent out.
We still dont know why these are resending

It seems to be down to the logging level... If you switch the Report Server Service logging level down to level 2 (Exceptions, restarts and warnings) this error seems to manifest itself however when the logging level is switched back up to 3 or above the error seems to disappear. Some similar behavior is noticed here: http://social.msdn.microsoft.com/Forums/en-NZ/sqlreportingservices/thread/b78bb6e2-0810-4afd-ba6b-8b09a243f349

Check the SQL Agent jobs (named with GUIDs) for the subscriptions. Maybe the schedule on those go messed up somehow.

Related

Why does SSRS need to recycle the application domain

I'm working with MS Reporting Services 2016. I noticed that the application domain is set by default to recycle every 12 hours. Now the impact on users after a recycle is either slow response from reporting services or a failed report. Both disappear after a refresh of the report, but this is not ideal.
I have come across a SO answer where people suggest that you can turn off the scheduled recycle by setting the configuration attribute RecycleTime to zero.
I have also read that writing a script to manually restart reporting services, which also recycles the app domain. Then a script that simply loads a report at a controlled time to remove the first time load issues. However this all seems like a work around to me and I would rather not have to do this.
My concern is that there must be a logical reason for having the scheduled recycle time, but I cannot find any information explaining this. Does anyone know if there is a negative impact from turning off the scheduled application domain recycle?
The RecycleTime is a function aimed at making sure SSRS isn't consuming RAM it doesn't need and potentially starving the rest of the machine. Disabling the refresh essentially removes the ability to claw back any memory used for a brief period of intensive processing.
If you are confident your machine is suitably resourced you can turn the refresh off or, if not, alternatively schedule the refresh for an out of hours time and define a Cache Refresh Plan to cache any super important reports immediately afterwards to minimise any user impact.
Further reading here: https://www.mssqltips.com/sqlservertip/2735/prevent-sql-server-reporting-services-slow-startup/
I guess I'm possibly over simplifying this, but SSRS was designed to recycle every 12 hours (default) for a reason. If it ain't broke, don't fix it. In my case, I wanted to control when the recycle occurred. I execute a 1 line powershell script from a SQL Agent job at 6:50 am, then generate a subscription report at 7 am, which kick starts SSRS and the users do not see any performance degradation.
restart-service 'ReportServer'
Leaving the SSRS config file setting at 720 minutes lets the recycle occur again at 6:50 pm. Subscription reports generate throughout the night, so if a human gets on SSRS after hours there should be no performance issue because the system is already running.
Are we possibly overthinking it?

MySQL transactions: reads while writing

I'm implementing PayPal Payments Standard in the website I'm working on. The question is not related to PayPal, I just want to present this question through my real problem.
PayPal can notify your server about a payment in two ways:
PayPal IPN - after each payment PayPal sends a (server-to-server) notification to a url (choose by you) with the transaction details.
PayPal PDT - after a payment (if you set this up in your PP account) PayPal will redirect the user back to your site, passing the transaction id in the url, so you can query PayPal about that transaction, to get details.
The problem is, that you can't be sure which one happens first:
Will your server notified by IPN
Will be the user redirected back to your site
Whichever is happening first, I want to be sure I'm not processing a transaction twice.
So, in both cases, I query my DB against the transaction id coming from paypal (and the payment status actually..but it doesn't matter now) to see if I already saved and processed that transaction. If not, I process it, and save the transaction id with other transaction details into my database.
QUESTION
What happens if I start processing the first request (let it be the PDT..so the user was redirected back to my site, but my server wasn't notified by IPN yet), but before I actually save the transaction to database, the second (the IPN) request arrives and it will try to process the transaction too, because it doesn't find it in DB.
I would love to make sure that while I'm writing a transaction into database, no other queries can read the table, looking for that given transaction id.
I'm using InnoDB, and don't want to lock the whole table, for the time of the write.
Can this be solved simply by transactions, have I to lock "manually" that row? I'm really confused, and I hope some more experienced mysql developers can help making this clear for me and solving the problem.
Native database locks are almost useless in a Web context, particularly in situations like this. MySQL connections are generally NOT done in a persistent way - when a script shuts down, so does the MySQL connection and all locks are released and any in-flight transactions are rolled back.
e.g.
situation 1: You direct a user to paypal's site to complete the purchase
When they head off paypal, the script which sent over the http redirect will terminate and shuts down. Locks/transactions are released/rolled back, and they come back to a "virgin" status as far as the DB is concerned. Their record is no longer locked.
situation 2: Paypal does a server-to-server response. This will be done via a completely separate HTTP connection, utterly distinct from the connection established by the user to your server. That means any locks you establish in the yourserver<->user connection will be distinct from the paypal<->yourserver session, and the paypal response will encounter locked tables. And of course, there's no way of predicting when the paypal response comes in. If the network gods smile upon you and paypal's not swamped, you get a response very quickly and possibly while the user<->you connection is still open. If things are slow and the response is delayed, that response MAY encounter unlocked tables/rows because the user<->server session has completed.
You COULD use persistent MySQL connections, but they open up a whole other world of pain. e.g. consider the case where your script has a bug which gets triggered halfway through processing. You connection, do some transaction work, set up some locks... and then the script dies. Because the MySQL connection is persistent, MySQL will NOT see that the client script has died, and it will keep the transactions/locks in-flight. But the connection is still sitting there, in the shared pool waiting for another session to pick it up. When it invariably is, that new script has no idea that it's gotten this old "stale" connection. It'll step into the middle of a mess of locks and transactions it has no idea exists. You can VERY easily get yourself into a deadlock situation like this, because your buggy scripts have dumped garbage all over the system and other scripts cannot cope with that garbage.
Basically, unless you implement your own locking mechanism on top of the system, e.g. UPDATE users SET locked=1 WHERE id=XXX, you cannot use native DB locking mechanisms in a Web context except in 1-shot-per-script contexts. Locks should never be attempted over multiple independent requests.

Is there anyway you could stop a data driven subscription in SSRS 2008?

Is there anyway you can stop a DDS? I have a process that adds a .pdf in 400 different folders. I was just wondering if there way anyway to stop it because sometimes it interferes with other things on the server I'm working on it and it slows it down significantly.
There's no supported way to do this. There is a hack you can try, but it involves manipulating the tables in the ReportServer database, which is unpredictable at best. Here are the steps:
Stop the SSRS server. This will make it quit sending new subscriptions, but it doesn't actually stop the DDS.
Find the relevant row in the ActiveSubscriptions table.
Delete all of the rows in the Notifications table with an ActivationID that corresponds to the ActiveID in ActiveSubscriptions.
Delete the row in ActiveSubscriptions.
Restart the SSRS server
Any subscriptions that SSRS had already queued up with the SQL Server Agent will still be processed, but it should stop sending new ones. As I said, this is a hack, and it's difficult to say what else you might break by doing this.

What causes mysterious hanging threads in Colfusion -> mysql communication

One of the more interesting "features" in Coldfusion is how it handles external requests. The basic gist of it is that when a query is made to an external source through <cfquery> or or any other external request like that it passes the external request on to a specific driver and at that point CF itself is unable to suspend it. Even if a timeout is specified on the query or in the cfsetting it is flatly ignored for all external requests.
http://www.coldfusionmuse.com/index.cfm/2009/6/9/killing.threads
So with that in mind the issue we've run into is that somehow the communication between our CF server and our mySQL server sometimes goes awry and leaves behind hung threads. They have the following characteristics.
The hung thread shows up in CF and cannot be killed from FusionReactor.
There is no hung thread visible in mySQL, and no active running query (just the usual sleeps).
The database is responding to other calls and appears to be operating correctly.
Max connections have not been reached for the DB nor the user.
It seems to me the only likely candidate is that somehow CF is making a request, mySQL is responding to that request but with an answer which CF ignores and continues to keep the thread open waiting for a response from mySQL. That would explain why the database seems to show no signs of problems, but CF keeps a thread open waiting for the mysterious answer.
Usually these hung threads appear randomly on otherwise working scripts (such as posting a comment on a news article). Even while one thread is hung for that script, other requests for that script will go through, which would imply that the script isn't neccessarily at fault, but rather the condition faced when the script was executed.
We ran some test to determine that it was not a mysql generated max_connections error... we created a user, gave it 1 max connections, tied that connection with a sleep(1000) query and executed another query. Unfortunately, it correctly errored out without generating a hung thread.
So, I'm left at this point with absolutely no clue what is going wrong. Is there some other connection limit or timeout which could be causing the communication between the servers to go awry?
One of the things you should start to look at is the hardware between the two servers. It is possible that you have a router or bridge or NIC that is dropping occasional packets. This can result in the mySQL box thinking it has completed the task while the CF server sits there and waits for a complete response indefinitely, creating a hung thread.
3com has some details on testing for packet loss here: http://support.3com.com/infodeli/tools/netmgt/tncsunix/product/091500/c11ploss.htm#22128
We had a similar problem with a MS SQL server. There, the root cause was a known issue in which, for some reason, the server thinks it's shutting down, and the thread hangs (even though the server is, obviously, not shutting down).
We weren't able to eliminate the problem, but were able to reduce it by turning off pooled DB connections and fiddling with the connection refresh rate. (I think I got that label right -- no access to administrator at my new employment.) Both are in the connection properties in Administrator.
Just a note: The problem isn't entirely with CF. The problem, apparently, affects all Java apps. Which does not, in any way, reduce how annoyed I get by this.
Long story short, but I believe the caused was due to Coldfusion's CF8 image processing. It was just buggy and now in CF9 I have never seen that problem again.

SSIS Job Monitoring and Reporting

Our shop relies heavily on SSIS to run our back end processes and database tasks. Overall, we have hundreds of jobs, and for the most part, everything runs efficiently and smoothly.
Most of the time, we have a job failure due to an external dependency failing (data not available, files not delivered, etc). Right now, our process is set up to email us every time a job fails. SSIS will generate an email sending us the name of the job and the step it failed on.
I'm looking at creating a dashboard of sorts to monitor this process more efficiently. I know that the same information available in the Job History window from SSIS is also available by querying the msdb database. I want to set up a central location to report failures (probably using SQL Reporting Services), and also a more intelligent email alert system.
Has anyone else dealt with this issue? If so, what kind of processes/reporting did you create around the SSIS procedures to streamline notification of job failures or alerts?
We have a similar setup at our company. We primarily rely on letting the jobs notify us when there is a problem and we have employees who check job statuses at specific times to ensure that everything is working properly and nothing was overlooked.
My team receives a SQL Server Agent Job Activity Report HTML email every morning at 6am and 4pm that lists all failed jobs at the top, running jobs below that, and all other jobs below that grouped into daily, weekly, monthly, quarterly, and other categories. We essentially monitor SQL Server Agent jobs, not the SSIS packages themselves. We rely on job categories and job schedule naming conventions to automate grouping in the report.
We have a similar setup for monitoring our SSRS subscriptions. However, we only monitor this one once a day since most of our subscriptions are triggered around 3am-4am in the morning. The SSRS Subscription Activity Report goes one step further than the SQL Server Agent Job Activity Report in that it has links to the subscription screen for the report and has more exception handling built into it.
Aside from relying on reports, we also have a few jobs that are set to notify the operator via email upon job completion instead of upon job failure. This makes it easy to quickly check if all the major ETL processes have run successfully. It's sort of an early indicator of the health of the system. If we haven't received this email by the time the first team member gets into the office, then we know something is wrong. We also have a series of jobs that will fail with a job error if certain data sources haven't been loaded by a specific time. Before I had someone working an early shift, I use to check my iPhone for the email anytime I woke up in the middle of the night (which happened a lot back then since I had a newborn baby). On the rare occasion that I didn't receive an email indicating everything completed or I received an error regarding a job step, then I logged onto my machine via remote desktop to check the status of the jobs.
I considered having our data center guys check on the status of the servers by running a report each morning around 4am, but in the end I determined this wouldn't be necessary since we have a person who starts work at 6am. The main concern I had about implementing this process is that our ETL changes over time and it would have been necessary for me to maintain documentation on how to check the jobs properly and how to escalate the notifications to my team when a problem was detected. I would be willing to do this if the processes had to run in the middle of the night. However, our ETL runs every hour of the day, so if we had to kick off all our major ETL processes in the early morning, we would still complete loading our data warehouse and publishing reports before anyone made it into the office. Also, our office starts REALLY late for some reason, so people don't normally run our reports interactively until 9am onward.
If you're not looking to do a total custom build out, you can use https://cronitor.io to monitor etl jobs.
Current SSRS Job monitoring Process:
There are no SSRS job monitoring process. If any SSRS job is failed, user creates the incident, then TOPS Reporting and SSRS developer team are started to work on basis of incident. As a result, this process has taken huge turnaround time to resolve this issue.
Proposed SSRS Job monitoring Process:
SSRS subscriptions monitoring job will help to TOPS Reporting and SSRS developer for proactively monitoring the SSRS job. This job will create the report to display the failed list of report along with generic execution log status and subscription errors log status. Initially, developer can understand report failure reason from this report and then developer can start working proactively to resolve the issue.