SMTP Send Adapter, Errors and Suspended Messages Until Hosts Instance Restart - smtp

I've got a BizTalk application that has to send out several hundred, close to a thousand emails over the course of 3 or 4 hours in the morning. The app will run fine for several days, then it seems that the app will slow way down, eventually I will see all of the out going messages in the 'active' state, but not doing anything, just sitting there, with this warning...
The adapter failed to transmit message going to send port "" with URL "". It will be retransmitted after the retry interval specified for this Send Port. Details:"The transport failed to connect to the server.
I don't see any unusual load on the box, no high CPU, disk, or Network utilization.
After I restart the host instance that is hosting this SMTP send port, they all continue and run fine, for a day or two until I have this issue again.
I've been scratching my head on what may be causing this issue... any ideas?

Possibly look for throttling conditions, especially for Memory throttling - (Throttling State 4) - use Perfmon or SCOM on this counter.
Also, in task manager look at the Memory of your BizTalk service hosts - and add Commit Size (i.e. including virtual). It is possible that your Orchs aren't releasing memory or are too memory intensive (e.g. remember to Dispose() XLangMessages in custom assemblies).
If you do find Throttling state 4, and are sure that you aren't leaking, you might want to bump the throttling threshold up from 25 to say 50 - see here. But IMHO 100% as suggested in the article sounds dangerous.

Related

Where is this "stalled" time coming from?

This is a screenshot of Chrome's network timing and I'm trying to figure out the "stalled" time of ~250ms. Google's docs say the stalled status is for time spent either waiting for a socket (possibly due to max 6 rule) or time spent negotiating a proxy. However, in my case, this is the only active request (so it's not a max 6 issue), and there is no other network activity on the computer or even the network. I'm also not using any form of proxy or VPN.
I eventually figured out that this "stalled" time disappears if I change from https to plain http, so at first, I figured this was SSL setup time. But if that were the case, why isn't it showing up in the "SSL" section of the timing?
What's causing this "stalled" time taking 30% of the load time?
Edit
I had my buddy on the opposite coast hit the same page and it's worse for him, which suggests that it could be server-proximity related?
HTTPS
HTTP

OpenShift idling, rationale - just a service differentiation?

Have found several SO Q&A's related to OpenShift's concept of application idling when there is no inbound traffic for 24 hours. Apart from the fact that there could be hacks to work around it, I was wondering, as to what is the effect since OpenShift claims that the application is brought back to full live state when an incoming request is encountered. In that case, apart from the fact that the HTTP request that causes application to go back from idle state to live/running state would run trifle slower, but is there any other inconvenience that I am missing here ?
In my experience, the first call after idling consistently fails. Subsequent calls do work then. This is probably due to a timeout, since it takes a while to spin up the gears. It may also depend on the time your application takes to activate, meaning this could be specific for my kind of application.
However I switched from the FREE plan to BRONZE a while ago and did not experience any problems since then.

Difference between a ping and a heartbeat?

I have never heard of the heartbeat until the heartbleed bug. I wonder what is the difference between this and a ping, and if there are other signals to manage the connection (also, which are not data packages).
Strictly speaking, ping refers to using an ICMP ECHO request to see if the destination computer is reachable. It tests the network, but not whether the target computer is able to usefully respond to any other particular service request (I've seen computers which were ping-able but which were functionally down; the OS kernel — which is what responds to pings — was up, but all user processes were dead).
However, the term has been extended to cover any sort of client-initiated check of whether the other end is up, often done inside the protocol of interest so that you can find out whether the target machine is able to do useful work.
With heartbeats, I've typically thought of them as being where the service regularly pushes the notification to somewhere else (as opposed to being prompted by a client). The idea is that the heartbeat monitor detects if it hasn't had a heartbeat signal for a while and applies “emergency CPR” (i.e., restarts the service) if that happens. It's similar to a watchdog timer in hardware.
I view a ping and a heartbeat as being complementary: one is for the client to learn whether the service is up, and the other is for the service provider to learn whether the service is up. (The provider could use a ping, and probably does via their Nagios setup, but a heartbeat monitors something slightly different — internal timers, in particular — and is pretty cheap to implement so there's no reason not to use one.)
Ironically, the Heartbleed bug was in what I'd consider to be a ping-ing mechanism. But it's called that because it's based on a (mis-)implementation of the SSL Heartbeat Extension. Terminology's all too often just there to be abused…

Chromium: is communicating with the page faster than communicating with a worker?

Suppose I've got the following parts in my system: Storage (S) and a number of Clients (C). The clients are separate Web Workers and I'm actually trying to emulate something like shared memory for them.
Right now I've got just one Client and it's communicating with the Storage pretty intensively. For the sake of testing it is spinning in a for-loop, requesting some information from the Storage and processing it (processing is really cheap).
It turns out that this is slow. I've checked the process list and noticed chrome --type=renderer eating lots of CPU, so I thought that it might be redrawing the page or doing some kind of DOM processing after each message, since the Storage is running in the page context. Ok, I've decided to try to move the Storage to a separate Worker so that the page is totally idle now and… ended up getting even worse performance—exactly twice slower (I've tried a Shared Worker and a Dedicated Worker with explicit MessageChannels with the same results).
So, here is my question: why sending a message from a Worker to another Worker is exactly twice slower than sending a message from a Worker to the page? Are they sending messages through the page? Is it “by design” or a bug? I was going to check the source code, but I'm afraid it's a bit too complex and, probably, someone is already familiar with this part of Chromium internals…
P.S. I'm testing in Chrome 27.0.1453.93 on Linux and Chrome 28.0.1500.20 on Windows.

How can I run teardown code when the Flash VM closes?

Is there a way to register code to be run when Flash is about to close (e.g., when the user closes the browser or when DOM manipulation causes the embedded player to be removed)?
In particular, I'd like for my application to send a closing packet to a remote service so the user's peers know that the user has no chance of coming back without having to wait for a timeout. I'm using URLLoader and URLRequest to maintain a BOSH connection, so I welcome solutions applicable to this specific case. However, if there are NetConnection-specific solutions, I'm sure I can learn from them too.
I'm happy to accept that this callback won't be run on a kill -9 but it would be nice to have the more graceful exit paths allow for some code execution.
It seems like the better solution would be to do this via the server side no? The server should be able to detect the disconnection, where you could then invalidate the session.
However, you could go with a client/socket based solution, albeit much more overhead. Using FMS or some other rtm real time server you could dispatch events to your web server that a connection has dropped, (though you might have issues in the case of low network connectivity, or an internet drop). I would suggest going against this though, as in my experience, FMS sucks :)
Is setting extremely low timeouts not a possibility? (i.e. < 10 seconds)