postfix: timing of client responses in a milter and in after-queue processing? - smtp

I'm currently using postfix-2.11.3, and I am doing a lot of message processing through a milter. This processing takes place before the client is notified that the message is accepted, and it sometimes involves enough work that it delays the client's receipt of the initial SMTP 250 2.0.0 Ok: queued as xxxxxxxxxxx message.
During large email blasts to my server, this milter processing can cause a backlog, and in some cases, the client connections time out while waiting for that initial 250 ... message.
My question is this: if I rewrite my milter as a postfix after-queue filter with no before-queue processing, will clients indeed get the initial 250 messages right away, with perhaps subsequent SMTP messages coming later? Or will the 250 message still be deferred until after postfix completes the after-queue filtering?
And is it possible for an initial 250 message to be received by the client with a subsequent 4xx or 5xx message received and processed later by that same client, in case the after-queue filter decides to subsequently reject the message?
I know I could test this by writing an after-queue filter. However, my email server is busy, and I don't have a test server available, and so I'd like to know in advance whether an after-queue filter can behave in this manner.
Thank you for any wisdom you could share about this.

I managed to set up a postfix instance on a test machine, and I was able to install a dummy after-queue filter. This allowed me to figure out the answer to my question. It turns out that postfix indeed sends the 250 2.0.0 Ok: queued as xxxxxxxxxxx message before the after-queue filter completes.
This means that I can indeed move my slower milter processing to the after-queue filter in order give senders a quicker SMTP response.

Related

Should a message queue server be facing the Internet directly or not?

I have the following use case:
message size: ~4kb
protocol type: considering MQTT
message queue server: considering RabbitMQ or Mosquitto
up to 50k msg / s arriving messages
each message is sent from a mobile client with various network connectivity
What I would like to know is: how is it better to have the system to ingest the messages?
A) expose the message queue server directly to the Internet, processes the messages later for consistency / validity (of course with a load balancer in front of the servers)
B) expose a server that can read the message in the native format, apply some basic validity checks and then queue the message to an internal message queue server
I'm leaning towards the second option but I have no real arguments for pro / cons of it vs first option so can you please advise on this one?
Thank you.
You question has two parts:
Whether or not to expose the message queue server to the internet
Whether or not to process the message immediately
For the first question, I would advice to put the server behind a firewall. As such, you will have more tools to protect your server against internet attacks.
For the second question, it depends on whether or not the server is required to inform the mobile about the message processing result and whether the result of the message processing should be known immediately:
In case you are not required to send a feedback to the mobile and the result of the message processing is not required to be executed immediately, I would advice to log the message then process later it in batch mode,
In case you are required to send back a feedback to the mobile but the message isn't required to be processed immediately, I would advice to execute a sanity check of the message, send back the feedback to the mobile then log the message for batch processing,
Otherwise, I would advice to execute the sanity check, process the message and send back feedback to the mobile.
In my advice, I have suggested to use batch mode over online mode as much as possible. When you operate in batch mode, you have more options to use efficiently your computing resources in a simple way.

Sending email with Flask errors with SMTPHandler

I saw in the documentation an extremely easy way to send emails on Flask errors. My question is whether this will considerably affect performance of the app? As in, is the process running my app actually sending the email?
My current hunch is that because SMTP is a server running on another process, it will enqueue the email properly and send it when it can, meaning it won't affect the performance of the app.
Well, SMTPHandler inherits from logging.Handler. Looking at logging.Handler, while it does several things to handle being called in multiple threads it doesn't do anything to spawn multiple threads. The logging calls happen in the thread they are called on. So, if I am reading the code correctly, a logging call will block the thread it is running on until it completes (which means that if your SMTP server takes 30 seconds to respond your erroring thread will take time_to_error + 30 seconds + time_to_send + time_to_respond_to_request_with_500.)
That said, I could be mis-reading the code. However, you'd be better off using SysLogHandler and letting syslog handle sending you messages out of band.

Ejabberd Message Acknowledgment from Server

I have setup and implemented ejabberd server with my little mobile app chatting program. Have implemented XEP-184 for the message delivery status as well.
But I am having an issue, how would it possible for my app to know if my message has indeed reached the ejabberd server?
My scenario: I am walking into a weak connection signal area, the signal is barely strong enough to keep the connection alive, but with frequent timeout. I tried to send a message out, how would it possible that I can confirm if the message reaches the server?
Hope I am clear enough on my question. Thanks in advance!
I wrote an ejabberd mod for this which you can find at:
https://github.com/kmtsar/ejabberd-mods
A possible approach would be to implement XEP-0198 Stream Management. Stream management is a standard feature in latest ejabberd versions.
With that in place, a client can ask the server to keep a count of the received stanzas, and when interested ask the server to confirm the number of received stanzas.
The client can then get an idea whether one or more stanza were received or not.
This can be done for every single stanza: the client requires an ack for the last sent stanza, and expects an ACK from the server.
In theory you could implement just the "Basic Ack Scenarios" - no need for the full XEP (which includes stream resumption).

How do internet streaming data feeds work?

This can be any type of data feed, let's just say for this example stock market data since that's a popular one. But I'm talking about real time data feeds. So it continuously sends data.
I'm trying to understand how this is implemented. Does this happen in some way over http? I just don't get how the clients hook up to the server and the server sends continuous data, can anyone clue me into how this works technically? How might a client hook up to the server? Let's take C# or Java or something. Does this happen over http or maybe some other type of way? please go into details.
Thanks
It's not really any different from normal HTTP traffic, just longer.
What happens when you open a website? (very rough overview)
Your computer contacts a server, establishes a connection.
Server starts sending data split into packets to you.
Your computer receives the packets, possibly out-of-order and some with significant delay, and re-assembles them into a website.
Your computer sends an acknowledgment for received packets.
The server will retransmit any packets it hasn't received an acknowledgment for for a significant amount of time, assuming they were dropped on the way.
In between receiving packets and/or acknowledgments, both wait.
When all data necessary for displaying the site is transferred, your computer thanks the server for its time and disconnects.
If at any point in this process either party stops responding for a significant amount of time, either side may drop the connection, assuming technical difficulties.
What's happening with "streaming data feeds"? (even rougher overview)
A client contacts a server, establishing a connection.
The server starts sending data split into packets to the client.
The client receives the packets, possibly out-of-order and some with significant delay, and re-assembles them.
The client sends an acknowledgment for received packets.
The server will retransmit any packets it hasn't received an acknowledgment for for a significant amount of time, assuming they were dropped on the way.
In between receiving packets and/or acknowledgments, both wait.
The only difference is that the client doesn't hang up on the server, because it's still expecting data and that both aren't as quick to drop the connection.
The thing is that web servers (for web sites) are specialized for the task of delivering small snippets of data to many people, so they're quick to hang up on you once all data has been sent. Your server script can simply not quit though, and the connection will stay alive. Here's a tiny PHP script that will demonstrate that:
while (true) {
echo '.';
sleep(1);
}
This will send a new . every second indefinitely (note that the web server needs to be configured appropriately to not terminate the script and to immediately send the output).
Try the Wikipedia article about TCP/IP for the basics and this article about long-polling/HTTP streaming for concrete examples.

Socket throttling because client not reading data fast enough?

I have a client/server connection over a TCP socket, with the server writing to the client as fast as it can.
Looking over my network activity, the production client receives data at around 2.5 Mb/s.
A new lightweight client that I wrote to just read and benchmark the rate, has a rate of about 5.0Mb/s (Which is probably around the max speed the server can transmit).
I was wondering what governs the rates here, since the client sends no data to the server to tell it about any rate limits.
In TCP it is the client. If server's TCP window is full - it needs to wait until more ACKs from client came. It is hidden from you inside the TCP stack, but TCP introduces guaranteed delivery, which also means that server can't send data faster than rate at which client is processing them.
TCP has flow control and it happens automatically. Read about it at http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Flow_control
When the pipe fills due to flow control, the server I/O socket write operations won't complete untill the flow control is releaved.
The server is writing data at 5.0Mb/s, but if your client is the bottleneck here then server has to wait before the data in "Sent Buffer" is completely sent to client, or enough space is released to put in more data.
As you said that the light weight client was able to receive at 5.0Mb/s, then it will be the post-receiving operations in your client that you have to check. If you are receiving data and then processing it before you read more data, then this might be the bottleneck.
It is better to receive data asynchronously, and as soon as one receive is complete, ask the client sockets to start receiving data again, while you process the received data in a separate thread pool thread. This way your client is always available to receive incomming data, and server can send it at full speed.