zeromq push/pull architecture -high water mark - message-queue

Question about zeromq push-pull architecture:
how zeromq buffers messages both at sender and receiver end.
what will happen if in the following cases:
1.The main memory of system goes low before reaching the high watermark limit.
if high watermark limit is reached (e.g in 1000 messages)
then what will happen? will it maintain a queue for the blocked messages as well ?
3.what if zeromq enters the exceptional state and did not get back for hours?
Thanks

Related

Azure service bus queue design

Each user in my application has gmail account. the application needs to be in sync with incoming emails. for each user every 1 minute the application should ask gmail servers if there is something new. 99% of the time nothing is new.
From what I know gmail dosen't provide web-hooks
In order to reduce the load from my servers and specially from the DB I want to use the service bus queue in the following manner.
queue properties:
query method: PEEK_AND_LOCK
lock time : 1 minute
max delivery count: X
flow:
queue listener receiving message A from the queue and process it.
if nothing is new the listener will not delete the message from the queue
the message will be delivered again after lock time (1 minute)
basically instead of sending new message to the queue again and again to be processed we just leave them in the queue and relying on the re-delivery mechanism.
we are expecting many users in the near future (100,000-500,000) which means many messages in the queue in a given moment which needs to be processed each minute.
lets assume the messages size is very small and less the 5GB all together
I am assuming that the re delivery mechanism is used mainly for error handling and I wonder whether our design is reasonable and the queue is apt for that task? or if there are any other suggestions to achieve our goal.
Thanks
You are trying to use the Service Bus Queue as a scheduler which it not really is. As a result SB Queue will be main bottleneck. With your architecture and expected number of users you will find yourself quickly blocked by limitations of the Service Bus queue. For example you have only max 100 concurrent connections, which means only 100 listeners (assuming long-pooling method).
Another issue might be max delivery count property of SB Queue. Even if you set it to int.MaxValue now, there is no guarantee that Azure Team will not limit it in the future.
Alternative solution might be that you implement your own scheduler worker role (using already existing popular tools, like Quartz.NET for example). Then you can experiment - you can host N jobs (which actually do Gmail api requests) in one worker role (each job runs every X minute(s)), and each job may handle M number of users concurently. Worker role could be easily scaled if number of users increases. Numbers N and M can depend on the worker role configuration and can be determined on practice. If applicable, just to save some resources, X can be made variable, for example, based on the time of the day (maybe you don't need to check emails so often at night). Hope it helps.

Different Retry policies for different messages in MSMQ

We have a requirement for an API, which allows asynchronous updates via a MSMQ message queue, that I'm putting together which will allow the developer consuming the API to specify different retry policies per message. So a high priority client system, e.g. for sales will submit all messages with 5 delivery attempts (retries) and 15 minutes between each attempt, whereas a low priority client system, e.g. back-end mail shot system will allow users to update their marketing preferences, submitting messages with 3 retries and an hour between each attempt.
Is there a way in the System.Messaging MSMQ (version 3 or 4) implementation to specify number of retries, retry delay and things like whether messages are sent to a dead letter queue or just deleted? (and if so, how?)
I would be open to using other messaging frameworks if they fulfilled this requirement.
Is there a way in the System.Messaging MSMQ (version 3 or 4) implementation to specify number of retries
Depending on which operating system/msmq version you're using, specifying retry semantics is highly sophisticated in WCF. The following is for Windows 2008 and MSMQ4 using a transactional queue.
The main setting on the binding is called MaxRetryCycles. One retry cycle is an attempt to successfully read a message from a queue and process it inside the handling method. This "attempt" can actually be made up of multiple attempts, as defined by the msmq binding property ReceiveRetryCount. ReceiveRetryCount is the number of times an application will try to read the message and process it before rolling back the de-queue transaction. This marks the end of one retry cycle.
You can also introduce a delay in between cycles with the RetryCycleDelay property.
A more complicated consideration is what to do with the messages which fail even after multiple retry cycles.
allow the developer consuming the API to specify different retry policies per message
I am not sure how you could do this with MSMQ - as far as I'm aware it's only possible to set retry semantics on a per-endpoint basis. If you're using transactions then you can't even allow API users to set the priority of the messages being sent (transactional queues guarantee delivery in order).
The only thing you could do is host a another instance of your API as high-priority and one for low priority. These could be hosted on different environments, and this has the added benefit that low priority messages won't be competing for system resources with high priority messages.
Hope this helps.

how Message Queue System Works?

I have studied Message Queues System in my class but I still don't get it how these Message Queues System work in real time scenarios? Is there any tutorial which can help me to get the complete picture?
Can someone explain me how these systems work?
An example: My thread or process can send a message to your message queue, and having sent it, my code goes on to do something else. Your code, when it gets around to it, reads the next message from the message queue, and then decides what to do about that message. Message queues avoid needing to have a critical section or mutex shared between the two threads, or processes. The underlying message queue layer itself takes care of making sure that messages get into the queue without race conditions affecting the integrity of the queue.
Message queues can be used for both one-way and two-way, asynchronous messaging. For one-way use, my thread can use it to keep your thread appraised of key events in my thread, without acknowledgement back from your thread. For two-way use, after my thread sends a message to your thread, your thread may need to send data back to my thread via my message queue.
The message queue layer uses lower level synchronization schemes to insure that no two writers to the queue can write at the same time. It insures that all writes to the queue are atomic. It also insures that a reader of the queue cannot read a partially written message from the queue.
Most message queue APIs also offer support for reading messages from the queue based on a filter that you designate. Say for instance that you consider messages from a time critical thread to be more important that other messages. You can each time you check your queue for messages, first check for messages from the critical thread, and service those messages first. Your thread would then go onto to process the rest of the messages as normal, provided no more messages from the critical thread are found.
A C tutorial of the UNIX message queues
That's a complex topic but to put it simply:
Message Queues are one of the best ways, if not the best, to
implement distributed systems.
Now you might ask, what is a distributed system? It is an integrated system that spans multiple machines, clients or nodes which execute their tasks in parallel in a non-disruptive way. A distributed system should be robust enough to continue to operate when one or more nodes fail, stop working, lag or are taken down for maintenance.
Then you might ask, what is a message queue? It is a message-oriented middleware that enables the development of a distributed system by using asynchronous messages for inter-node communication through the network.
And finally you might ask, what is all that good for? This is good for implementing applications with a lot of moving parts called nodes which needs real-time monitoring and real-time reaction capabilities. To summarize they provide: parallelism (nodes can truly run in parallel), tight integration (all nodes see the same messages in the same order), decoupling (nodes can evolve independently), failover/redundancy (when a node fails, another one can be running and building state to take over immediately), scalability/load balancing (just add more nodes), elasticity (nodes can lag during activity peaks without affecting the system as a whole) and resiliency (nodes can fail / stop working without taking the whole system down).
Check this article which discusses a message queue infrastructure in detail.

Performance comparison between ZeroMQ, RabbitMQ and Apache Qpid

I need a high performance message bus for my application so I am evaluating performance of ZeroMQ, RabbitMQ and Apache Qpid. To measure the performance, I am running a test program that publishes say 10,000 messages using one of the message queue implementations and running another process in the same machine to consume these 10,000 messages. Then I record time difference between the first message published and the last message received.
Following are the settings I used for the comparison.
RabbitMQ: I used a "fanout" type exchange and a queue with default configuration. I used the RabbitMQ C client library.
ZeroMQ: My publisher publises to tcp://localhost:port1 with ZMQ_PUSH socket, My broker listens on tcp://localhost:port1 and resends the message to tcp://localhost:port2 and my consumer listens on tcp://localhost:port2 using ZMQ_PULL socket. I am using a broker instead of peer to to peer communication in ZeroMQ to to make the performance comparison fair to other message queue implementation that uses brokers.
Qpid C++ message broker: I used a "fanout" type exchange and a queue with default configuration. I used the Qpid C++ client library.
Following is the performance result:
RabbitMQ: it takes about 1 second to receive 10,000 messages.
ZeroMQ: It takes about 15 milli seconds to receive 10,000 messages.
Qpid: It takes about 4 seconds to receive 10,000 messages.
Questions:
Have anyone run similar performance comparison between the message queues? Then I like to compare my results with yours.
Is there any way I could tune RabbitMQ or Qpid to make it performance better?
Note:
The tests were done on a virtual machine with two allocated processor. The result may vary for different hardware, however I am mainly interested in relative performance of the MQ products.
RabbitMQ is probably doing persistence on those messages. I think you need to set the message priority or another option in messages to not do persistence. Performance will improve 10x then. You should expect at least 100K messages/second through an AMQP broker. In OpenAMQ we got performance up to 300K messages/second.
AMQP was designed for speed (e.g. it does not unpack messages in order to route them) but ZeroMQ is simply better designed in key ways. E.g. it removes a hop by connecting nodes without a broker; it does better asynchronous I/O than any of the AMQP client stacks; it does more aggressive message batching. Perhaps 60% of the time spent building ZeroMQ went into performance tuning. It was very hard work. It's not faster by accident.
One thing I'd like to do, but am too busy, is to recreate an AMQP-like broker on top of ZeroMQ. There is a first layer here: http://rfc.zeromq.org/spec:15. The whole stack would work somewhat like RestMS, with transport and semantics separated into two layers. It would provide much the same functionality as AMQP/0.9.1 (and be semantically interoperable) but significantly faster.
Hmm, of course ZeroMQ will be faster, it is designed to be and does not have a lot of the broker based functionality that the other two provide. The ZeroMQ site has a wonderful comparison of broker vs brokerless messaging and drawbacks & advantages of both.
RabbitMQ Blog:
RabbitMQ and 0MQ are focusing on different aspects of messaging. 0MQ puts much more focus on how the messages are transferred over the wire. RabbitMQ, on the other hand, focuses on how messages are stored, filtered and monitored.
(I also like the above RabbitMQ post above as it also talks about using ZeroMQ with RabbitMQ)
So, what I'm trying to say is that you should decide on the tech that best fits your requirements. If the only requirement is speed, ZeroMQ. But if you need other aspects such as persistence of messages, filtering, monitoring, failover, etc well, then that's when u need to start considering RabbitMQ & Qpid.
I am using a broker instead of peer to to peer communication in ZeroMQ to to make the performance comparison fair to other message queue implementation that uses brokers.
Not sure why you want to do that -- if the only thing you care about is performance, there is no need to make the playing field level. If you don't care about persistence, filtering, etc. then why pay the price?
I'm also very leery of running benchmarks on VM's -- there are a lot of extra layers that can affect the results in ways that are not obvious. (Unless you're planning to run the real system on VM's, of course, in which case that is a very valid method).
I've tested c++/qpid
I sent 50000 messages per second between two diferent machines for a long time with no queuing.
I didn't use a fanout, just a simple exchange (non persistent messages)
Are you using persistent messages?
Are you parsing the messages?
I suppose not, since 0MQ doesn't have message structs.
If the broker is mainly idle, you probably haven't configured the prefetch on sender and receptor. This is very important to send many messages.
We have compared RabbitMQ with our SocketPro (http://www.udaparts.com/) persistent message queue at the site http://www.udaparts.com/document/articles/fastsocketpro.htm with all source codes. Here are results we obtained for RabbitMQ:
Same machine enqueue and dequeue:
"Hello world" --
Enqueue: 30000 messages per second;
Dequeue: 7000 messages per second.
Text with 1024 bytes --
Enqueue: 11000 messages per second;
Dequeue: 7000 messages per second.
Text with 10 * 1024 bytes --
Enqueue: 4000 messages per second;
Dequeue: 4000 messages per second.
Cross-machine enqueue and dequeue with network bandwidth 100 mbps:
"Hello world" --
Enqueue: 28000 messages per second;
Dequeue: 1900 messages per second.
Text with 1024 bytes --
Enqueue: 8000 messages per second;
Dequeue: 1000 messages per second.
Text with 10 * 1024 bytes --
Enqueue: 800 messages per second;
Dequeue: 700 messages per second.
Try to configure prefetch on sender and receptor with a value like 100. Prefetching just sender is not enough
We've developed an open source message bus built on top of ZeroMQ - we initially did this to replace Qpid. It's brokerless so it's not a totally fair comparison but it provides the same functionality as brokered solutions.
Our headline performance figure is 140K msgs per second between two machines but you can see more detail here: https://github.com/Abc-Arbitrage/Zebus/wiki/Performance

Detecting slow readers with zmq(zeromq)

I'm trying to replace a small homegrown messaging system, and are playing around a bit with zmq .
I'll be needing to detect slow readers, and boot/disconnect them - slow readers pretty much meaning a particular consumer whos queue size is above a certain threshold.
So far it seems zmq blocks every consumer if one of them is a bit slow (fair enough) - but
I can't find any way to detect a potential slow consumer. Anyone have any experience with
wether and how this is possible with zmq - or have any other broker-less messaging system to recccommend ?
As of zeromq-2.0.7, you can set the ZMQ_HWM option on a ZMQ_PUB socket to control the maximum number of messages that can be queued for a subscriber. Once the high-water mark has been reached, all further messages destined for that subscriber will be dropped until the queue size drops back below the high-water mark. This limits the amount of memory dedicated to what you call a slow reader.
However, because the ZeroMQ library exposes sockets, not clients, there is no way for you to identify and forcibly disconnect unwanted clients without modifying the library itself.
There is a section in the ZeroMq Guide regarding this, it suggests implementing a pattern the call the "Suicidal Snail Pattern".
Basically, it reverses the dependency and tries to convince slow subscribers to disconnect/kill themselves by giving them a way to detect if they have become slow readers.