Set Expiry of GoogleCloudPubsub Message - publish

Is there a way to set the expiry of a Google Pubsub message(Bad request) that I publish, so that it does not retry indefinitely on failure ?
I cannot configure this with retry because I want valid Error messages to retry indefinitely

There is only a global level 7-days expiration. However, you can add timestamp as an attribute and check the timestamp at the beginning of your pipeline, then throw away if conditions are met.
That said, we don't recommend that you only use timestamp for determining whether or not you throw away the message because if you have a big backlog and the consumer can not catch up, it is possible that valid messages are thrown away even if it's the first time being processed.
Here is another idea. When you publish the message, you get the message IDs in the API response, which you can later use in order to identify individual messages. In your pipeline, you can increment the retry counts on each message IDs, so that you know how many times the retry occurred for a particular message. Then you can throw away messages which are retried more than N times. This strategy is more reliable in my opinion. These retry count is not a critical thing, so you may be able to store them just in memory.

Related

Artemis receives duplicate messages after failover [duplicate]

This question already has an answer here:
ActiveMQ messageId not working to stop duplication
(1 answer)
Closed 2 years ago.
In order to test the communication performance in the event of a failure, I numbered each message and sent it continuously, sending about 30 messages per second. And found that even if the ha policy is set, consumers will repeatedly receive a small number of received messages after failover/failback. Is this normal?
I know that Artemis provides automatic duplicate message detection by giving a unique value to the message, which can avoid repeated sending of messages, but the repeated received messages have different "client ack messageID". Does this mean that it cannot prevent receiving repeated messages?
Depending on how you've written your client you can get duplicates on failover because some message acknowledgements may get lost when the failure happens. For example, if you receive a message from the broker and process it but then the broker fails before you send the acknowledgement (or fails while the acknowledgement is in transit) then the backup will still have the message you received already and will dispatch it again.
If you don't want duplicates to be a problem for your client then you have a couple of options:
Use a transaction on your client and don't commit until the acknowledgement has been confirmed successfully. If the acknowledgement fails then rollback the transaction.
Make sure your consumer is idempotent so duplicates don't really matter.

Azure Service Bus: How to keep the message send from sender is FIFO

I have read a few questions from StackOverflow. They said we can enabled the Session Support to the queue to keep the message FIFO. Some mention the ordering cannot be guaranteed. To make sure the message processed in order we have to deal with manual during the processing by the timestamp.
Is that true?
Azure Service Bus Queue itself follows FIFO. In some cases, the processing of the messages may not be sequential. If you are sure that the size of the payload will be consistent, then you can go with the normal Queues, which will process the messages in order(works for me).
If there will be change in payload size between the messages, it is preferred to go with Session enabled Queues as Sean Feldman mentioned in his answer.
To send/receive messages in FIFO mode, you need to enable enable "Require Sessions" on the queue and use Message Sessions to send/receive messages. The timestamp doesn't matter. What matters is the session.
Upon sending, setting message's SessionId
Upon receiving, either receive any session using MessageReceiver or a specific session using lower level API (SessionClient) and specifying session ID.
A good start would be to read the documentation and have a look at this sample.

SSIS - Script Component pulling information from RabbitMQ

A question that might be mostly theoretical, but I'd love to have my concerns put to rest (or confirmed).
I built a Script Component to pull data from RabbitMQ. On RabbitMQ, we basically set up a durable queue. This means messages will continue to be added to the queue, even when the server reboots. This construction allows us to periodically execute the package and grab all "new" messages since the last time we did so.
(We know RabbitMQ isn't set up to accommodate to this kind of scenario, but rather it expects there to be a constant listener to process messages. However, we are not comfortable having some task start when SQL Server starts, and pretty much running 24/7 to handle that, so we built something we can schedule to run every n minutes and empty the queue that way. If we'd not be able to run the task, we most likely are dealing with a failed SQL Server, and have different priorities).
The component sets up a connection, and then connects to the specific exchange + queue we are pulling messages from. Messages are in JSON format, so we deserialize the message into a class we defined in the script component.
For every message found, we disable auto-acknowledge, so we can process it and only acknowledge it once we're done with it (which ensures the message will be processed, and doesn't slip through). Then we de-serialize the message and push it onto the output buffer of the script component.
There's a few places things can go wrong, so we built a bunch of Try/Catch blocks in the code. However, seeing we're dealing with the queue aspect, and we need the information available to us, I'm wondering if someone can explain how/when a message that is sent to the output buffer is processed.
Is it batched up and then pushed? Is it sent straight away, and is the SSIS component perhaps not updating information back to SSIS in a timely fashion?
Would there be a chance for us to acknowledge a message, but that it somehow ends up not getting committed to our database, yet popped from the queue (as I think happens once a message is acknowledged)?

How to write an event trigger which send alerts to a JMS Queue

Is there any example where, we can trigger an event to send messages to JMS Queue when a table is updated/inserted ect for MYSQL/Postgre?
This sounds like a good task for pg_message_queue (which you can get off Google Code or PGXN), which allows you to queue requests. pg_message_queue doesn't do a great job of parallelism yet (in terms of parallel queue consumers), but I don't think you need that.
What you really want to do (and what pg_message_queue provides) is a queue table to hold the jms message, and then a trigger to queue that message. Then the question is how you get it from there to jms. You have basically two options (both of which are supported):
LISTEN for notifications, and when those come in handle them.
Periodically poll for notifications. You might do this if you have a lot of notifications coming in, so you can batch them every minute or so, or if you have few notifications coming in and you want to process them at midnight.
Naturally that is PostgreSQL only. Doing the same on MySQL? I don't know how to do that. I think you would be stuck with polling the table, but you could use pg_message_queue to understand basically how to do the rest. Note that in all cases this is fully transactional so the message would not be sent until after transaction commit, which is probably what you want.

How to retract a message in RabbitMQ?

I've got something like a job queue over RabbitMQ and, upon a request to cancel a job, I'd like to retract the tasks that have not yet started processing (their messages have not been ack'd), which corresponds to retracting these messages from the queues that they've been routed to.
I haven't found this functionality in AMQP or in the RabbitMQ API; perhaps I haven't searched well enough? Or will I have to use a workaround (it's not hard, but still)?
I would solve this scenario by having the worker check some sort of authoritative data source to determine if the the job should proceed or not. For example, the worker would check the job's status in a database to see if the job was canceled already.
For scenarios where the speed of processing jobs may be faster than the speed with which the authoritative store can be updated and read, a less guaranteed data store that trades speed for other characteristics may be useful.
An example of this would be to use Redis as the store for canceling processing of a message instead of a relational DB like MySQL. Redis is very fast, but makes fewer guarantees regarding the data it holds, whereas MySQL is much slower, but offers more guarantees about the data it holds.
In the end, the concept of checking with another source for whether or not to process a message is the same, but the way you implement that depends on your particular scenario.
RabbitMQ doesn't let you modify or delete messages after they've been enqueued. For that, you want some kind of database to hold the state of each job, and to use RabbitMQ to notify interested parties of changes in that state.
For lowish volumes, you can kludge it together with a queue per job. Create the queue, post the job description to the queue, announce the name of the queue to the workers. If the job needs to be cancelled before it is processed, deleted the job's queue; when the workers come to fetch the job description, they'll notice the queue has vanished.
Lighterweight and generally better would be to use redis or another key/value store to hold the job state (with a deleted or absent record meaning a cancelled or nonexistent job) and to use rabbitmq to notify about new/removed/changed records in the key/value store.
At least two ways to achieve your target:
basic.reject will requeue message if requeue=true is set (otherwise it will reject message).
(supported since RabbitMQ 2.0.0; see http://www.rabbitmq.com/blog/2010/08/03/well-ill-let-you-go-basicreject-in-rabbitmq/).
basic.recover will ask broker to redeliver unacked messages on channel.
You need to subscribe to all the queues to which messages have been routed, and consume them with ack.
For instance if you publish to a topic exchange with "test" as the routing key, and there are 3 persistent queues which subscribe to "test" you would need to consume those three queues. It might be better to add another queue which your consumer processes would also listen too, and tell them to ignore those messages.
An alternative, since you are using RabbitMQ, is to write a custom exchange plugin that will accept some out of band instruction to clear all queues. For instance you might have that exchange read a special message header that tells it to clear all queues to which this message is destined. This does require writing Erlang code, but there are 4 different exchange types implemented so you would only need to copy the most similar one and write the code for the new bahaviours. If you only use custom headers for this, then the body of the message can be a normal message for the consumers.
To sum up:
1) the publisher needs to consume the messages itself
2) the publisher can send a special message in a special queue to tell consumers to ignore the message
3) the publisher can send a special message to a custom exchange that will clear any existing messages from the queues before sending this special message to consumers.