What is the best way to make sure you have all the messages in the proper order in your queue such as SQS or Kafka?
Basically, I need to make sure all the messages that I have in the queue are ordered for example based on date+time and process them online. I need to do this because I need to process only the first 1000 messages in the queue online. What is the best approach here?
Related
How to limit the number of transcations per second on a table in Mysql?
Like to prevent brute force login via an API
As David says, do this on the API. You cannot and should not limit your database. There's no way to distinguish the origin of the query, so you'll just shut down the database for everyone if one person decides to flood it, making a denial-of-service attack easier.
As for a solution there are many examples.
Nginx has a rate-limiting feature built in that can limit requests per interval of time, and is very flexible. This can be focused on particular endpoints, paths, or other criteria, making it easy to protect whatever parts of your system are vulnerable.
You'll also need to block clients that are trying to attack your system. Consider something like fail2ban which can read logs and automatically block source traffic from offenders. Log every failed attempt and this tool can do the rest.
I want an admin account to send an announcement to all users in the db.
Right now for my Message table I am storing a message every time for user to user messages with senderId and receiverId etc.
My problem is can I treat announcement the same way as user to user messages and if yes, would it be wise to save into the message table n times for n number of users in the db every time there is an announcement?
So I want to see if there are cleaner approach to this.
It depends on how much time/effort you want to invest in this.
Separate table for announcements: You won't be able to reuse your current messaging system, but you will have maximum flexibility (special GUI features for announcements, they won't get mixed up with normal messages, etc.)
Modify your current messaging system to support multi-recipient and/or broadcast messages. With this you can reuse most of your current GUI with some backend modifications.
Do the simplest possible thing and send a message to everyone. This is very easy to implement. The obvious downside is that you will have a lot of copied messages in your DB, which may or may not be a problem.
Is there any example where, we can trigger an event to send messages to JMS Queue when a table is updated/inserted ect for MYSQL/Postgre?
This sounds like a good task for pg_message_queue (which you can get off Google Code or PGXN), which allows you to queue requests. pg_message_queue doesn't do a great job of parallelism yet (in terms of parallel queue consumers), but I don't think you need that.
What you really want to do (and what pg_message_queue provides) is a queue table to hold the jms message, and then a trigger to queue that message. Then the question is how you get it from there to jms. You have basically two options (both of which are supported):
LISTEN for notifications, and when those come in handle them.
Periodically poll for notifications. You might do this if you have a lot of notifications coming in, so you can batch them every minute or so, or if you have few notifications coming in and you want to process them at midnight.
Naturally that is PostgreSQL only. Doing the same on MySQL? I don't know how to do that. I think you would be stuck with polling the table, but you could use pg_message_queue to understand basically how to do the rest. Note that in all cases this is fully transactional so the message would not be sent until after transaction commit, which is probably what you want.
Assume I am building netflix and I want to log each view by the userID and the movie ID
The format would be viewID , userID, timestamp,
However in order to scale this, assume were getting 1000 views a second. Would it make sense to queue these views to SQS and then our queue readers can un-queue one by one and write it to the mysql database. This way the database is not overloaded with write requests.
Does this look like something that would work?
Faisal,
This is a reasonable architecture; however, you should know that writing to SQS is going to be many times slower than writing to something like RabbitMQ (or any local) message queue.
By default, SQS FIFO queues support up to 3,000 messages per second with batching, or up to 300 messages per second (300 send, receive, or delete operations per second) without batching. To request a limit increase, you need to file a support request.
That being said, starting with SQS wouldn't be a bad idea since it is easy to use and debug.
Additionally, you may want to investigate MongoDB for logging...check out the following references:
MongoDB is Fantastic for Logging
http://blog.mongodb.org/post/172254834/mongodb-is-fantastic-for-logging
Capped Collections
http://blog.mongodb.org/post/116405435/capped-collections
Using MongoDB for Real-time Analytics
http://blog.mongodb.org/post/171353301/using-mongodb-for-real-time-analytics
I've got something like a job queue over RabbitMQ and, upon a request to cancel a job, I'd like to retract the tasks that have not yet started processing (their messages have not been ack'd), which corresponds to retracting these messages from the queues that they've been routed to.
I haven't found this functionality in AMQP or in the RabbitMQ API; perhaps I haven't searched well enough? Or will I have to use a workaround (it's not hard, but still)?
I would solve this scenario by having the worker check some sort of authoritative data source to determine if the the job should proceed or not. For example, the worker would check the job's status in a database to see if the job was canceled already.
For scenarios where the speed of processing jobs may be faster than the speed with which the authoritative store can be updated and read, a less guaranteed data store that trades speed for other characteristics may be useful.
An example of this would be to use Redis as the store for canceling processing of a message instead of a relational DB like MySQL. Redis is very fast, but makes fewer guarantees regarding the data it holds, whereas MySQL is much slower, but offers more guarantees about the data it holds.
In the end, the concept of checking with another source for whether or not to process a message is the same, but the way you implement that depends on your particular scenario.
RabbitMQ doesn't let you modify or delete messages after they've been enqueued. For that, you want some kind of database to hold the state of each job, and to use RabbitMQ to notify interested parties of changes in that state.
For lowish volumes, you can kludge it together with a queue per job. Create the queue, post the job description to the queue, announce the name of the queue to the workers. If the job needs to be cancelled before it is processed, deleted the job's queue; when the workers come to fetch the job description, they'll notice the queue has vanished.
Lighterweight and generally better would be to use redis or another key/value store to hold the job state (with a deleted or absent record meaning a cancelled or nonexistent job) and to use rabbitmq to notify about new/removed/changed records in the key/value store.
At least two ways to achieve your target:
basic.reject will requeue message if requeue=true is set (otherwise it will reject message).
(supported since RabbitMQ 2.0.0; see http://www.rabbitmq.com/blog/2010/08/03/well-ill-let-you-go-basicreject-in-rabbitmq/).
basic.recover will ask broker to redeliver unacked messages on channel.
You need to subscribe to all the queues to which messages have been routed, and consume them with ack.
For instance if you publish to a topic exchange with "test" as the routing key, and there are 3 persistent queues which subscribe to "test" you would need to consume those three queues. It might be better to add another queue which your consumer processes would also listen too, and tell them to ignore those messages.
An alternative, since you are using RabbitMQ, is to write a custom exchange plugin that will accept some out of band instruction to clear all queues. For instance you might have that exchange read a special message header that tells it to clear all queues to which this message is destined. This does require writing Erlang code, but there are 4 different exchange types implemented so you would only need to copy the most similar one and write the code for the new bahaviours. If you only use custom headers for this, then the body of the message can be a normal message for the consumers.
To sum up:
1) the publisher needs to consume the messages itself
2) the publisher can send a special message in a special queue to tell consumers to ignore the message
3) the publisher can send a special message to a custom exchange that will clear any existing messages from the queues before sending this special message to consumers.