Can a NATS publisher send a single message into multiple queues? - message-queue

I'm building a system where two different entities need to process messages from the same source (in different ways - for example one will log all messages while another entity might want to aggregate data).
Ideally each entity is fully scalable for performance and resiliency, so we have multiple publishers, multiple log subscribers and multiple aggregation subscribers, but still each messages generated by each published is processed by exactly one log subscriber and one aggregation subscribers.
With AMQP we can achieve this by publishing to a fan-out exchange that distributes messages to two queues where each queue has many subscribers. I understand that the same behavior can be achieved in NATS by simply having all the subscribers listen on the same "subject" by use two distinct "queue group names" base on their roles.
In such an instance messages to the subject will be delivered to one subscriber from each queue group, i.e. each message will be delivered exactly n-times, n being the number of distinct queue groups and not the number of subscribers. Is this correct?

Indeed, you can use queue subscribers (for instance in Go, it would be such API: func (nc *Conn) QueueSubscribe(subj, queue string, cb MsgHandler) (*Subscription, error).
The queue is the group name. For instance, it could be in your example log and aggregation. You can create as many queue subscribers on each of these groups and only 1 member in each group will receive a given message.
For instance, suppose that you publish a message on subject foo and you have 10 queue subscribers on foo with queue name log and 10 queue subscribers on foo with queue name aggregation. The message will be delivered to 2 subscribers, 1 for group log and 1 for group aggregation.
Hope this helps.

Your approach is correct, the concept of queue in nats.io is to sequential distribute the message among the subscribers listening to the queue. This distribution happens in a linear fashion, suppose you have 10 subscribers (S1- S10) listening to a topic and registered on same queue, then the first message will be sent to S1, then to S2 and so on in a cyclic manner.
You just need to make sure that all the subscribers are connected to the server as if a subscriber goes offline, the NATS server will become aware of this event after certain outstanding PING-PONG requests and during that interval it will be forwarding the messages to the offline node. Thus you need to carefully set
PING-PONG interval
Max outstanding PING requests
https://nats.io/documentation/server/gnatsd-config/

Related

Best practice to get summary/filters from a large SQL table

In our system, a user can be of 2 types
Client
Affiliate client
The client can have one listing and Affiliate clients can have multiple listings.
So we get calls against listings and we store them in a table called call logs table.
Now this table has ~5 million records and this table grows every minute. A call log belongs to a listing and a user(client/affiliate). So a call log can have (a client id and a listing id) OR
(an affiliate client id and listing id).
We show filters on the call logs page an admin portal to filters the call logs. Admin can filter the call logs by Clients, Affiliates Client, and Listings.
This is how I get clients, affiliates, and listings from the call logs table to make filters.
select * from `call_logs` order by `call_logs`.`id` asc limit 500 offset 0
So after getting the 500 call logs, I extract
Users with type client and store them in a variable named client filter array.
Users with type affiliate client and store in a variable called affiliate filter array.
Unique listings and store in a listings filter array.
After looping through the whole table of call_logs, because I chunk up the table to 500 rows per loop, I displayed those filters on the admin portal page.
Issue/Problem:
Because the table is large, so separation of users on the base of type(Client and affiliate client) and listing take a long time, so the admin has to wait for page loading.
I have also tried to load those filters after the page is successfully loaded via ajax, but most of the time its takes too long.
I can't use groups by user types like user_id, affiliate_client_id, or listing_id. Because if a group by affiliate_client_id, so the affiliate can have multiple listings with different calls, so it will pick on one call, the same thing for listing.
you need an index on user_id and call_fee columns and this query calculate the call fee for all the users in one query:
SELECT `user_id`,sum(`call_fee`)
FROM `call_logs`
GROUP BY `user_id`;

Which will be more efficient for server to handle

Let me say I have a billing program webapp which will serve all users to maintain Thier business.
Eg:
Receipt No, qty, AMT, owner.
Are 4 tupples.
Now when multiple people use same online software, we are identifying transaction of a organisation by owners name.
When he takes his organisation sales report it has filter rows by owner name condition.
So this is delaying the process.
If we make master slaves database
Then we need only 3 Receipt No, qty, AMT, tupples since we can identify organisation by database name.
Which will be more efficient for server to handle !
I also need all organisation reports too combined.
P.S: I am using mysql server for db and web2py platform.

Data sync from MySQL to NoSQL key-value store

I am having a legacy system with the MySQL at the backend and python as the primary programming language.
Recently we have a scenario where we need to display a dashboard with the information present in the MySQL database. The data in the table changes every second.
This can be thought of similar to a bid application where people bid constantly. Every time a user bids a record goes in to the database. When an user updates his bid it updates the previous value.
I also have few clients who monitor this dashboard which updates the statistics.
I need to order this data in realtime as people bid in real time.
I don't prefer to run queries against MySQL because at any second I may have few 1000 clients querying the database. This will create load on database.
Please advice.
If you need to collect and order data in realtime you should be looking at the atomic ordered map and ordered list operations in Aerospike.
I have examples of using KV-ordered maps at rbotzer/aerospike-cdt-examples.
You could use a similar approach with the user's ID being the key, the bid being a list with the structure [1343, { foo: bar, ts: 1234, this: that} ]. The bid amount in cents (integer) is the first element of the list, all the other information is in a map in the second element position.
This will allow you to update a user bid with a single map operation, get back the user's bid with a single operation, order by rank (on the bid amount) to get the top bids ordered, get all the bids in a particular range, etc. You would have one record per item, with all the bids in this KV-sorted map.

Performance considerations for table design of chat service

We are implementing a chat service, and I'd like some input on table design. Our service uses MySQL, and our DB has 2 tables, Threads and Messages. Threads table stores all the chat threads, and Messages table stores all the messages. A chat thread can have multiple messages, while a message belong to only one thread. Each message is identified by a column in Messages table called messageId.
We need to get the messageId of the last message of each thread from time to time in our service. I can see 2 options:
1 add a column called lastMessageId to Threads to keep track of the last message; each time a message is inserted into Messages table, we need to update Threads table as well;
2 each time we need the last message's id, perform a query on Messages table to find the last message;
Which option should I take, and why?
I would suggest to go for option 2, below are the reason.
You said that u need last message id time to time which means not so frequent.
Making a query time to time is less intensive than making an Update operation on every insert.
You can further fine tune your query by creating indexes on Messages Table.

Basic Normalization Question

This might not exactly be a "normalization" question, it's more the type of data which I am saving.
I've just done a specification for a messaging and email system . The idea is that I need to save all of the messages which are internal to my web service, but also know if an email has been sent with that message.
Here is the specification.
Specification
Any messages are stored in one table.
Messages can be from unregistered users, or registered users.
An unregistered user message will just have a return email address
A registered user message will have the user id of the sender
Messages are either owned by a User (meaning that they are the sent to) or messages are shared by user roles.
When a message is owned by a user, we record some information about this message (same table as the message).
a) Has the user opened/read the message?
b) Was an _email sent_ to the owner of the message or is it just an internal message
c) Date the message was first read
d) Date the message was sent
When a message is sent to a group of users, meaning that they are sent to "All Users", or "All Owners" or "All SuperAdmin"...
a) The message is saved once in the messages table with a sent date
b) Each individual open is tracked in a seperate table
c) A field records if a direct _email has been sent_, or if it is just saved internally in the system. (seperate table)
Messages can be threaded, this means that if a message is responded to, that it is a child or the original message.
Messages have different "Types", meaning that a message can be "System Notice", "Enquiry", "Personal Message", "Private Message", "Transactional Information"
Messages which are linked to an enquiry for a product, will save the ID of the product they are enquiring for. (ie The relevant property).
End Specification
Now the actual question...
As you can see in bullet 1)(b) I am recording for a message which is sent to an indiviual user, if an email was also sent for that message.
However, when an email is sent to a group of users, I am then recording whether an email was sent in a completely different table.
Obviously because I can't save this information in the same table.
What are your opinions on this model. I'm not duplicating any data, but I'm seperating where the data is saved. Should I just have a email_sent table to record all of this information.
It is hard to say whether your current design is good or bad. On the surface, I think that it is a mistake to separate the same piece of information into two places. It may seem easier to have a note about an individual email sent in the table which is closer to the individual and notes about emails sent to groups closer to the groups. However, your code is going to have to go looking in two places to find information about any email or about all emails in general.
If the meaning of the flag email_sent is the same for an individual user as it is for a member of a group of users, then looking in two places all the time for what is essentially one kind of information will be tedious (which in code terms comes down to being potentially slow and hard to support).
On the other hand, it may be that email_sent is something that is not important to your transactional or reporting logic and is just a mildly interesting fact that is "coming along for the ride". In this case, trying to force two different email_sent flags into one place may require an inconvenient and inadvisable mash-up of two entities that ought to be distinct because of all of their other, more important attributes.
It is difficult to give a conclusive answer without having a better understanding of your business requirement, but this is the trade-off you have to consider.
Create 3 tables:
MSG with id (key auto), msgtext, type (value U or R), userId/roleId
ROLES with roleId, userId
ACCS with userId, MsgId, date opened, read, etc
MSG records the message, with a type to see if it's from a role or unregistered user
ROLES points one role to many users
ACCS records everything, for a user, registered or not.
To retrieve, join the MSG type U with ACCS
join MSG type R with ROLES and then with ACCS
To retrieve all, UNION them