backfill data when creating new microservice - message-queue

Given a microservice system using queues (sqs/sns or rabbitmq for example).
And services for example:
A create book service, where you create books, and you can edit them until you have published them. This service has a datastore with all books created, and publishes a message when a book has been published.
A lending service, where users can lend a book. This service reads messages from create book service to know what books exists. It stores all loans and send a message when a book is loaned and returned.
This has been running for a while.
We now want to create a new service, showing statisitcs for loans.
It will listen to the create book service messages to get information about the books.
It will listen to the lending service to gather statistics.
What is the best practises to get data about books that was created before the statistics service into this new service? Since it is not using Kafka or similar data is not availible in the queue. It is only in the datastore of the create book service.

Related

Accessing Tracking Records with the Consumer API

I'm currently working on a project where I would need to Read and possibly update information from tracking records. I haven't found anything in the Knowledge Base that refers to any kind of tracking record (LoanApp, Account, Share, etc) access through the API. Is it possible to read and or update fields in any of the tracking records?
There isn't much support for reading and updating tracking records (I'm assuing you mean SymXchange external tracking records) via the API. Updating, in particular, is not available.
For reading, one option may be to have the Banno Admin at the financial institution enable the Restricted Claim which is https://api.banno.com/consumer/claim/external_tracking_records (that's a scope name, not a URL). You'll want to read this page in the Authentication Framework docs: https://jackhenry.dev/open-api-docs/authentication-framework/overview/openidconnectoauth/
The gist is that the claim (when enabled by the admin at the FI, and also requested by your code) provides SymXchange external tracking records as part of the Identity Token.

How does dropbox' response queue work, exactly?

I am reading this writing: https://medium.com/#narengowda/system-design-dropbox-or-google-drive-8fd5da0ce55b. In the Synchronization Service part, it writes that:
The Response Queues that correspond to individual subscribed clients are responsible for delivering the update messages to each client. Since a message will be deleted from the queue once received by a client, we need to create separate Response Queues for each client to be able to share an update message which should be sent to multiple subscribed clients.
The context is that we need a response queue to send the file updates from one client to other clients. I am confused by this statement. If Dropbox has 100 million clients, we need to create 100 million queues, based on the statement. It is unimaginable to me. For example, a Kafka cluster can support up to 5K topics (https://stackoverflow.com/questions/32950503/can-i-have-100s-of-thousands-of-topics-in-a-kafka-cluster#:~:text=The%20rule%20of%20thumb%20is,5K%20topics%20should%20be%20fine.). We need 20K Kafka clusters in this case. Which queuing system can do 100 million "topics"?
Not sure but I expect such notification to clients via web-sockets only.
Additionally as this medium blog states that if client is not online then messages might have to be persisted in DB. After that when client comes online they can request for all updates after certain timestamp after which web sockets can be setup to facilitate future communication.
Happy to know your thoughts on this.
P.S : Most of dropbox system design blogs/vlogs have just copied from each other without going into low level detail.

Sending a message from Azure Service bus to AWS SQS

We have a requirement to implement Azure Service Bus as Integration point to various Applications (including apps hosted in AWS). Each application will have its own SQS. So the idea is to have Azure Service Bus with Topics and Subscription filters to route messages to each SQS accordingly. However I am not sure as to how we can pick messages from a subscription filter and push the message to AWS SQS. I am not able to see any solution for this.
These two are inherently two different messages services and you will either need to find a third party connector/bridge between the two or create your own. This would be a process that would be retrieving messages from one broker and forwarding it to another.
When it comes to a third party, there's an example that you could have a look at. NServiceBus has a community extension called Router. The router allows achieving exactly what you're looking for.
Disclaimer: I contribute and work on NServiceBus

How to sync the database with the microservices (and the new one)?

I'm developing a website with the microservice architecture, and each of the service owns a database. The database stores the data which the microservice needs.
Post, Video services need the user information, so both of the services subscribed to the NEW_USER_EVENT.
The NEW_USER_EVENT will be triggered when there's a new user registered.
Once the services received the NEW_USER_EVENT, they put the incoming user information to each of their own database. So they can do things without asking the User service.
So far so good. But here comes the question:
What if I'm going to create a new service? How do I get the registered user informations and put them in the new service?
Maybe I can get the informations from the existing services. But the events are pushed by the messaging queue (NSQ).
If I'm going to copy the data from one of the microservices, how do I make sure which service has the latest user informations? (Because some services haven't received the latest event)
Read More:
The Hardest Part About Microservices: Your Data
Intro to Microservices, Part 4: Dependencies and Data Sharing
What if I'm going to create a new service? How do I get the registered user informations and put them in the new service?
You have to replay all events to which this new service is subscribed from the beginning of time (you should have an "event store" that keeps all events already happened in your application). Also, you can put a bit smarter logic when replaying events by starting from the most recent ones and going back in time. In this way, you will be able to restore most valuable data first. Just be careful to handle interdependent events correctly.
Data source: The events are pushed by the messaging queue(NSQ), If I'm going to copy the data from one of the microservices, how do I make sure the copy source has the latest user informations?
You are not talking about doing backups, right?
Aside from backups, in the event-driven systems people usually don't copy the data in a classical way, row by row. Instead, they just replaying events from event store from the beginning of time and feeding those events to the event handlers for the new service (or new instance). As a result, new service eventually becomes consistent with the other parts of the system.

Service for accessing multiple user cloud storage accounts from single server

I'm working on a free educational web app for school music teachers and students that will allow them to collaborate and share mp3 recordings. Since earning revenue is not the goal, I'm looking for ways to reduce file storage costs. A single teacher assignment might produce hundreds of recorded responses. Instead of saving these recordings to my own storage (or to a service like Amazon's S3), I was wondering if there are any cloud storage services that teachers could sign up for - similar to something like Google Drive - and which they could then give my server app access to for storing their class's recordings. I'd still manage the info for the recordings and other data in a single database on my own server, but I'd save any large files to the shared storage provided to me by each teacher. I haven't found any examples of this sort of thing with services like Google Drive or Dropbox, but if it's possible with those or any other services, I'd appreciate a link to some info. The expectation would be that a teacher could pay the storage company for its service according to the school's usage. The service would have to be simple for teachers to sign up for and provide me access to, which I think puts some of the developer-oriented services out of reach.
Suggestions for different strategies are also welcome. I'd prefer not to handle financial transactions (so I don't want to rent space to people).