How can we sync data between Amazon dynamo database and Relational database - relational-database

We are planing to develop an application with Amazon Dynamo db. Actually this application is collecting information from my client's database(my client's are using MYSQL, Oracle,MSsql/ any other Relational database), doing some process in my application and send back results to the client's database. This synchronization process should work always(or every 1 minute interval).
I want to know is there any tools(or tricks) are available for synchronization between Amazon dynamo database and Relational database?

You can consider Elastic Map Reduce job, which reads from dynamo, transforms the data and writes back to relational database. (http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/EMRforDynamoDB.html)
Edit: Also look at Data Pipeline (http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-taskrunner-rdssecurity.html)

Related

Syncing data between MySQL and Node.js

In my project, I am storing a lot of data in MySQL automatically. I want, however, the same things to sync with Node.js application's own data storage. What would be the fastest and easiest way to do this when storing data in both storages simultaneously isn't possible?
So, for example, I am storing variable "balance" in MySQL inside one Node.js application. I would want this same balance updated into other Node.js application's own storage but my current Node.js app is not connected to socket or other kind of data transporting mechanism. So how could I fetch data from MySQL in that other Node.js application?
sound look like your project structure is saga pattern.
in your question about update data is can use Kafka to create a topic and 2 node application consume message on same topic to update data into own database.

Mirroring homogeneous data from one MySQL RDS to another MySQL RDS

I have two MySQL RDS's (hosted on AWS). One of these RDS instances is my "production" RDS, and the other is my "performance" RDS. These RDS's have the same schema and tables.
Once a year, we take a snapshot of the production RDS, and load it into the performance RDS, so that our performance environment will have similar data to production. This process takes a while - there's data specific to the performance environment that must be re-added each time we do this mirror.
I'm trying to find a way to automate this process, and to achieve the following:
Do a one time mirror in which all data is copied over from our production database to our performance database.
Continuously (preferably weekly) mirror all new data (but not old data) between our production and performance MySQL RDS's.
During the continuous mirroring, I'd like for the production data not to overwrite anything already in the performance database. I'd only want new data to be inserted into the production database.
During the continuous mirroring, I'd like to change some of the data as it goes onto the performance RDS (for instance, I'd like to obfuscate user emails).
The following are the tools I've been researching to assist me with this process:
AWS Database Migration Service seems to be capable of handling a task like this, but the documentation recommends using different tools for homogeneous data migration.
Amazon Kinesis Data Streams also seems able to handle my use case - I could write a "fetcher" program that gets all new data from the prod MySQL binlog, sends it to Kinesis Data Streams, then write a Lambda that transforms the data (and decides on what data to send/add/obfuscate) and sends it to my destination (being the performance RDS, or if I can't directly do that, then a consumer HTTP endpoint I write that updates the performance RDS).
I'm not sure which of these tools to use - DMS seems to be built for migrating heterogeneous data and not homogeneous data, so I'm not sure if I should use it. Similarly, it seems like I could create something that works with Kinesis Data Streams, but the fact that I'll have to make a custom program that fetches data from MySQL's binlog and another program that consumes from Kinesis makes me feel like Kinesis isn't the best tool for this either.
Which of these tools is best capable of handling my use case? Or is there another tool that I should be using for this instead?

Sql vs nosql for a realtime messaging app?

I am creating a messaging app. I have my users stored in a mysql database and messages stored in google datastore a nosql database. However I was wondering what would be the drawbacks of having my messages in a mysql database since I am fetching the message and the user simultaneously.
Is there performance drawbacks?
Generally, different database usage cannot affect anything if your backend architecture is well-defined. Database stores only data to manipulate. I think for authentication you use mySQL and store data in Google Datastore. Performance drawbacks are coming from the bandwidth of your server.
I propose that you must use the same database to store all data, it will be more stable and easy to manage.

How to convert MS SQL tables to DynamoDB tables?

I am new to Amazon DynamoDB and I have eight(8) MS SQL tables that I want to migrate to DynamoDB.
What process should I use for converting and migrating the database schema and data?
I was facing the same problem a year back when I started migrating the app from SQL to DynamoDB. I am not sure if there are automated tools, but I can share what we had done for migration:
Check if your existing data types can be mapped/need to change in DynamoDB. You can merge some of the table which requires less updates into single item with List and Map types or use a Set if required.
The most important thing is to check all your existing queries. This will be the core information you will need when you will design DynamoDB tables.
Make sure you distribute Hash keys properly.
Use GSI and LSI for searching and sorting purposes (project only those attributes that will be needed; this will save money).
Some points that will save some money:
If your tables are read-heavy, try using some caching mechanism, otherwise be ready to increase throughput of the tables.
If your table is write-heavy, then implement a queuing mechanism, such as SQS.
Keep checking all of your important tables status in Management console. They have provided different matrices that will help you in managing the throughput of the tables.
I have written a blog which include all the challenges faced while moving from relational database to NoSQL database

How can I use firebase alongside a primary database?

I want to leverage Firebase's real-time syncing features but use it as a secondary database. How would such an architecture be possible? In this scenario, a PostgreSQL or similar database would be used as the primary store, and firebase is used for real-time syncing and delivery of data to clients.
This is beneficial where for example in any case that firebase goes down, my service would still be running and only lose its real-time sync features, as opposed to going down completely. Or in any other scenario of difficulty using firebase, there is an in-house copy of data to turn to.
Ideally firebase would be the real-time eventual consistency store, whereas the data gets stored into the SQL database in parallel.
Is this a possible scenario, and are there any example architectures?
Thanks