I want to leverage Firebase's real-time syncing features but use it as a secondary database. How would such an architecture be possible? In this scenario, a PostgreSQL or similar database would be used as the primary store, and firebase is used for real-time syncing and delivery of data to clients.
This is beneficial where for example in any case that firebase goes down, my service would still be running and only lose its real-time sync features, as opposed to going down completely. Or in any other scenario of difficulty using firebase, there is an in-house copy of data to turn to.
Ideally firebase would be the real-time eventual consistency store, whereas the data gets stored into the SQL database in parallel.
Is this a possible scenario, and are there any example architectures?
Thanks
Related
I have a full stack app that uses React, Node.js, Express, and MySQL. I want the react app to respond to database updates similar to Firebase: When data changes, I want a real-time notification sent to my app.
I want to use stock MySQL (no plugins), so that I can use AWS RDB or whatever.
I will use socket.io to push the real-time notifications to the web app.
To avoid off-target responses, I'll summarize various approaches that are not what I am looking for:
The server could poll, or each client could poll. (Not real-time, but included for completeness. When I search, polling is the only solution I find.)
Write a wrapper that handles all MySQL updates, handles subscriptions, and sends the notifications. This is a complicated component that adds complexity. Firebase is popular because it both increases performance and reduces complexity. I like Firebase a lot but want to do the same thing with MySQL.
Use Firebase to handle the real-time notifications. The MySQL wrapper could use Firebase to handle the subscriptions and notifications, but there is still the problem of triggering the notifications in the first place. Also, I don't want to use Firebase. (For example, my application needs to run in an air-gapped environment.)
The question: Using a stock MySQL database, when a table changes, can a notification server discover the change in real-time (no polling), so that it can send notifications?
The approach that works is to listen to the binary logs. This way, any change to the database will be communicated in real-time. The consumer of the binary logs can then publish this information in a number of ways. A common choice is to feed a stream of events to Apache Kafka.
Debezium, Maxwell, and NiFi work this way.
I'm working on migration/integration of large on-premise Oracle monolithic app to cloud based Microservices. For a long time, microservices will need to be fed from and synchronized with the Oracle DB.
One of the alternatives is using Oracle Golden Gate for DB-to-DB(s) near-real-time replication. The advantage is that it seems to be reliable and resilient. The disadvantage is that it works on low-level CDC/DB changes (as opposed to app-level events).
An alternative is creating higher level business events from source DB by enriching data and then pushing it to something like Kafka. The disadvantage is that it puts more load on source DB, and requires durability on the source.
Anybody dealt with similar problems? Any advice is appreciated.
The biggest problem for us has been that legacy data is on the LAN, and our microservices are in the public cloud (in an attempt to avoid a "new legacy" hybrid cloud future).
Oracle Goldengate for Big Data can push change records as JSON to Kafka/Confluent. There's also the option to write your own handlers. You can find a lot of our PoC code in github.
As time has gone by, it became apparent the number of feeds was going to end up in the 300+ range, and we're now considering a data virtualisation + caching approach rather than pushing the legacy data to the cloud apps
We have built a LAMP-stack API application via PHP Laravel. This currently uses a local mySQL instance. We have mostly implemented views in AngularJS.
In order to use Firebase, we need to sync data between the authoritative store in mySQL with anything relevant that exists on Firebase, as close to real-time as possible. This means that other parts of the app which are not real-time and don't use Firebase can also serve up fresh content that's very recently been entered into the system.
I know that Firebase is essentially a noSQL database in the cloud. My question is - how do I write a wrapper or a means to sync the canonical version of my Firebase into my database of record - mySQL?
Update to answer - our final decision - ditching Firebase as an option
We have decided against this, as we can easily have a socket.io instance on the same server with an extremely low latency connection to mySQL, so that the two can remain in sync. There's no need to go across the web when resources and endpoints can exist on localhost. It also gives us the option to run our app without any internet connection, which is important if we sell an on-premise appliance to large companies.
A noSQL sync platform like Firebase is really just a temporary store that makes reads/writes faster in semi-real-time. If they attempt to get into the "we also persist everything for you" business - that's a whole different ask with much more commitment required.
The guarantee on eventual consistency between mySQL and Firebase is more important to get right first - to prevent problems down the line. Also, an RDMS is essential to our app - it's the only way to attack a lot of data-heavy problems in our analytics/data mappings - there's very strong reasons most of the world still uses a RDMS like mySQL, etc. You can make those very reliable too - through Amazon RDS and Google Cloud SQL.
There's no specific problem beyond scaling real-time sync that Firebase actually solves for us, which other open source frameworks don't already solve. If their JS lib actually handled offline scenarios (when you START offline) elegantly, I might have considered it, but it doesn't do that yet.
So, YMMV - but in our specific case, we're not considering Firebase for the reasons given above.
The entire topic is incredibly broad, definitely too broad to provide a simple answer to.
I'll stick to the use-case you provided in the comments:
Imagine that you have a checklist stored in mySQL, comprised of some attributes and a set of steps. The steps are stored in another table. When someone updates this checklist on Firebase - how would I sync mySQL as well?
If you insist on combining Firebase and mySQL for this use-case, I would:
Set up your Firebase as a work queue: var ref = new Firebase('https://my.firebaseio.com/workqueue')
have the client push a work item into Firebase: ref.push({ task: 'id-of-state', newState: 'newstate'})
set up a (nodejs) server that:
monitors the work queue (ref.on('child_added')
updates the item in the mySQL database
removes the task from the queue
See this github project for an example of a work queue on top of Firebase: https://github.com/firebase/firebase-work-queue
I come from the cliche land of PHP and MySQL on Dreamhost. BUT! I am also a javascript jenie and I've been dying to get on the Node.js train. In my reading I've discovered inadvertently a NoSQL solution called Redis!
With my shared web host and limited server experience (I know how to install Linux on one of my old dell's and do some basic server admin) how can I get started using Redis and Node.js? and the next best question is -- what does one even use Redis for? What situation would Redis be better suited than MySQL? And does Node.js remove the necessity for Apache? If so why do developers recommend using NGINX server?
Lots of questions but there doesnt seem to be a solid source out there with this info all in one place!
Thanks again for your guidance and feedback!
NoSQL is just an inadequate buzz word.
I'll attempt to answer the latter part of the question.
Redis is a key-value store database system. Speed is its primary objective, so most of its use comes from event driven implementations (as it goes over in its reddit tutorial).
It excels at areas like logging, message transactions, and other reactive processes.
Node.js on the other hand is mainly for independent HTTP transactions. It is basically used to serve content (much like a web server, but Node.js really wouldn't be necessarily public facing) very fast which makes it useful for backend business logic applications.
For example, having a C program calculate stock values and having Node.js serve the content for another internal application to retrieve or using Node.js to serve a web page one is developing so one's coworkers can view it internally.
It really excels as a middleman between applications.
Redis
Redis is an in-memory datastore : All your data are stored in the memory meaning that a huge database means huge memory usage, but with really fast access and lookup.
It is also a key-value store : You don't have any realtionships, or queries to retrieve your data. You can only set a key value pair, and retreive it by its id. (Redis also provides useful types such as sets and hashes).
These particularities makes Redis really well suited for storing sessions in a web application, creating indexes on a database, handling real-time data like analytics.
So if you need something that will "replace" MySQL for storing your basic application models I suggest you try something like MongoDB, Riak or CouchDB that are document store.
Document stores manages your data as something analogous to JSON objects (I know it's a huge shortcut).
Read this article if you want to know more about popular nosql databases.
Node.js
Node.js provides asynchrous I/O for the V8 JavaScript engine.
When you run a node server, it listens on a port on your machine (e.g. 3000). It does not do any sort of Domain name resolution and Virtual Host handling so you have to use a http server with a proxy such as Apache or nginx.
Choosing over nginx in production is a matter of performance, and I find it easier to use. But I suggest you use the one you're the most comfortable with.
To get started with it just install them and start playing with it. HowToNode
You can get a free plan from https://redistogo.com/ - it is a hosted redis database instance.
Quick intro to redis data types and basic commands is available here - http://redis.io/topics/data-types-intro.
A good comparison of when to use what is here - http://playbook.thoughtbot.com/choosing-platforms/databases/
We are building a large scale e-comm web site to service over 100,000 users, but we expect the number of users to grow rapidly over the first year. In general, the site functions very much like ebay where users can create, update, and remove listings. User can also search listings and purchase an item of interest. Basically, the system has transactional and non-transactional requirements:
**Transactional**
Create a listing (multi-record update)
Remove a listing
Update a listing
Purchase a listing (multi-record update)
**Non-Transactional**
Search listings
View a listing
We want to leverage the power of scalable, document-based NoSQL data stores such as Couch or MongoDB, but at the same time we need a relational store to support our ACID transactional requirements. So we have come up with a hybrid solution which uses both technologies.
Since the site is "read mostly", and, to meet the scalablity needs, we set up a MongoDB data store. For the transactional needs, we set up a MySQL Cluster. As the middleware component we use JBoss App server cluster.
When a "search" request comes in, JBoss directs the request to Mongo to handle the search which should produce very quick results while not burdening MySQL. When a listing is created, updated, removed, or purchased, JBoss services the transactions against MySQL. To keep MongoDB and MySQL synchronized, all transactional requests handled by JBoss against MySQL would include a final step in the business logic that updates the corresponding document in MongoDB via the listing id; we plan to use the MongoDB Java API to facilitate this integration of updating the document.
So, in essence, since the site is read mostly, the architecture allows us to scale out MongoDB horizontally to accommodate more users. Using MySQL allows us to leverage the ACID properties of relational databases while keeping our MongoDB store updated through the JBoss middleware.
Is there anything wrong with this architecture? No platform can offer consistency, availability, and partition-tolerance at the same time -- NoSQL systems usually give up consistency -- but at least with this hybrid approach we can realize all three at the cost of additional complexity in the system, and we are ok with that since all of our requirements are being met.
There is nothing wrong with this approach.
Infact Currently am also working on the application (E-Commerce) which leverages both SQL & NonSQL. Ours is a rails application and 90% of the data is stored in mongo and only transactional & inventory items stored in mysql. All the transactions are handled in Mysql, and everything else goes to mongo.
If you have already built it, there isn't too much wrong with the architecture aside from being a little too enterprisey. Starting from scratch on a system like this though, I'd probably leave out SQL and the middleware.
The loss of consistency in NoSQL data stores isn't as complete as you suggest. Aside from the fact that many of them do support transactions and can be set up for immediate consistency on particular queries, I suspect some of your requirements are simply an artefact of designing things relationally. Your concern seems to be around operations that require updates to multiple records - Is a listing really multiple records, or just set up that way because SQL records have to have a flat structure?
Also, if search and view are handled outside of MySQL, you have effectively set up an eventual consistency system anyway.