I have a bit theoretical question.
I have a microservice which handles 200/300 req/sec on peak. In each request we make several calls to DB (MySQL) and return some information.
Given the RPS, does it make sense to cache the data on app level?
Or modern MySQL servers can easily withstand such load and use their own Db-level cache instead?
Thanks
Related
I have a dashboard application that has PHP backend and javascript frontend. Data is read from multiple sources and I have access to databases of all the sources.
While designing the application, is it a good idea to store remote data locally instead of hitting the remote database everytime the application has a request?
Store locally? Reason being the data is not live. I can write a cron to run in the background to update the data every 5 min and the application always will read the data from local DB thereby giving faster load times.
Read from remote every time? Since I have direct database access to all these remote DB's, I do not notice any performance gain of storing data locally over fetching from remote everytime.
Which approach scales better?
What you're describing is called "caching." It's a common optimization.
Fetching data remotely is much more expensive than getting it out of a local cache.
You should learn the Latency Numbers Every Programmer Should Know.
The tricky part of caching is knowing when you need to discard the local cached copy of data and re-fetch it from the remote database. This is a hard problem with no single answer.
There's an old joke attributed to Phil Karlton:
“There are only two hard things in Computer Science: cache invalidation and naming things.”
In my work I need to revamp the web which need to accept numerous connection always. Before I use the JSON to get the data until now.But now I want to direct call the DB and get the data. As I know use cache is the best way for my web. But in initial the concurrent access to DB is often happen.Any advice for me to handle the situation. Because I want the web that can get the updated data always.
Thanks.
Following are my suggestions
If you want to use cache, you have to automate your cache clear process whenever there is an update in the particular data you hit. But this is practically possible if your data is updated infrequently.
If your budget allows, Put your DB in a cluster (Write in master and read from master&slave)
In worst case,ensure your db is properly indexed.
I'm developing an App where I store my data in a DB online using HTTP POSTO and GET.
I need to implement some reliability to my software, so if the user presses the button, and there is no connection, the data should be stored in something (file? sqlite?) and then when the connection is again on, send the HTTP request to send data.
Any advices or pieces of code to show me how to do this?
Thanks.
Sounds good and pretty forward for me. Just go.
You use a local sqlite db as "cache". To keep it simple, do not implement any logic about that into your apps normal code. Just use the local db. Then, separately, you code a synchronizer. That one checks for the online connection and synchronizes the the local sqlite database with a remote database, maybe mysql.
This should be perfectly fine for all applications that to not require immediate exchange of the data with other processes all the time.
There is one catch, though: the low performance of sqlite on bigger data sets. That is an issue with all single file database solutions. So this approach probably is only valid for small data sets in total, or if you can reduce the usage of the local database to only a part of the total data, maybe only the time critical stuff.
Another workaround might be to use joins over two separate databases, the local and the remote one. But such things really boost the complexity of code, so think thrice if that really is required.
The situation is that about 50.000 electronic devices are going to connect to a webservice created in node.js once per minute. Each one is going to send a POST request containg some JSON data.
All this data should be secured.
The web service is going to receive those requests, saving the data to a database.
Also reading requests are possible to get some data from the DB.
I think to build up a system based on the following infrastructure:
Node.js + memcached + (mysql cluster OR Couchbase)
So, what memory requirements do I need to assign to my web server to be able to handle all this connections? Suppose that in the pessimistic possibility I would have 50.000 concurrent requests.
And what if I use SSL to secure the connections? Do I add too much overhead per connection?
Should I scale the system to handle them?
What do you suggest me?
Many thanks in advance!
Of course, it is impossible to provide any valuable calculations, since it is always very specific. I would recommend you just to develop scalable and expandable system architecture from the very beginning. And use JMeter https://jmeter.apache.org/ for load testing. Then you will be able to scale from 1000s to unlimited connections.
Here is a 1 000 000 connections article http://www.slideshare.net/sh1mmer/a-million-connections-and-beyond-nodejs-at-scale
Remember that your nodejs application will be single threaded. Meaning your performance will degrade horribly when you increase the number of concurrent requests.
What you can do to increase your performance is create a node process for each core that you have on your machine all of them behind a proxy (say nginx), and you can also use multiple machines for your app.
If you make requests only to memcache then your api won't degrade. But once you start querying mysql it will start throttling your other requests.
Edit:
As suggested in the comments you could also use clusters to fork worker processes and let them compete amongst each other for incoming requests. (Workers will run on a separate thread, thereby allowing you to use all cores).
Node.js on multi-core machines
Is there an implementation of database (mysql) query caching written purely in Node.js?
I'm writing a Node web app and was planning on caching queries with memcached, but while considering this I realised it's probably possible to do the caching through a separate Node.js layer instead
To explain:
You could query the database through a node server on a separate port, returning data from memory where available and loading it into memory where it isn't.
Anyone know how Node.js would compare to memcache in terms of return speed on hashed arrays? Is this a pipe-dream or something I should look at?
I went ahead and wrote a caching solution for private use that stored the data in a shared object. This wasn't really query caching, it stores specific results instead of raw sql results ordered by hashes, but it kept what I needed in memory and was ridiculously easy to write.
Since I originally asked this question a number of node caching solutions have emmerged:
ptarjan/node-cache
tcs-de/nodecache
vxtindia/node-cache
mape/node-caching
I haven't used any of these but one of them might well be of use to someone else.
There are now also redis and memcached clients for node.
You can definitely implement something like this in node, and it could be an interesting project, but it depends on your needs. If you're just doing this for a hobby project, by all means, build a caching layer in node and try it out. Let us know how it goes!
If this is for production use, then I would recommend sticking to the established caching layers (memcached, redis, etc) as they have already gone through all of the growing pains associated with building a scalable caching system.
I have written a node.js module that performs MySQL query caching using memcached.
The module is named Memento and is available at https://www.npmjs.com/package/memento-mysql
Enjoy!