PHP, MySQL - Storing data locally vs Fetching from remote every time - mysql

I have a dashboard application that has PHP backend and javascript frontend. Data is read from multiple sources and I have access to databases of all the sources.
While designing the application, is it a good idea to store remote data locally instead of hitting the remote database everytime the application has a request?
Store locally? Reason being the data is not live. I can write a cron to run in the background to update the data every 5 min and the application always will read the data from local DB thereby giving faster load times.
Read from remote every time? Since I have direct database access to all these remote DB's, I do not notice any performance gain of storing data locally over fetching from remote everytime.
Which approach scales better?

What you're describing is called "caching." It's a common optimization.
Fetching data remotely is much more expensive than getting it out of a local cache.
You should learn the Latency Numbers Every Programmer Should Know.
The tricky part of caching is knowing when you need to discard the local cached copy of data and re-fetch it from the remote database. This is a hard problem with no single answer.
There's an old joke attributed to Phil Karlton:
“There are only two hard things in Computer Science: cache invalidation and naming things.”

Related

What is the best strategy to store redis data to MySQL for permanent storage?

I am running a couple of crawlers that produce millions of datasets per day. The bottleneck is the latency between the spiders and the remote database. In case the location of the spider server is too large, the latency will slow the crawler down to a point where it can not longer complete the datasets needed for a day.
In search for a solution I came upon redis with the idea on installing redis the spider server where it will temporarily store the data collected with low latency and then redis will pull that data to mysql some how.
The setup is like this until now:
About 40 spiders running on multiple instances feed one central MySQL8 remote server on a dedicated machine over TCP/IP.
Each spider writes different datasets, one kind of spider gets positions and prices of search results, where there are 100 results with around 200-300 inserts on one page. Delay is about 2-10s between the next request/page.
The later one is the problem as the spider yields every position within that page and creates a remote insert within a transaction, maybe even a connect (not sure at the moment).
This currently only works as spiders and remote MySQL server are close (same data center) with ping times of 0.0x ms, it does not work with ping times of 50ms as the spiders can not write fast enough.
Is redis or maybe DataMQ a valid approach to solve the problem or are there other recommended ways of doing this?
Did you mean you have installed a Redis Server on each spider?
Actually it was not a good solution for you case. But if you have already done this and still want to use MySQL to persistent your data, cronjob on each server will be an option.
You can create a cronjob on each spider server(based on your dataset and your need, you can choose daily or hourly sync job). And write a data transfer script to scan your Redis and transfer to MySQL tables.
I recommend using MongoDB instead of MySQL to store data

Reliability Android when connection is off

I'm developing an App where I store my data in a DB online using HTTP POSTO and GET.
I need to implement some reliability to my software, so if the user presses the button, and there is no connection, the data should be stored in something (file? sqlite?) and then when the connection is again on, send the HTTP request to send data.
Any advices or pieces of code to show me how to do this?
Thanks.
Sounds good and pretty forward for me. Just go.
You use a local sqlite db as "cache". To keep it simple, do not implement any logic about that into your apps normal code. Just use the local db. Then, separately, you code a synchronizer. That one checks for the online connection and synchronizes the the local sqlite database with a remote database, maybe mysql.
This should be perfectly fine for all applications that to not require immediate exchange of the data with other processes all the time.
There is one catch, though: the low performance of sqlite on bigger data sets. That is an issue with all single file database solutions. So this approach probably is only valid for small data sets in total, or if you can reduce the usage of the local database to only a part of the total data, maybe only the time critical stuff.
Another workaround might be to use joins over two separate databases, the local and the remote one. But such things really boost the complexity of code, so think thrice if that really is required.

Rails app with remote database - should I duplicate in my app or connect remotely

I'm building a new rails app for a client. They already have a separate rails app that manages users (with all the standard Devise fields) and don't want to have to maintain users in both apps, which makes total sense.
I'm able to connect to their remote database using database.yml for the connection details and establish_connection: in my User model. It works, although is a bit slow (going over the public internet). I'm concerned that relying on this remote database for something that is queried A LOT will seriously slow down my app. I also won't be able to do joins with the remote database.
My thought is to duplicate the user table in my app and have a cron job that runs once every few hours (or even more frequently) that keeps my table in sync with the "master".
Is there any reason not to do that? Is it a terrible idea from a design perspective?
I should mention that my DB is postgres and the remote DB is mysql. I also started reading up on the DbCharmer gem (http://dbcharmer.net/) but I don't fully understand it yet.
--Edit:--
I should also mention that I will need to read other tables from the remote DB, not just the users table.
I would recommend caching their DB locally, so when you look up a remote record you record it locally (if it existed remotely) or you record a negative result locally if it didn't exist remotely - you cache a record of the remote record's absence. Remember to cache negative results for less time than positive results.
You can then look at your local cache and see if there's a fresh-enough result to return and only query the remote if the locally cached result is stale or there isn't a locally cached result.
This is how I'd do it personally; I'd cache rather than copy and sync. You can certainly combine the two approaches by pre-fetching commonly fetched things into the cache on a regular basis, though.
There's no need to use Pg for the local cache, you can just as easily use redis/memcached/whatever (and I'm a Pg dev, so I'm not exactly biased in favour of Redis).

Mysql with Node.js: Does it make sense to have node.js save/load stuff to/from the database all the time?

So I have a small game in node.js(only the server of course) which has map data and player accounts stored in a mysql database. Right now I constructed it in a way that minimizes the amount of queries made by loading data from the database and keeping it in javascript objects/arrays or whatever seems appropriate and only writing to the database when needed.
Now I was thinking: Is this really worth it? In many cases it would be alot better(in terms of data would be more save and WAY more up-to-date) to hardly store data in the server and just loading it from the database when needed(respectively writing when it needs to be changed).
My question is: Is it efficient/save/recommendable to have the server read/write from the database often rather than having data from the database in javascript variables in the server?
Additional info:
-The nodejs server and my mysql server are on the same machine and a query usually takes less than 1ms or maybe 3ms for big queries like loading room data.
-I am using a module simply called mysql.
-If needed I will include extra info, just ask in a comment.
Really depends on your Use-Case. Generally speaking, I would not add another layer of caching in node.js but handle that in your db with a bigger cache and optimized queries.

best practice: mysql remote mobile devices sync over 3G connection

currently we have one master mysql server that connect every 1 hour to 100 remote mobile devices [vehicles] over 3G connection [not very reliable: get disconnect daily while sync in progress for few cars]. the sync done through .net windows service tool. after checking the remote mysql status the master start perform the sync. sometimes the sync payload data is about 6-8 MB. the sync performed for one table only using non-transactional approach.
mysql server version in use is: 4.1.22
Questions:
is it useful to make the sync transactional knowing that only one table getting sync? or no value added!
the sync data loaded to remote machine using mysql statement:
LOAD DATA LOCAL INFILE
the file format is CSV. how i can send the data in compressed format? without developing tool that reside on the remote device.
is it good practice or architecture in the sync domain to deploy remote application that will perform the sync after sending the data or it should be done directly by the master? i mean the development of tool that will reside on remote machine will be difficult to update or fix in case new requirements appear. but it will save a lot of bandwidth for the sync operation and it will eliminate the errors that could raise from the live master sync in case disconnection occur while the sync is in-progress. so if this is recommend then only compressed data will be sent, then by using some sort of check-sum I'll verify that the whole data sent otherwise the request will be initiated again.
please share your thoughts and experience.
thanks,
Firstly, I would change the approach to a client inited sync vs a server inited sync. A many to one vs one to many approach will expand much easier than your current setup. My above comments give a few good examples of a required client to server syncing.
Secondly, Turn on transactional record entry. There is no reason not to have it. This will guarentee that the information gets entered in a timely fashion and will be able to possibly provide even more 'meta-data' (such as which clients are slow to update, etc...).
Lastly, you can 'enhance' this uploading by taking a different look at it. If you were to implement a sort of service at the server side that takes in a response via a POST from the client, you'd be able to send the data to the server side with no issues. It would be just like 'uploading' a file to a server. Once your 6-8 MB file is 'uploaded' it is then put into the database. The great thing about this is if your server is an APACHE (or even in your case an IIS server), you'd be able to have every single client uploading data at the same time without much of an issue. At that point, uploading to the mysql server via an insert would take virtually no time and your process would continue on without a problem.
This is the way I'd handle your situation...