Rails 3: How to implement a query cache in MongoDB - mysql

I did some research about MongoDB and recognised that it doesn't have any query cache.
MongoDB does not implement a query cache: MongoDB serves all queries directly from the indexes and/or data files. (http://docs.mongodb.org/manual/faq/fundamentals/)
Is there a way to implement a query cache in Rails for MongoDB? I just want the same behaviour as the MySQL query cache. The same database query should be more faster on the second time.
Thanks!

You could add a caching layer using memcached, but MongoDB will probably still have the data paged into memory from the last read/write operation already. Using your MongoDB server memory for memcached will compete with it's memory mapped file model. Less memory for MongoDB means more swapping to disk.
If you're running map reduce jobs (large enough to cause paging), it may be worth caching results, but tracking updates properly could be very tricky.

Related

Why is MySQL more used than redis in persistence

I think two reasons
1 Mysql and redis both provide persistence, but why mysql is is used more than redis in persistence? Maybe redis has no index and cannot be used to answer queries directly from disk. But since we can query from memory, there is no need query from disk.
2 Redis saves data to disk on a periodic basis, then data loss may occur, but does Mysql save data to disk immediately after insert without time window?
Redis and MySQL are really two very different technologies. Redis primarily serves as a cache for storing data temporarily as a key-value store. While it is true that Redis can be configured to write back to a database or file under the hood, Redis itself is neither of these things. Instead, Redis is meant to store data which generally would be considered volatile.
On the other hand, MySQL is a database and a full blown data store. MySQL is suitable for permanently storing data, and also exposes a rich API for making it easy to query and search its data.
In terms of common ground, a query against a MySQL column which has a hash index would behave somewhat similarly to a lookup in a Redis cache, each using a certain key. But the difference is that, in general, Redis will perform about 100 times faster than a database. For this reason, when a lightning fast cache technology is needed, MySQL often will not be suitable for this purpose, but a cache like Redis might be suitable.

where to use MYSQL query caching

problem
My question is I am developing a system. I click a query caching concept for fast response time. now I want to find which kind of traffic on system on web application is better for query caching and which is not. and what is the downside of query caching.
Whether Query Cache is good for you? Depends on
what MySQL version are you using
what is the scale of your application
what kind of queries you want to cache
How it works
If MySQL Query Cache is used, MySQL won't go to the trouble of parsing the query every time the query is hit. MySQL will look for a identical query in the query cache whenever a query is hit and if it finds the query, it won't need to parse it again, it will just send it to the server and fetch the results.
Issues & Limitations
Please do remember that the cache doesn't store data of your query. You will not receive old/stale data from a cached query. It just stores the parsed query. But a point to be made here is that if the underlying tables (of the cached query) undergo any change, all the tables being used in the cache will be invalidated.
Among other things, there are serious limitations to the Query Cache. Cached queries cannot be used for stored procedures, functions and triggers. They're also not used for queries which are subqueries of an outer query.
It was once considered a great tool for speeding up the queries, but recently MySQL development team has decided to retire this feature as they found some scalability issues with the query cache.
Do read this this article on MySQL Server Team's blog about retiring the Query Cache in MySQL 8.0

Caching the data result of complex computation

I have a Spring Boot server application. Clients of this server ask for statistics about different things all the time. These statistics can be shared among clients, and must not be real time.
It's good enough if these statistics are refreshed every 15-30 mins.
Also, computing these statistics requires reading the whole database.
So, I'd like to cache these computed statistics and update them now and then.
What is your suggestion, what tool or pattern should I use?
I have the following ideas so far:
using memcached
upgrading to MySQL 5.7 which has JSON store, and store the data there
Please keep in mind that the hardware of my server is not too powerful: 512MB RAM and 1 CPU (cheapest option in DigitalOcean).
Thank you in advance!
Edit 1:
These statistics are composed of quite simple data structures: int to int maps, lists, etc. and they are NOT fitting well for a relational database.
Edit 2:
The whole data is only a few megabytes. The crutial point is that creating this data requires a lot of database reads, and a lot of clients are asking for it.
I also want to keep my server application stateless. I think it's important to mention.
A simple solution for the problem, is saving the data in JSON format to a file, and that's it.
Additionally, this file can be on a ram disk partition, so it will be blazing fast.

Using Redis to cache SQL result

I have a SQL-based application and I like to cache the result using Redis. You can think of the application as an address book with multiple SQL tables. The application performs the following tasks:
40% of the time:
Create a new record / Update an existing record
Bulk update multiple records
Review an existing record
60% of the time:
Search records based on user's criteria
This is my current approach:
The system cache a record when a record is created or updated.
When user performs a search, the system will cache the query result.
On top of that, I have a Redis look-up table (Redis Set) which stores the MySQL record ID and the Redis cache key. That way I can delete the Redis caches if the MySQL record has been changed (e.g., bulk update).
What if a new record is created after the system cache the search result? If the new record matches the search criteria, the system will always return the old cache (which does not include the new record), until the cache is deleted (which won't happen until an existing record in the cache is updated).
The search is driven by the users and the combination of the search condition is countless. It is not possible to evaluate which cache should be deleted when a new record is created.
So far, the only solution is to remove all caches of a MySQL table when a record is created. However this is not a good choice because lots of records are created daily.
In this situation, what's the best way to implement Redis on top of MySQL?
Here's a surprising thing when it comes to PHP and MySQL (I am not sure about other languages) - not caching stuff into memcached or Redis is actually faster. Much faster. Basically, if you just built your app and queried MySQL - you'd get more out of it.
Now for the "why" part.
InnoDB, the default engine, is a superb engine. Specifically, it's memory management (allocation and what not) is superior to any memory storage solutions. That's a fact, you can look it up or take my word for it - it will, at least, perform as good as Redis.
Now what happens in your app - you query MySQL and cache the result into redis. However, MySQL is also smart enough to keep cached results. What you just did is create an additional file descriptor that's required to connect to Redis. You also used some storage (RAM) to cache the result that MySQL already cached.
Here comes another interesting part - the preferred way of serving PHP scripts is by using php-fpm - it's much quicker than any mod_* crap out there. Down to the core, php-fpm is a supervisor process that spawns child processes. They don't shut down after the script is served, which means they cache connections to MySQL - connect once, use multiple times. Basically, if you serve scripts using php-fpm, they will reuse the already established connection to MySQL, meaning that you won't be opening and closing connections for each request - this is extremely resource friendly and it lets you have lightning fast connection to MySQL. MySQL, being memory efficient and having the cached result is much quicker than Redis.
Now what does all of this mean for you - having a proper setup lets you have small code that's simple, easy, doesn't involve Redis and eliminates all the problems that you might have with cache invalidation and what not and you won't waste your memory to contain the same data twice.
Ingredients you need for this to work:
php-fpm
MySQL and InnoDB based tables and most of all - sufficient RAM and tweaked innodb_buffer_pool_size variable. That one controls how much RAM InnoDB is allowed to allocate for its purposes - the larger the better.
You eliminated Redis from the game, you kept your code simple and easy to maintain, you didn't duplicate data, you didn't introduce additional system to the play and you let software that's meant to take care of data do its job. Pretty cheap trade-off for maximum usefulness, even if you compile all the software from scratch - it won't take more than an hour or so to get it up and running.
Or, you can just ignore what I wrote and look for a solution using Redis.
We met the same problem and we chose to do same thing you are thinking of: remove all query caches affected by the table. It is not ideal like your said but fortunately our "write" is not as high as 40% so it's ok so far.
That's the nature of query based caching. As an alternative you can add entity based caching. Instead of caching the search result only, cache the entire table and do the search inside memory. We use C# LINQ so we can do pretty common queries in memory but if the search is too complicated then you are out of luck.

MySQL Cluster with Memcache Support

I've been reading up about MySQL Cluster 7, and it appears that there is some support for a memcache storage engine.
Does the implemenation require any custom code in the application (making requests to memcache), or is it integrated to the point where I could
select cars.* from cars WHERE cars.id = 100
and MySQL cluster + memcache would be able to "automatically" look at the memcache cache first, and if there wasn't a hit, look in MySQL?
Like wise with update - Would i manually have to set the data in memcache with every modify or is there a mechanism that will do it for me?
Memcached would not provide the functionality that you describe. Memcached is key-value storage, and it does not automatically cache any query results. You would need to write code to store the results. Some frameworks make this easier.
MySQL's query caching can cache query results, but you're still hitting MySQL.
MySQL's NDB cluster is a clustered in-memory storage engine that is able to serve up relational data very fast thanks to load balancing and partitioning.
Take a look at this blog to learn more about the implementation and capabilities of the memcached API for MySQL Cluster:
http://www.clusterdb.com/mysql-cluster/scalabale-persistent-ha-nosql-memcache-storage-using-mysql-cluster/
Essentially the API is implemented as a plug-in to the memcached server which can then communicate directly with the data nodes, via memcached commands, without going through an SQL layer - giving you very fast native access to your data, with full persistence, scalability, write throughput and schema or schemaless data storage