I want to use redis as cache for mysql, and the main idea is:
Query
read from redis
if not exist, read from mysql, and add to redis cache
Add
write to mysql directly
Update&Delete
write to mysql
invalidate the cache of redis
My question is: how to invalidate the cache?
I know I can delete it, or set a expire time, is it a usual way, or are there any standard methods to invalidate the cache?
Thanks!
You would need a trigger to tell redis that mysql data has been updated.
This can be a part of your code, whenever you save data to mysql, you invalidate the redis cache as well.
You can use streams like http://debezium.io/ to capture changes in the database, and take neccessary actions like invalidating cache.
Related
I have a spring boot service that writes to an apache ignite cache.
The cache is writethrough,writebehind and readthrough.
I have mysql as a persistence store that the cache writesbehind to.
I would like to clear all cache entries that have been written to the mysql database from the cache.
I am using the cache as a staging area so that i do not make a db call every second instead i set the cache to writebehind every 30 seconds.
I would like to know how to clear to cache once writebehind is complete.
You may call cache.clear() - it will not remove entries from underlying 3rd party persistence if that's what you ask.
I think the only way is to extend CacheStore that you use, after writes to DB just create a task to remove persisted records, but make sure that you execute it in async way and with withSkipStore, otherwise you can easily get deadlock or/and thread starvation.
I am trying apache ignite data grid to query cached data using sql.
I could load data into ignite caches on startup from mysql and csv and am able to query using sql.
To deploy in production, in addition to loading cache on startup. I want to keep updating different caches once I have data is available in mysql and when csvs are are created for some caches.
I can not use read through as I will be using sql queries.
How it can be done in ignite ?
Read through cannot be configured for SQL Queries. You can go through this discussion in Apache Ignite Users forum.
http://apache-ignite-users.70518.x6.nabble.com/quot-Read-through-quot-implementation-for-sql-query-td2735.html
If you elaborate a bit on your use case, I can suggest you an alternative.
If you're updating database directly, the only way to achieve this is to manually reload data. You can have a trigger on DB that will somehow initiate the reload, or have a mechanism that will periodically check if there were any changes.
However, the preferable way to do this is to never update DB directly, but always use Ignite API for this with write-through. This way you will guarantee that cache and DB are always consistent.
I have the app a MySQL DB is a slave for other remote Master DB. And i use memcache to do caching of some DB data.
My slave DB can be updated if there are updates in a Master DB. So in my application i want to know when my local (slave) DB is updated to invalidate related cached data and display fresh data i got from master.
Is there any way to run some program when slave mysql DB is updated ? i would then filter q query and understand if i need to clean a cache or not.
Thanks
First of all you are looking for solution similar to what Facebook did in their db architecture (As I remember they patched MySQL for this).
You can build your own solution based on one of these techniques:
Parse replication log on slave side, remove cache entry when you see update of data in the log
Load UDF (user defined function) for memcached, attach trigger on replica side (it will call UDF remove function) to interested tables inside MySQL.
Please note that this configuration is complicated during the support and maintenance. If you can sacrifice stale data in the cache maybe small ttl will help you.
As Kirugan says, it's as simple as writing your own SQL parser, and ensuring that you also provide an indexed lookup keyed to the underlying data for anything you insert into the cache, then cross reference the datasets for any DML you apply to the database. Of course, this will be a lot simpler if you create a simplified, abstract syntax to represent the DML, but thereby losing the flexibilty of SQL and of course, having to re-implement any legacy code using your new syntax. Apart from fixing the existing code, it should only take a year or two to get this working right. Basing your syntax on MySQL's handler API rather than SQL will probably save a lot of pain later in the project.
Of course, if you need full cache consistency then you need to ensure that a logical transaction now spans all the relevant datacentres which will have something of an adverse impact on your performance (certainly much slower than just referencing the master directly).
For a company like facebook, with hundreds of thousands of servers and terrabytes of data (and no requirement for cache consistency) such an approach to solving the problem leads to massive savings. If you only have 2 servers, a better solution would be to switch to multi-master replication, possibly add another database node, optimize the storage (e.g. switching to ssds / adding fast bcache) make sure you have session affinity to the dbms from the aplication (but not stcky sessions) and spend some time tuning your dbms, particularly its cache performance.
I have not come across a good suggestion on how to keep the database and memcache in sync.
I use MySQL 5.5.28, Zope 2.12.19 in my web application.
So, some of the suggestions are like once you do a select from memcache (during a cache hit), it sends the data from the cache. After this cache is invalidated and data is selected again from the database for the cache to be re-populated. But only because the database operations are expensive, we have opted to use cache in the first place. So how is this solving the problem of faster access ?
The other solution seems to be update memcache using triggers on the source table. Any inputs on this would be appreciated as I do not understand how this is done.
Below are the links with the best solutions that I could find to the above questions.
The answer to my first question that mentions about the use of cache with rapidly changing data.
Well, caching is not ideal if the data changes frequently. This is true with less number of users.
But if the number of hits to the website increases, then caching is really useful when the following approach is used:
INSERT, UPDATE or DELETE operations will invoke triggers that would invalidate the cache.
And when the page is loaded, SELECT will be used and the resulting data will be stored in the cache until it is changed again. This way, the application's code does not have to be modified throughout the system by using triggers for INSERT, UPDATE, DELETE on the respective tables. Only SELECT needs to be handled in the code.
Regarding my second question on how to use triggers to manipulate cache, the link below has been extemely useful in answering my question:
http://code.openark.org/blog/mysql/using-memcached-functions-for-mysql-an-automated-alternative-to-query-cache.
Is it possible to add external cache provider like ehCache or memcache to the mySQL database. By doing this I am hoping that the performance of the mySQL can be improved. Is this possible to do?
Not "add to MySQL" per se as far as I know. You can easily use Memcache(d) for instance in your system if you want: not to be rude about it, but basically just 'use' it.
Now:
You have code that requests data from your database.
With Memcache:
You have code that checks if the data is in the cache. If it isn't, request it from the database and add it to the cache, then return.
In this case you didn't add it to MySQL itself, but it does help you get better results (if you do it right ofcourse). For sql cache you have the query-cache from the database itself, but that only works for especially-equal (that's a strange term) queries.