I have a spring boot service that writes to an apache ignite cache.
The cache is writethrough,writebehind and readthrough.
I have mysql as a persistence store that the cache writesbehind to.
I would like to clear all cache entries that have been written to the mysql database from the cache.
I am using the cache as a staging area so that i do not make a db call every second instead i set the cache to writebehind every 30 seconds.
I would like to know how to clear to cache once writebehind is complete.
You may call cache.clear() - it will not remove entries from underlying 3rd party persistence if that's what you ask.
I think the only way is to extend CacheStore that you use, after writes to DB just create a task to remove persisted records, but make sure that you execute it in async way and with withSkipStore, otherwise you can easily get deadlock or/and thread starvation.
Related
I'm working on a akka-http/slick web service, and I need to do the following in a transaction:
Insert a row in a table
Call some external web service
Commit the transaction
The web service I need to call is sometimes really slow to respond (let's say ~2 seconds).
I'm worried that this might keep the SQL connection open for too longer, and that'll exhaust Slick's connection pool and affect other independent requests.
Is this a possibility? Or does Slick do something to make sure this "idle" mid-transaction connection does not starve the pool?
If it is something I should be worried about - is there anything I can do to remedy this?
If it matters, I'm using MySQL with TokuDB.
The slick documentation seems to say that this will be a problem.
The use of a transaction always implies a pinned session.
And
You can use withPinnedSession to force the use of a single session, keeping the existing session open even when waiting for non-database computations.
From: http://slick.lightbend.com/doc/3.2.0/dbio.html#transactions-and-pinned-sessions
I want to use redis as cache for mysql, and the main idea is:
Query
read from redis
if not exist, read from mysql, and add to redis cache
Add
write to mysql directly
Update&Delete
write to mysql
invalidate the cache of redis
My question is: how to invalidate the cache?
I know I can delete it, or set a expire time, is it a usual way, or are there any standard methods to invalidate the cache?
Thanks!
You would need a trigger to tell redis that mysql data has been updated.
This can be a part of your code, whenever you save data to mysql, you invalidate the redis cache as well.
You can use streams like http://debezium.io/ to capture changes in the database, and take neccessary actions like invalidating cache.
I am trying apache ignite data grid to query cached data using sql.
I could load data into ignite caches on startup from mysql and csv and am able to query using sql.
To deploy in production, in addition to loading cache on startup. I want to keep updating different caches once I have data is available in mysql and when csvs are are created for some caches.
I can not use read through as I will be using sql queries.
How it can be done in ignite ?
Read through cannot be configured for SQL Queries. You can go through this discussion in Apache Ignite Users forum.
http://apache-ignite-users.70518.x6.nabble.com/quot-Read-through-quot-implementation-for-sql-query-td2735.html
If you elaborate a bit on your use case, I can suggest you an alternative.
If you're updating database directly, the only way to achieve this is to manually reload data. You can have a trigger on DB that will somehow initiate the reload, or have a mechanism that will periodically check if there were any changes.
However, the preferable way to do this is to never update DB directly, but always use Ignite API for this with write-through. This way you will guarantee that cache and DB are always consistent.
In my work I need to revamp the web which need to accept numerous connection always. Before I use the JSON to get the data until now.But now I want to direct call the DB and get the data. As I know use cache is the best way for my web. But in initial the concurrent access to DB is often happen.Any advice for me to handle the situation. Because I want the web that can get the updated data always.
Thanks.
Following are my suggestions
If you want to use cache, you have to automate your cache clear process whenever there is an update in the particular data you hit. But this is practically possible if your data is updated infrequently.
If your budget allows, Put your DB in a cluster (Write in master and read from master&slave)
In worst case,ensure your db is properly indexed.
I have a following setup:
Several data processing workers get configuration from django view get_conf() by http.
Configuration is stored in django model using MySQL / InnoDB backend
Configuration model has overridden save() method which tells workers to reload configuration
I have noticed that sometimes the workers do not receive the changed configuration correctly. In particular, when the conf reload time was shorter than usual, the workers got "old" configuration from get_conf() (missing the most recent change). The transaction model used in Django is the default autocommit.
I have come up with the following possible scenario that could cause the behavior:
New configuration is saved
save() returns but MySQL / InnoDB is still processing the (auto)commit
Workers are booted and make http request for new configuration
MySQL (auto)commit finishes
Is the step 2 in the above scenario possible? That is, can django model save() return before the data is actually committed in the DB if the autocommit transactional method is being used? Or, to go one layer down, can MySQL autocommitting INSERT or UPDATE operation finish before the commit is complete (update / insert visible to other transactions)?
Object may be getting dirty, please try refresh object after save.
obj.save()
obj.refresh_from_db()
reference: https://docs.djangoproject.com/en/1.8/ref/models/instances/#refreshing-objects-from-database
This definitely looks like a race condition.
The scenario you describe should never happen if there's only one script and one database. When you save(), the method doesn't return until the data is actually commited to the database.
If however you're using a master/slave configuration, you could be the victim of the replication delay: if you write on the master but read on the slaves, then it is entirely possible that your script doesn't wait long enough for the replication to occur, and you read the old conf from the slave before it had the opportunity to replicate the master.
Such a configuration can be set up in django using database routers, or it can be done on the DB side by using a DB proxy. Check that out.