Django save() behavior with autocommit transactions - mysql

I have a following setup:
Several data processing workers get configuration from django view get_conf() by http.
Configuration is stored in django model using MySQL / InnoDB backend
Configuration model has overridden save() method which tells workers to reload configuration
I have noticed that sometimes the workers do not receive the changed configuration correctly. In particular, when the conf reload time was shorter than usual, the workers got "old" configuration from get_conf() (missing the most recent change). The transaction model used in Django is the default autocommit.
I have come up with the following possible scenario that could cause the behavior:
New configuration is saved
save() returns but MySQL / InnoDB is still processing the (auto)commit
Workers are booted and make http request for new configuration
MySQL (auto)commit finishes
Is the step 2 in the above scenario possible? That is, can django model save() return before the data is actually committed in the DB if the autocommit transactional method is being used? Or, to go one layer down, can MySQL autocommitting INSERT or UPDATE operation finish before the commit is complete (update / insert visible to other transactions)?

Object may be getting dirty, please try refresh object after save.
obj.save()
obj.refresh_from_db()
reference: https://docs.djangoproject.com/en/1.8/ref/models/instances/#refreshing-objects-from-database

This definitely looks like a race condition.
The scenario you describe should never happen if there's only one script and one database. When you save(), the method doesn't return until the data is actually commited to the database.
If however you're using a master/slave configuration, you could be the victim of the replication delay: if you write on the master but read on the slaves, then it is entirely possible that your script doesn't wait long enough for the replication to occur, and you read the old conf from the slave before it had the opportunity to replicate the master.
Such a configuration can be set up in django using database routers, or it can be done on the DB side by using a DB proxy. Check that out.

Related

What could cause mysql db read to return stale data

I am chasing a problem on a mysql application. At some point my client INSERTs some data, using a query wrapped in a START TRANSACTION; .... COMMIT; statement. Right after that another client comes are read back the data, and it is not there (I am sure of the order of things).
I am running nodejs, express, mysql2, and use connection pooling, with multiple statements queries.
What is interesting is that I see weird things on mysqlworkbench. I just had a workbench instance which would not see the newly inserted data either. I opened a second one, it saw the new data. Minutes later, the first instance would still not see the new data. Hit 'Reconnect to DBMS', and now it sees it. The workbench behaviour, if applied to my node client, would explain the bad result I see in node / mysql2.
There is some sort of caching going on somewhere... no idea where to start :-( Any pointers? Thanks!
It sounds like your clients are living in their own snapshot of the database, which would be true if they have an open transaction using the REPEATABLE-READ isolation level. In other words, no data committed after that client started its transaction will be visible to that client.
One workaround is to force a new transaction to start. Just run COMMIT in the client session where it appears to be viewing stale data. That will resolve any open transaction and the next query will start a new transaction.
Another way you can test is to use a locking read query such as SELECT ... FOR UPDATE. This will read the most recently committed data, regardless of the client's transaction isolation level. That is, even if the client had started their transaction using REPEATABLE-READ, a locking read behaves as if they had started their transaction with READ-COMMITTED.

Reset MYSQL database to core data (docker)

In order to test my nodejs microservice architecture I am trying to build the entire architecture with docker. Now I want to run tests with newman (postman). In the before-each hook, so before every http test request, the database(s) should have a predefined dataset.
So now to the core question: Is there a simple way to reset the entire database, so that the architecture stays (does it anyway) but the data in the database gets reset to a predefined state. (Maybe via sql statement?)
I read about ROLLBACK, but I think this is not going to work due to the fact that the ROLLBACK is going to happen from another service within my architecture. Also there is not only one mysql request happening, but multiple msql request during one http test request.
Regards

Clear Apache Ignite Cache after WriteBehind

I have a spring boot service that writes to an apache ignite cache.
The cache is writethrough,writebehind and readthrough.
I have mysql as a persistence store that the cache writesbehind to.
I would like to clear all cache entries that have been written to the mysql database from the cache.
I am using the cache as a staging area so that i do not make a db call every second instead i set the cache to writebehind every 30 seconds.
I would like to know how to clear to cache once writebehind is complete.
You may call cache.clear() - it will not remove entries from underlying 3rd party persistence if that's what you ask.
I think the only way is to extend CacheStore that you use, after writes to DB just create a task to remove persisted records, but make sure that you execute it in async way and with withSkipStore, otherwise you can easily get deadlock or/and thread starvation.

master-slave in dropwizard and hibernate

I have an application written in Dropwizard and using hibernate to connect to DB(mysql). Due to new features being released, I am expecting high load for read apis and thinking of providing the reads from slave DB. What are the different ways in which i can configure master-slave and tradeoffs.
The way I have solved:
I have 2 session factories in my case: one is default which talks to master and other one with a name say "slaveDb" which talk to slave database.
I have created different dao for same entity one for slave interactions and one for master. in slave dao i am binding it with the slaveSessionFactory
Now unit of work annotation has one attribute "value" if you don't use it , which we do not in many cases then the annotation processor will talk on top of the default session factory. If you define a name over here then annotation processor will use the session factory with that particular name.
P.S. In my case I have a single slave as the application is not that high load and I wanted slave just for report generation purpose. In case of many slaves this solution doesn't scale well. Also As I was giving the slave machine details in my config.yaml file , I need not set the underlying connection as read only.
If you are using #UnitOfWork Annotation , then no and yes.
No, they donot directly allow you to communicate db using read only.
Yes, you can create two resources each using different db (master slave).
One resource for writes and critically reads(master) another for read only(slave).
https://groups.google.com/forum/#!topic/dropwizard-user/nxURxVWDtEY
Also as link suggest mysql driver can do this automatically, but for that session readOnly should be true which UnitOfWorkApplicationListener doesnot set properly even if you set readOnly true in #UnitOfWork.

Grails session handling in waiting thread with Hibernate and MySQL InnoDB

In order to realize client-side notifications in an AJAX-driven application that I am developing with Grails (and GWT), I have implemented a service method that will block until it is being signaled. I am using a monitor object to wait for a signal. Once signaled, the thread will query the database for new objects and then return the entities to the browser.
It is working perfectly fine with the memory database but not as I expect when I use the MySQL database connector. What happens: whenever I do a findAllBy... call it will only find objects that were created before the request started.
The lifecycle of my service method
request from client
Hibernate session is being created by Grails
service querying database for new objects
if there are none: wait
incoming signal: query database for new objects (DOES NOT GET NEW OBJECTS when using MySQL, works fine with memory db)
The mysql query log shows all the queries as expected but the result of findAllBy... is just an empty array.
I disabled query and second level cache. Behaviour is the same no matter if data connection is pooled or not.
What am I doing wrong? Should I close the Hibernate session? Flush it? Use a transaction for my queries? Or somehow enforce the findAllBy... method to query the database?
Just a guess, but this sounds like a transaction isolation level problem where you are experiencing a phantom read. Does your service need need to be transactional? If not set transactional=false in the service.
I think that you need to flush the session on the save calls for the new objects that you are looking for, e.g.
DomainOfFrequentlyAddedStuff.save(flush:true)
Then they should be persisted to the db quickly so they will show up in your findAll() query.