django/innodb -- problem with old sessions and transactions - mysql

We just switched our MySQL database from MyIsam to Innodb, and we are seeing an odd issue arise in Django. Whenever we make a database transaction, the existing sessions do not pick it up...ever. We can see the new record in the database from a mysql terminal, but the existing django sessions (ie a shell that was already open), would not register the change. For example:
Shell 1:
>>> my_obj = MyObj.objects.create(foo="bar")
>>> my_obj.pk
1
Shell 2 (was open before the above)
>>> my_obj = MyObj.objects.filter(pk=1)
[]
Shell 3 (MySQL):
mysql> select id from myapp_my_obj where id = 1;
id
1
Does anyone know why this might be happening?
EDIT: To clarify, Shell 2 was opened before Shell 1, then I make the create Shell 1, then I try to view the object that I created in Shell 2.
EDIT2: The big picture is that I have a celery task that is being passed the primary key from the object that is created. When I was using MyISAM, it found it every time, and now it throws ObjectDoesNotExist, even though I can see that the object is created in the database.

Your create() command commits the transaction for the current shell, but doesn't do anything to the transaction in the second shell.
https://docs.djangoproject.com/en/dev/topics/db/transactions/
Your second thread that can't see what's done in the first because it is in a transaction of its own. Transactions isolate the database so that when a transaction is committed, everything happens at a single point in time, including select statements. This is the A in ACID. Try running
from django.db import transaction; transaction.commit()
in the second shell. That should commit the current transaction and start a new one. You can also use transaction.rollback() to acheive the same thing if you haven't modified anything in the db in the current shell.
Edit Edit:
You may need to grab your specific db connection to make this work. Try this:
import django.db
django.db.connection._commit()
More information about this problem here:
http://groups.google.com/group/django-users/msg/55fa3724d2754013
The relevant bit is:
If you want script1.py (using an InnoDB table) to see committed updates from
other transactions you can change the transaction isolation level like so:
from django.db import connection
connection.cursor().execute('set transaction isolation level read
committed')
Alternatively you can enable the database's version of auto-commit, which
"commits" queries as well as updates, so that each new query by script1 will
be in its own transaction:
connection.cursor().execute('set autocommit=1')
Either one allows script1 to see script2's updates.
So, the tl;dr is that you need to set your InnoDB transaction isolation to READ-COMMITTED.

Related

Sphinx indexer does not fetch data when MySQL transaction is used

please help me with running phpunit tests that check module with sphinx search engine.
To search in that module I use two sphinx indexes docs and docsdelta. After new data are appear in the DB, I do following:
exec("indexer docsdelta --rotate");
exec("indexer --merge docs docsdelta --rotate");
It works well on my website and I am able to add new document through the web interface and it appears in the search.
On the same time when I run phpunit test and it creates new document "on fly",
exec("indexer docsdelta --rotate");
does not fetch any new data. My phpunit tests use transactions to rollback the database to the previous state and I notice, that indexer works properly in case if I switch off transactions. Additionally I am able to see new data in the DB, just before and after running indexer. Maybe I missed something but I do not understand why transaction has an influence on indexer.
Is it some way to use indexer docdelta together with MySQL transaction?
Thank you in advance for help!
To make changes you make inside the transaction visible outside, i.e. to indexer you need to change isolation level of the indexer's SELECT queries. You can do it like this:
sql_query_pre = SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
You can read more about mysql isolation levels here https://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.html

Getting stale results in multiprocessing environment

I am using 2 separate processes via multiprocessing in my application. Both have access to a MySQL database via sqlalchemy core (not the ORM). One process reads data from various sources and writes them to the database. The other process just reads the data from the database.
I have a query which gets the latest record from the a table and displays the id. However it always displays the first id which was created when I started the program rather than the latest inserted id (new rows are created every few seconds).
If I use a separate MySQL tool and run the query manually I get correct results, but SQL alchemy is always giving me stale results.
Since you can see the changes your writer process is making with another MySQL tool that means your writer process is indeed committing the data (at least, if you are using InnoDB it does).
InnoDB shows you the state of the database as of when you started your transaction. Whatever other tools you are using probably have an autocommit feature turned on where a new transaction is implicitly started following each query.
To see the changes in SQLAlchemy do as zzzeek suggests and change your monitoring/reader process to begin a new transaction.
One technique I've used to do this myself is to add autocommit=True to the execution_options of my queries, e.g.:
result = conn.execute( select( [table] ).where( table.c.id == 123 ).execution_options( autocommit=True ) )
assuming you're using innodb the data on your connection will appear "stale" for as long as you keep the current transaction running, or until you commit the other transaction. In order for one process to see the data from the other process, two things need to happen: 1. the transaction that created the new data needs to be committed and 2. the current transaction, assuming it's read some of that data already, needs to be rolled back or committed and started again. See The InnoDB Transaction Model and Locking.

Should I commit after a single select

I am working with MySQL 5.0 from python using the MySQLdb module.
Consider a simple function to load and return the contents of an entire database table:
def load_items(connection):
cursor = connection.cursor()
cursor.execute("SELECT * FROM MyTable")
return cursor.fetchall()
This query is intended to be a simple data load and not have any transactional behaviour beyond that single SELECT statement.
After this query is run, it may be some time before the same connection is used again to perform other tasks, though other connections can still be operating on the database in the mean time.
Should I be calling connection.commit() soon after the cursor.execute(...) call to ensure that the operation hasn't left an unfinished transaction on the connection?
There are thwo things you need to take into account:
the isolation level in effect
what kind of state you want to "see" in your transaction
The default isolation level in MySQL is REPEATABLE READ which means that if you run a SELECT twice inside a transaction you will see exactly the same data even if other transactions have committed changes.
Most of the time people expect to see committed changes when running the second select statement - which is the behaviour of the READ COMMITTED isolation level.
If you did not change the default level in MySQL and you do expect to see changes in the database if you run a SELECT twice in the same transaction - then you can't do it in the "same" transaction and you need to commit your first SELECT statement.
If you actually want to see a consistent state of the data in your transaction then you should not commit apparently.
then after several minutes, the first process carries out an operation which is transactional and attempts to commit. Would this commit fail?
That totally depends on your definition of "is transactional". Anything you do in a relational database "is transactional" (That's not entirely true for MySQL actually, but for the sake of argumentation you can assume this if you are only using InnoDB as your storage engine).
If that "first process" only selects data (i.e. a "read only transaction"), then of course the commit will work. If it tried to modify data that another transaction has already committed and you are running with REPEATABLE READ you probably get an error (after waiting until any locks have been released). I'm not 100% about MySQL's behaviour in that case.
You should really try this manually with two different sessions using your favorite SQL client to understand the behaviour. Do change your isolation level as well to see the effects of the different levels too.

Mysql InnoDB table locked but I can "select" from another session. What gives?

During my development of some code, I needed to 'write lock' an InnoDB table in order to avoid race conditions concurrency problems. 'read lock' is not good enough as some parallel session that will 'read' a locked table (locked by other session) will get false data as what it reads might evaporate (deleted) once the locking session finishes its job.
Thus far as to why I need 'write lock'. Comments are welcome on this but it will simply take long to explain why (to my humble mind) I cannot see any way other than complete terminal lock of the table.
Now, for my tests, I have opened two mysql command line sessions, both with regular user (no root or similar). In one session I did:
lock tables mytable write;
which resulted ok (uery OK, 0 rows affected...)
On the second command line session I connected to same DB and run a simple select * on the same table. To my surprise I got a full response.
In more tests from the actual web application I did notice that on some use cases that involve the web app (PHP + PDO with persistent connections attribute on) a command line or web mysql connectivity did block until the lock was released but I did not identified what exactly caused this (desired) effect, and it involves also different environment (PHP + PDO as detailed and command line vs. 2 command line sessions).
My question is: why? why wouldn't the second command line session, running a simple 'select' on the write-locked table blocked?
Does this has to do with the nature of InnoDB locks which is row-based? If so, how exactly does this relate?
How do I get such a simple lock implemented on an InnoDB table. I know I can create a 'semaphore' MyIsam table with no purpose other than act as a 'traffic light' but that will lose the effect of DB level protection and will move all the protection to be done (or wrongly done) in the app level.
TIA!
MySQL version is 5.1.54 (Ubuntu 11.04).
While InnoDB has row level locking, it also has multi-version concurrency control http://en.wikipedia.org/wiki/Multiversion_concurrency_control, so this means that readers don't need to be blocked by writers. They can just see the current version of the record. (Technical implementation, on update the row is modified in place and the previous edition will be written to undo space for older transactions.)
If you want to make the write lock block readers, you need to change the SELECT to be FOR UPDATE (i.e. SELECT * FROM my_table WHERE cola = n FOR UPDATE).

Force Django to commit

Setup:
Python script A inserts data to a database every 15 minutes
Python script B queries a few of the latest entries from the database every few minutes
Both use Django's ORM, run on the same machine and use a local MySQL database.
The Problem:
B fetches entries, except for the latest one, even though A saves it minutes before.
I suspected that A doesn't close the transaction, thus B sees the database without the last entry. Indeed when examining the MySQL logs, I noticed the commit for each INSERT happens right before the next INSERT.
Even though it's supposed to be redundant, I added #commit_on_success decorator to the A function that includes the save(), but it did not help.
How can I force Django (or MySQL?!) to commit right after the save()?
UPDATE:
I discovered that the commits DO happen - I was mislead to believe they don't because MySQL's General Query Log only has 1 sec resolution.
In light of this and other new information, I've reasked the question here.
You can use the commit_manually decorator and call it whenever you want.
Straight from the documentation:
from django.db import transaction
#transaction.commit_manually
def viewfunc(request):
...
# You can commit/rollback however and whenever you want
transaction.commit()
...
# But you've got to remember to do it yourself!
try:
...
except:
transaction.rollback()
else:
transaction.commit()
This answers the question you asked, though I wonder if there might be something else at work.
NOTE: commit_manually was deprecated in 1.6 and removed in 1.8.
The problem is caused by that MySQL by default has a REPEATABLE-READ transaction isolation level. That means that the same query is expected to return the same values. Changes won't return. You can do two things here:
Set transaction isolation level to READ-COMMITTED in the settings. As explained here.
Force a commit, thus close the transaction on script B, so as a new transaction starts, you will see all changes before this new transaction. Model.objects.update() does the trick.