Sphinx indexer does not fetch data when MySQL transaction is used - mysql

please help me with running phpunit tests that check module with sphinx search engine.
To search in that module I use two sphinx indexes docs and docsdelta. After new data are appear in the DB, I do following:
exec("indexer docsdelta --rotate");
exec("indexer --merge docs docsdelta --rotate");
It works well on my website and I am able to add new document through the web interface and it appears in the search.
On the same time when I run phpunit test and it creates new document "on fly",
exec("indexer docsdelta --rotate");
does not fetch any new data. My phpunit tests use transactions to rollback the database to the previous state and I notice, that indexer works properly in case if I switch off transactions. Additionally I am able to see new data in the DB, just before and after running indexer. Maybe I missed something but I do not understand why transaction has an influence on indexer.
Is it some way to use indexer docdelta together with MySQL transaction?
Thank you in advance for help!

To make changes you make inside the transaction visible outside, i.e. to indexer you need to change isolation level of the indexer's SELECT queries. You can do it like this:
sql_query_pre = SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
You can read more about mysql isolation levels here https://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.html

Related

Getting stale results in multiprocessing environment

I am using 2 separate processes via multiprocessing in my application. Both have access to a MySQL database via sqlalchemy core (not the ORM). One process reads data from various sources and writes them to the database. The other process just reads the data from the database.
I have a query which gets the latest record from the a table and displays the id. However it always displays the first id which was created when I started the program rather than the latest inserted id (new rows are created every few seconds).
If I use a separate MySQL tool and run the query manually I get correct results, but SQL alchemy is always giving me stale results.
Since you can see the changes your writer process is making with another MySQL tool that means your writer process is indeed committing the data (at least, if you are using InnoDB it does).
InnoDB shows you the state of the database as of when you started your transaction. Whatever other tools you are using probably have an autocommit feature turned on where a new transaction is implicitly started following each query.
To see the changes in SQLAlchemy do as zzzeek suggests and change your monitoring/reader process to begin a new transaction.
One technique I've used to do this myself is to add autocommit=True to the execution_options of my queries, e.g.:
result = conn.execute( select( [table] ).where( table.c.id == 123 ).execution_options( autocommit=True ) )
assuming you're using innodb the data on your connection will appear "stale" for as long as you keep the current transaction running, or until you commit the other transaction. In order for one process to see the data from the other process, two things need to happen: 1. the transaction that created the new data needs to be committed and 2. the current transaction, assuming it's read some of that data already, needs to be rolled back or committed and started again. See The InnoDB Transaction Model and Locking.

Should I commit after a single select

I am working with MySQL 5.0 from python using the MySQLdb module.
Consider a simple function to load and return the contents of an entire database table:
def load_items(connection):
cursor = connection.cursor()
cursor.execute("SELECT * FROM MyTable")
return cursor.fetchall()
This query is intended to be a simple data load and not have any transactional behaviour beyond that single SELECT statement.
After this query is run, it may be some time before the same connection is used again to perform other tasks, though other connections can still be operating on the database in the mean time.
Should I be calling connection.commit() soon after the cursor.execute(...) call to ensure that the operation hasn't left an unfinished transaction on the connection?
There are thwo things you need to take into account:
the isolation level in effect
what kind of state you want to "see" in your transaction
The default isolation level in MySQL is REPEATABLE READ which means that if you run a SELECT twice inside a transaction you will see exactly the same data even if other transactions have committed changes.
Most of the time people expect to see committed changes when running the second select statement - which is the behaviour of the READ COMMITTED isolation level.
If you did not change the default level in MySQL and you do expect to see changes in the database if you run a SELECT twice in the same transaction - then you can't do it in the "same" transaction and you need to commit your first SELECT statement.
If you actually want to see a consistent state of the data in your transaction then you should not commit apparently.
then after several minutes, the first process carries out an operation which is transactional and attempts to commit. Would this commit fail?
That totally depends on your definition of "is transactional". Anything you do in a relational database "is transactional" (That's not entirely true for MySQL actually, but for the sake of argumentation you can assume this if you are only using InnoDB as your storage engine).
If that "first process" only selects data (i.e. a "read only transaction"), then of course the commit will work. If it tried to modify data that another transaction has already committed and you are running with REPEATABLE READ you probably get an error (after waiting until any locks have been released). I'm not 100% about MySQL's behaviour in that case.
You should really try this manually with two different sessions using your favorite SQL client to understand the behaviour. Do change your isolation level as well to see the effects of the different levels too.

mysql ensure only one access at a time

I'm connecting to a mysql-database by several threads in Java. Sometimes, the threads are reading and updating the same column of a database-table so that some inconsistency appears.
In Java there is a synchronized keyword which limits the access to one ressource. Are there any possible limitation for the mysql-database? So that these inconsistencies do not occur?
You should use transactions with the appropiate isolation level.
Simplified example:
BEGIN TRANSACTION
...
COMMIT
Mysql docs about transactions Mysql docs about isolation levels.
Try using a version column, that way you can know if someone messed with your data while you work on it.
meaning: add a column named "updated" of type datetime, and whenever updating check that the updated you got is the same as the one on row in db, if they're not you know someone worked on your record.

django/innodb -- problem with old sessions and transactions

We just switched our MySQL database from MyIsam to Innodb, and we are seeing an odd issue arise in Django. Whenever we make a database transaction, the existing sessions do not pick it up...ever. We can see the new record in the database from a mysql terminal, but the existing django sessions (ie a shell that was already open), would not register the change. For example:
Shell 1:
>>> my_obj = MyObj.objects.create(foo="bar")
>>> my_obj.pk
1
Shell 2 (was open before the above)
>>> my_obj = MyObj.objects.filter(pk=1)
[]
Shell 3 (MySQL):
mysql> select id from myapp_my_obj where id = 1;
id
1
Does anyone know why this might be happening?
EDIT: To clarify, Shell 2 was opened before Shell 1, then I make the create Shell 1, then I try to view the object that I created in Shell 2.
EDIT2: The big picture is that I have a celery task that is being passed the primary key from the object that is created. When I was using MyISAM, it found it every time, and now it throws ObjectDoesNotExist, even though I can see that the object is created in the database.
Your create() command commits the transaction for the current shell, but doesn't do anything to the transaction in the second shell.
https://docs.djangoproject.com/en/dev/topics/db/transactions/
Your second thread that can't see what's done in the first because it is in a transaction of its own. Transactions isolate the database so that when a transaction is committed, everything happens at a single point in time, including select statements. This is the A in ACID. Try running
from django.db import transaction; transaction.commit()
in the second shell. That should commit the current transaction and start a new one. You can also use transaction.rollback() to acheive the same thing if you haven't modified anything in the db in the current shell.
Edit Edit:
You may need to grab your specific db connection to make this work. Try this:
import django.db
django.db.connection._commit()
More information about this problem here:
http://groups.google.com/group/django-users/msg/55fa3724d2754013
The relevant bit is:
If you want script1.py (using an InnoDB table) to see committed updates from
other transactions you can change the transaction isolation level like so:
from django.db import connection
connection.cursor().execute('set transaction isolation level read
committed')
Alternatively you can enable the database's version of auto-commit, which
"commits" queries as well as updates, so that each new query by script1 will
be in its own transaction:
connection.cursor().execute('set autocommit=1')
Either one allows script1 to see script2's updates.
So, the tl;dr is that you need to set your InnoDB transaction isolation to READ-COMMITTED.

How can i found out if there is transaction open in MySQL?

How can i find out if there is transaction open in mySQL? I need to start new one if there is no transaction open, but i don't want to start new one if there is one running, because that would commit that running transaction.
UPDATE:
i need to query database in one method of my application, but that query could be called as part of bigger transaction, or just as it should be a transaction on its own. Changing application to track if it has open a transaction would be more difficult, as it could started from many pieces of code. Although it would be possible, i'm looking for solution that would be faster to implement. Simple if statement in sql would be effortless.
Thank you
I am assuming you are doing this as a one-off and not trying to establish something that can be done programatically. You can get the list of currently active processes using: SHOW PROCESSLIST
http://dev.mysql.com/doc/refman/5.1/en/show-processlist.html
If you want something programatic, then I would suggest an explicit lock on a table, at the beginning of your transaction and release it at the end.