I'm using the DBI package to send queries to a MySQL server. I'd like to assure that these queries are sent as a single transaction in order to avoid table lock.
I use the dbSendQuery function to send queries:
df <- fetch(dbSendQuery(connection,
statement = "SELECT *
FROM table"),
n = -1)
The DBI package says little about handling transactions, but what it does have is listed under these functions: dbCommit, dbRollback nor dbCallProc under the header:
Note: The following methods deal with transactions and store
procedures.
in the vignette. None seem to relate to sending queries as a single transaction.
How can I make sure I'm sending these queries as a single transaction?
Warning: not tested.
You would need some help from MySQL. By default, MySQL runs with auto commit mode enabled. To disable auto commit mode, you would need to issue a START TRANSACTION statement. I suspect dbCommit and dbRollback simply execute COMMIT and ROLLBACK, respectively.
Details: http://dev.mysql.com/doc/refman/5.0/en/commit.html
So you would need to do something like
dbSendQuery(connection, "START TRANSACTION")
# add your dbSendQuery code here
dbCommit(connection)
Related
We have a process where we need to test a SQL script before it runs against a production database. The approach is to execute the script within a transaction with a rollback statement at the end, capturing before/after logs illuminating the effects of the scripts.
select now();
start transaction;
select 'data before any changes', ...;
<insert / update / delete statements>;
select 'data after changes', ...;
rollback;
We've used this approach for many years with MSSQL, however we're having trouble implementing with mysql. If everything is working to-plan, things are great and it works exactly as expected. However, if we run into any errors (ex: typo in SQL or table constraint violation), the script aborts ... but COMMITS whatever ran before the error!!
How is this remotely considered an acceptable outcome? It drastically reduces the effectiveness of transactions if an error condition results in committing changes. I'm gobsmacked at this behavior--simply cannot understand the rationale.
I've been searching all over for a resolution to this and am coming up basically empty-handed. Options I've seen:
Catch the error in client code and rollback. This isn't really an option, as our workflow process is just running the SQL script--and the "commit" needs to happen within the script rather than caller code (after testing the script, we run it without the transaction or rollback).
Use DECLARE ... HANDLER FOR ...; to rollback on error. This looked promising, but appears to be only supported within a stored procedure (or similar) and therefore not suitable for one-off scripts.
What am I missing? Thanks
The SQLAlchemy's docs say: "When you use a session to query, the session get transaction started". In fact, every operation is a transaction.
But it has a question, I'm using a MySQL middleware--mycat for Read/Write Splitting, if you send a transaction, all queries are transferred to the write server, even if it is a select query.
I wish to let the query not enable transactions without using the raw SQL query. So how to stop SQLAlchemy session transaction? Or change MySQL middleware to better?
I'm not sure if this is an issue with phpMyAdmin, or that I'm not fully understanding how transactions work, but I want to be able to step through a series of queries within a transaction, and either ROLLBACK or COMMIT based on the returned results. I'm using the InnoDB storage engine.
Here's a basic example;
START TRANSACTION;
UPDATE students
SET lastname = "jones"
WHERE studentid = 1;
SELECT * FROM students;
ROLLBACK;
As a single query, this works entirely fine, and if I'm happy with the results, I could re-run the entire query with COMMIT.
However, if all these queries can be ran seperately, why does phpMyAdmin lose the transaction?
For example, if I do this;
START TRANSACTION;
UPDATE students
SET lastname = "jones"
WHERE studentid = 1;
SELECT * FROM students;
Then this;
COMMIT;
SELECT * FROM students;
The update I made in the transaction is lost, and lastname retains its original value, as if the update never took place. I was under the impression that transactions can span multiple queries, and I've seen a couple of examples of this;
1: Entirely possible in Navicat, a different IDE
2: Also possible in PHP via MySQLi
Why then am I losing the transaction in phpMyAdmin, if transactions are able to span multiple individual queries?
Edit 1: After doing a bit of digging, it appears that there are two other ways a transaction can be implicitly ended in MySQL;
Disconnecting a client session will implicitly end the current
transaction. Changes will be rolled back.
Killing a client session will implicitly end the current
transaction. Changes will be rolled back.
Is it possible that phpMyAdmin is ending the client session after Go is hit and a query is submitted?
Edit 2:
Just to confirm this is just a phpMyAdmin-specific issue, I ran the same query across multiple seperate queries in MySQL Workbench, and it worked exactly as intended, retaining the transaction, so it appears to be a failure on phpMyAdmin's part.
Is it possible that phpMyAdmin is ending the client session after Go is hit and a query is submitted?
That is pretty much how PHP works. You send the request, it get's processed, and once done, everything (including MySQL connections) gets thrown away. With next request, you start afresh.
There is a feature called persistent connections, but that is as well doing it's clean up. Otherwise the code would have to somehow handle giving the same user the same connection. Which could prove very difficult given the way PHP works.
I have a multithreaded application that periodically fetches the whole content of the MySQL table (with SELECT * FROM query)
The application is written in python, uses threading module to multithreading and uses mysql-python (mysqldb) as MySQL driver (using mysqlalchemy as a wrapper produces similar results).
I use InnoDB engine for my MySQL database.
I wrote a simple test to check the performance of SELECT * query in parallel and discovered that all of those queries are implemented sequentially.
I explicitly set the ISOLATION LEVEL to READ UNCOMMITTED, although it does not seem to help with performance.
The code snipper making the DB call is below:
#performance.profile()
def test_select_all_raw_sql(conn_pool, queue):
'''
conn_pool - connection pool to get mysql connection from
queue - task queue
'''
query = '''SELECT * FROM table'''
try:
conn = conn_pool.connect()
cursor = conn.cursor()
cursor.execute("SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED")
# execute until the queue is empty (Queue.Empty is thrown)
while True:
id = queue.get_nowait()
cursor.execute(query)
result = cursor.fetchall()
except Queue.Empty:
pass
finally:
cursor.execute("SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ")
conn.close()
Am I right expecting this query to be executed in parallel?
If yes, how can I implement that in python?
MySQL allows many connections from a single user or many users. Within that one connection, it uses at most one CPU core and does one SQL statement at a time.
A "transaction" can be composed of multiple SQL statements while the transaction is treated as atomically. Consider the classic banking application:
BEGIN;
UPDATE ... -- decrement from one user's bank balance.
UPDATE ... -- increment another user's balance.
COMMIT;
Those statements are performed serially (in a single connection); either all of them succeed or all of them fail as a unit ("atomically").
If you need to do things in "parallel", have a client (or clients) that can run multiple threads (or processes) and have each on make its own connection to MySQL.
A minor exception: There are some extra threads 'under the covers' for doing background tasks such as read-ahead or delayed-write or flushing stuff. But this does not give the user a way to "do two things at once" in a single connection.
What I have said here applies to all versions of MySQL/MariaDB and all client packages accessing them.
I am working with MySQL 5.0 from python using the MySQLdb module.
Consider a simple function to load and return the contents of an entire database table:
def load_items(connection):
cursor = connection.cursor()
cursor.execute("SELECT * FROM MyTable")
return cursor.fetchall()
This query is intended to be a simple data load and not have any transactional behaviour beyond that single SELECT statement.
After this query is run, it may be some time before the same connection is used again to perform other tasks, though other connections can still be operating on the database in the mean time.
Should I be calling connection.commit() soon after the cursor.execute(...) call to ensure that the operation hasn't left an unfinished transaction on the connection?
There are thwo things you need to take into account:
the isolation level in effect
what kind of state you want to "see" in your transaction
The default isolation level in MySQL is REPEATABLE READ which means that if you run a SELECT twice inside a transaction you will see exactly the same data even if other transactions have committed changes.
Most of the time people expect to see committed changes when running the second select statement - which is the behaviour of the READ COMMITTED isolation level.
If you did not change the default level in MySQL and you do expect to see changes in the database if you run a SELECT twice in the same transaction - then you can't do it in the "same" transaction and you need to commit your first SELECT statement.
If you actually want to see a consistent state of the data in your transaction then you should not commit apparently.
then after several minutes, the first process carries out an operation which is transactional and attempts to commit. Would this commit fail?
That totally depends on your definition of "is transactional". Anything you do in a relational database "is transactional" (That's not entirely true for MySQL actually, but for the sake of argumentation you can assume this if you are only using InnoDB as your storage engine).
If that "first process" only selects data (i.e. a "read only transaction"), then of course the commit will work. If it tried to modify data that another transaction has already committed and you are running with REPEATABLE READ you probably get an error (after waiting until any locks have been released). I'm not 100% about MySQL's behaviour in that case.
You should really try this manually with two different sessions using your favorite SQL client to understand the behaviour. Do change your isolation level as well to see the effects of the different levels too.