What does synchronize_session mean in sqlalchemy? [duplicate] - sqlalchemy

I am trying to update some records in the table using the following code:
session.query(Post).filter(
Post.title.ilike("%Regular%")
).update({"status": False})
But the problem is that the code throws the following exception:
InvalidRequestError: Could not evaluate current criteria in Python: "Cannot evaluate BinaryExpression with operator <function ilike_op at 0x7fbb88450ea0>". Specify 'fetch' or False for the synchronize_session parameter.
However, if I pass synchronize_session=False to the update(), it works miraculously.
session.query(Post).filter(
Post.title.ilike("%Regular%")
).update({"status": False}, synchronize_session=False)
So what the use of synchronize_session?

Query.update is a bulk operation, that is it operates outside of Sqlalchemy's unit of work transaction model.
synchronize_session provides a way to specify whether the update should take into account data that is in the session, but is not in the database.
From the docs:
synchronize_session
chooses the strategy to update the attributes on objects in the
session. Valid values are:
False - don’t synchronize the session. This option is the most efficient and is reliable once the session is expired, which typically
occurs after a commit(), or explicitly using expire_all(). Before the
expiration, updated objects may still remain in the session with stale
values on their attributes, which can lead to confusing results.
So, with synchronize_session=False, the values updated in the database will not be updated in the session.
'fetch' - performs a select query before the update to find objects that are matched by the update query. The updated attributes
are expired on matched objects.
Passing fetch makes sqlalchemy identify values in the session affected by the update, and when they are next accessed sqlalchemy will query the database to get their updated values
'evaluate' - Evaluate the Query’s criteria in Python straight on the objects in the session. If evaluation of the criteria isn’t
implemented, an exception is raised.
In your code, you do not specify a value for synchronize_session so the default value, evaluate applies. Sqlalchemy can't find a way to do ilike on your model without delegating to the database so it raises an exception to make the developer decide whether or not to synchronize the values in the session with the values in the database.

Related

Why do writes to MySQL fail to persist when run from Python's unittest?

I am currently writing a test case for a newly-created MySQL database using unittest in Python 3.8. The database is an AWS RDS instance running Aurora MySQL 5.6 — it has a table users with a single primary key field uuid VARCHAR(36). The test case is as follows:
import unittest
import mysql.connector
from config import MYSQL_CONNECTION_INFO
class SQLSchemaTests(unittest.TestCase):
"""Verifies the correct behavior of the schema itself. (i.e. that the tables were set up correctly)"""
def setUp(self):
self.cnxn = mysql.connector.connect(**MYSQL_CONNECTION_INFO)
self.cursor = self.cnxn.cursor()
def tearDown(self):
self.cnxn.close()
def test_create_users(self):
"""Verify that a client can create user entries in the data store with appropriate parameters."""
self.cursor.execute("SELECT COUNT(*) from users")
user_entries_count = self.cursor.fetchone()[0]
self.assertEqual(user_entries_count, 0)
self.cursor.execute("INSERT INTO users (uuid) VALUES ('aaa-bbb-ccc-ddd-eee')")
self.cursor.execute("SELECT COUNT(*) from users")
user_entries_count = self.cursor.fetchone()[0]
self.assertEqual(user_entries_count, 1)
What confuses me is that this test case passes every time it's run — in other words with no cleanup action on my part it doesn't fail due to duplicate entries. I used PyCharm's debugger to place a breakpoint after the INSERT statement, and then ran the SELECT COUNT(*) from users in a separate database console while test execution was paused: the result came back as zero. What is more, when I used the database console to write an identical entry to the users table only then did the test fail due to a duplicate entry.
I'd like to know the following:
Why don't the INSERT statements within the unit test persist to the table? Is it caused by the MySQL connector, unittest, or something else?
What are the rules that dictates how this happens? Under what circumstances is this behavior guaranteed?
Is there any official documentation that could clarify these points?
In order to see the insertion persist between tests I needed to add self.cnxn.commit() after I called execute on the INSERT statement: the Python connector docs specify that auto-commit is disabled by default.
Moreover, the reason that I could get back an updated count from within the test but not from the separate database console is due to transaction isolation at the database level (in this case, set to REPEATABLE-READ). More information is available in the MySQL docs and in the Wikipedia article on isolation in databases.

How does Hibernate get the AutoIncrement Value on Identity Insert

I am working on a high scale application of the order of 35000 Qps, using Hibernate and MySQL.
A large table has AutoIncrement Primary key, and generation defined is IDENTITY at Hibernate. Show Sql is true as well.
Whenever an Insert happens I see only one query being fired in DB, which is an
Insert statement.
Few Questions Follow:
1) I was wondering how does Hibernate get the AutoIncrement Value after insert?
2) If the answer is "SELECT LAST_INSERT_ID()", why does it not show up at VividCortex or in Show Sql Logs...?
3) How does "SELECT LAST_INSERT_ID()" account for multiple autoincrements in different tables?
4) If MySql returns a value on Insert, why aren't the MySql clients built so that we can see what is being returned?
Thanks in Advance for all the help.
You should call SELECT LAST_INSERT_ID().
Practically, you can't do the same thing as the MySQL JDBC driver using another MySQL client. You'd have to write your own client that reads and writes the MySQL protocol.
The MySQL JDBC driver gets the last insert id by parsing packets of the MySQL protocol. The last insert id is returned in this protocol by a MySQL result set.
This is why SELECT LAST_INSERT_ID() doesn't show up in query metrics. It's not calling that SQL statement, it's picking the integer out of the result set at the protocol level.
You asked how it's done internally. A relevant line of code is https://github.com/mysql/mysql-connector-j/blob/release/8.0/src/main/protocol-impl/java/com/mysql/cj/protocol/a/result/OkPacket.java#L55
Basically, it parses an integer from a known position in a packet as it receives a result set.
I'm not going to go into any more detail about parsing the protocol. I don't have experience coding a MySQL protocol client, and it's not something I wish to do.
I think it would not be a good use of your time to implement your own MySQL client.
It probably uses the standard JDBC mechanism to get generated values.
It's not
You execute it imediately after inserting in one table, and you thus get the values that have been generated by that insert. But that's not what is being used, so it's irrelevant
Not sure what you mean by that: the MySQL JDBC driver allows doing that, using the standard JDBC API
(Too long for a comment.)
SELECT LAST_INSERT_ID() uses the value already available in the connection. (This may explain its absence from any log.)
Each table has its own auto_inc value.
(I don't know any details about Hibernate.)
35K qps is possible, but it won't be easy.
Please give us more details on the queries -- SELECTs? writes? 35K INSERTs?
Are you batching the inserts in any way? You will need to do such.
What do you then use the auto_inc value in?
Do you use BEGIN..COMMIT? What value of autocommit?

Perl5 DBI Mysql: reliable way to get last_insert_id

In my code I use database->last_insert_id(undef,undef,undef,"id"); to get the autoincremented primary key. This works 99.99% of the time. But once in a while it returns a value of 0.
In such situations, Running a select with a WHERE clause similar to the value of the INSERT statement shows that the insert was successful. Indicating that the last_insert_id method failed to get the proper data.
Is this a known problem with a known fix? or should I be following up each call to last_insert_id with a check to see if it is zero and if yes a select statement to retrieve the correct ID value?
My version of mysql is
mysql Ver 14.14 Distrib 5.7.19, for Linux (x86_64)
Edit1: Adding the actual failing code.
use Dancer2::Plugin::Database;
<Rest of the code to create the insert parameter>
eval{
database->quick_insert("build",$job);
$job->{idbuild}=database->last_insert_id(undef,undef,undef,"idbuild");
if ($job->{idbuild}==0){
my $build=database->quick_select("build",$job);
$job->{idbuild}=$build->{idbuild};
}
};
debug ("=================Scheduler build Insert=======================*** ERROR :Got Error",$#) if $#;
Note: I am using Dancer's Database plugin. Whose description says,
Provides an easy way to obtain a connected DBI database handle by
simply calling the database keyword within your Dancer2 application
Returns a Dancer::Plugin::Database::Core::Handle object, which is a
subclass of DBI's DBI::db connection handle object, so it does
everything you'd expect to do with DBI, but also adds a few
convenience methods. See the documentation for
Dancer::Plugin::Database::Core::Handle for full details of those.
I've never heard of this type of problem before, but I suspect your closing note may be the key. Dancer::Plugin::Database transparently manages database handles for you behind the scenes. This can be awfully convenient... but it also means that you could change from using one dbh to using a different dbh at any time. From the docs:
Calling database will return a connected database handle; the first time it is called, the plugin will establish a connection to the database, and return a reference to the DBI object. On subsequent calls, the same DBI connection object will be returned, unless it has been found to be no longer usable (the connection has gone away), in which case a fresh connection will be obtained.
(emphasis mine)
And, as ysth has pointed out in comments on your question, last_insert_id is handle-specific, which suggests that, when you get 0, that's likely to be due to the handle changing on you.
But there is hope! Continuing on in the D::P::DB docs, there is a database_connection_lost hook available which is called when the database connection goes away and receives the defunct handle as a parameter, which would allow you to check and record last_insert_id within the hook's callback sub. This could provide a way for you to get the id without the additional query, although you'd first have to work out a means of getting that information from the callback to your main processing code.
The other potential solution, of course, would be to not use D::P::DB and manage your database connections yourself so that you have direct control over when new connections are created.

Can MySqlBulkLoader be used with a transaction?

Can MySqlBulkLoader be used with a transaction? I don't see a way to explicitly attach a transaction to an instance of the loader. Is there another way?
As stated here by member of MySQL documentation team:
It's not atomic. The records loaded prior to the error will be in the
table.
Work arround is to import data to dedicated table and then execute INSERT INTO ... SELECT ... which will be atomic operation. On huge data sets this is potential problem becasue of long transaction.
The MySQL manual indicates that the MySqlBulkLoader is a wrapper of 'LOAD DATA INFILE'. While looking at the 'LOAD DATA INFILE' documentation I noticed this paragraph:
If you specify IGNORE, input rows that
duplicate an existing row on a unique
key value are skipped. If you do not
specify either option, the behavior
depends on whether the LOCAL keyword
is specified. Without LOCAL, an error
occurs when a duplicate key value is
found, and the rest of the text file
is ignored. With LOCAL, the default
behavior is the same as if IGNORE is
specified; this is because the server
has no way to stop transmission of the
file in the middle of the operation.
I found no discussion on transactions but the above paragraph would indicate that transactions are not possible.
A workaround would be to import the data into a import table and then use a separate stored procedure to process the data using transactions into the desired table.
So in answ

SqlDateTime overflow on INSERT when date is correct using a Linq to SQL DataContext

I get an SqlDateTime overflow error (Must be between 1/1/1753 12:00:00 AM and 12/31/9999 11:59:59 PM.) when doing an INSERT using an Linq DataContext connected to SQL Server database when I do the SubmitChanges().
When I use the debugger the date value is correct. Even if I temporary update the code to set the date value to DateTime.Now it will not do the insert.
Did anybody found a work-around for this behaviour? Maybe there is a way to check what SQL the datacontext submits to the database.
Do you have the field set as autogenerated in the designer? If that's not the problem, I'd suggest setting up logging of the data context actions to the console and checking the actual SQL generated to make sure that it's inserting that column, then trace backward to find the problem.
context.Log = Console.Out;
FWIW, I often set my "CreatedTime" and "LastUpdatedTime" columns up as autogenerated (and readonly) in the designer and give them a suitable default or use a DB trigger to set the value on insert or update. When you set it up as autogenerated, it won't include it in the insert/update even if modified. If the column doesn't allow nulls, then you need to supply an alternate means of setting the value, thus the default constraint and/or trigger.
Are you sure you're looking at the right Date column? Happened to me once, and the error turned out to be caused by another non-nullable Date column that wasn't set before submitting.
I came across this recently.
The error may as well say "something's preventing the save!". Because in my case, it was not the DateTime value that was the problem.
I thought I was passing a value in for the primary key, and what was arriving was "null". Being the key, it can't be null - and so my problem was completely somewhere else. By resolving the null, the problem disappeared.
We all hate misleading errors - and this is one of them.
Lastly, as a suggestion... If you do find conversion of dates a problem, then don't use dates at all! .NET's DateTime class supports the "Ticks" value. It can also instantiate a new DateTime(ticks); too. The only Gotcha with that one, is the implementation of ticks in Javascript has a different starting point in history. So you might want a conversion between ticks if you ever tried getting DateTimes from c# to Javascript.
I suggest you change your project's Target Framework. Maybe SQL Server is newer than .Net Framework. I see the same your issue:
My project's Target Framework is 3.5.
SQL Server is 2012
And then I change to 4.0. The issue is solved.
Bottom line: watch the order of your calls to SubmitChanges() and ensure that all objects that would be "submitted" are actually ready to be submitted. This often happens to me when I'm in the middle of setting the attributes of new LINQ object (e.g, the ".FirstName" of new "tblContact"), and then some conditional logic requires the creation of a separate, related record (e.g., a new "tblAddress" record), so the code goes to create the "tblAddress" and tries to SubmitChanges() on saving that record, but that SubmitChanges() then also tries to insert the unfinished "tblContact" record, which maybe doesn't yet have a required "BirthDate" field value set. Thus, the exception looks to occur when I'm inserting the "tblAddress" object/record, but actually refers to the lack of "BirthDate" for the "tblContact" object/record.