I'm using z3c.saconfig to configure sqlalchemy in a Plone/Zope application. In this application, we created a Session sqlalchemy with named_scoped_session("dbmyapp") z3c.saconfig method. The Session is created and works very well. But we created just one Session for the app.
Can this [one Session sqlalchemy / app] be a bottleneck for app?
By the way, can we create more than one Session per app? Are there any advantages?
snippet of buildout.cfg:
<configure xmlns="http://namespaces.zope.org/zope"
xmlns:db="http://namespaces.zope.org/db">
<include package="z3c.saconfig" file="meta.zcml" />
<db:engine name="dbmyapp" url="oracle://user:pass#hostname:port/sid" />
<db:session name="dbmyapp" engine="dbmyapp" />
</configure>
The session machinery takes care of providing you with one connection per thread; since you can only execute sequential code within one thread that connection cannot become a bottleneck.
Different parts of the code can request their own session; the session machinery will reuse session connections as required. This is not something you generally have to worry about, that's all handled for you by z3c.saconfig and its dependencies.
Related
We have a web service built using Asp.net Web API. We use NHibernate as our ORM connecting to a MySQL database.
We have a couple of controller methods that do a large number (1,000-3,000) of relatively cheap queries.
We're looking at improving the performance of these controller methods and almost all of the time is spent doing the NHibernate queries so that's where we're focusing our attention.
In the medium term the solutions are things like reducing the number of queries (perhaps by doing fewer larger queries) and/or to parallelize the queries (which would take some work since NHibernate does not have an async api and sessions are single threaded) and things like that.
In the short term we're looking at improving the performance without taking on either of those larger projects.
We've done some performance profiling and were surprised to find that it looks like a lot of the time in each query (over half) is spent opening the connection to MySQL.
It appears that NHibernate is opening a new connection to MySQL for each query and that MySqlConnection.Open() makes two round trips to the database each time a connection is opened (even when the connection is coming from the pool).
Here's a screenshot of one of our performance profiles where you can see these two things:
We're wondering if this is expected or if we're missing something like a misconfiguration/misuse of NHibernate or a way to eliminate the two round trips to the database in MySqlConnection.Open().
I've done some additional digging and found something interesting:
If we add .SetProperty(Environment.ReleaseConnections, "on_close") to the NHibernate configuration then Open() is no longer called and the time it takes to do the query drops by over 40%.
It seems this is not a recommended setting though: http://nhibernate.info/doc/nhibernate-reference/transactions.html#transactions-connection-release
Based on the documentation I expected to get the same behavior (no extra calls to Open()) if I wrapped the reads inside a single NHibernate transaction but I couldn’t get it to work. This is what I did in the controller method:
using (var session = _sessionFactory.OpenSession()) {
using (var transaction = session.BeginTransaction()) {
// controller code
transaction.Commit();
}
}
Any ideas on how to get the same behavior using a recommended configuration for NHibernate?
After digging into this a bit more it turns out there was a mistake in my test implementation and after fixing it using transactions eliminates the extra calls to Open() as expected.
Not using transaction is considered a bad practice, so starting to add them is anyway welcome.
Moreover, as you seem to have find out by yourself, the default connection release mode auto currently always translates to AfterTransaction, which with NHibernate (v2 to v4 at least) releases connections after each statement when no transactions are ongoing for the session.
From Connection Release Modes:
Note that with ConnectionReleaseMode.AfterTransaction, if a session is considered to be in auto-commit mode (i.e. no transaction was started) connections will be released after every operation.
So simply transacting your session usages should do it. As this is not the case for your application, I suspect other issues.
Is your controller code using other sessions? NHibernate explicit transactions apply only to the session from which their were started (or to sessions opened from that session with ISession.GetSession(EntityMode.Poco)).
So you need to handle a transaction for each opened session.
You may use a TransactionScope instead for wrapping many sessions in a single transaction. But each session will still open a dedicated connection. This will in most circumstances promote the transaction to distributed, which has a performance penalty and may fail if your server is not configured to enable it.
You may configure and use a contextual session instead for replacing many sessions per controller action by only one. Of course you can use dependency injection instead for achieving this too.
Notes:
About reducing the number of queries issued by an application, there are some easy to leverage features in NHibernate:
Batching of lazy-loads (Batch fetching): configure your lazily loaded entities and collections of entities to not only load themselves, but also some others awaiting entities (of the same class) or collections of entities (same collections of other parent entities). Add batch-size attribute on collections and classes. I have written a detailed explanation of it in this other answer.
Second level cache, which will allow caching of data across http requests. Transactions mandatory for it to work.
Future queries, as proposed by Low.
Going parallel for a web API looks to me as a doomed road. Threads are a valuable ressource for web application. The more threads a request uses, the less requests the web application will be able to serve in parallel. So going that way will very likely be a major pain for your application scalability.
The OnClose mode is not recommended because it delays connection releasing to session closing, which may occur quite late after the last transaction, especially when using contextual session. Since it looks like your session usage is very localized, likely with a closing very near the last query, it should not be an issue for your application.
parallelize the queries (which would take some work since NHibernate
does not have an async api and sessions are single threaded) and
things like that.
You can defer the execution of the queries using NHibernate Futures,
Following code (extracted from reference article) will execute single query despite there are 2 values retrieved,
using (var s = sf.OpenSession())
using (var tx = s.BeginTransaction())
{
var blogs = s.CreateCriteria<Blog>()
.SetMaxResults(30)
.Future<Blog>();
var countOfBlogs = s.CreateCriteria<Blog>()
.SetProjection(Projections.Count(Projections.Id()))
.FutureValue<int>();
Console.WriteLine("Number of blogs: {0}", countOfBlogs.Value);
foreach (var blog in blogs)
{
Console.WriteLine(blog.Title);
}
tx.Commit();
}
You can also use NHibernate Batching to reduce the number of queries
I am performing benchmark testing for the application that i am currently working on. After a lot of iteration, we could identify that the time taking component.
Its a web based application using Spring Data JPA with hibernate as persistence provider.
From the monitoring tool we found that class Proxy For org.springframework.orm.jpa.SharedEntityManagerCreator:invoke:289 is where a lot time spent when running more number(2000) of concurrent threads.
Kindly let me know the possible cause and solution please.
Below are the versions i am working with
Spring - 4.1.7.RELEASE
Hibernate - 4.2.15.Final
Spring Data JPA - 1.8.0.RELEASE
Below is the drilled down call graph
SharedEntityManagerCreator is creating a fresh EntityManager instance for that particular thread. If you work with JPA, that's what's supposed to happen as an EntityManager by definition of the spec is not thread-safe.
The line number you posted implies that it's the reflective method call on the EntityManager instance is taking that much time. So I'd inspect what method actually gets called, what it does and why it takes so long. SharedEntityManagerCreator is basically just forwarding the call.
I'm having a confusion about the session object in SQLAlchemy. Is it like the PHP session where a session is all the transactions of a users or is a session an entity which scopes the lifetime of a transaction.
For every transaction in SQLAlchemy, is the procedure as follows:
-create and open session
-perform transaction
-commit or rollback
-close session
So, my question is, for a client, do we create a single session object, or a session object is created whenever we have a transaction to perform
I would be hesitant to compare a SQLAlchemy session with a PHP session, since typically a PHP session refers to cookies, whereas SQLAlchemy has nothing to do with cookies or HTTP at all.
As explained by the documentation:
A Session is typically constructed at the beginning of a logical
operation where database access is potentially anticipated.
The Session, whenever it is used to talk to the database, begins a
database transaction as soon as it starts communicating. Assuming the
autocommit flag is left at its recommended default of False, this
transaction remains in progress until the Session is rolled back,
committed, or closed. The Session will begin a new transaction if it
is used again, subsequent to the previous transaction ending; from
this it follows that the Session is capable of having a lifespan
across many transactions, though only one at a time. We refer to these
two concepts as transaction scope and session scope.
The implication here is that the SQLAlchemy ORM is encouraging the
developer to establish these two scopes in his or her application,
including not only when the scopes begin and end, but also the expanse
of those scopes, for example should a single Session instance be local
to the execution flow within a function or method, should it be a
global object used by the entire application, or somewhere in between
these two.
As you can see, it is completely up to the developer of the application to determine how to use the session. In a simple desktop application, it might make sense to create a single global session object and just keep using that session object, committing as the user hits "save". In a web application, a "session per request handled" strategy is often used. Sometimes you use both strategies in the same application (a session-per-request for web requests, but a single session with slightly different properties for background tasks).
There is no "one size fits all" solution for when to use a session. The documentation does give hints as to how you might go about determining this.
I'm actually using SQLAlchemy with MySQL and Pyro to make a server program. Many clients connect to this server to make requests. The programs only provides the information from the database MySQL and sometimes make some calculations.
Is it better to create a session for each client or to use the same session for every clients?
What you want is a scoped_session.
The benefits are (compared to a single shared session between clients):
No locking needed
Transactions supported
Connection pool to database (implicit done by SQLAlchemy)
How to use it
You just create the scoped_session:
Session = scoped_session(some_factory)
and access it in your Pyro methods:
class MyPyroObject():
def remote_method(self):
Session.query(MyModel).filter...
Behind the scenes
The code above guarantees that the Session is created and closed as needed. The session object is created as soon as you access it the first time in a thread and will be removed/closed after the thread is finished (ref). As each Pyro client connection has its own thread on the default setting (don't change it!), you will have one session per client.
The best I can try is to create new Session in every client's request. I hope there is no penalty in the performance.
This is more of a follow up thread though the question is very generic. I have a Django web application with REST API (implemented with tastypie) accessed through Apache web server.
I am adding API call logging functionality so that for every call to the web application I will be creating an entry in a specific application log table in MySQL database.
User base of this application is limited. I am not expecting large volume of concurrent API calls at this point or near future.
I have the following options:
1. multithread locking
2. multiprocess locking mechanisms
3. ORM transaction or data base locking
I am not sure if I should use any locking feature at all for wrapping MySQL db entry creation / update operations.
How are these kind of cases anyway dealt with for large volume of concurrent POST API calls in Django web applications?
Thanks,