I'm using the autobahn server in twisted to provide an RPC API. Some calls require queries to the database and multiple clients may be connected via websocket to the server.
I am using the SqlAlchemy ORM to access the database.
What are the pros and cons of the two following approaches for dealing with SqlAlchemy sessions.
Create and destroy a session for every RPC call
Create a single session when the server starts and use it in every RPC call
Which would you recommend and why? (I'm leaning towards 2)
The recommended way of doing SQL-based database access from Twisted (and Autobahn) with databases like PostgreSQL, Oracle or SQLite would be twisted.enterprise.adbapi.
twisted.enterprise.adbapi will run queries on a background thread pool, which is required, since most database drivers are blocking.
Sidenote: for PostgreSQL, there is a native-asynchronous, non-blocking
driver also: txpostgres.
Now, if you put an ORM like SQLAlchemy on top of the native SQL driver, I'm not sure how this will work together (if at all) with twisted.enterprise.adbapi.
So from the options you mention
Is a no go, since most drivers are blocking (and Autobahn's RPCs run on the main thread = Twisted reactor thread - and you MUST not block that).
With this, you need to put the database session(s) in background threads (again, to not block).
Also see here.
If you're using SQLAlchemy and Twisted together, consider using Alchimia rather than the built-in adbapi.
Related
After reaching some bottlenecks with the MySQL cluster my application uses (too many connections, deadlocks), I've been advised to queue all the writes using Kafka or Redis. A near real-time solution sounds, indeed, much better than having database connection errors in production, so I went for this recommendation.
However, I'm having troubles understanding where does Kafka (or a Redis stream) fit in my architecture. I currently use multiple MySQL servers with an HAProxy in front, then split read and write queries to different ports, to prevent the cluster from going out of sync. Each port is then directed to one server (in case of write queries) or many servers (in case of read queries).
I assume I would have to configure the write endpoint so that all write queries go through Kafka first, but it seems like I simply know too little to connect the dots based on the tutorials I've been watching (most examples turn certain MySQL tables into streams, but that would be too late for my use case).
My question would be: is there a simple way to redirect SQL queries to Kafka (something like switching the MySQL host and port configuration to Kafka or using some kind of plug-in) or do I have to write my own adapter?
(I'm using PHP with Laravel for my app)
I just did some reading about serverless computing and FaaS. If using FaaS to access an arbitrary database, we need each time to establish and close a database connection. In, lets say a node applications, we would usually establish the connection once and reuse it for multiple requests.
Correct?
I have a hosted MongoDB at mlab and thought about implementing a REST API with Googles Cloud Functions Service. Don't know how to handle the database connection efficient.
For sure thing get clearer while coding and testing. But I would like to know chances to succeed before spending a lot of time.
Thanks
Stefan
Serverless platforms reuse the underlying containers between distinct function invocations whenever possible. Hence you can set up a database connection pool in the global function scope and reuse it for subsequent invocations - as long as the container stays warm. GCP has a guide here using MySQL but I imagine the same applies to MongoDB.
Question 1:
I am using MySQL Connector /J to connect to MySQL. I am creating connection for every request. I need to use connection pool. Whether i need to choose c3p0 or i could use MysqlConnectionPool class provided by the connector library.
Question 2:
I may need to load balace / failover between two MySQL database servers. I could use jdbc:mysql://host,host2/dbname to do the failover automatically. I want to use connection pool and failover in combination. How should i acheive it.
I'd recommend using C3PO or something else. It'll integrate into a Java EE app server better, and it's database agnostic.
Your second question is a good deal more complicated. Load balancing is usually done with an appliance of some kind, like an F5 or ACE, that stands between the client and the load balanced instances. Is that how you're doing it? How do you plan to keep the data in synch if you load balance between the two? If the connections aren't "sticky", you'll expect to find INSERTed data in both instances.
Maybe this reference can help you get started:
http://www.howtoforge.com/loadbalanced_mysql_cluster_debian
I have several Rails apps running on a single MySQL server. All of them run the same app, and all of the databases have the same schema, but each database belongs to a different customer.
Conceptually, here's what I want to do:
Customer.all.each do |customer|
connection.execute("use #{customer.database}")
customer.do_some_complex_stuff_with_multiple_models
end
This approach does not work because, when this is run in a web request, the underlying model classes cache different database connections from the A/R connection pool. So the connection on which I execute the "use" statement, may not be the connection the model uses, in which case it queries the wrong database.
I read through the Rails A/R code (version 3.0.3), and came up with this code to execute in the loop, instead of the "use" statement:
ActiveRecord::Base.clear_active_connections!
ActiveRecord::Base.establish_connection(each_customer_database_config)
I believe that the connection pool is per-thread, so it seems like this would clobber the connection pool and re-establish it only for the one thread the web request is on. But if the connections are shared in some way I'm not seeing, I would not want that code to wreak havoc with other active web requests in the same app.
Is this safe to do in a running web app? Is there any other way to do this?
IMO switching to a new database connection for different requests is a very expensive operation. AR maintains a limited pool of connections.
I guess you should move to PostgreSQL, where you have concept of schemas.
In an ideal SQL world this is the structure of a database
database --> schemas --> tables
In MYSQL, database and schemas are the same thing. Postgres has separate schemas, which can hold tables for different customers. You can switch schema on the fly without changing the AR connection by setting
ActiveRecord::Base.connection.set_schema_search_path("CUSTOMER's SCHEMA")
Developing it require a bit of hacking though.
Switching database by connecting/disconnecting is really slow, and is not going to work due to AR connection pools an internal caches. Try using ActiveRecord::Base.table_name_prefix = "customer_" and keep the database constant.
Right now you have connections in ActiveRecord can be per class level. Its looks per thread basis because is in before 1.9 ruby threads sucked so implementations were using process instead of thread, but It may not be true for long.
But since AR uses one thread per Model. You can create different mock models for each database you have. So using answer given in this question.
Code will look something like this. (I have not tested it)
Customer.all.each do |customer|
c_class = Class.new(ActiveRecord::Base)
c_class.establish_connection(each_customer_database_config)
c_class.table_name = customor.table_name()
c_class.do_something_on_diff_models_using_cutomer_from_diff_conn(customer.id)
c_class.clear_active_connections!
end
Why not keep the same db and tables and just have each of your models belong_to a customer? Then you can find all the models for that customer with:
Customer.all.each do |customer|
customer.widgets
customer.wodgets
# etc
end
What is the most efficient way of implementing queues to be read by another thread/process?
I'm thinking of using a basic MySQL table with polling on sleep. This sounds to be the most scalable (it doesn't even have to be on the same server) but might potentially result in too many queries to the DB.
You have several options, and it really depends on what you are trying to get the system to do.
fork child processes, and interface using connections their stdin/stdout pipes.
create a named pipe on the file system, like /tmp/mysql.sock. This is basically using sockets to communicate cross process.
Setup a message broker. I'd recommend giving ActiveMQ a try, and using the Stomp client for Perl. This is probably your most scalable solution.
This is one of those things that is simple to write yourself to your exact specifications. I wrote a toy one here:
http://github.com/jrockway/app-queue
I am not sure it compiles anymore, as AnyEvent::Subprocess has changed significantly since I wrote it. But you can steal the ideas.
Basically, I think an RPC-style infrastructure is the best. You have a server that handles keeping the data. Then clients connect and add data or remove data via RPC calls. This gives you ultimate flexibility with the semantics. You can be "transactional" so that if a client takes data and then never says "hey, I am done with it", you can assume the client died and give the job to another client. You can also ensure that each job is only run once.
Anyway, making a queue work with a relational database table involves a bit of effort. You should use something KiokuDB for the persistence. (You can physically store the data in MySQL if you desire, but this provides a nicer Perl API to that.)
In PostgreSQL you could use the NOTIFY/LISTEN combination, would need only a wait on the PG connection socket after running LISTEN(s).