MySQL odbc timeout from R - mysql

I'm using R to read some data from a MySQL database using the RODBC package. The data is then processed and some results are sent back to the database. The problem is that the server closes the connection after about a minute due to inactivity, which is the time needed to process the data locally. It's a shared server, so the host won't bump up the timeout time.
I think there are two possibilities to get around this:
Open a connection before every database transaction and close it immediately after
Send some small 'ping' command to the server every 30 seconds or so to let the server know that I'm still there.
I can implement the first fairly easily, but it seems pretty slow to constantly open and close connections. Does anyone know an efficient command for the second? Or is a better way altogether?

The first solution is the one I prefer. It's really hard to do the latter with a single threaded program like R. If R is busy running analysis there's no way for it to handle the ping. Unless you are doing hundreds of reads/writes the method of opening and closing the connection should not introduce an extreme amount of overhead.

Related

Is there a queue for SQL jobs?

I'm atm working to create a huge mySQL database by parsing XML files released on a FTP.
On a single computer, it takes ages, because of the huge amount of SQL INSERT INTO to make.
Thus, I modified my code to build it on AWS by creating a cluster, launching a database, build everything and download back the dump.
However, I got a question. Is there a "queue" for SQL requests sent ? I mean, if every of my nodes are sending requests at the same time to the database, what's going to happen ?
Thanks
On MySQL you can use SHOW FULL PROCESSLIST to see the open connections and what query they are running at the moment.
There is no queue of requests but some requests waits for others to complete, before starting, because they attempt to use rows or tables that are locked by the requests that are currently running.
Only one request is executed at a time for each connection.

Handling doctrine 2 connections in long running background scripts

I'm running PHP commandline scripts as rabbitmq consumers which need to connect to a MySQL database. Those scripts run as Symfony2 commands using Doctrine2 ORM, meaning opening and closing the database connection is handled behind the scenes.
The connection is normally closed automatically when the cli command exits - which is by definition not happening for a long time in a background consumer.
This is a problem when the consumer is idle (no incoming messages) longer then the wait_timeout setting in the MySQL server configuration. If no message is consumed longer than that period, the database server will close the connection and the next message will fail with a MySQL server has gone away exception.
I've thought about 2 solutions for the problem:
Open the connection before each message and close the connection manually after handling the message.
Implementing a ping message which runs a dummy SQL query like SELECT 1 FROM table each n minutes and call it using a cronjob.
The problem with the first approach is: If the traffic on that queue is high, there might be a significant overhead for the consumer in opening/closing connections. The second approach just sounds like an ugly hack to deal with the issue, but at least i can use a single connection during high load times.
Are there any better solutions for handling doctrine connections in background scripts?
Here is another Solution. Try to avoid long running Symfony 2 Workers. They will always cause problems due to their long execution time. The kernel isn't made for that.
The solution here is to build a proxy in front of the real Symfony command. So every message will trigger a fresh Symfony kernel. Sound's like a good solution for me.
http://blog.vandenbrand.org/2015/01/09/symfony2-and-rabbitmq-lessons-learned/
My approach is a little bit different. My workers only process one message, then die. I have supervisor configured to create a new worker every time. So, a worker will:
Ask for a new message.
If there are no messages, sleep for 20 seconds. If not, supervisor will think there's something wrong and stop creating the worker.
If there is a message, process it.
Maybe, if processing a message is super fast, sleep for the same reason than 2.
After processing the message, just finish.
This has worked very well using AWS SQS.
Comments are welcomed.
This is a big problem when running PHP-Scripts for too long. For me, the best solution is to restart the script some times. You can see how to do this in this Topic: How to restart PHP script every 1 hour?
You should also run multiple instances of your consumer. Add a counter to any one and terminate them after some runs. Now you need a tool to ensure a consistent amount of worker processes. Something like this: http://kamisama.me/2012/10/12/background-jobs-with-php-and-resque-part-4-managing-worker/

Connecting and disconnecting from Mysql continuously with Excel

Newbie question....sorry
I have a simple mysql database running in our Intranet (windows server) which >20 people connect to for searching/inserting records, etc
This is done with a simple Excel GUI.
Process is:
Search Strings are typed in excel cells
VBA opens connection to Mysql and query is run
Results retrieved are put on excel Connection to
mysql closed with VBA
The above process takes in general 0-2 seconds. Records retrieved <100.
Everthing runs fine so far.
In order to be able to connect more people in future, I would like to have some feedback on whether it is ok to continously connect and disconnect from mysql in the way I am doing.
Can it cause some type of crash/memory leaks, etc ??
Is there some better way to do this?
I am hoping to get <2000 users, but I understand the more users connected, worse it is.
By disconnecting after each search/insert, I am hoping to keep the number of live connections as low as possible.
thanks for your input
This constant connecting and disconnecting is an expensive process.
A better way would be to use server-side scripting to manage your connections. This way you would have a single persistent connection to each server, and the users will execute their queries through the single connection. You will also need to implement some sort of job queue for execution.

mysql connections. Should I keep it alive or start a new connection before each transaction?

I'm doing my first foray with mysql and I have a doubt about how to handle the connection(s) my applications has.
What I am doing now is opening a connection and keeping it alive until I terminate my program. I do a mysql_ping() every now and then and the connection is started with MYSQL_OPT_RECONNECT.
The other option (I can think of), would be to start a new connection before doing anything that requires my connection to the database and closing it after I'm done with it.
What are the pros and cons of these two approaches?
what are the "side effects" of a long connection?
What is the most used method of handling this?
Cheers ;)
Some extra details
At this point I am keeping the connection alive and I ping it every now and again to now it's status and reconnect if needed.
In spite of this, when there is some consistent concurrency with queries happening in quick succession, I get a "Server has gone away" message and after a while the connection is re-established.
I'm left wondering if this is a side effect of a prolonged connection or if this is just a case of bad mysql server configuration.
Any ideas?
In general there is quite some amount of overhead incurred when opening a connection. Depending on how often you expect this to happen it might be ok, but if you are writing any kind of application that executes more than just a very few commands per program run, I would recommend a connection pool (for server type apps) or at least a single or very few connections from your standalone app to be kept open for some time and reused for multiple transactions.
That way you have better control over how many connections get opened at the application level, even before the database server gets involved. This is a service an application server offers you, but it can also be rolled up rather easily if you want to keep it smaller.
Apart from performance reasons a pool is also a good idea to be prepared for peaks in demand. When a lot of requests come in and each of them tries to open a separate connection to the database - or as you suggested even more (per transaction) - you are quickly going to run out of resources. Keep in mind that every connection consumes memory inside MySQL!
Also you want to make sure to use a non-root user to connect, because if you don't (I think it is tied to the MySQL SUPER privilege), you might find yourself locked out. MySQL reserves at least one connection for an administrator for problem fixing, but if your app connects with that privilege, all connections would already be used up when you try to put out the fire manually.
Unless you are worried about having too many connections open (i.e. over 1,000), you she leave the connection open. There is overhead in connecting/reconnecting that will only slow things down. If you know you are going to need the connection to stay open for a while, run this query instead of pinging periodically:
SET SESSION wait_timeout=#
Where # is the number of seconds to leave an idle connection open.
What kind of application are you writing? If it's a webscript: keep it open. If it's an executable, pool your connections (if necessary, most of the times a singleton will do).

Constant MySQL Connection or Connect when needed

I am building a little daemon which periodically (every 30 seconds) checks for new data and enters it in a local MySQL Database.
I was just wondering whether it was better to create a connection to the database when the application launches and always use that connection throughout the application until it is closed, or if it should only open a connection when there is new data, close it after the data has been added and then repeat this when there is new data 30 seconds later?
Thank you.
I would recommend that you do whatever you find easiest to code. Don't waste any time trying to solve what will most likely be a non-problem.
If it turns out there is any difficulty with contention, connection limits or other such things you can fix it later.
That depends.
In your case the performance won't matter, since you won't be performing thousands of queries/logins per second and the new connection/login overhead is in (tens of) milliseconds.
If you use a single connection, you have to make sure your daemon handles sudden disconnections from the MySQL side and is able to recover from there. Also if you ever move your application so your application would be on a different server than the MySQL, then many firewalls can drop prolonged connections every now and then.
If you create a new connection every time and then disconnect when finished, things like firewalls cleaning up old connections won't bite you so easily.