I have a database, let's say in MySQL, that logs runs of client programs that connect to the database. When doing a run, the client program will connect to the database, insert a "Run" record with the start timestamp into the "Runs" table, enter its data into other tables for that run, and then update the same record in the "Runs" table with the end timestamp of the run. The end timestamp is NULL until the end of the run.
The problem is that the client program can be interrupted -- someone can hit Ctrl^C, the system can crash, etc. This would leave the end timestamp as NULL; i.e. I couldn't tell the difference between a run that's still ongoing and one that terminated ungracefully at some point.
I wouldn't want to wrap the entire run in a transaction because the runs can take a long time and upload a lot of data, and all of the data from a partial run would be desired. (There will be lots of smaller transactions during the run, however.) I also need to be able to view the data in real-time in another SQL connection as it's being uploaded by a client, so a mega-transaction for the entire run would not be good for that purpose.
During a run, the client will have a continuous session with the SQL server, so it would be nice if there could be a "trigger" or similar functionality on the connection closing that would update the Run record with the ending timestamp. It would also be nice if such a "trigger" could add a status like "completed successfully" vs. "terminated ungracefully" to boot.
Is there a solution for this in MySQL? How about PostgreSQL or any other popular relational database system?
Related
I have a table with >19M rows that I want to create a subtable of (I'm breaking the table into several smaller tables). So I'm doing a CREATE TABLE new_table (SELECT ... FROM big_table). I run the query in MySQL Workbench.
The query takes a really long time to execute so eventually I get a "Lost connection to MySQL server" message. However, after a few minute the new table is there and it seems to contain all the data that was supposed to be copied over (I'm doing a GROUP BY so cannot just check that the number of rows are equal in both tables).
My question is: Am I guaranteed that the query is completed even though I lose connection to the database? Or could MySQL interrupt the query midway and still leave a table with incomplete data?
Am I guaranteed that the query is completed even though I lose connection to the database?
No. There are several reasons other than connection timeout to get lost-connection errors. The server might crash due to used-up disk space or a hardware fault. An administrator might have terminated your session.
"Guarantee" is a strong word in the world of database management. Because other peoples' data. You should not assume that any query ran correctly to completion unless it ended gracefully.
If you're asking because an overnight query failed and you don't want to repeat it, you can inspect the table with stuff like COUNT(*) to convince yourself it completed. But please don't rely on this kind of hackery in production with other peoples' data.
I am building a monitoring application for a machine, position must be read and stored every second for a period of a month. I wrote a procedure to fill a table with initial 0 value.
CREATE DEFINER=`root`#`localhost` PROCEDURE `new_procedure`(start_date datetime, end_date datetime)
BEGIN
DECLARE interval_var INT DEFAULT 1;
WHILE end_date >= start_date DO
INSERT INTO table1(datetime, value) VALUES(start_date, 0);
SET start_date = DATE_ADD(start_date, INTERVAL interval_var SECOND);
END WHILE;
END
This process is very slow, and most of the time the connection with the sql database is lost. For example, when I try to fill the table from "2016-01-14 07:00:00" to "2016-01-15 07:00:00" the procedure reached 2016-01-14 07:16:39 and crashed due to lost connection with database.
Is there a more efficient way to create a template table for a month with second increments and 0 values? My monitoring application is built on vb.net and I have tried to create a code on vb to create this template table, but it was slower and more likely to crash than direct procedure on mysql workstation.
I would recommend looking at the application architecture first. Expecting the whole system to run without failing for even 1 second for an entire month is asking an awful lot. Also, try to think about if you really need this much data.
1 record/sec*3600 sec/hr*24 hr/day*30 day/mo is more than 3.1 million records. Trying to process that much information will cause a lot of software to choke.
If the measurement is discrete, you may be able to cut this dramatically by only recording changes in the database. If it's analog, you may have no choice.
I would recommend creating two separate applications: a local monitor that stores local data, and then reports to the mysql server application every hour or so. That way, if the database is unavailable, it keeps right on collecting data, and when the database is available again, it can transfer everything that has been recorded since the last connection. Then the mysql application can store the data in the database in one shot. If that fails, it can retry, and keep it's own copy of the data until it gets stored into the database.
Ex:
machine => monitoring station app => mysql app => mysql database
It's a little more work, but each application will be pretty small, and they may even be reusable. And it will make troubleshooting much easier, and dramatically increase the fault tolerance in the system.
Is it possible to view the database entries (for example with PhpMyAdmin) which were created by a factory? My tests are successful, so the Database entry should exist. But when i add sleep(60) to my test (after creating the entry), i can't find any database entries in my database.
In most setups for FactoryGirl, your database entries will be inserted in a transaction that is never committed. That means the records will never be visible outside that one test.
If you're using RSpec, you can set config.use_transactional_fixtures = false.
If you're using DatabaseCleaner, you can use DatabaseCleaner.strategy = :truncation.
After doing this, the transaction will be committed and records will be visible outside the test. This will likely make your tests a little slower.
I am using 2 separate processes via multiprocessing in my application. Both have access to a MySQL database via sqlalchemy core (not the ORM). One process reads data from various sources and writes them to the database. The other process just reads the data from the database.
I have a query which gets the latest record from the a table and displays the id. However it always displays the first id which was created when I started the program rather than the latest inserted id (new rows are created every few seconds).
If I use a separate MySQL tool and run the query manually I get correct results, but SQL alchemy is always giving me stale results.
Since you can see the changes your writer process is making with another MySQL tool that means your writer process is indeed committing the data (at least, if you are using InnoDB it does).
InnoDB shows you the state of the database as of when you started your transaction. Whatever other tools you are using probably have an autocommit feature turned on where a new transaction is implicitly started following each query.
To see the changes in SQLAlchemy do as zzzeek suggests and change your monitoring/reader process to begin a new transaction.
One technique I've used to do this myself is to add autocommit=True to the execution_options of my queries, e.g.:
result = conn.execute( select( [table] ).where( table.c.id == 123 ).execution_options( autocommit=True ) )
assuming you're using innodb the data on your connection will appear "stale" for as long as you keep the current transaction running, or until you commit the other transaction. In order for one process to see the data from the other process, two things need to happen: 1. the transaction that created the new data needs to be committed and 2. the current transaction, assuming it's read some of that data already, needs to be rolled back or committed and started again. See The InnoDB Transaction Model and Locking.
I'm very new at MySQL and I'm trying to execute some commands, wait a certain period of time until they are uploaded and then check to ensure that the data from the commands is actually in the table so I can delete these files.
Is there any sort of first priority/instant way of executing these sort of commands? A way of setting the maximum amount of time these commands take to execute?
Thanks. By the way I am using C# and the OdbcCommand class
You don't need to check anything. If the database doesn't return an error, the data is written. MySQL does have a way to defer inserts, but it's not in the default version of INSERT, so you don't need to worry about it. When the query finishes without an error, the data is in the database (unless you are in a transaction, in which case the data is in the database after you commit the transaction).