Flyway does not handle implicity committed statements when flyway process crashes - mysql

Ran into this situation recently using SpringBoot (1.2.3) and Flyway (3.1), and could not find much about how to handle:
Server was spinning up and executing a long running alter table add column statement against a mysql database (5.6) 20-30mins. As the script was running the server process was hard terminated since it was not responding to health checks in a given timeframe. Since the MySQL server was processing the statement, it continued to process the statement to completion but the script was not marked as failed or success. When another server was spun up, it tried to execute the script which failed cause the column already existed.
Given that the server could crash at anytime for any reason during a long running script, other than idempotent scripts or a manual db upgrade process, would like to understand established patterns for handling this situation.
Possibly a setting that indicates the server platform uses implicit commits so mark it as run when the script is sent to the server?

You bring up a good point but unfortunately, I don't think Flyway or Spring Boot have any native support for this.
One workaround, ugly as it is, is to implement the beforeEachMigrate and afterEachMigrate callbacks that Flyway provides. You could use them to maintain a separate migration table that keeps track of which migrations have been started and which ones have been completed. Then, if it contains unfinished migrations the next time your application starts, you can shut it down with a descriptive error message.
I recommend creating a feature request about it. If you do, please link us to it!

My approach would be to have separate migration scripts for any long-running SQL that has an implicit commit. Flyway makes it really easy to add minor version numbered scripts, so there's not a good reason to overcomplicate the implementation with what you're suggesting. If you're using PostgreSQL you probably wouldn't need to do this, but Oracle and MySQL would require it.

Related

ActiveRecord transactions: record created but not found in database

We're using Rails v6.0.2.1 on Jruby with managed PostgreSQL (using JDBC adapter) hosted on DigitalOcean. Lately, we have been seeing issues where a certain transaction created some record, threw the Id to Sidekiq for further processing but the record isn't there in the database and Sidekiq fails since the record is not found in db.
There are some Concurrent::Future calls as well inside the transaction. DigitalOcean doesn't show any deadlocks in the said time period. And currently, DO doesn't provide a dump of PostgreSQL logs. How do we debug this?
While looking at Rails production logs, we found out that those transactions didn't log any BEGIN...COMMIT OR ROLLBACK message either.
you might have been using after_save callback on your models I assume? anyway, the issue is that Sidekiq is fast. Sometimes, too fast.
What is happening is that Sidekiq is picking up the job before the transaction is committed. That will trigger the error you are seeing (e.g. RecordNotFound).
You have two solutions: simply retry the job, on a second pass, the record would most likely be in the database again or you can go move to after_commit callback and these errors will disappear.
This issue was discussed here in the past

How to update data in Redis and MySQL at the same time?

I'm building a background service which boils down to a very complicated queue system. The idea is to use Redis as non-persistent storage, and have a sub/pub scheme which runs on an interval.
All of the subscribers will be behind a load balancer. This removes the complicated problem of maintaining state between all the servers behind the load balancer.
But, this introduces a new problem...how can I ensure that the non-persistent (Redis) and persistent (MySQL) databases are both updated by my application(s)?
It seems like I'm forced to prioritize one, and if I HAVE to prioritize one, I will prioritize persistence. But, in that scenario, what happens if MySQL is updated, Redis is not, and for some reason I have lost the connection to MySQL and cannot undo my last write?
There are two possible solutions to your problem:
Following these steps:
a. Start MySQL transaction with START TRANSACTION
b. Run your MySQL query INSERT INTO ...
c. Run your Redis command
d. Finish your MySQL transaction with COMMIT statement in case if Redis command succeeded or ROLLBACK if command failed
Using transctions ensures that data is consistent in both storages.
Write LUA script for Redis using LuaSQL library (https://realtimelogic.com/ba/doc/en/lua/luasql.html), where you will connect to MySQL, insert your data and then send commands to Redis as well. Then this LUA script can be called from client side with just one command EVAL or EVALSHA
You can try the mysql udf plugin (https://github.com/Ideonella-sakaiensis/lib_mysqludf_redis)
See the post: how to move data from mysql to redis

Node.js, Express, MySQL - update schema

I have a small app running on a production server. In the next update the db schema will change; this means the production database schema will need to change and there will need to be some data manipulation.
What's the best way to do this? I.E run a one off script to complete these tasks when I deploy to the production server?
Stack:
Nodejs
Expressjs
MySQL using node mysql
Codeship
Elasticbeanstalk
Thanks!
"The best way" depends on your circumstances. Is this a rather seldom occurrence, or is it likely to happen on a regular basis? How many production servers are there? Are there other environments, e.g. for integration tests, staging etc.? Do your developers have an own DB environment on their machines? Does your process involve continuous integration?
The more complex your landscape is, the better it is to use solutions like Todd R suggested (Liquibase, Flywaydb).
If you just have one production server and it can be down for maintenance for a few hours, the it could be sufficient to
Schedule a maintenance downtime with your stakeholders and users
Shutdown the server
Create a backup
Update the database structure and contents as necessary
Deploy software updates
Restart the server
Test the result (manually or automatically)
Inform your stakeholders and users
If anything goes wrong, rollback to a backed up version of the database and your software.
Having database update scripts is advisable. Having tested them once or more is advisable even more. Creating a backup in advance is essential.
http://www.liquibase.org/ or http://flywaydb.org/ - pretty "heavy" for one time use, but if you'll need to change the schema again in the future, probably worth investing the time to learn one of these.

Execute command when replicating database on publisher and subscriber?

I have two MSSQL 2012 databases.
I have snapshot replication configured where the first server is a publisher and distributer, and the other is a subscriber.
I would like to be able to execute a command on the publisher just before the replication job occurs, and then another command on the subscriber just after the replication finishes.
I belive this should be a pull snapshot replication, so that the agent is located on the subscriber server.
Is this even possible?
EDIT. Due to the nature of snapshot replication, i switched to using transactional replication, thus removing my ability to execute scripts on replication-start and -stop.
I never did find a way to execute commands successfully when data is replicating, as i switched to transactional replication. The job handling this transaction type, will start and then just keep running, and not like snapshot replication where the job starts - replicates data - stops.
Instead i set up the jobs i needed executed, using the task scheduler. My services transfers files, to and from a webserver, through the database. And will only transfer files if not already present.
Using task scheduler is working pretty good, and it is MUCH more simple and stable than having something execute a sql script, which would then execute a powershell remoting command to connecto to the server and execute the service.
I just thought i would add this if anyone else stumbles on a similar problem :)

The best practice to create a daemon on Linux server

Here is the senario:
We have a site running on NodeJS. Periodically, we pull some data from internet, analyze it, and update a MySQL Database.
My questions are:
What is the best practice to create a Linux daemon? gcc? Can I do it in PHP or other languages?
Since NodeJs will be accessing to the same Database, how can we create mutex?
How can we manage the daemon? For example If the daemon crashes, we want to restart it automatically.
You can use forever.js ... see How does one start a node.js server as a daemon process?. It answers your 1st and 3rd question. I guess you should have searched stack overflow or just have googled a bit !!
You can code a daemon in any language: C, C++, Ocaml, Haskell, ... (but I won't code it in PHP).
The most important in coding a daemon is to be sure the code is robust and fault-detecting.
Concurrent access to the database should be handled by the MySQL server.
If you only share resources by a shared database, you can use its transaction isolation guarantees to stop other processes seeing incomplete data.
This means that you need to either do your operation atomically in SQL (a single statement) or use a transaction.
In any case, it means you need to use a transactional engine in MySQL (probably InnoDB) and your application needs to be aware of and handle deadlocks correctly.