Alembic migration hangs after creating db session - sqlalchemy

I'm using this approach to talk to an external database in my alembic migration. I can run SELECT statements and read data from the other database, but the alembic migration always ends up hanging after running my statements. I've noticed that removing...
Session = sessionmaker(bind=engine)
session = Session()
... fixes the issue, but I obviously can't do much without that.
I see no errors, what could be going on?

Related

ActiveRecord transactions: record created but not found in database

We're using Rails v6.0.2.1 on Jruby with managed PostgreSQL (using JDBC adapter) hosted on DigitalOcean. Lately, we have been seeing issues where a certain transaction created some record, threw the Id to Sidekiq for further processing but the record isn't there in the database and Sidekiq fails since the record is not found in db.
There are some Concurrent::Future calls as well inside the transaction. DigitalOcean doesn't show any deadlocks in the said time period. And currently, DO doesn't provide a dump of PostgreSQL logs. How do we debug this?
While looking at Rails production logs, we found out that those transactions didn't log any BEGIN...COMMIT OR ROLLBACK message either.
you might have been using after_save callback on your models I assume? anyway, the issue is that Sidekiq is fast. Sometimes, too fast.
What is happening is that Sidekiq is picking up the job before the transaction is committed. That will trigger the error you are seeing (e.g. RecordNotFound).
You have two solutions: simply retry the job, on a second pass, the record would most likely be in the database again or you can go move to after_commit callback and these errors will disappear.
This issue was discussed here in the past

How to update data in Redis and MySQL at the same time?

I'm building a background service which boils down to a very complicated queue system. The idea is to use Redis as non-persistent storage, and have a sub/pub scheme which runs on an interval.
All of the subscribers will be behind a load balancer. This removes the complicated problem of maintaining state between all the servers behind the load balancer.
But, this introduces a new problem...how can I ensure that the non-persistent (Redis) and persistent (MySQL) databases are both updated by my application(s)?
It seems like I'm forced to prioritize one, and if I HAVE to prioritize one, I will prioritize persistence. But, in that scenario, what happens if MySQL is updated, Redis is not, and for some reason I have lost the connection to MySQL and cannot undo my last write?
There are two possible solutions to your problem:
Following these steps:
a. Start MySQL transaction with START TRANSACTION
b. Run your MySQL query INSERT INTO ...
c. Run your Redis command
d. Finish your MySQL transaction with COMMIT statement in case if Redis command succeeded or ROLLBACK if command failed
Using transctions ensures that data is consistent in both storages.
Write LUA script for Redis using LuaSQL library (https://realtimelogic.com/ba/doc/en/lua/luasql.html), where you will connect to MySQL, insert your data and then send commands to Redis as well. Then this LUA script can be called from client side with just one command EVAL or EVALSHA
You can try the mysql udf plugin (https://github.com/Ideonella-sakaiensis/lib_mysqludf_redis)
See the post: how to move data from mysql to redis

Flyway does not handle implicity committed statements when flyway process crashes

Ran into this situation recently using SpringBoot (1.2.3) and Flyway (3.1), and could not find much about how to handle:
Server was spinning up and executing a long running alter table add column statement against a mysql database (5.6) 20-30mins. As the script was running the server process was hard terminated since it was not responding to health checks in a given timeframe. Since the MySQL server was processing the statement, it continued to process the statement to completion but the script was not marked as failed or success. When another server was spun up, it tried to execute the script which failed cause the column already existed.
Given that the server could crash at anytime for any reason during a long running script, other than idempotent scripts or a manual db upgrade process, would like to understand established patterns for handling this situation.
Possibly a setting that indicates the server platform uses implicit commits so mark it as run when the script is sent to the server?
You bring up a good point but unfortunately, I don't think Flyway or Spring Boot have any native support for this.
One workaround, ugly as it is, is to implement the beforeEachMigrate and afterEachMigrate callbacks that Flyway provides. You could use them to maintain a separate migration table that keeps track of which migrations have been started and which ones have been completed. Then, if it contains unfinished migrations the next time your application starts, you can shut it down with a descriptive error message.
I recommend creating a feature request about it. If you do, please link us to it!
My approach would be to have separate migration scripts for any long-running SQL that has an implicit commit. Flyway makes it really easy to add minor version numbered scripts, so there's not a good reason to overcomplicate the implementation with what you're suggesting. If you're using PostgreSQL you probably wouldn't need to do this, but Oracle and MySQL would require it.

Rails adapter solutions for MySQL Cluster (NDB)?

I'm setting up a high-availability environment for a customer. There are a pair of load-balanced hosts serving http requests to our Rails application, which is deployed in parallel on both hosts.
Additionally, there are two MySQL hosts available. I want to run MySQL Cluster (NDB) on both hosts (i.e., multi-master) to have a fully redundant configuration. I'm specifically trying to avoid a master-slave configuration based on database replication; I feel like that makes the writable node a single point of failure.
I'm looking for some guidance on how best to interface our Rails app to a multi-master MySQL cluster like this. Almost all of the database adapters I can find are for master-slave setups. Failover_adapter sounds very promising, but it's woefully outdated. I haven't managed to turn up anything similar developed in the last five years.
Is anyone aware of any gems to use or approaches to take to serve as an adapter between a Rails application and a multi-master MySQL cluster like I've described?
I've just recently set this up. There shouldn't be much work to be done.
In the mysqladapter gem it specifies the engine as InnoDB, which is obviously not suitable for clustering. You need to add this to an initialization file:
ActiveRecord::ConnectionAdapters::Mysql2Adapter
class ActiveRecord::ConnectionAdapters::Mysql2Adapter
def create_table(table_name, options = {})
super(table_name, options.reverse_merge(:options => "ENGINE=NDB"))
end
end
This is the original in the adapter.
def create_table(table_name, options = {})
super(table_name, options.reverse_merge(:options => "ENGINE=InnoDB"))
end
Another key aspect is you don't want migrations taking place at the same time so in the deploy.rb file:
task :migrate, :max_hosts => 1 do
# sleep 0.5
run "cd #{release_path} && bundle exec rake db:migrate RAILS_ENV=#{rails_env}"
end
Max hosts prevents cap from running the migrations in parallel. This is important as you don't want the cluster running create table type of things at the same time. It may even be worth putting a delay in I have commented above just for a little extra safety.
One more key aspect. Don't forget to set:
DataMemory =
IndexMemory =
The default values are supremely low. Typically index size is DataMemory/5-10
One more pitfall I've seen so far is in your mysqld nodes make sure you set:
ndb_autoincrement_prefetch_sz
to at least a 100. Otherwise bulk inserts will take forever. The default is 1.
EDIT:
ndb_autoincrement_prefetch_sz
Leave this variable completely alone. Don't set it. What it can cause is auto increment indexes to become out of sync on the cluster. Which is a nightmare to debug.
Additionally make sure you're NDB nodes don't run on the same server as the NDB MGM nodes.
Happy coding.
Ad
I ultimately was not able to find an adapter solution that did what I wanted.
I settled on using the mysql2 adapter and pointing it at a reverse-proxy (I used haproxy) in front of my database cluster that could handle load-balancing and failover between the master nodes.

If I run a migration with Django South and it crashes, is my database ever corrupted?

I'm playing around with Django South, and have been impressed by it's power, but in the process of doing some migrations, I've managed to do things that cause errors in the middle of migrations. Things like having a syntax error or run time exception in a data migration file, deciding I didn't want to actually do something and hitting ctrl-c during a migration and aborting prematurely, etc.
I'm using MySQL as a database backend. Do I need to worry about the integrity of my database when something goes wrong with South? Do transactions ensure that all problems are rolled back on error?
The database should rollback nicely:
http://south.aeracode.org/docs/migrationstructure.html#transactions
Anyway, can't you just check the db tables?
A couple of notes:
You can print the existing migrations with
manage.py migrate --list
This also shows which migrations have been applied
You can also manually rollback to a previous migration using
manage.py migrate <app_name> 0010
where 10 is the last safe migration
Hope this helps