alembic and development / production databases - sqlalchemy

For development purposes I am using sqlite, though I expect to use postgres in the production environment.
I see that alembic supports multiple databases.
What I am less clear on is whether the migration scripts for different database engines are the same -- in other words, can I use the same migration scripts for postgres and for sqlite, or should I maintain entirely separate alembic environments for them?

Alembic migrations are written with SQLAlchemy datatypes. SQLAlchemy has types that are generic and types that are vendor specific.
If you use vendor specific datatypes then your migrations won't work across multiple vendors. Otherwise they should.
For more information about types in SQLAlchemy check http://docs.sqlalchemy.org/en/rel_1_1/core/type_basics.html

Related

Doctrine, SQLite and Enums

We have an application running on Symfony 2.8 with a package named "liip/functional-test-bundle". We plan on using PHP Unit to run functional tests on our application, which uses MySQL for it's database.
The 'functional test bundle' package allows us to use the entities as a schema builder for an in-memory SQLite database, which is very handy because:
It requires zero configuration to run
It's extremely fast to run tests
Our tests can be run independently from each test and the development data
Unfortunately, some of our entities use 'enums' which is not supported by SQLite, and our technical lead has opted to keep existing enums whilst refraining from using them anymore.
Ideally we need this in the project sooner rather than later, so the team can start writing new tests in the future to help maintain the stability of the application.
I have 3 options at this point, but I need help choosing the correct one and performing it correctly:
Convince the technical lead that enums are a bad idea and lookup tables could instead be used (Which may cost time where the workload is already high)
Switch to using MySQL for the testing database. (This will require additional configuration for our tests to run, and may be slower)
Have doctrine detect when enums are used on a SQLite driver, and switch them out for strings. (I would have no idea how to do this, but this is, in my opinion, the most ideal solution)
Which action is the best, and how should I carry it out?

Alembic Migrations without Database

Is it possible to have Alembic auto-generate migrations without it having access to the database?
For example django / south are able to do this by comparing the current version of a Model against a previous snapshot of the Model.
No, this isn't possible. In the relevant issue zzzeek said
while the reflection-based comparison has its issues, it really is a very fundamental assumption these days particularly in the openstack world where autogen features are used in unit test suites to ensure the migrated schema matches the model. I don't have plans right now to pursue the datafile-based approach, it would be an enormous undertaking for a system that people seem to be mostly OK with as is.
Though an alternative approach could be to spin up a new database on demand, run the migrations from empty to head, generate against it, then discard the database.

Arel in SQLite and other databases

I am new in databases and Ruby on rails applications.
I have a question about generating queries from ORM.
When my database is SQLite, and I am using a code for creating queries for this database, if I change my database am I still able to use the same code?
In addition, when I am using Arel, because it provides more ready methods for more complex queries, before I am generating a query I call the method .to_sql
If I want to use the same code but for another database am I still able to execute the query? Using instead of to_sql something else?
In general, the Ruby on Rails code is portable between databases without doing anything more than adjusting your config/database.yml file (for connection details) and updating your Gemfile (to use the correct database adapter gem).
Database portability is mostly likely when you do not rely on specific, hardcoded uses of SQL as a way to invoke queries. Instead use the Rails' associations, relations, and query tools wherever possible. Specific SQL often creeps in on .where() clauses, so be thoughtful there and minimize/simply those as much as practical (for instance, multiple simple scopes that can be chained may give you better results than trying larger more complex single scopes). Also use Arel.matches when depending on "LIKE" type clauses instead of hardcoding LIKE details in .where() calls because different databases (such as PostgreSQL) handle case-sensitivity differently.
Your best defense against surprises upon switching databases is a robust set of automated unit (e.g., Rspec, minitest) and integration tests (e.g., capybera). These are especially important where you are unable to avoid use of specific SQL coding (say for optimization or odd/complex queries).
Since SQLite is simpler than most other robust engines like MySQL or Postgres, you're likely to be safer anyway in any operations you depend on. You're most vulnerable when using some advanced or specific feature of the database, but you're also generally more aware if you're doing that, so can write protective tests to help warn you upon switching database engines.

Performing a join across multiple heterogenous databases e.g. PostgreSQL and MySQL

There's a project I'm working on, kind of a distributed Database thing.
I started by creating the conceptual schema, and I've partitioned the tables such that I may require to perform joins between tables in MySQL and PostgreSQL.
I know I can write some sort of middleware that will break down the SQL queries and issue sub-queries targeting individual DBs, and them merge the results, but I'd like to do do this using SQL if possible.
My search so far has yielded this (Federated storage engine for MySQL) but it seems to work for MySQL databases.
If it's possible, I'd appreciate some pointer's on what to look at, preferably in Python.
Thanks.
It might take some time to set up, but PrestoDB is a valid OpenSource solution to consider.
see https://prestodb.io/
You connect connect to Presto with JDBC, send it the SQL, it interprets the different connections, dispatches to the different sources, then does the final work on the Presto node before returning the result.
From the postgres side, you can try using a foreign data wrapper such as mysql_ftw (example). Queries with joins can then be run through various Postgres clients, such as psql, pgAdmin, psycopg2 (for Python), etc.
This is not possible with SQL.
Your options are to write your own "middleware" as you hinted at. To do that in Python, you would use the standard DB-API drivers for both databases and write individual queries; then merge their results. An ORM like sqlalchemy will go a long way to help with that.
The other option is to use an integration layer. There are many options out there, however, none that I know that are written in Python. mule esb, apache servicemix, wso2 and jboss metamatrix are some of the more popular ones.
You can colocate the data on a single RDBMS node (either PostgreSQL or MySQL for example).
Two main approaches
Readonly - You might want to use read-replicas of both source systems, then use a process to copy the data to a new writeable converged node; OR
Primary - You might chose a primary database of 2. Move the data from one to the primary using a conversion process (eg. ETL or off the shelf table-level replication)
Then you can just run the query on the one RDBMS with JOINs as usual.
BONUS: You can also do log reading from RDBMS that can ship logs through Kafka. You can make it really complex as required.

Modern, smart ways of updating MySql table definitions with software updates?

A common occurrence when rolling out the next version of a software package is that some of the data structures change. When you are using a Sql database, an appropriate series of alters and updates may be required. I've seen (and created myself) many ways of doing this over the years. For example RoR has the concept of migrations. However, everything I've done so far seems a bit hairy to maintain or has other shortcomings.
In a magical world I'd be able to specify the desired schema definition, and have something automatically sort out what alters, updates, etc. are needed to move from the existing database layout...
What modern methodologies/practices/patterns exist for rolling out table definition changes with software updates? Do any MySql specific tools/scripts/commands exist for this kind of thing?
Have you looked into flyway or dbdeploy ? Flyway is Java specific, but I believe works with any DB, dbdeploy supports more languages, and again multiple databases.