SQLAlchemy, MySQL, and python - how should I handle boolean? - mysql

I have a Pyramid application that I am using with SQLAlchemy and MySQL. For database fields that I wanted to treat as boolean, I've been using a "BIT" data type on the SQLAlchemy side, and BIT(1) on the MySQL side.
This had all been working fine, but I was checking some newly updated code on my webhost and their version of phpMyAdmin is newer than the one I'm using locally; I was browsing a table that has a BIT field and on the newer phpMyAdmin none of the data appears - it's just blank. On my local instance BIT fields display as 0 or 1. If I tried to inline edit the hosted phpMyAdmin it wouldn't take any values I tried. I did try my application code and it appears to be able to toggle the true/false values just fine.
The got me wondering - with this setup should I be approaching it differently? SQLAlchemy does support Boolean, which seems like it would be more intuitive and appropriate, should I use that and set the MySQL fields to TINYINT instead?
What is the conventionally accepted way to handle booleans between SQLAlchemy and MySQL?

MySQL has a BOOL type (which is what SQLAlchemy uses) so I'm not sure why you don't just use that? Apparently it is an alias for TINYINT.
from sqlalchemy import Boolean and you should be good to go.

Related

Knex : universal way to get the last inserted id

I'm using Knex, because I'm working on an application that I would like to use with multiple database servers, currently Sqlite3, Postgres and MySQL.
I'm realizing that this might be more difficult that I expected.
On MySQL, it appears that this syntax will return an array with an id:
knex('table').insert({ field: 'value'}, 'id');
On postgres I need something like this:
knex('table').insert({ field: 'value'}, 'id').returning(['id']);
In each case, the structure they return is different. The latter doesn't break MySQL, but on SQlite it will throw a fatal error.
The concept of 'insert a record, get an id' seems to exist everywhere though. What am I missing in Knex that lets me write this once and use everywhere?
Way back in 2007, I implemented the database access class for a PHP framework. It was to support MySQL, PostgreSQL, SQLite, Microsoft SQL Server, Oracle, and IBM DB2.
When it came time to support auto-incremented columns, I discovered that all of these implement that feature differently. Some have SERIAL, some have AUTO-INCREMENT (or AUTOINCREMENT), some have SEQUENCE, some have GENERATED, some support multiple solutions.
The solution was to not try to write one implementation that worked with all of them. I wrote classes using the Adapter Pattern, one for each brand of SQL database, so I could implement each adapter class tailored to the features supported by the respective database. The adapter satisfied an interface that I defined in my framework, to allow the primary key column to be defined and the last inserted id to be fetched in a consistent manner. But the internal implementation varied.
This was the only sane way to develop that code, in my opinion. When it comes to variations of SQL implementations, it's a fallacy that one can develop "portable" code that works on multiple brands.

Spark jdbc write (to MySQL) missing milliseconds in DATETIME and TIMESTAMP columns

There are similar questions, but I am beginning to think mine is related to Spark jdbc APIs since both components seem to be working correctly on their own. I am using Spark 2.4 (which has ms support for timestamps) and have a MySQL 5.7.x version that supports fractional seconds.
I created a simple Dataset, with a TimestampType column, and when I show() it, here is what I get:
+-----------------------+
|my_timestamp |
+-----------------------+
|2021-02-06 12:11:45.335|
+-----------------------+
When I write this to MySQL (using dataset.write()), it creates the table automatically, with SQL TIMESTAMP type for the column, and the milliseconds part is lost upon insert.
For a second test, I created the table manually and defined the colon as TIMESTAMP(3). When I manually insert timestamps with ms part to it, everything works correctly. But when I write using Spark jdbc APIs, once again the ms part is truncated and it becomes 2021-02-06 12:11:45.0.
The only workaround that comes to mind is to keep the column as a long/BIGINT and convert it to DATETIME/TIMESTAMP when querying.
Am I doing something wrong here?
Well, StringType to the rescue. Apparently if I keep the Spark column as String with a value formatted the way MySQL expects, e.g. "2020-11-20 23:06:41.745", I can insert to a MySQL TIMESTAMP(3) column without any truncation or other problems.
This feels more like a workaround, so I still want to learn if there is a way to do this correctly.

How to query data which is json datatype from postgresql in sqlalchemy?

In sqlalchemy, I use classical mapping and autoload=True. I want to query data which is jsontype from postgresql.
D:\Python27\lib\site-packages\sqlalchemy\dialects\postgresql\base.py:1706: SAWarning: Did not recognize type 'json' of column 'comm_media_alias'name, format_type, default, notnull, domains, enums, schema)
how to deal with this problem
The JSON type in PostgreSQL is a relatively new data type in PostgreSQL, and thus SQLAlchemy has only recently added functionality to properly detect it. At the time of this question, SQLAlchemy would not be able to detect the "JSON" column type, as doing so relied on functionality that it did not yet support.
SQLAlchemy 0.9 was released on December 30th, 2013. This version contains support for the PostgreSQL JSON data type, so I would recommend upgrading to this version and trying again.
If you can't upgrade (or the upgrade still isn't working for you), you can also change your column type to something else (like TEXT).
Another thing to note about this is that it is not an error: it is a warning. I'm not sure what SQLAlchemy will actually try to do when it is working with the column that is a JSON type, but it might end up working anyway.

switching from MySQL to PostgreSQL for Ruby on Rails for the sake of Heroku

I'm trying to push a brand new Ruby on Rails app to Heroku. Currently, it sits on MySQL. It looks like Heroku doesn't really support MySQL and so we are considering using PostgreSQL, which they DO support.
How difficult should I expect this to be? What do I need to do to make this happen?
Again, please note that my DB as of right now (both development & production) are completely empty.
Common issues:
GROUP BY behavior. PostgreSQL has a rather strict GROUP BY. If you use a GROUP BY clause, then every column in your SELECT must either appear in your GROUP BY or be used in an aggregate function.
Data truncation. MySQL will quietly truncate a long string to fit inside a char(n) column unless your server is in strict mode, PostgreSQL will complain and make you truncate your string yourself.
Quoting is different, MySQL uses backticks for quoting identifiers whereas PostgreSQL uses double quotes.
LIKE is case insensitive in MySQL but not in PostgreSQL. This leads many MySQL users to use LIKE as a case insensitive string equality operator.
(1) will be an issue if you use AR's group method in any of your queries or GROUP BY in any raw SQL. Do some searching for column "X" must appear in the GROUP BY clause or be used in an aggregate function and you'll see some examples and common solutions.
(2) will be an issue if you use string columns anywhere in your application and your models aren't properly validating the length of all incoming string values. Note that creating a string column in Rails without specifying a limit actually creates a varchar(255) column so there actually is an implicit :limit => 255 even though you didn't specify one. An alternative is to use t.text for your strings instead of t.string; this will let you work with arbitrarily large strings without penalty (for PostgreSQL at least). As Erwin notes below (and every other chance he gets), varchar(n) is a bit of an anachronism in the PostgreSQL world.
(3) shouldn't be a problem unless you have raw SQL in your code.
(4) will be an issue if you're using LIKE anywhere in your application. You can fix this one by changing a like b to lower(a) like lower(b) (or upper(a) like upper(b) if you like to shout) or a ilike b but be aware that PostgreSQL's ILIKE is non-standard.
There are other differences that can cause trouble but those seem like the most common issues.
You'll have to review a few things to feel safe:
group calls.
Raw SQL (including any snippets in where calls).
String length validations in your models.
All uses of LIKE.
If you have no data to migrate, it should be as simple as telling your Gemfile to use the pg gem instead, running bundle install, and updating your database.yml file to point to your PostgreSQL databases. Then just run your migrations (rake db:migrate) and everything should work great.
Don't feel you have to migrate to Postgres - there are several MySQL Addon providers available on Heroku - http://addons.heroku.com/cleardb is the one I've had the most success with.
It should be simplicity itself: port the DDL from MySQL to PostgreSQL.
Does Heroku have any schema creation scripts? I'd depend on those if they were available.
MySQL and PostgreSQL are different (e.g. identity type for MySQL, sequences for PostgreSQL). But the port shouldn't be too hard. How many tables? Tens are doable.

LONGTEXT valid in migration for PGSQL and MySQL

I am developing a Ruby on Rails application that stores a lot of text in a LONGTEXT column. I noticed that when deployed to Heroku (which uses PostgreSQL) I am getting insert exceptions due to two of the column sizes being too large. Is there something special that must be done in order to get a tagged large text column type in PostgreSQL?
These were defined as "string" datatype in the Rails migration.
If you want the longtext datatype in PostgreSQL as well, just create it. A domain will do:
CREATE DOMAIN longtext AS text;
CREATE TABLE foo(bar longtext);
In PostgreSQL the required type is text. See the Character Types section of the docs.
A new migration that updates the models datatype to 'text' should do the work. Don't forget to restart the database. if you still have problems, take a look at your model with 'heroku console' and just enter the modelname.
If the db restart won't fix the problem, the only way I figured out was to reset the database with 'heroku pg:reset'. No funny way if you already have important data in your database.