What is the cost of using NULL in database columns? - mysql

I use MySQL and SQLite often and plan on bringing more PostgreSQL into my workflow soon. With that in mind, what are the costs of using NULL in each database? I heard that MySQL adds an extra bit to each NULL column value to mark it as nullable.

This question was answered separately for PostgreSQL:
How much disk-space is needed to store a NULL value using postgresql DB?
and for MySQL:
NULL in MySQL (Performance & Storage)
But to recap they both use bitmask fields to mark nulls.

Related

DBT MySQL producing boolean columns as INT, should be TINYINT

I'm trying to change how we make some transformations in our tables on RDS MySql. This table have 20 million records and 200 columns. We have a pipeline executed monthly where we download the table to an EC2, use python to do the transformation, then it is reuploaded.
Upon presenting dbt, boss wants to see it working because of the benefits: everything will stay on SQL (I am the only python person in our small 20 people company), will have documentation, automated tests and version control [all this is really needed at the moment]. I made it happen, wrote SQL on dbt that produces the same results of the python script and runs directly on the mysql database using this https://pypi.org/project/dbt-mysql/ adapter.
There are some problems and the one of them i think will start helping me most is about the boolean in mysql. I already know all that thing about boolean, tinyint(1), etc, etc. But all columns intended to be "boolean" are going to the tables as INT, and I want them as tinyint, because it is taking 4 times the space it should.
Edit: added more information thanks to feedback
My raw table comes with all columns as str, i'm trying to cast the correct types. As this one should be boolean, i expected it to be converted to tinyint(1). When I create a table via pandas and there is a bool column, the table column is tinyint(1). But when I try to do something like this in SQL, the column becomes int.
The code is really just that:
SELECT IF(myStrColumn = '1', TRUE, FALSE)
FROM myRawTable
The resulting column is given as int, but i wanted it to be tinyint(1) to represent boolean.
tinyint is not a valid type to be passed to cast as per documentation https://dev.mysql.com/doc/refman/8.0/en/cast-functions.html#function_cast so it doesn't work
After looking at the MySQL docs, I think you have two options:
Create a new, custom table materialization that allows you to leverage the MySQL syntax:
create table my_table (my_col tinyint) as select ...
Add a post-hook that narrows the column after you've created the table:
config(
materialized="table",
post_hook="alter table {{ this }} modify my_col tinyint"
)
For #1, there is a guide to creating materializations in the dbt docs, but it is a complex and advanced topic. I think the dbt-mysql adapter uses the vanilla/default table materialization in the global project. You may want to check out the MySQL incremental materialization macro, which is here.

configuring hibernate for postgres for col=null instead of "is null"

I read through this post...
Is there a SQL mode for MySQL allowing "WHERE x = NULL" (meaning "WHERE x IS NULL")?
In hibernate, NamedQuery is constantly used as the goto in our company and criteria api is a big one-off. They are just easy as far as
select c from Customer c where c.firstName=:firstName and c.middleName=:middleName and c.lastName=:lastName
Of course, then we ran into middleName being null and things broke. Then we ran into firstName being null which then broke. All over the schema, we are going to have situations like these constantly soooo, we are looking for a way to interpret
c.middleName = null to c.middleName is null when it is null.
Is there a hibernate setting for this so it just does it correctly in mysql and postgres(the best ideal situation) OR that paramater in the above post for postgres. Where is that and how can I set that up? or is it per query(ugh)?
We are in the process of switching from MySQL to postgres since MySQL's method of dealing with uniqueConstraints on null columns is very non-ideal for us compared to postgres (in that MySQL allows John null Smith twice where postgres does not).
thanks,
Dean
In PostgreSQL the IS NOT DISTINCT FROM operator is the null-safe equal predicate. However, similarly to mysql's <=> operator, this predicate is an extension to the SQL standard.
I'm not aware of hibernate having any null-safe equal operators, so I guess the only way to handle such queries is through the native query feature. Obviously, such queries are specific to the underlying database used, so not really ideal.

Why laravel timestamp fields are nullables?

this is not a technical question.
I'm using laravel for several projects and today I had a doubt: why Laravel timestamp fields are can be null? Is there a portability reason behind this choice or it is only useful?
What is the principle that they applied on this choice?
This is because of MySQL. If they weren't set nullable in some MySQL versions MySQL would put in there own values and when they are marked as nullable, MySQL won't put there own values and values from framework will be used.

MySQL Indices on Encrypted Field

Is there a way to encrypt a field in a database and still have useful indexes on it?
For example, in the medical arena you need to encrypt patient information. If I do this on a patient name field, is there a way to still be able to have indexes on the decrypted value?
I'm thinking of using AES_ENCRYPT() on the field, but would really like to know if there is a trick to do the indexing on the decrypted value, not on the field's value (which would be encrypted).
AES_ENCRYPT() and AES_DECRYPT() are functions. So the question in more general terms is:
Can MySQL do indexing on functions?
As of MySQL 5.6 the answer is no, although you can see this in other sql engines. For example oracle has done it since 8i and MS SQL has done it since 2000.
It looks like this might be possible in Maria DB 5.2 (https://mariadb.com/kb/en/mariadb/virtual-computed-columns/), which is a community version fork of MySQL.
References:
Is it possible to have function-based index in MySQL?
http://use-the-index-luke.com/sql/where-clause/functions

Compatibility Query for inserting null values in datefield

I'm using Oracle and MySQL database in our project. So we are trying to write common queries for both databases.
I'm trying to insert a date in a date field into both database tables. Both databases support this date format only: 2013-07-19. In the course of our DML operations we are facing a problem when inserting dates as Empty or Null.
Both databases have their own syntax to store an "empty field" in a Date field/column.
MySQL allows dates as 0000-00-00 or NULL (specifically written at an appropriate position). But Oracle does not support this format. Oracle only allows date fields as Empty.
How to write common queries in this type of situation?
INSERT INTO TABLENAME(DATEFIELD) VALUES(NULL);
...works both in MySQL and Oracle, so you could use that.