INNODB -> Should the default column be zero or null? - mysql

My english is a little weak sorry
When using INNODB, must the column be 0 or should it be null?
Does the problem occur if the joined columns are defined as 0?
Finally, which is better in terms of performance?
Thanks.

Ignore the problem. You do not have to provide a DEFAULT value. You do not have to declare a column NULL (or NOT NULL).
Instead...
Think about the application.
Case 1: Your application will always have a value for that column, and it will always specify the value. Then declaring the column to be NOT NULL and not providing a DEFAULT makes sense.
Case 2: You don't have a value yet. Example: The table has start_date and end_date. You create a row when the thing "starts", so you fill in start_date (see Case 1) but leave end_date empty. "Empty" could be encoded as NULL and DEFAULT NULL. Later you UPDATE the table to fill in the end_date with a real value.
Case 3: The table has a "counter". When you initially add a row, the counter needs to start with "1". Later you will UPDATE ... SET counter=counter+1. You could either explicitly put the "1" in when you create the row, or you could leave out the value when inserting, but have the column declared NOT NULL DEFAULT '1'
NULL could represent "not-yet-filled-in" (Case 1), "don't know the value", "use the default", and several other things. This is an application choice.
There are other uses for NULL. Unless you have a use for NULL, declare each column NOT NULL.
If, when INSERTing, you specify all the columns, then there is no need for DEFAULTs. DEFAULTs are a convenience, not a necessity. Without a DEFAULT, you get "0" for numeric NOT NULL columns, or '' for strings.
Performance -- Probably not an issue. Do you have a particular example we should discuss?
JOINing -- I would avoid joining on a column that might have NULL in it. This rarely happens in "real life", since one joins on the unique identifier for rows in one column with a column in the other table:
FROM A JOIN B ON B.id = A.b_id
That is, B.id is probably the PRIMARY KEY of B, hence cannot be NULL. On the other hand, A.b_id could be NULL to indicate there is no row in B corresponding to the row in A. No problem.

Related

what is the problem of "null" values in mysql or other dbs?

As far as I know, null can be indexed in InnoDB. But many colleagues say the null values are bad DB designs. So I don't know what is the problem of "null", is the 3rd value (eq, not eq, not known) problem or something else that stops people use nullable columns?
NULL is an essential piece of SQL. It indicates a value that does not exist.
Avoiding NULL would lead into situations where you make up arbitary special values for items that does not exist (0, -1, empty strings etc). That would be bad design.
I would suggest that 80% of columns should be declared NOT NULL.
But there are many cases where NULL is the 'natural' thing to use --
A start_time is known but the end_time is not yet known.
An optional attribute
etc.
As for indexing, read the rules, and be ready to abide by them. PRIMARY KEY disallows NULLs, but UNIQUE allows them. But are they treated as "equal" or not.
As already mentioned, IS NULL and IS NOT NULL work as expected, but = NULL does not. Note also <=>.
LEFT JOIN creates artificial NULLs (when the 'right' table's row is missing). An example of that usage:
FROM a
LEFT JOIN b ON ...
WHERE b.id IS NULL
See COALESCE() for a way to turn NULL into something else. Example:
SELECT ...,
( SELECT name FROM foo WHERE ... ) AS foo_name,
...,
FROM ...
may deliver NULL. This would be friendlier:
SELECT ...,
COALESCE(( SELECT name FROM foo WHERE ... ), 'N/A') AS foo_name,
...,
FROM ...
Personally, I often shun NULL in these two places:
string VARCHAR(99) NOT NULL DEFAULT ('') -- empty string is usually good enough
choice ENUM('unknown', 'this', 'that') NOT NULL -- easier to display and test
There is virtually no performance difference (space or speed) between NULL and NOT NULL.

Alternatively setting Null values in columns

I am building a database and one of the tables contains the columns "sensor_id" and "station_id". When someone tries to insert a new row in the table we can have a NULL value in the "sensor_id" column but then we can not have a NULL value in the "station_id" under no circumstances. Vice versa, when the "station_id" is NULL, the "sensor_id" must not be NULL. If a value is entered at both columns that is not a problem.
I am currently working in MySQL Workbench and it seems that my choices are to set both columns as NN(Not NULL) which is too strict implementation as one of them is sufficient, to set just one of them as NN which means that one specific column must always be filled(not the case either) or set none of them NN which is too loose as at least one of both values must be given.
Visually the table looks like this(sorry for the Microsoft Word substitute but I have problems with MYSQL server and can not acces the database):
Alert_id is the primary key of the table, so duplicate values are allowed for the other two columns.
How could I implement this?
You should add contstraint to this table, like that:
CONSTRAINT CheckSensorStationNotNull CHECK (station_id is not null or sensor_id is not null)

SQL coalesce(): what type does the combined column have?

Lets say I use coalesce() to combine two columns into one in select and subsequently a view constructed around such select.
Tables:
values_int
id INTEGER(11) PRIMARY KEY
value INTEGER(11)
values_varchar
id INTEGER(11) PRIMARY KEY
value VARCHAR(255)
vals
id INTEGER(11) PRIMARY KEY
value INTEGER(11) //foreign key to both values_int and values_varchar
The primary keys between values_int and values_varchar are unique and that allows me to do:
SELECT vals.id, coalesce(values_int.value, values_varchar.value) AS value
FROM vals
JOIN values_int ON values_int.id = vals.value
JOIN values_varchar ON values_varchar.id = vals.value
This produces nice assembled view with ID column and combined value column that contains actual values from two other tables combined into single column.
What type does this combined column have?
When turned into view and then queried with a WHERE clause using this combined "value" column, how is that actually handled type-wise? I.e. WHERE value > 10
Som rambling thoughts in the need (most likely wrong):
The reason I am asking this is that the alternative to this design have all three tables merged into one with INT values in one column and VARCHAR in the other. That would of course produce a lots of NULL values in both columns but saved me the JOINs. For some reason I do not like that solution because it would require additional type checking to choose the right column and deal with the NULL values but maybe this presented design would require the same too (if the resulting column is actually VARCHAR). I would hope that it actually passes the WHERE clause down the view to the source (so that the column does NOT have a type per se) but I am likely wrong about that.
You query should be explicit to be clear, In this case mysql is using varchar.
I would write this query like this to be clear
coalesce(values_int.value,cast(values_varchar.value as integer), 0)
or
coalesce(cast(values_int.value as varchar(20)),values_varchar.value,'0')
you should put in that last value unless you want the column to be null if both columns are null.
Returns the data type of expression with the highest data type precedence. If all expressions are nonnullable, the result is typed as nonnullable.
So in your case the type will be VARCHAR(255)
Lets say I use coalesce() to combine two columns into one
NO, that's not the use of COALESCE function. It's used for choosing a provided default value if the column value is null. So in your case, if values_int.value IS NULL then it will select the value in values_varchar.value
coalesce(values_int.value, values_varchar.value) AS value
If you want to combine the data then use concatenation operator (OR) CONCAT() function rather like
concat(values_int.value, values_varchar.value) AS value
Verify it yourself. An easy way to check in MySQL is to DESCRIBE a VIEW you create to capture your dynamic column:
mysql> CREATE VIEW v AS
-> SELECT vals.id, coalesce(values_int.value, values_varchar.value) AS value
-> FROM vals
-> JOIN values_int ON values_int.id = vals.value
-> JOIN values_varchar ON values_varchar.id = vals.value;
Query OK, 0 rows affected (0.01 sec)
Now DESCRIBE v will show you what's what. Note that under MySQL 5.1, I see the column as varbinary(255), but under 5.5 I see varchar(255).

Mysql test against value excludes NULL entries - can someone explain?

I've got a table shop_categories with a field called category_is_hidden which is defined as:
category_is_hidden tinyint(4) DEFAULT NULL
In the database, the values for that field are either 1 or NULL.
SELECT * FROM shop_categories where category_is_hidden IS NULL
returns all the null entries.
SELECT * FROM shop_categories where category_is_hidden <> 1
returns an empty sets (that is, it excludes the null values).
Why does that last statement not include null entries? isn't null <> 1?
Edit: Tested on MySQL 5.1 & 5.5
Since your category_is_hidden column appears to be a flag, I'd change it to tinyint(1) and make it be either 1 or 0 instead of 1 or NULL. Allowing a column to be null will add a byte to the storage requirements of the column, leading to an increased index size.
Next, the question you actually asked. NULL by definition is UNKNOWN. Your query says "give me everything where category_is_hidden is not 1". But the NULL column values are all unknown. So MySQL doesn't know if they are not 1. You need to rewrite the WHERE as IS NOT NULL. If your column is going to be tri-state (1, NULL, other value), you need to make your WHERE have an OR in it to allow for that.
If a field is null, then it means that it does not have a value. It is not zero, or an empty string. If you check if NULL <> 1, then it is not, because it is not a number; it is not anything, and therefore cannot be compared.

MySQL NULL or NOT NULL That is The Question?

What is the difference between NULL and NOT NULL? And when should they be used?
NULL means you do not have to provide a value for the field...
NOT NULL means you must provide a value for the fields.
For example, if you are building a table of registered users for a system, you might want to make sure the user-id is always populated with a value (i.e. NOT NULL), but the optional spouses name field, can be left empty (NULL)
I would suggest
Use NOT NULL on every field if you can
Use NULL if there is a sensible reason it can be null
Having fields which don't have a meaningful meaning for NULL nullable is likely to introduce bugs, when nulls enter them by accident. Using NOT NULL prevents this.
The commonest reason for NULL fields is that you have a foreign key field which is options, i.e. not always linked, for a "zero or one" relationship.
If you find you have a table with lots of columns many of which can be NULL, that starts sounding like an antipattern, consider whether vertical partitioning makes more sense in your application context :)
There is another useful use for NULL - making all the columns in an index NULL will stop an index record being created for that row, which optimises indexes; you may want to index only a very small subset of rows (e.g. for an "active" flag set on only 1% or something) - making an index which starts with a column which is usually NULL saves space and optimises that index.
What is the difference between NULL and NOT NULL?
When creating a table or adding a column to a table, you need to specify the column value optionality using either NULL or NOT NULL. NOT NULL means that the column can not have a NULL value for any record; NULL means NULL is an allowable value (even when the column has a foreign key constraint). Because NULL isn't a value, you can see why some call it optionality - because database table requires that in order to have a column, there must be an instance of the column for every record within the table.
And when should they be used?
That is determined by your business rules.
Generally you want as many columns as possible to be NOT NULL because you want to be sure data is always there.
NOT NULL means that a column cannot have the NULL value in it - instead, if nothing is specified when inserting a row for that column, it will use whatever default is specified (or if no default is specified, whatever MySQL's default is for that type).
Fields that aren't NOT NULL can potentially have their value as NULL (which essentially means a missing/unknown/unspecified value). NULL behaves differently than normal values, see here for more info.
As others have answered, NOT NULL simply means that NULL is not a permitted value. However, you will always have the option of empty string '' (for varchar) or 0 (for int), etc.
One nice feature when using NOT NULL is that you can get an error or warning should you forget set the column's value during INSERT. (assuming the NOT NULL column has no DEFAULT)
The main hiccup with allowing NULL columns is that they will never be found with the <> (not equal) operator. For example, with the following categorys
Desktops
Mobiles
NULL -- probably embedded devices
The = operator works as expected
select * from myTable where category="Desktops";
However, the <> operator will exclude any NULL entries.
select * from myTable where category<>"Mobiles";
-- returns only desktops, embedded devices were not returned
This can be the cause of subtle bugs, especially if the column has no NULL data during testing initial, but later some NULL values are added due to subsequent development. For this reason I set all the columns to NOT NULL.
However, it can be helpful to allow NULL values when using a UNIQUE KEY/INDEX. Normally a unique key requires the column (or combination of columns) to be unique for the whole table. Unique keys are a great safeguard that the database will enforce for you.
In some cases, you may want the safeguard for most of the rows, but there are exceptions.
If any column referenced by that particular UNIQUE KEY is NULL, then the uniqueness will no longer be enforced for that row. Obviously this would only work if you permit NULLs on that column, understanding the hiccup I explained above.
If you decide to allow NULL values, consider writing your <> statements with an additional condition to detect NULLs.
select * from myTable where category<>"Desktops" or category is null;
NOT NULL is a column constraint and should be used when you have a column that's not in a primary key (primary keys are intrinsically not-null so it's silly to say NOT NULL explicitly about them) of which you know the values will always be known (not unknown or missing) so there's no need for nulls in that column.
NULL is a keyword occurring in many contexts -- including as a column constraint, where it means the same as the default (i.e., nulls are allowed) -- but also in many other contexts, e.g. to insert a null in a column as part of an INSERT...VALUES statement.
Also note that NULL is not equal to anything else, even not to NULL itself.
For example:
mysql> select if(NULL=NULL, "null=null", "null!=null");
+------------------------------------------+
| if(NULL=NULL, "null=null", "null!=null") |
+------------------------------------------+
| null!=null |
+------------------------------------------+
1 row in set (0.00 sec)
This definition of NULL is very useful when you need a unique key on a column that is partially filled. In such case you can just leave all the empty values as NULL, and it will not cause any violation of the uniqueness key, since NULL != NULL.
Here is an example of how you can see if something is NULL:
mysql> select if(null is null, "null is null", "null is not null");
+------------------------------------------------------+
| if(null is null, "null is null", "null is not null") |
+------------------------------------------------------+
| null is null |
+------------------------------------------------------+
1 row in set (0.01 sec)
If you're not sure, use NOT NULL.
Despite the common belief, NOT NULL doesn't require you to fill all fields; it just means whatever you omit will have the default value. So no, it doesn't mean pain. Also, NULL is less efficient in terms of indexing, and causes many edge case situations when processing what you receive from a query.
So, while of course NULL values have a theoretical meaning (and in rare cases you can benefit from this), most of the time NOT NULL is the way to go. NOT NULL makes your fields work like any variable: they always have a value, and you decide if that value means something or not. And yes, if you need all the possible values and one extra value that tells you there's simply nothing there, you can still use NULL.
So why do they love NULL so much?
Because it's descriptive. It has a semantic meaning, like "Nnah, wait, this is not just an empty string, this is a lot more exotic - it's the lack of information!" They will explain how it's different to say "time is 00:00" and "i don't knot what time it is". And this is valid; it just takes some extra effort to handle this. Because the system will allocate extra space for the information "is there a value at all" and it will constantly struggle checking it out. So for the tiny piece of semantic beauty, you sacrifice time and storage. Not much, but still. (Instead, you could have said "99:99" which is clearly an invalid time and you can assign a constant to it. No, don't even start, it's just an example.)
The whole phenomenon reminds me of the good old isset debate where people are somehow obsessed with the beauty of looking at a nonexistent array index and getting an error message. This is completely pointless. Practical defaults are a blessing, they simplify the work in a "you know what I mean" style and you can write more readable, more concise, more expressive code that will make sense even after you spend 4 years with other projects.
Inevitable NULLs
If you have JOINs, you will encounter NULLs sooner or later, when you want to join another row but it's not there. This is the only valid case where you can't really live without NULLs; you must know it's not just an empty record you got, it's no record at all.
Otherwise? NOT NULL is convenient, efficient, and gives you fewer surprises. In return, it will be called ignorant and outrageous by some people with semantical-compulsive disorder. (Which is not a thing but should be.)
TL;DR
Prefer NOT NULL when possible.
It's a weird thing - for the machine, too.