Database normalization of an entity with varied properties - mysql

In my application I have an entity, table, called actions with varied properties. To clarify the case, the following is the table actions structure:
id,
status_id(not null),
section_id(not null),
job_id (not null)
equipment_id (null),
cause_id (null),
solution_id (null),
created_at,
closed_at,
action_type (not null) char(3)
Where all fields suffixed with _id are foreign keys and the action_type is very limited and defined list of actions types, so I defined it in a configuration file i.e there is no database entity for action_type.
My question is more general than this one: Can a foreign key be NULL and/or duplicate? where I'm asking about normalization principal.
In my case, some action types has no need, for example, for equipment_id, where others need equipment_id but not need both cause_id and solution_id, etc
In my database design, the actions table looks like Many to Many conjugation table.
The above design allows, easily, to get many statistics data about sections and jobs without need to perform complex join queries.
My question is: Does my normalization and design correct?

Yes. A foreign key containing a NULL represents a case where a relationship is optional, and the relationship is not present in this instance.
In your case, there may be entries where there is simply no corresponding equipment, and equipment_id is accordingly left NULL. When a join is done to the reference table, rows with NULL in the foreign key will simply drop out.

Yes if these _id are not so important so in normalization you can set them as null or remove them :)

Related

Foreign Key or Composite Key?

I just started applying everything that I read about table relationships but I'm kind of confused on how to insert data on tables with MANY-TO-MANY relationship considering there's a third table.
Right now I have these tables.
subject
name
code PK
units
description
schoolyear
schoolyearId PK
yearStart
yearEnd
schoolyearsubjects (MANY TO MANY table)
id PK
code FK
schoolyearId FK
But the problem with the above schoolyearsubjects table is that, I don't know how I can insert the schoolyearId from the GUI. On the GUI screenshot, as soon as "Save" button is clicked, a TRANSACTION containing 2 INSERT statements (to insert on subject) and (to insert on schoolyearsubjects) will execute. If I stick with the above, I'll have to insert the schoolyearId. schoolyearId definitely won't come from GUI.
I'm thinking of changing the columns of schoolyearsubjects and schoolyear to this:
schoolyear
--composite keys (yearStart, yearEnd)
yearStart (PK)
yearEnd (PK)
schoolyearsubjects(MANY TO MANY table)
id PK
code (FK)
yearStart (FK) --is this possible?
yearEnd (FK) --is this possible?
1.) Is the solution to change the columns and make a composite key so I can just insert yearStart and yearEnd values instead of schoolyearId?
2.) Is my junction / linking table schoolyearsubjects correct?
3.) What can you advise?
I'd appreciate any help.
Thanks.
For me schoolyear is a period, and as such, there is no need to use a surrogate key here. This always makes things more confusing, and it is always more difficult to develop a graphical interface for it (I'm talking about how we model periods as developers).
If you stop to think, periods are seen itself as something unique. Will you have a period equal to the other? Stop and think. Even if you have, this will occur in years or different times. So we already have a primary key for schoolyear. Eliminate "schoolyard PK" from schoolyear. Use composite key here with yearStart and yearend. So, your schoolyear entity (in future, table) will be like:
yearStart PK
yearEnd PK
In the intermediate table, you will have 3 fields as composite primary key (also foreign key!):
yearStart PK FK (from schoolyear)
yearEnd PK FK (from schoolyear)
code PK FK (from subject)
This will permit that a period have only a single subject. If, on the other hand, you want a period with more than one subject, you would have to put a surrogate key here.
Now, to draw the graphical interface, you will only have to use a select box (combo box). In this case you will have each item as a text, something like "from year X to Y" (a period). Your users can very well understand and select it.
Note: In anyway, you may not have the ID of a record in an interface, but the values that identify it. This is permissible to be seen, and identifies a record of their remaining.
If, however, you do not have periods as something unique, then "yearStart" and "yearEnd" are fields in subject entity, and there is no schoolyear entity. To be honest, the entity "schoolyear" should only exist if you want to reuse it's records to relationships with other records of other(s) table(s). I'm not saying this is or is not the case. Watch out as well. If you do this you say that every period has only one subject (as fields). I do not know if this is exactly what you want. We must always remember the most important thing in shaping an ER-Diagram:
CONTEXT
Check your context. What does it ask? If you have any questions, please comment. If you can offer me some more context here, I can help you more.
Assuming you have parameters #code, #yearStart and #yearEnd with values from the UI:
INSERT INTO schoolyearsubjects ( code, yearStart, yearEnd )
SELECT #code, y.yearStart, y.yearEnd
FROM schoolyear y
WHERE #yearStart <= y.yearStart
AND y.yearEnd <= #yearEnd;
...but I think you have a design flaw with your schoolyearsubjects because it allows duplicates e.g. doing this:
INSERT INTO schoolyearsubjects VALUES ( 'code red', '2016', '2017' );
INSERT INTO schoolyearsubjects VALUES ( 'code red', '2016', '2017' );
INSERT INTO schoolyearsubjects VALUES ( 'code red', '2016', '2017' );
looks like it would result in three de facto duplicate rows.
With your current scheme you can insert the schoolyearId with a request as follows:
INSERT INTO schoolyearsubjects (id, code, schoolyearId)
VALUES ( ${id},
${code_from_GUI},
( SELECT schoolyearId
FROM schoolyear
WHERE yearStart=${start_from_GUI} AND yearEnd=${end_from_GUI})
);
For this to work, the unique constraint on (yearStart, yearEnd) in the schoolyear table is required.
As to the rest of your questions:
1) You can use a composite key in the schoolyear table it will work either way.
2) The schoolyearsubjects is correct as it allows to write join queries. If you get rid of the schoolyearId columns than you will not probably need the schoolyear table alltogether as all data you may want to get will be in the schoolyearsubjects table.
3) This article may help to deside what type of key to use.

Nullable foreign keys - good or bad practice?

Let's assume I have 2 tables: foo and bar.
In third table I want to store different kind of data, however every row will have a reference to either foo OR bar.
Is it correct if I create 2 NULLable foreign keys - foo_id and bar_id - in the third table, or is it againts database design principles?
Basically, I thought all the time that foreign keys need to ALWAYS have a "parent", so if I try to e.g. INSERT a row with no primary key matched (or, in this case, with a foreign key set to NULL), I will get an error. Nullable FK-s are new to me, and they feel a bit off.
Also, what are the alternatives? Is it better to create separate tables storing single reference? Isn't this creating redundancy?
Linking tables?
Help.
A nullable FK is "okay". You will still get an error when you try to insert a non-existing parent key (it is just NULL that is allowed now).
The alternative is two link tables, one for foo and one for bar.
Things to consider:
Link tables allow for 1:N. If you don't want that, you can enforce it by primary key on the link table. That is not necessary for the id column solution (they are always 1:N).
You can avoid columns with mostly NULL values using link tables. In your case, though, it seems that you have NULL for exactly half the values. Probably does not qualify as "mostly". This becomes more interesting with more than two parent tables.
You may want to enforce the constraint that exactly one of your two columns is NULL. This can be done with the id column version using a check constraint. It cannot be done with link tables (unless you use triggers maybe).
it is depend on the business logic of the program. if the foreign key field must has a value , it is bad to set it null-able .
for example .
a book table has category_id field which the value is reference from bookCategory table.
each record in book table must has category . if for some reason you set it as nullable . this will cause some record in book table with category_id is null.
the problem will show up in report. the following 2 query will return different total_book
select count(*) as total_book from book;
select
count(*) as total_book
from
book
inner join bookCategory
on book.category_id = category.id
my advice is don't use null-able unless you expect value and no-value . alot of complex system that sometime have value different from one report and another , usually is cause by this.

Can a foreign key act as a primary key?

I'm currently designing a database structure for our team's project. I have this very question in mind currently: Is it possible to have a foreign key act as a primary key on another table?
Here are some of the tables of our system's database design:
user_accounts
students
guidance_counselors
What I wanted to happen is that the user_accounts table should contain the IDs (supposedly the login credential to the system) and passwords of both the student users and guidance counselor users. In short, the primary keys of both the students and guidance_counselors table are also the foreign key from the user_accounts table. But I am not sure if it is allowed.
Another question is: a student_rec table also exists, which requires a student_number (which is the user_id in the user_accounts table) and a guidance_counsellor_id (which is also the user_id in the user_accounts) for each of its record. If both the IDs of a student and guidance counselor come from the user_accounts table, how would I design the student_rec table? And for future reference, how do I manually write it as an SQL code?
This has been bugging me and I can't find any specific or sure answer to my questions.
Of course. This is a common technique known as supertyping tables. As in your example, the idea is that one table contains a superset of entities and has common attributes describing a general entity, and other tables contain subsets of those entities with specific attributes. It's not unlike a simple class hierarchy in object-oriented design.
For your second question, one table can have two columns which are separately foreign keys to the same other table. When the database builds the query, it joins that other table twice. To illustrate in a SQL query (not sure about MySQL syntax, I haven't used it in a long time, so this is MS SQL syntax specifically), you would give that table two distinct aliases when selecting data. Something like this:
SELECT
student_accounts.name AS student_name,
counselor_accounts.name AS counselor_name
FROM
student_rec
INNER JOIN user_accounts AS student_accounts
ON student_rec.student_number = student_accounts.user_id
INNER JOIN user_accounts AS counselor_accounts
ON student_rec.guidance_counselor_id = counselor_accounts.user_id
This essentially takes the student_rec table and combines it with the user_accounts table twice, once on each column, and assigns two different aliases when combining them so as to tell them apart.
Yes, there should be no problem. Foreign keys and primary keys are orthogonal to each other, it's fine for a column or a set of columns to be both the primary key for that table (which requires them to be unique) and also to be associated with a primary key / unique constraint in another table.

composite foreign key referencing one or more columns

I have a table (tableA) that joins 3 other tables with primary keys 'vehicle','engine','transmission' I would like to be able to assign parts to one or more of these eg 'only this vehicle' or 'only this vehicle with this engine' or 'any vehicle with this engine'
My plan is to have a parts table (tableB) with also primary keys 'vehicle','engine','transmission' and I'd like to be able to insert for example:
4844, null, null to assign a part to only 'vehicle' or
4844, 240, null to assign a part to 'only this vehicle with this engine'.
Is there some way I can enforce integrity at the database level so that.
the fields that are filled in in tableB must reference fields in tableA.
at least one of the fields must be filled in.
the option exists not to insert data into them all?
read up on data model patterns. there are several good books (hay, fowler, silverston, blaha) and hay covers this kinda stuff well
if you want to do this sort of stuff, please use a real database like postgres that has check constraints. for your questions 2 and 3, this is easily solved with checks.

mysql many to many relationship

Been reading the tutorial How to handle a Many-to-Many relationship with PHP and MySQL .
In this question I refer to the "Database schema" section which states the following rules:
This new table must be constructed to
allow the following:
* It must have a column which links back to table 'A'.
* It must have a column which links back to table 'B'.
* It must allow no more than one row to exist for any combination of rows from table 'A' and table 'B'.
* It must have a primary key.
Now it's crystal clear so far.
The only problem I'm having is with the 3rd rule ("It must allow no more than one row to exist for any combination").
I want this to be applied as well, but it doesn't seem to work this way.
On my test instance of mysql (5.XX) I'm able to add two rows which reflect the same relationship!
For example, if I make this relation (by adding a row):
A to B
It also allows me to make this relation as well:
B to A
So the question is two questions actually:
1) How do I enfore the 3rd rule which will not allow to do the above? Have only one unique relation regardless of the combination.
2) When I'll want to search for all the relations of 'A', how would the SQL query look like?
Note #1: Basically my final goal is to create a "friendship" system, and as far as I understand the solution is a many-to-many table. Suggest otherwise if possible.
Note #2: The users table is on a different database from the relations (call it friendships) table. Therefore I cannot use foreign keys.
For the first question:
Create a unique constraint on both
columns
Make sure you always sort the columns. So if your table has the
colummns a and b than make sure
that a is less than or equal to
b
For the second question:
SELECT
*
FROM
many_to_many_table
WHERE
a = A or b = A
It sounds like you want a composite primary key.
CREATE TABLE relationship (
A_id INTEGER UNSIGNED NOT NULL,
B_id INTEGER UNSIGNED NOT NULL,
PRIMARY KEY (A_id, B_id)
);
This is how you setup a table so that there can only ever be one row that defines tables A and B as related. It works because a primary key has to be unique in a table so therefore the database will allow only one row with any specific pair of values. You can create composite keys that aren't a primary key and they don't have to be unique (but you can create a unique non-primary key, composite or not), but your specification requested a primary key, so that's what I suggested.
You can, of course, add other columns to store information about this specific relationship.
Ok WoLpH was faster, I basically agree (note that you have to create a single constraint on both columns at the same time!). And just to explain why you collide with the rules you mentioned: Typically, A and B are different tables. So the typical example for n:m relations would allow entries (1,0) and (0,1) because they'd be refering to different pairs. Having table A=table B is a different situation (you use A and B as users, but in the example they're tables).