Decomposing relations to Fourth Normal Form - relational-database

Disclosure: I am taking Stanford's online database course. The forum there is dead, and I'm hoping for some help on SO.
Here's the quiz question:
Consider relation R(A,B,C,D,E) with multivalued dependencies:
A -» B, B -» D
and no functional dependencies. Suppose we decompose R into 4th Normal Form. Depending on the order in which we deal with 4NF violations, we can get different final decompositions. Which one of the following relation schemas could be in the final 4NF decomposition?
And here is my thinking:
Since we are given that there are no functional dependencies, the only key is set of attributes (A,B,C,D,E). In other words, both multivalued dependencies in the question are violating, and we must decompose them.
I am following the decomposition algorithm given in lecture:
Compute keys for R [done]
Repeat until all relations are in 4NF
Pick any R' with nontrivial A -» B that violates 4NF
Decompose R' into R_1(A, B) and R_2(A, rest)
Compute functional dependencies and multivalued dependencies for R_1 and R_2
Compute keys for R_1 and R_2
I see two ways to decompose the relations: start with A -» B or B -» D.
Starting with A -» B
R(A,B,C,D,E)
|
+-----------+
| |
R_1(A,B) R_2(A,C,D,E)
Since B and D are no longer in the same relation, we do not have a 4NF violation, and we're done. I'm not sure how to compute the FDs, MVDs, and keys at this point.
Starting with B -» D
R(A,B,C,D,E)
|
+-----------+
| |
R_1(B,D) R_2(B,A,C,E)
|
+----------+
| |
R_3(A,B) R_4(A,C,E)
At this point, (A and B) and (B and D) are decomposed into their own relations, so we have no violations, and we're done.
The answer choices:
At this point, I'm completely stumped. I do not see any of the relations in the answer choices, nor can I come up with an idea that will get me there:
CE
AD
AE
ABD
I don't need the answer outright, but what am I missing?

A correct answer is AD.
How is this obtained?
Consider that, like for functional dependencies, you can have multivalued dependencies implied by other multivalued dependencies. For instance, there is a pseudo-transitivity rule (or multi-valued transitivity rules) that says:
If X →→ Y holds, and Y →→ Z holds, then X →→ Z − Y holds
For this rule, from A →→ B and B →→ D you can derive A →→ D. So, if you decompose the relation in 4NF you could start from this dependency, and get a table with attributes AD. Or, alternatively, in your first decomposition, after finding R_1(A,B) and R_2(A,C,D,E), you should continue to decompose R_2, since it still contain the non-trivial MVD A →→ D, to find R_3(A, D) and R_3(A, C, E).

Related

How to deal with transitive functional dependency in MySQL

I'm relatively new to MySQL. Thank you in advance if any of you could point a direction on my problem. I am stuck on my project on the problem below. I have a data table below to show insurance premiums under different insurance variants.
ins_id (PK)
insurance_type
local_household (Variant A)
ss_medical_level (Variant B)
company_type (Variant C)
base
percentage
1
ss_pension
yes
x
x
2
ss_pension
no
x
x
3
ss_medical
yes
x
x
4
ss_medical
no
first
corporation
x
x
5
ss_medical
no
first
non_corporation
x
x
6
ss_medical
no
second
x
x
7
ss_medical
no
third
x
x
Variant C -> Variant B -> Variant A. They are transitive functional dependent by the sequence C > B > A
(-> means depends on)
In short, I am seeking a way to maintain these variants.
What I am trying to achieve is to design a schema to maintain the variants, and make it possible for easy future sustainable management on the variants. There will be more variants added for other types insurance. I had a draft design on the schema relations, but I am stuck on the variants part. I still did not get my head around on the transitive functional dependency design.
my current schema
I think I found the direction to solve my problem. I got inspired by this thread to use a path enumeration method.
https://www.youtube.com/watch?v=wuH5OoPC3hA

Finding the strongest normal form and if it isn't in BCNF decompose it?

I know how to do these problems easily when the input is basic. I know the rules for the 1st,2nd,and 3rd normal forms as well as BCNF. HOWEVER I was looking through some practice exercises and I saw some unusual input:
Consider the following collection of relations and dependencies.
Assume that each relation is obtained through decomposition from a
relation with attributes ABCDEFGHI and that all the known
dependencies over relation ABCDEFGHI are listed for each question.
(The questions are independent of each other, obviously, since the given dependencies over ABCDEFGHI are different.)
R2(A,B,F) AC → E, B → F
R1(A,C,B,D,E) A → B, C → D
I can solve 2:
A+=AB
C+=CD
AC+=ABCD
ACE=ABCDE
So ACE is the candidate key, and none of A, C and E are superkeys. It isn't bcnf for sure. Decompose it and obtain (ACE)(AB)(CD) etc etc.
BUT Number 1 is confusing me! Why is there AC → E when neither C nor E is in R2? How could this be solved? It can't be an error because many other exercises are like this :/
Another question, what happens when one functional dependency is in BCNF and others are not? Do we just ignore this functional dependency while decomposing the others into BCNF?
If I understand correctly the text of the exercise, the dependencies are those holding on the original relation (ABCDEFGHI): “all the known dependencies over relation ABCDEFGHI are listed for each question”.
So, assuming that in the original relation the only specified dependencies are AC → E and B → F, this means that the dependency AC → E is lost in the decomposed relation R2(A,B,F), that the (only) candidate key of the relation is AB, the schema is not in 2NF (since F depends on a part of a key), and that to decompose that schema in BCNF you must decompose it in (AB) and (BF).

Finding the Candidate Keys of a Relation Using the FD's

I have definitely checked out many different related posts, as suggested when creating this question. I have also done different sample problems from online sources as well from a similar problem. However, I am stuck on the problem below specifically.
Given the following relation R and the set of functional dependencies S that hold on R, find all candidate keys for R. Show your work.
R(A, B, C, D, E, F)
S:
AB → C
AC → B
AD → E
BC → A
E → F
Initially, I broke the attributes into groups: attributes found only on the left, only on the right, and on both sides (they are D, ABCE, and F respectively). I also know that I should try to compute the closure of D. This is where I get stuck. At first glance, this seems like I am unable to solve this problem, which isn't true. I also tried computing the closures of (AD), (BD), (CD), and (ED) because I thought that the closure of D = D. Any thoughts?
The keys here are ABD, ACD and BCD.
You were on the right track. After dividing the attributes in three groups, the attributes under "only on the left" list are always a part of the key. Here that attribute is D.
"I also tried computing the closures of (AD), (BD), (CD), and (ED)"
As you couldn't determine the key while taking attributes in groups of 2 you should have then tried making group of 3 attributes and check their closure.

Decomposing a relation into BCNF

I'm having trouble establishing when a relation is in Boyce-Codd Normal Form and how to decompose it info BCNF if it is not. Given this example:
R(A, C, B, D, E) with functional dependencies: A -> B, C -> D
How do I go about decomposing it?
The steps I've taken are:
A+ = AB
C+ = CD
R1 = A+ = **AB**
R2 = ACDE (since elements of C+ still exist, continue decomposing)
R3 = C+ = **CD**
R4 = ACE (no FD closures reside in this relation)
So now I know that ACE will compose the whole relation, but the answer for the decomposition is: AB, CD, ACE.
I suppose I'm struggling with how to properly decompose a relation into BCNF form and how to tell when you're done. Would really appreciate anyone who can walk me through their thought process when solving these problems. Thanks!
Although the question is old, the other questions/answers don't seem to provide a very clear step-by-step general answer on determining and decomposing relations to BCNF.
1. Determine BCNF:
For relation R to be in BCNF, all the functional dependencies (FDs) that hold in R need to satisfy property that the determinants X are all superkeys of R. i.e. if X->Y holds in R, then X must be a superkey of R to be in BCNF.
In your case, it can be shown that the only candidate key (minimal superkey) is ACE.
Thus both FDs: A->B and C->D are violating BCNF as both A and C are not superkeys or R.
2. Decompose R into BCNF form:
If R is not in BCNF, we decompose R into a set of relations S that are in BCNF.
This can be accomplished with a very simple algorithm:
Initialize S = {R}
While S has a relation R' that is not in BCNF do:
Pick a FD: X->Y that holds in R' and violates BCNF
Add the relation XY to S
Update R' = R'-Y
Return S
In your case, the iterative steps are as follows:
S = {ABCDE} // Intialization S = {R}
S = {ACDE, AB} // Pick FD: A->B which violates BCNF
S = {ACE, AB, CD} // Pick FD: C->D which violates BCNF
// Return S as all relations are in BCNF
Thus, R(A,B,C,D,E) is decomposed into a set of relations: R1(A,C,E), R2(A,B) and R3(C,D) that satisfies BCNF.
Note also that in this case, functional dependency is preserved but normalization to BCNF does not guarantee this.
1NF -> 2NF -> 3NF -> BCNF
According to given FD set "ACE" forms the key.
Clearly R(A,B,C,D,E) is not in 2NF.
2NF decomposition gives R1(A,B) , R2(C,D) and R3(A,C,E).
this decomposition decomposed relations are in 3NF and also in BCNF.

What are Boolean Networks?

I was reading this SO question and I got intrigued by what are Boolean Networks, I've looked it up in Wikipedia but the explanation is too vague IMO. Can anyone explain me what Boolean Networks are? And if possible with some examples too?
Boolean networks represent a class of networks where the nodes have states and the edges represent transitions between states. In the simplest case, these states are either 1 or 0 – i.e. boolean.
Transitions may be simple activations or inactivations. For example, consider nodes a and b with an edge from a to b.
f
a ------> b
Here, f is a transition function. In the case of activation, f may be defined as:
f(x) = x
i.e. b's value is 1 if and only if a's value is 1. Conversely, an inactivation (or repression) might look like this:
f(x) = NOT x
More complex networks use more involved boolean functions. E.g. consider:
a b
\ /
\ /
\ /
v
c
Here, we've got edges from a to c and from b to c. c might be defined in terms of a and b as follows.
f(a, b) = a AND NOT b
Thus, c is activated only if a is active and b is inactive, at the same time.
Such networks can be used to model all kinds of relations. One that I know of is in systems biology where they are used to model (huge) interaction networks of chemicals in living cells. These networks effectively model how certain aspects of the cells work and they can be used to find deficiencies, points of attack for drugs and similarities between unrelated components that point to functional equivalence. This is fundamentally important in understanding how life works.