Not sure about violations of 3NF in the following relation - relational-database

I have the following relation, but I am not sure about the violation of the 3NF:
R={A,B,C,D,E,F,G}
FDs:
{AF-->BCG, B-->DE, CG-->EF, E-->G}
My keys are {{A,F}},{{A,C,G}},{{A,C,E}},{{A,B,C}}
With the following definition according to Wiki:
X is a superkey, or
Every element of Y-X, the set difference between Y and X, is a prime attribute (i.e., each attribute in A-Y is contained in some candidate key)
Then the only violations would be B-->DE and CG-->EF because if X is not the superkey and therefore every element on the right side must be part a key? Is that correct?

Related

Different versions of 3NF?

I have a question on the definition of 3NF given by Chris Date in his book "Database Design and Relational Theory", page 78.
The definition given in the book is: "A relvar R is in 3NF iff for every non-trivial FD X -> Y, either X is a superkey, or Y is a subkey."
(For Date "Y is a subkey" means that Y is contained in a candidate key, and no assumption is made on the cardinality of the set Y in the Date definition.)
It seems to me, however, that this definition is not equivalent to the usual definition (that can be found in other references) saying that "R is in 3NF if for every FD X -> Y, either the FD is trivial, or X is a superkey, or every element in Y\X is contained in a candidate key".
Consider now the relvar with 5 attributes R(A,B,C,D,E) with the following FD cover:
{A,B} -> C,
{C,D} -> E,
E -> B
These imply {A,E} -> {B,C}. The candidate keys of R are K1 = {A,B,D}, K2 = {A,C,D} and K3 = {A,E,D} and so the FD {A,E} -> {B,C} shows that R is not in 3NF if we use Date's definition.
However, it is in 3NF if we use the "usual" definition (since every attribute is actually contained in a candidate key).
Is there something I do not understand? Or is Date really using another (stronger than the usual one) definition of 3NF?
The Date definition says "Y is a subkey means that Y is contained in a key, ...".
The 'usual definition' (where did you get that) says "or every element in Y\X is contained in a key".
Then they both say "Y ... contained in a key".
You can equivalently write {A,E} -> {B,C} as two FDs {A,E} -> {B}; {A,E} -> {C}. Now each Y\X is "contained in a [candidate] key".
So you seem to be quibbling about the wording "every element in Y", which the Date definition isn't explicit about(?) Or perhaps it is, and you haven't quoted Date in full?

How to prove 3NF?

I am trying really hard to spin my brain around how to prove 3NF.
I actually have the answer, but if someone know this well enough to make me understand it, I would be very grateful. Ok, here it goes:
If R is in 3NF according to Definition 2, R must be in 3NF according to Definition1.
Recall that if R is in 3NF according to Definition 1, then the following two conditions must be satisfied.
i) For R, we don’t have any transitive function dependency between a non-key
attribute and a key through some other non-key attribute.
ii) For R, we don’t have any partial function dependency between a non-key attribute
and a key.
Assume that R does not satisfy (i). Then, we must have a transitive FD: X → A, A → B, where X is a key, A and B are non-key attributes. But according to Definition 2, R does not have such kind of FDs. (That is, A must be a prime attribute or a super key.) Contradiction. So R must be satisfy (i).
Assume that R does not satisfy (ii). Then, we must have a subset S of a key X (S ⊂ X) such that there exists a non-key attribute A with S → A. However, according to Definition 2, S must be a super key. Contradiction. So R must satisfy (ii).
Therefore, R is in 3NF according to Definition 1.
If R is in 3NF according to Definition 1, R must be in 3NF according to Definition 2.

Assume that R is not in 3NF according to Definition 2. Then, we must have a FD: X → A such that X is not a super key and A is not a prime attribute. Consider the a key X’ of R. We must have
X’ →X→A.
It is a transitive FD between the non-key attribute A and the key X’ through X. If X is a non-key attribute, then R is not in 3NF according to Definition 1. Contradiction. If X appears in X’, we have a partial FD. So R is not in 2NF, contradicting to Definition 1.

How can I determine the candidate keys in this relation

I have the relation R(ABCDEF) and the functional dependencies F{AC->B, BD->F, F->CE}
I have to find all the candidate keys for the relation(Armstrong axioms).
I did this:
A->A, B->B, C->C, D->D, E->E, F->F
From F->CE => F->C and F->E
And then:
1. BD->F
2. F->E
3. BD->E
4. BD->EF
5. BD->BD
6. BD->BDEF
7. BD->F
8. F->CEF
9. BD->CEF => BD->BCDEF
Now I am trying to get A on the right hand side of BD->BCDEF so BD can become a candidate key.
It would be great if someone could help.
EDIT:
1. ABD->ABCDEF
2. ACD->BD
3. ACD->ABD => AC->B and ACD->ABCDEF => BD->ABCDEF
The last step in your (edited) logic is
AC->B and ACD->ABCDEF, therefore BD->ABCDEF
It looks like you've replaced AC with B on the left-hand side. You seem to be thinking in arithmetical terms, not in terms of Armstrong's rules of inference. There isn't a rule of inference that says "if AC->B, then wherever AC appears, you can replace AC with B". (Sometimes it looks like that's what happens, but it's not.) AC and B aren't equal, and they're not equivalent.
Imagine that people's names are unique. Then "name" would determine "height", and "name" would determine "weight". But you can't replace name with height; you can't say that "height" determines "weight". The terms aren't equal, and they're not equivalent.
BD is not a candidate key, but ABD is. (There are others.)
Rules of Thumb:
An Attribute appearing only on the left hand side in your FDs is in all keys.
An Attribute not appearing in any of your FDs is in all keys.
An Attribute only appearing on the right hand side in your FDs is not in any key.
A candidate key is the left hand side of a derived FD on which all attributes depend.
Example:
R(ABCDE), (A->C, AB->D, D->B)
E does not appear in any FDs. E is in all Keys.
A appears only on the left hand side. A is in all keys.
C appears only on the right hand side. C is not in any keys.
Keys will include the attributes:
AE
Find dependencies with every possible key from AE:
A->C
A->AC (X->XY axiom)
E->E
AE->ACE (from previous 2 FDs)
Not all attributes are on the right, therefore AE is not a key, only part of all keys.
Start combining AE with BCD and see what comes out:
ADE->ABCDE (as D->B, and by X->XY axiom D->BD. This is a key, by last rule of thumb)
ACE->ACE
ABE->ABCDE (AE->ACE, B->BD from axioms, this is a key)
ABCE->ABCDE, ABDE->ABCDE (superkeys of ABE, so ignore)
ACDE->ABDCE (superkey of ADE)
Assuming I've done this correctly, Then ABE and ADE are keys.

Difference between declarative and model-based specification

I've read definition of these 2 notions on wiki, but the difference is still not clear. Could someone give examples and some easy explanation?
A declarative specification describes an operation or a function with a constraint that relates the output to the input. Rather than giving you a recipe for computing the output, it gives you a rule for checking that the output is correct. For example, consider a function that takes an array a and a value e, and returns the index of an element of the array matching e. A declarative specification would say exactly that:
function index (a, e)
returns i such that a[i] = e
In contrast, an operational specification would look like code, eg with a loop through the indices to find i. Note that declarative specifications are often non-deterministic; in this case, if e matches more than one element of e, there are several values of i that are valid (and the specification doesn't say which to return). This is a powerful feature. Also, declarative specifications are often not total: here, for example, the specification assumes that such an i exists (and in some languages you would add a precondition to make this explicit).
To support declarative specification, a language must generally provide logical operators (especially conjunction) and quantifiers.
A model-based language is one that uses standard mathematical structures (such as sets and relations) to describe the state. Alloy and Z are both model based. In contrast, algebraic languages (such as OBJ and Larch) use equations to describe state implicitly. For example, to specify an operation that inserts an element in a set, in an algebraic language you might write something like
member(insert(s,e),x) = (e = x) or member(s,x)
which says that after you insert e into s, the element x will be a member of the set if you just inserted that element (e is equal to x) or if it was there before (x is a member of s). In contrast, in a language like Z or Alloy you'd write something like
insert (s, e)
s' = s U {e}
with a constraint relating the new value of the set (s') to the old value (s).
For many examples of declarative, model-based specification, see the materials on Alloy at http://alloy.mit.edu, or my book Software Abstractions. You can also see comparisons between model-based declarative languages through an example in the appendix of the book, available at the book's website http://softwareabstractions.org.

how to find the highest normal form for a given relation

I've gone through internet and books and still have some difficulties on how to determine the normal form of this relation
R(a, b, c, d, e, f, g, h, i)
FDs =
B→G
BI→CD
EH→AG
G→DE
So far I've got that the only candidate key is BHI (If I should count with F, then BFHI).
Since the attribute F is not in use at all. Totally independent from the given FDs.
What am I supposed to do with the attribute F then?
How to determine the highest normal form for the realation R?
What am I supposed to do with the attribute F then?
You could observe the fact that the only FD in which F gets mentioned, is the trivial one F->F. It's not explicitly mentioned precisely because it is trivial. Nonetheless, all of Armstrong's axioms apply to trivial ones equally well. So, you can use this trivial one, e.g. applying augmentation, to go from B->G to BF->GF;
How to determine the highest normal form for the relation R?
first, test the condition of first normal form. If satisfied, NF is at least 1. Check the condition of second normal form. If satisfied, NF is at least 2. Check the condition of third normal form. If satisfied, NF is at least three.
Note :
"checking the condition of first normal form", is a bit of a weird thing to do in a formal process, because there exists no such thing as a formal definition of that condition, unless you go by Date's, but I have little doubt that your course does not follow that definition.
Hint :
Given that the sole key is BFHI, which is the first clause of "the key, the whole key, and nothing but the key" that gets violated by, say, B->G ?