How would transitive functional dependencies occur here?

How would transitive functional dependencies occur here? - relational-database

so im reading up on database normalization, and it seems like for the most part, a lot of us are already following up to 2NF or even 3NF without realizing it. I wonder why our professor 4 years ago told us thats a topic discussed in masters database course because its "too complicated" lol...sounds straight forward to me really...
anyways, part of this article here, talks about 3NF, and to achieve that, you need to have 2NF and no no transitive functional dependencies.
the example given is this image, but i dont understand...how could a value of a non-key column change a value of another non-key column? if anything, it sounds to me like a glitch in the system if that were to happen...
Consider the table 1. Changing the non-key column Full Name may change Salutation.

That article is terrible. So clearly some people find it difficult to understand normalisation. (Many people struggle with the difference between 3NF vs Boyce-Codd NF, which the article ducks out of explaining.) The article says
Normalization helps produce database systems that are cost-effective and have better security models.
That's not the chief reason for normalising a design. Indeed in the early days of the Relational Model (when disk was expensive, so for example dates had two-digit years), normalisation (i.e. vertical partitioning) was the opposite of cost-effective, and a lot of attention was paid to trade-offs in (partially) denormalised schemas.
The chief reason for normalisation is to avoid duplicating information and/or duplicates getting out of step, called 'update anomalies'. Specifically:
A transitive functional dependency is when changing a non-key column, might cause any of the other non-key columns to change
Is a terrible way to put it. But might mean: when you update one column (Full Name) of one row (Membership Id 3) you need to also update other column(s) (Salutation) of the same or other row(s) (Membership Id 2?); or if you don't, you break the consistency of the data.
The article doesn't tell us what FDs are expected to hold. Does Full Name determine Salutation? Is it possible the Membership ID 3 Robert Phil could qualify as a Doctor and therefore change his Salutation without Member 2 also becoming a Doctor? Then there is no FD from Full Name to Salutation, and what looks like duplicated entry is not.
Presumably what the example is trying to show (I'm not sure, because it's wrong) is that there's a dependency between Full Name and Salutation. Introducing a Salutation Id is so ... stupid, I'm very tempted to say "not even wrong". It has not removed the Transitive Functional Dependency at all.
Normalisation would (assuming there is a FD from Full Name to Salutation):
Put Full Name and Salutation in a separate table, keyed by Full Name -- that represents one of the FDs.
Remove Salutation from the Membership table.
Not introduce a Salutation ID field.
You can recover the original Membership table by joining to the Full Name, Salutation table.
The alleged 3NF form has not removed the Transitive Functional Dependency, and so is not in 3NF. All it has done is replace a Transitive FD from Membership ID to Full Name to Salutation with one from Membership ID to Full Name to Salutation ID. So if Member 3 changes their name from Robert Phil to Roberta Phil, under the initial design Salutation would have to change in step from Mr to Ms; under the alleged 3NF design still the Salutation ID has to change from 1 to 2.
There are other reasons to think that alleged 3NF design is not 3NF. I expect a dependency from Person to Full Name and to Address. There's no column Person, with the consequence there are two Mr Robert Phils. Are they the same person? Then what if they flatted together? The article tries to introduce a composite key {Full Name, Address}, but that won't help; and it's quite common for same-named father and son to live at the same address. We'd have two people with same name at same address. (What if one of them then qualified as a Doctor?)
Normalisation would introduce a Person ID, key to a Person table, with columns Full Name, Salutation, Address. The partitioned Membership table would have columns Membership ID (key), Person ID (Foreign Key references Person).

Related

Normalize two tables with same primary key to 3NF

I have two tables currently with the same primary key, can I have these two tables with the same primary key?
Also are all the tables in 3rd normal form
Ticket:
-------------------
Ticket_id* PK
Flight_name* FK
Names*
Price
Tax
Number_bags
Travel class:
-------------------
Ticket id * PK
Customer_5star
Customer_normal
Customer_2star
Airmiles
Lounge_discount
ticket_economy
ticket_business
ticket_first
food allowance
drink allowance
the rest of the tables in the database are below
Passengers:
Names* PK
Credit_card_number
Credit_card_issue
Ticket_id *
Address
Flight:
Flight_name* PK
Flight_date
Source_airport_id* FK
Dest_airport_id* FK
Source
Destination
Plane_id*
Airport:
Source_airport_id* PK
Dest_airport_id* PK
Source_airport_country
Dest_airport_country
Pilot:
Pilot_name* PK
Plane id* FK
Pilot_grade
Month
Hours flown
Rate
Plane:
Plane_id* PK
Pilot_name* FK

This is not meant as an answer but it became too long for a comment...
Not to sound harsh, but your model has some serious flaws and you should probably take it back to the drawing board.
Consider what would happen if a Passenger buys a second Ticket for instance. The Passenger table should not hold any reference to tickets. Maybe a passenger can have more than one credit card though? Shouldn't Credit Cards be in their own table? The same applies to Addresses.
Why does the Airport table hold information that really is about destinations (or paths/trips)? You already record trip information in the Flights table. It seems to me that the Airport table should hold information pertaining to a particular airport (like name, location?, IATA code et cetera).
Can a Pilot just be associated with one single Plane? Doesn't sound very likely. The pilot table should not hold information about planes.
And the Planes table should not hold information on pilots as a plane surely can be connected to more than one pilot.
And so on... there are most likely other issues too, but these pointers should give you something to think about.
The only tables that sort of looks ok to me are Ticket and Flight.

Re same primary key:
Yes there can be multiple tables with the same primary key. Both in principle and in good practice. We declare a primary or other unique column set to say that those columns (and supersets of them) are unique in a table. When that is the case, declare such column sets. This happens all the time.
Eg: A typical reasonable case is "subtyping"/"subtables", where entities of a kind identified by a candidate key of one table are always or sometimes also of the kind identifed by the same values in another table. (If always then the one table's candidate key values are also in the other table's. And so we would declare a foreign key from the one to the other. We would say the one table's kind of entity is a subtype of the other's.) On the other hand sometimes one table is used with attributes of both kinds and attributes inapplicable to one kind are not used. (Ie via NULL or a tag indicating kind.)
Whether you should have cases of the same primary key depends on other criteria for good design as applied to your particular situation. You need to learn design including normalization.
Eg: All keys simple and 3NF implies 5NF, so if your two tables have the same set of values as only & simple primary key in every state and they are both in 3NF then their join contains exactly the same information as they do separately. Still, maybe you would keep them separate for clarity of design, for likelihood of change or for performance based on usage. You didn't give that information.
Re normal forms:
Normal forms apply to tables. The highest normal form of a table is a property independent of any other table. (Athough you might choose that form based on what forms & tables are alternatives.)
In order to normalize or determine a table's highest normal form one needs to know (in general) all the functional dependencies in it. (For normal forms above BCNF, also join dependencies.) You didn't give them. They are determined by what the meaning of the table is (ie how to determine what rows go in it in any given situation) and the possible situtations that can arise. You didn't give them. Your expectation that we could tell you about the normal forms your tables are in without giving such information suggests that you do not understand normalization and need to educate yourself about it.
Proper design also needs this information and in general all valid states that can arise from situations that arise. Ie constraints among given tables. You didn't give them.

Having two tables with the same key goes against the idea of removing redundancy in normalization.
Excluding that, are these tables in 1NF and 2NF?
Judging by the Names field, I'd suggest that table1 is not. If multiple names can belong to one ticket, then you need a new table, most likely with a composite key of ticket_id,name.

Is it proper to make a grand-parent key, a primary key, in its grand-child, in a multi-level identifying relationship?

Asked this here a couple of days ago, but haven't gotten many views, let alone a response, so I'm reposting to stackoverflow.
I'm modeling a DB for a conference ticketing system. In this system attendees are members of an attendee group, which belong to a conference. These relationships are identifying, and therefore FKs must be PKs in the respective children.
My current model:
Q: Is it proper to have attendeeGroupConferenceId FK, as a PK, in the attendee table, as MySQL Workbench has automatically set up for me?
On one side one would get a performance boost by keeping it in there for quick association at "check in". However, it does not strictly necessary since the combination of id, attendeeGroupId, and a corresponding lookup of conferenceId in the respective attendeeGroup table, is enough. (Therefore becomes redundant data.)
To me, it feels like it might violate some form of normalization, but I plan on keeping it in for the speed boost as described. I'm just curious about what proper design says about giving it PK status or not.

You definitely don't need the attendeeGroupConferenceId in your attendee table. It's redundant and notice that candidate key is the combination of (attendeeGroupId, personId), not the attendeeGroupConferenceId alone.
The table attendee also seems to violate the Second normal form (2NF) as it is.
My suggestion is to remove the attribute attendeeGroupConferenceId. In any case you can just join the tables in your queries to get extra info rather than keeping an extra attribute.

Relational Table Normalization

I would like to know if the relational table in BCNF
Student(StudentNum, NRIC, DateOfBirth, BookTitle)
• Student’s number (StudentNum) uniquely identifies the National Registration Identity Card (NRIC) and the date of birth of the student (DateOfBirth).
• The NRIC determines the student’s date of birth (DateOfBirth).
According to my analysis, the relation is in 2NF. And after changing to BCNF it looks like this
Student(StudentNum, NRIC, BookTitle)
StudentDetails(NRIC, DateOfBirth)
My query;
Before the change 2NF
After the change BCNF
Am i correct.

The idea behind normalization is to eliminate redundancy of rows as much as possible by putting creating additional tables for those columns that may seem to inflict any redundancy on the table.
For example if we start with the following table: Student(StudentNum, NRIC, DateOfBirth, BookTitle) The unique columns in this case are the StudentNum and NRIC, the other 2 fields are not because other students may have the same date of birth and others may have borrowed the book of the same title. From there we see the need to normalize in order for us not to fall into the redundancy of data, for example what if the same student borrowed 100 different books?
If everything is in a single table, we may end up with lots of redundant(repetitive) data.
I suggest you check out this guide to the 5 normal forms http://www.bkent.net/Doc/simple5.htm
Update:
I think the beginning relation better to be considered 1st normal form given that everything is within a single table. Your resulting relation is 2NF I guess because what if different students have borrowed the same book? That may lead to repetition in the Student table.
I think you have to give more info regarding the scenario constituting your relations so we can analyze this better. It highly depends on the business rules.

No. Student is in 1NF, not 2NF.
You're starting with
Student(StudentNum, NRIC, DateOfBirth, BookTitle)
and these dependencies.
StudentNum->NRIC
StudentNum->DateOfBirth
NRIC->DateOfBirth
A relation is in 2NF if and only if
it's in 1NF, and
every non-prime attribute is dependent on the whole of every candidate key, not just on part of any candidate key.
So your first job is to determine the candidate keys of the Student relation. The Student relation has only one candidate key, and that's {StudentNum, BookTitle}.
Your textbook should have at least one algorithm for determining all the candidate keys of a relation.
Since NRIC is dependent on StudentNum, and StudentNum isn't a candidate key (it's just part of a candidate key), the relation Student is not in 2NF. Fix that by changing
Student(StudentNum, NRIC, DateOfBirth, BookTitle)
to this, by eliminating the partial key dependency on StudentNum.1
Student (StudentNum, NRIC, DateOfBirth)
StudentBooks (StudentNum, BookTitle)
StudentBooks has no non-prime attributes at all; it's now in 6NF. Student is in 2NF, but not yet in 3NF or BCNF. Do you know why?
It seems you did know why. There is indeed a transitive dependency: StudentNum->NRIC, and
NRIC->DateOfBirth. Fix the transitive dependency like this.
Student (StudentNum, NRIC)
NRIC (NRIC, DateOfBirth)
StudentBooks (StudentNum, BookTitle)
All three of those relations are in 6NF.
This decomposition might look a little odd. That's because textbook examples usually don't use meaningful names for either relations or attributes. Relations are usually named R{ABCD}, R1{ABC}, R2{AD}, etc. The decomposition above involved
projecting a new relation {StudentNum, NRIC, DateOfBirth} to eliminate the partial key dependency,
observing that the name "Student" no longer identified the relation consisting of {StudentNum, BookTitle},
moving the name "Student" from the original relation to the new, projected relation, and
coming up with a new name, StudentBooks, for the original relation.

Always create unique keys whenever possible?

Should you always create unique keys whenever possible?
For example let's say I have a table with three fields, student ID, first name, last name and the student ID is the primary key.
If no two students have the first & last name, should I create a unique key for those two fields?

Yes, you should use unique indexes even when you already have a primary key when the column or combination of columns are unique. It's good to have constraints in your database to prevent bad data. However, this is not what you have in your case. Even if you currently have no students with duplicate names that can easily happen in the future. Names are not unique in the world.
U.S. Social Security numbers are almost always unique (they can be reused after a number of years, but it's unlikely to ever happen in your case), so they might make for a good candidate for a unique index. If you have non-U.S. students though then you would need to make the column nullable.

Yes, usually having unique IDs (surrogate keys) is best. In this case, last name and first name are not enough for a primary key. Even if you no duplicate names now, you can't be sure you won't have two John Smith's in the future.

Don't make the assumption that no two students will have the same name.
When the underlying model suggests it, it is a good idea to create unique keys. Constraints like these will ensure cohesive data and prevent errors. But in your case the underlying model does not suggest this to be the case.

Unique keys should follow business definitions; if the studentID is a "semi-natural" key (it has unique meaning that exists beyond your specific database), then that should suffice as your unique key.
If the studentID is simply an identity value that is assigned by the database as a row-number, then you probably need some other unique key to avoid entering the same student twice.

Primitive primary key with no relation to data domain is one of widely accepted best practices
( just imagine - one of your students decides to marry )
Another good practice (though from NoSql) world is to use GUID - this way keys are unique, and different datasets can be mixed in same table without collisions.
PS: you could save some storage space, but today it is cheap and there is no need to sacrifice good practices for it

Yes!
If you ever need to update or delete rows from the table, it is very advantageous to have something to uniquely identify each row in the table.
With your example, I don't think it's possible to guarantee no two students will share the same name. Even adding a date of birth still can't guarantee they'll always be unique. I'd recommend adding an auto incrementing INT or BIGINT as the primary key.
You can always add the Unique constraint as well and remove it if it becomes an issue.

A simple way to do it is use an auto-generated Guid (Globally Unique Identifier) to identify a student. It is "guarenteed" to be unique every time it is generated. Names can change (like when somebody gets married), but some auto generated value has no meaning so should never need to be changed.
http://en.wikipedia.org/wiki/Globally_unique_identifier

Your database constraints should be DBMS understood business rules. Is there a business rule that states that no two students may have the same first and last name combination? I presume not, therefore do not create a unique key for those two fields. Perhaps best not to presume, though, and ask a business domain expert e.g. the enrolment officer.
Note that a row in this table is a proposition I.e. that there exists a student enrolled with first name 'x' and last name 'y' and student ID 'z'. Clearly the DBMS has not concept of whether this proposition is true in the real world. What normally happens is that there will be a trusted source to verify data. The enterprise will authorize an officer (director etc) in this role. Let's say it is the enrolment officer who is responsible for verifying that 'x y' is a real person, that they are eligible to be enrolled, and the person is who they say they are. Typically, they will require sight of documents (certificates, passport, etc), take up references, interview the person, check public records, etc. Of course, the enrolment officer may delegate their responsibility to other members of staff or engage an agent.
At some point they will be satisfied and for convenience will issue they own identifier, the student ID. Mistakes do happen and it may turn out that this value is not unique, in which case it would be the enrolment officer's responsibility to resolve the problem and issue a new student to. Perhaps they will use software to generate the value to mitigate against such problems. The student ID will be issued to the student and will be used within the enterprise to identify the person for the convenience of all concerned. They may even be issued with a document (e.g. photo ID card) to assist in identification, based on the level of trust in a given context (e.g. may need to produce photo ID to sit an exam). If the student forgets their ID, loses their issued documents, etc then the enrolment office will be able to retrieve it from records e.g. with reference to copy documents taken during the verification process; they are unlikely to use first name and last name alone.
The point is, the trusted source for the identifier is the enrolment officer on behalf of the enterprise, rather than the database, the DBMS or any other kind of software involved in the process. Therefore, it probably is acceptable to make student ID the sole identifier for stents within the database. Consider, however, that an auto-increment column generated on one hardware build of a single DBMS within the enterprise is probably not suitable for the allocation of such significant identifier values.

unnecessary normalization

My friend and I are building a website and having a major disagreement. The core of the site is a database of comments about 'people.' Basically people can enter comment and they can enter the person the comment is about. Then viewers can search the database for words that are in the comment or parts of the person name. It is completely user generated. For example, if someone wants to post a comment on a mispelled version of a person's name, they can, and that's OK. So there may be multiple spellings of different people listed as several different entries (some with middle name, some with nickname, some mispelled, etc.), but this is all OK. We don't care if people make comments about random people or imaginary people.
Anyway, the issue is about how we are structuring the database. Right now it is just one table with the comment ID as the primary key, and then there is a field for the 'person' the comment is about:
comment ID - comment - person
1 - "he is weird" - John Smith
2 - "smelly girl" - Jenny
3 - "gay" - John Smith
4 - "owes me $20" - Jennyyyyyyyyy
Everything is working fine. Using the database, I am able to create pages that list all the 'comments' for a particular 'person.' However, he is obsessed that the database isn't normalized. I read up on normalization and learned that he was wrong. The table IS currently normalized, because the comment ID is unique and dictates the 'comment' and the 'person.' Now he is insistant that 'person' should have it's OWN table because it is a 'thing.' I don't think it is necessary, because even though 'person' really is the bigger container (one 'person' can have many 'comments' about them), the database seems to operate just fine with 'person' being an attribute of the comment ID. I use various PHP calls for different SQL selections to make it magically appear more sophisticated on the output and the different way the user can search and see results, but in reality, the set-up is quite simple. I am now letting users rank comments with thumbs up and thumbs down, and I keep a 'score' as another field on the same table.
I feel that there is currently no need to have a separate table for just unique 'person' entries because the 'persons' don't have their own 'score' or any of their own attributes. Only the comments do. My friend is so insistant that it is necessary for efficiency. Finally I said, "OK, if you want me to create a separate table and let 'person' be it's own field, then what would be the second field? Because if a table has just a single column, it seems pointless. I agree that we may later create a need to give 'person' it's own table, but we can deal with that then." He then said that strings can't be primary keys, and that we would convert the 'persons' in the current table to numbers, and the numbers would be the primary key in the new 'person' table. To me this seems unnecessary and it would make the current table harder to read. He also thinks it will be impossible to create the second table later, and that we need to anticipate now that we might need it for something later.
Who is right?

In my opinion your friend is right.
Person should live in a different table and you should try to normalize. Don't overdo-it, though.
In the long run you may want to do more things with your site, say you want to attach multiple files to a person (ie. pictures) you'll be very thankfull then for the normalization.

Creating a new table for person and using the key of that table in place of the person attribute has nothing to do with normalization. It may be a good idea for other reasons but doing so does not make the database "more normalized" than not doing it. So you are right: as far as normalization is concerned, creating another table is unnecessary.

I would vote for your friend. I like to normalize and plan for the future and even if you never need it, this normalization is so easy to do it literally takes no time. You can create a view that you query in order to make your SQL cleaner and eliminate the need for you to join the tables yourself.

If you have already reached all of your capabilities and have no plans for expansion of capabilities I think you leave it as it is.
If you plan to add more, namely allowing people to have accounts, or anything really, I think it might be smart to separate your data into Person, Comments tables. Its not hard and makes expanding your functionality easier.

You're right.
Person may be a thing in general, but not in your model. If you were going to hassle people into properly identifying the person they're talking about, a Person table would be necessary. For example, if the comments were only about persons already registered in the database.
But here it looks like you have an unstructured data, without identity; and that nothing/nobody is interested in making sure whether "jenny" and "jennyyy" are in fact the same person, not to mentionned "jenny doe", and "my cousin"...

Well, there are two schools of thought. One says, create your data model in the most normalized way possible, then de-normalize if you need more efficiency. The other is basically "do the minimum work necessary for the job, then change it as your requirements change". Also known as YAGNI (You aren't going to need it).
It all depends on where you see this going. If this is all it will be, then your approach is probably fine. If you intend to improve it with new features over time, then your friend is right.

If you never intend to associate the person column with a user or anything else and data apparently needs no consistency or data integrity checks, just why is this in a relational database at all? Wouldn't this be a use case for a nosql database? Or am I missing something?

Normalization is all about functional dependencies (FD's). You need to identify all of the
FD's that exist among the attributes of your data model before it can be fully normalized.
Lets review what you have:
Any given instance of a CommentId functionally determines the Person (FD: CommentId -> Person)
Any given instance of a CommentId functionally determines the Comment (FD: CommentId -> Comment)
Any given instance of a CommentId functionally determines the UserId (FD: CommentId -> UserId)
Any given instance of a CommentId functionally determines the Score (FD: CommentId -> Score)
Everything here is a dependant attribute on CommentId and
CommentId alone. This might lead you to the belief that a relation (table) containing all of, or a subset of, the
above attributes must be normalized.
First thing to ask yourself is why did you create the CommentId attribute anyway? Strictly speaking,
this is a manufactured attribute - it does not relate to anything 'real'. CommentId is
commonly referred to as a surrogate key. A surrogate key is just a made up value that stands in
for a unique value set corresponding to some other group of attributes. So what group of attributes is CommentId
a surrogate for? We can figure that
out by asking the following questions and adding new FD's to the model:
1) Does a Comment have to be unique? If so the FD: Comment -> CommentId must be true.
2) Can the same Comment be made multiple times as long as it is about a different Person? If so, then
FD: Person + Comment -> CommentId must be true and the FD in 1 above is false.
3) Can the same Comment be made multiple times about the same Person provided it was made by
different UserId's? If so, the FDs in 1 and 2 cannot be true but
FD: Person + Comment + UserId -> CommentId may be true.
4) Can the same Comment be made multiple times about the same Person by the same UserId but
have different Scores? This implies FD: Person + Comment + UserId' + Score -> CommentId is true and the others are false.
Exactly one of the above 4 FD's above must be true. Whichever it is affects how your data model is normalized.
Suppose FD: Person + Comment + UserId -> CommentId turns out to be true. The logical
consequences are that:
Person + Comment + UserId and CommentId serve as equivalent keys with respect to Score
Score should be put in a relation with one but not both of its keys (to avoid transitive dependencies).
The obvious choice would be CommentId since it was specifically created as a surrogate.
A relation comprised of: CommentId, Person, Comment, UserId is needed to tie the
Key to its surrogate.
From a theoretical point of view, the surrogate key CommentId is not
required to make your data model or database work. However, its presence may affect how relations are constructed.
Creation of surrogate keys is a practical issue of some importance.
Consider what might happen if you choose to not use a surrogate key but the full
attribute set Person + Comment + UserId in its place, especially if it was required
on multiple tables as a foreign or primary key:
Comment might add a lot of space overhead
to your database because it is repeated in multiple tables. It is probably more than a couple of characters long.
What happens if someone chooses to edit a Comment? That change needs to be propagated
to all tables where Comment is part of a key. Not a pretty sight!
Indexing long complex keys can take a lot of space and/or make for slow update performance
The value assigned to a surrogate key never changes, no matter what you do to the values
associated to the attributes that it determines. Updating the dependant attributes is now
limited to the one table defining the surrogate key. This is of huge practical significance.
Now back to whether you should be creating a surrogate for Person. Does Person live
on the left hand side of many, or any, FDs? If it does, its value will propogate through your
database and there is a case for creating a surrogate for it. Whether Person is a text or numeric attribute is irrelevant to the choice of creating a surrogate key.
Based on what you have said, there is at best a weak argument to create a
surrogate for Person. This argument is based on the suspicion that its value may at some point become a key or part of a key at some point in the future.

Here's the deal. Whenever you create something, you want to make sure that it has room to grow. You want to try to anticipate future projects and future advancements for your program. In this scenario, you're right in saying that there is no need currently to add a persons table that just holds 1 field (not counting the ID, assuming you have an int ID field and a person name). However, in the future, you may want to have other attributes for such people, like first name, last name, email address, date added, etc.
While over-normalizing is certainly harmful, I personally would create another, larger table to hold the person with additional fields so that I can easily add new features in the future.

Whenever you're dealing with users, there should be a dedicated table. Then you can just join the tables and refer to that user's ID.
user -> id | username | password | email
comment -> id | user_id | content
SQL to join the comments to the users:
SELECT user.username, comment.content FROM user JOIN comment WHERE user.id = comment.user_id;
It'll make it so much easier in the future when you want to find information about that specific user. The amount of extra effort is negligible.
Concerning the "score" for each comment, that should also be a separate table as well. That way you can connect a user to a "like" or "dislike."

With this database, you might feel that it is okay but there may be some problem in the future when you want the users to know more from the database.Suppose you want to know about the number of comments made on a person with the name='abc'.In this case ,you will have to go through the entire table of comments and keep counting.In place of this, you can have an attribute called 'count' for every person and increment it whenever a comment is made on that person.
As far as normalization is concerned,it is always better to have a normalized database because it reduces redundancy and makes the database intuitive to understand. If you are expecting that your database will go large in future then normalization must be present.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008