I am trying to create a simple Registration Program using VB.Net and MySQL for its database. Here's my simple table for the basic Information
However, I am attempting to improve my basic knowledge in normalization of table and that's why I separated the Date field to avoid, let say in one day, the repeated insertion of the same date. I mean, when 50 individuals registered in one day, it will simply add a single date(record) in tblRegDate table instead of adding it up for 50 times in a table. Is there any way to do this? Is it possible in VB.Net and MySQL? Or rather, should I add or modify some field? or should I make a condition in VB.Net? The table above is what my friend taught me but I discovered that it doesn't eliminate the redundancy. Kindly give me any instruction or direct me to site where there's a simple tutorial for this. Thanks in advance!
here's my MySQL codes:
CREATE TABLE tblInfo(
Number INT AUTO_INCREMENT,
LastName VARCHAR(45),
FirstName VARCHAR(45),
MiddleName VARCHAR(45),
Gender ENUM(M,F),
BirthDate DATE,
PRIMARY KEY(Number));
CREATE TABLE tblRegDate(
IDRegDate INT AUTO_INCREMENT,
Date TIMESTAMP,
Number INT,
PRIMARY KEY(IDRegDate),
FOREIGN KEY(Number) REFERENCES tblInfo(Number));
As I see it in this case you don't have advanages of seperating a single field. You'll loose a lot of performance.
Table normalization isn't about don't having any redundant value. It's more about "Seperating the concerns"
Also it is important to not have an exploding complexity in your database. seperating single fills would end up in a database no one would be able to understand.
The Question is: Are there more informations on registration ? For Example Webpage, IP, .....
Than you should have two tables for example "Person" and "Registration". Then you would have two semantic different things which shouldn't be mixed up.
There are a lot of examples and information you can find via google. and wikipedia
http://en.wikipedia.org/wiki/Database_normalization
Actually it is not a good idea to seperate timestamp from the table.
You would need another table namely i.e timeTable. It would have two columns id and timestmap and you should reference this id in your tblRegDate table as foreign key.
Foreign key is an integer and has the size 4 bytes. Date on the other hand 3 bytes.
Therefore I would recommend you to keep date in your tblRegDate and not in a extra table
When you normalize DB structures, always keep it mind of ACID - http://en.wikipedia.org/wiki/ACID
Based on the fields you have, you should just keep it as a single table. Separating out the registration date is not a good design because you'll have to do a look up every time. In real life, you can consider indexing the reg date if your app always search or sort by regdate.
And if you FK RegDate table to the user table, it is also not efficient.
p.s. Also keep in mind that there are 4 levels of DB normalization. If you are new to DB design, you should consider learning how to move a DB design from 1st to 2nd, and 2nd to 3rd normal forms.
We rarely use 4th normal form in real life situation. Transaction systems usually stay at 3rd most of the time.
Hope that make sense.
Related
I have a list of records to be stored in database table, but I'm facing some difficulty in designing the database. The following would be the data to be stored:
The Class (Rows) and The Day (Column) will be continue to grow in future. My initial idea have 2 designs.
The table design for the database design will be exactly same with the current table. But the problem would be how if want to add Day13? It would be suffer in future in the column keep continue to grow.
Add 1 column as result:
It look better to solve the problem of Day column to be growing in future, but the problem is it will keep large amount of data records in database which make query become slower when more and more data insert.
Any idea or technique on how to optimize the database design? Thank you.
So a Class can have a result and a date. Just make sure to have a unique primary key on your Class table and make the correct data types for your fields.
What I think you need is a ClassId for primary key and then make a ClassName field (varchar) to store the class name. Don't write Day1 it should be a date format.
Maybe something simple like this.
One table for Classes (ClassID, ClassName)
One table for Day/Period/whatever you call it (DayID, DayName, DayInMonth etc..)
One table for Results(ResultId,DayId,ClassId, Result)
A friend and I are working on a database that stores information about cPanel hosting accounts, such as what settings, apps, and features each account is using.
Most of the fields are boolean, such as whether or not the account has any wordpress sites, any php 5.4 driven sites, any ruby on rails sites, etc...
A small number of fields are non-boolean data like disk usage in MB, hostname of the server the account resides on, and the username of the account, etc...
In my mind, it makes sense to store ALL this information in one single table.
So the table might have the following columns:
php54 boolean,
wordpress boolean,
ror boolean,
username varchar(8),
hostname varchar(20),
usage_mb int(9),
I figure that the primary key could be (username,hostname).
However, my friend has already set up the database with multiple tables that look like this:
Fact Table:
id int(11),
php54 boolean,
wordpress boolean,
ror boolean,
usage_mb int(9),
User Table:
id int(11),
factid int(11),
hostid int(11),
username varchar(8)
Hostname Table:
id int(11),
hostname varchar(20),
ip varchar(15),
Where each table's primary key is "id" and the user table references the hostname table and fact table using 'hostid' and 'factid' foreign keys (respectively).
I believe my friend's rationale behind multiple tables is to organize the data based on the type of data, despite all the data being related to one single, unique account.
My rationale is that since all the data belongs to one unique account, and therefore every single row is 1:1, does it make sense to have multiple tables?
I would think multiple tables would be sensible if a row in one table can reference multiple rows in another table... But in this case each row from each table can only be associated with one single row from any other table... so i think one table is fine.
Should this data be in multiple tables, or in one single table?
We're both sort of noobs figuring things out as we go.
At which point does it make sense to use multiple tables?
Currently its really difficult to write an API to add the data associated with one single account to three separate tables, as all the primary keys auto increment, and other than that there isn't any key that is unique to the account which would make it easy to update existing data.
Sorry if none of this makes sense
In your case, I dont't think having multiple tables with one to one relationships is the right way.
It is not forbidden and in some cases it can be helpfull (
Is there ever a time where using a database 1:1 relationship makes sense?), but you'll have to deal with unecessary joins in your requests.
Ignoring ids, the way you find out what your CKs (candidate keys) are and whether you should decompose is the topic of normalization to higher NFs (normal forms). This formalizes your notion of "a row in one table can reference multiple rows in another" (among others). Guessing using common sense here, there's no particular need to decompose. Introducing ids not visible at the business level is always technically unnecessary but happens per its own practical/ergonomic reasons. Further explanation/justification is information modeling & database design textbook chapters on design, CKs, NFs & surrogates--read some. Vague notions like "same type of data" are not helpful.
(TL;DR "At what point should I create a separate table?" is a basic question with a complex answer that requires learning some stuff.)
I had one single table that had lots of problems. I was saving data separated by commas in some fields, and afterwards I wasn't able to search them. Then, after search the web and find a lot of solutions, I decided to separate some tables.
That one table I had, became 5 tables.
First table is called agendamentos_diarios, this is the table that I'm gonna be storing the schedules.
Second Table is the table is called tecnicos, and I'm storing the technicians names. Two fields, id (primary key) and the name (varchar).
Third table is called agendamento_tecnico. This is the table (link) I'm goona store the id of the first and the second table. Thats because there are some schedules that are gonna be attended by one or more technicians.
Forth table is called veiculos (vehicles). The id and the name of the vehicle (two fields).
Fith table is the link between the first and the vehicles table. Same thing. I'm gonna store the schedule id and the vehicle id.
I had an image that can explain better than I'm trying to say.
Am I doing it correctly? Is there a better way of storing data to MySQL?
I agree with #Strawberry about the ids, but normally it is the Hibernate mapping type that do this. If you are not using Hibernate to design your tables you should take the ID out from agendamento_tecnico and agendamento_veiculos. That way you garantee the unicity. If you don't wanna do that create a unique key on the FK fields on thoose tables.
I notice that you separate the vehicles table from your technicians. On your model the same vehicle can be in two different schedules at the same time (which doesn't make sense). It will be better if the vehicle was linked on agendamento_tecnico table which will turn to be agendamento_tecnico_veiculo.
Looking to your table I note (i'm brazilian) that you have a column called "servico" which, means service. Your schedule table is designed to only one service. What about on the same schedule you have more than one service? To solve this you can create a table services and create a m-n relationship with schedule. It will be easier to create some reports and have the services well separated on your database.
There is also a nome_cliente field which means the client for that schedule. It would be better if you have a cliente (client) table and link the schedule with an FK.
As said before, there is no right answer. You have to think about your problem and on the possible growing of it. Model a database properly will avoid lot of headache later.
Better is subjective, there's no right answer.
My natural instinct would be to break that schedule table up even more.
Looks like data about the technician and the client is duplicated.
There again you might have made a decisions to de-normalise for perfectly valid reasons.
Doubt you'll find anyone on here who disagrees with you not having comma separated fields though.
Where you call a halt to the changes is dependant on your circumstances now. Comma separated fields caused you an issue, you got rid of them. So what bit of where you are is causing you an issue now?
looks ok, especially if a first try
one comment: I would name PK/FK (ids) the same in all tables and not using 'id' as name (additionaly we use '#' or '_' as end char of primary / foreighn keys: example technicos.technico_ and agendamento_tecnico has fields agend_tech_ and technico_. But this is not common sense. It makes queries a bit more coplex (because you must fully qualify the fields), but make the databse schema mor readable (you know in the moment wich PK belong to wich FK)
other comment: the two assotiative (i never wrote that word before!) tables, joining technos and agendamento_tecnico have an own ID field, but they do not need that, because the two (primary/unique) keys of the two tables they join, are unique them selfes, so you can use them as PK for this tables like:
CREATE TABLE agendamento_tecnico (
technico_ int not null,
agend_tech_ int not null,
primary key(technico_,agend_tech_)
)
I need to create a table for students with ID's starting either M or I and then followed by 7 digits. I thought about creating a table one field being choosing the letter M or I and the second field be the 7 digits. Then I would use those to create a composite primary key. But I don't think that's what I'm looking for. I would like one column for the student ID.
This is what I have so far:
create table student(
student_id_first ENUM('M', 'I')
, student_id_digits char(8) not null unique
, first_name varchar(50)
, last_name varchar(50)
, Primary Key(student_id_first, student_id_digits)
) ;
Thoughts?
Thanks.
You should not be putting data validation in your database layer. You're also creating a compound key which adds additional, unnecessary overhead. If at some point you need to adjust or relax the rules, you need to re-define the database schema.
This also introduces pointless complexity when retrieving data and having to assemble the actual student identifier. Unless you have a very good reason for splitting them, keep them together.
Just use one column and enforce what goes in there in your application. Even a very basic ORM will give you the ability to do this.
Your simplest approach here is to define a field id char(8) and make it the primary key.
The question that opens up (the question your teacher may ask) is "what happens if some rogue software client inserts an id like 'K1234567', 'DEADBEEF', or 'I123' that doesn't comply with your business rule for id values?"
Potentially valid answers to this question for you to give to your teacher are:
if we were in production we would test the software clients to prevent this happening.
we can run a daily production purge process to get rid of rows with malformed id numbers.
what the heck do you want? this is a week's homework for a two credit hour class! I already gave you a schema that's just as good as most payroll systems!
Seriously, it does not sound from your original question that your teacher is insisting that you design a dbms schema that enforces this particular business rule. But only you know if that is the case.
I'm currently working on a blog for a college news organization. Each post, though, will represent a full show, with multiple contributors and multiple titles.
For example, a post might have three news stories, each with its own title and some contributors for each:
"Story 1" by (id1) and (id2)
"Story 2" by (id3)
"Story 3" by (id4) and (id5)
So for each post, there would be an index (1, 2, 3...) for each individual story, a VARCHAR for the title, and id's that represent contributors, whose details are stored in another "contributors" table. The problem is that I don't know how many stories there will be, or how many contributors there will be per story. It could range from ~3 at the least to up to 6. In case our show expands in the future, I'd like to have the capability to scale up to even more than 6 posts, too.
I want to represent this structure concisely in a mySQL column, but I'm not sure how to do that. One solution would be to create another mySQL table to save the details for each individual story, but I'd prefer to avoid that hassle. The ideal solution would be if I could somehow create an "array" within a mySQL column, which could store (for each story) an index, a string, and multiple id's to show who the contributors are.
Is this possible, or will I have to create a new table to keep track of each story?
Don't use a column - use a table. It can be a simple InnoDB table which doesn't really hurt performance at all. Define a combined primary key (story_id, contributor_id) and insert all contributions in that table.
What you name in your question is called a M:N table. Don't ever go there - it's a very bad thing to do and is, in fact, nearly impossible in relational databases.
Save yourself some future heartburn. Create the extra table. It looks like a table of [Posts] with a one-to-many relationship to [Stories] where [Stories] has a many-to-many relationship to [Contributors].
You could store a comma-delimited string value of contributor ids or story ids in one column, but how, exactly would you relate them? What would seem to be your best bet in that case would be to make it an 'array' of 'arrays', where your main string consisted of pairs of strings strung together through commas.. I (so it's just my opinion, okay?) would avoid using unless totally necessary (can't think of one instance at this time)...
So create your relationships tables. Just to illustrate one approach to the idea:
-- a story may have multiple contributors
CREATE TABLE story_contributor_rel (
story_id INT NOT NULL
, contributor_id INT NOT NULL
)
-- a post may have multiple stories
CREATE TABLE post_story_rel (
post_id INT NOT NULL
, story_id INT NOT NULL
)
Or cheat it a bit, but I'd recommend against this also(!):
-- a less-normalized way
CREATE TABLE post_relationships (
post_id INT NOT NULL
, story_id INT NOT NULL
, contributor_id INT NOT NULL
)
These are just the simplest approaches. Naturally, you'd want to have either additional indentity columns and/or proper indexing and primary key settings, but this is just the way I can illustrate the point I'm driving at better.
Imagine this too.. If you were to put all those relationships in logical columns, then without the application it would not be so easy for anyone to understand what's going on in your tables. If you don't put any logic in the table structures and if you would properly set relationships tracking (meaning relationship tables), then it would appear transparent. One look at these tables and one would not take long enough to understand..
That's just my opinion. :) Cheers!