I started studying databases a week ago, and I have focuses on the relational model. I am pretty sure that is a dumb question, but is this relation valid? I suppose it isn't, but the tuples are not duplicated. Makes me doubt about it. Again forgive my ignorance.
-------------------------------
|Name | Number | Location |
--------------------------------
| Mike | 123 | New York |
--------------------------------
| Mike | 564 | New York |
-------------------------------
The set of tuples as presented satisfies 1NF trivially, but the only way to present the relation such that it doesn't do that is like so (violates atomicity):
-------------------------------
| Name | Number | Location |
--------------------------------
| Mike | 123,564 | New York |
-------------------------------
Or like so (contains repeating groups):
-------------------------------------------
| Name | NumberA | NumberB | Location |
--------------------------------------------
| Mike | 123 | 564 | New York |
-------------------------------------------
It's not really possible to go beyond that. What's Number? The address number? The number of entities with Name of Mike? A unique identifier? What is Location? Does it relate to Name, Number, or both? If there's no unique key set of fields on the table, it technically violates 1NF since the table could allow duplicate rows.
Beyond that, the term "valid" and "invalid" aren't really defined terms for use with relational algebra. The phrase commonly used is "violates normal form". The only truly invalid relation is making one where one does not exist, like, say, a relation between the weight of an Oreo cookie and the number of stars in a given photograph.
Related
I'm very new to Access and my teacher is... hard to follow. So I feel like there's something pretty basic I'm probably missing here. I think the biggest problem I'm having with this question is that I'm struggling to find the words to communicate what I actually need to do, which is really putting a damper on my google-fu.
In terms of what I think I want to do, I want to make a record reference another table in its entirety.
Main
+----+-------+--------+-------+----------------------------+
| PK | Name | Phone# | [...] | Cards |
+----+-------+--------+-------+----------------------------+
| 1 | Bob | [...] | [...] | < Reference to 2nd table > |
| 2 | Harry | [...] | [...] | [...] |
| 3 | Ted | [...] | [...] | [...] |
+----+-------+--------+-------+----------------------------+
Bob's Cards
+----+-------------+-----------+-------+-------+-------+
| PK | Card Name | Condition | Year | Price | [...] |
+----+-------------+-----------+-------+-------+-------+
| 1 | Big Slugger | Mint | 1987 | .20 | [...] |
| 2 | Quick Pete | [...] | [...] | [...] | [...] |
| 3 | Mac Donald | [...] | [...] | [...] | [...] |
+----+-------------+-----------+-------+-------+-------+
This would necessitate an entire new table for each record in the main table though, if it's even possible.
But the only alternative solution I can think of is to add 'Card1, Condition1, [...], Card2, Condition2, [...], Card3, [...]' fields to the main table and having to add another set of fields any time someone increases the maximum number of cards stored.
So I'm sort of left believing there is some other approach I should be taking that our teacher has failed to properly explain. We haven't even touched on forms and reports yet so I don't need to worry about working them in.
Any pointers?
(Also, the entirety of this data and structure is only a rough facsimile of my own, as I'd rather learn how to do it and apply it myself than be like 'here's my data, pls fix.')
Third option successfully found in comments by the helpful Minty.
This depends on a number of things, however to keep it simple you
would normally add one field to the cards table, with an number data
type called CardOwnerID. In your example it would be 1 indicating Bob.
This is known as a foreign key. (FK) - However if you have a table of
cards and multiple possible owners then you need a third table - a
Junction table. This would consist of the Main Person ID and the Card
ID. – Minty
I think I am before a problem where many of you were before. I have a registration form where a user can pick any language of the planet and then pick his skill level for the respective language from a selectbox.
So, for example:
Language1: German
Skill: Fluent
Language2: English
Skill: Basic
I'm thinking what's the best way to store these values in a MySQL database.
I thought of two ways.
First way: creating a column for each language and assigning a skill value to it.
--------------------------------------------------
| UserID | language_en | language_ge |
--------------------------------------------------
| 22 | 1 | 4 |
--------------------------------------------------
| 23 | 3 | 4 |
--------------------------------------------------
So the language is always the column's name and the number represents the skill level (1. Basic, 2. Average ... )
I believe this is a nice way to work with these things and it is also pretty fast. The problem starts when there are 50 languages or more. It doesn't sound like a good idea to make 50 columns where the script always have to check them all if a user have any skill in that language.
Second way: inserting an array in one of the table's column. The table will look like this:
----------------------------------
| UserID | languages |
----------------------------------
| 22 | "ge"=>"4", "en"=>"1" |
----------------------------------
This way the user with ID 22 has skill level 4 for Germany and skill level 1 for English. This is fine because we don't need to check 50 additional columns (or even more) but it's not the right way in my eyes anyway.
We have to parse a lot of results and find a user with, for example, has level 1 for Germany and level 2 for Spanish without looking for the English skill level - it will take the server's a longer time and when bigger data comes we are in trouble.
I bet many of you have experienced this kind of issue. Please, can someone advise me how to sort this out?
Thanks a lot.
I'd advise you to have a separate table with all the languages:
Table: Language
+------------+-------------------+--------------+
| LanguageID | LanguageNameShort | LanguageName |
+------------+-------------------+--------------+
| 1 | en | English |
| 2 | de | German |
+------------+-------------------+--------------+
And another table to link the users to the languages:
Table: LanguageLink
+--------+------------+--------------+
| UserID | LanguageID | SkillLevelID |
+--------+------------+--------------+
| 22 | 1 | 1 |
| 22 | 2 | 4 |
| 23 | 1 | 3 |
| 23 | 2 | 4 |
+--------+------------+--------------+
This is the normalised way to represent that kind of relations in a DB. All data is easily searchable and you don't have to change the DB scheme if you add a language.
To render a user's languages you could use a query like that. It will give you a row per lanugage a user speaks:
SELECT
LanguageLink.UserID,
LanguageLink.SkillLevelID,
Language.LanguageNameShort
FROM
LanguageLink,
Language
WHERE
LanguageLink.UserID = 22
AND LanguageLink.LanguageID = Language.LanguageID
If you want to go further, you could create another table fo the skill level:
Table: Skill
+--------------+-----------+
| SkillLevelID | SkillName |
+--------------+-----------+
| 1 | bad |
| 2 | mediocre |
| 3 | good |
| 4 | perfect |
+--------------+-----------+
What I've done here is called Database normalization. I'd recommend reading about it, it may help you design further databases.
Our Company is developing a CRM and we came now to the point where we have to decide how we want to handle the releationships. This is an important point because there are going to be tons of them. And changing the structure later would be simply not cool..
I know 3 ways how we could do it:
One releationship table:
The way i would do this is creating one table holding all the releationships.
Table: releationships
+----+-------------+-----------+--------------+------------+
| id | record_type | record_id | belongs_type | belongs_id |
+----+-------------+-----------+--------------+------------+
| 1 | person | 42 | company | 12 |
+----+-------------+-----------+--------------+------------+
| 2 | person | 43 | company | 12 |
+----+-------------+-----------+--------------+------------+
| 3 | note | 23 | company | 12 |
+----+-------------+-----------+--------------+------------+
| 4 | attachment | 13 | company | 12 |
+----+-------------+-----------+--------------+------------+
Multiple releationship tables:
I think this is the way how it for example the SugarCRM does.
Table: company_realationships
+----+-----------+------------+--------+
| id | record_id | has_type | has_id |
+----+-----------+------------+--------+
| 1 | 12 | person | 42 |
+----+-----------+------------+--------+
| 2 | 12 | person | 43 |
+----+-----------+------------+--------+
| 3 | 12 | note | 23 |
+----+-----------+------------+--------+
| 2 | 12 | attachment | 13 |
+----+-----------+------------+--------+
All in the record table:
Table: person
+----+-----------+------------+
| id | name | company_id |
+----+-----------+------------+
| 42 | luke | 12 |
+----+-----------+------------+
| 43 | other guy | 12 |
+----+-----------+------------+
ect.
So my Question is wich is the Best way of handling lots of releationships?
Are there other ways to do it?
What are disadvantages / advantages?
Is there a special way how hightraffic sides handle their releationships?
Thanks for your help guys :)
So my Question is wich is the Best way of handling lots of releationships?
The third one or the variation of it (see below).
Every "M:N" relationship should be represented by its own junction table. OTOH, a "1:N" relationship doesn't need additional table - just a proper foreign key in the table on the side of the "N".
If I understand your description correctly, the third option models a 1:N relationship between company and person. If by any chance you wanted to model a M:N relationship between them, you'd have a junction table: company_person ( company_id, person_id, PK (company_id, person_id) ).
Are there other ways to do it?
Sometimes, inheritance (aka. category, subtype, generalization hierarchy etc.) can be used to lower the number of possible "relatable" combinations. In a nutshell, make a relationship to a parent, then every child inherited from that parent is automatically involved in that relationship.
For an example, take a look at this post.
What are disadvantages / advantages?
Enforcing constraints (including FKs) declaratively is better (less prone to errors and probably more performant) than enforcing them through triggers, which is again better than enforcing them in the client code.
Choose a design that better adheres to that principle. For example, your options 1 and 2 don't allow the DBMS to enforce FKs declaratively.
Is there a special way how hightraffic sides handle their releationships?
Good logical design followed by good physical implementation is the only solid basis for good performance. It's hard to "bolt-on" the performance on top of a bad design.
Perhaps, you'd like to take a look at:
ERwin Methods Guide
Use The Index, Luke!
And when it comes to performance, don't guess! Measure on realistic amounts of data.
I'm building website where each user has different classes
+----+-----------+---------+
| id | subject | user_id |
+----+-----------+---------+
| 1 | Math 140 | 2 |
| 2 | ART 240 | 2 |
+----+-----------+---------+
Each class then will have bunch of Homework files, Class-Papers files and so on.
And here I need your help. What will be the better approach: Build one table like that:
+----+-----------+--------------------------------------------------+--------------+
| id | subject | Homework | Class-Papers |
+----+-----------+-----------------------------------------------------------------+
| 1 | Math 140 | www.example.com/subjects/Math+140/file_name.pdf | bla-bla |
| 2 | Math 140 | www.example.com/subjects/Math+140/file_name.pdf | bla-bla |
| 3 | Math 140 | www.example.com/subjects/Math+140/file_name.pdf | bla-bla |
| 4 | ART 240 | www.example.com/subjects/ART +240/file_name.pdf | bla-bla |
| 5 | ART 240 | www.example.com/subjects/ART +240/file_name.pdf | bla-bla |
+----+-----------+--------------------------------------------------+--------------+
And than just separate the content when I want to display it,
OR build a table for every single subject and than just load necessary table?
Or if you can suggest something better or more common/useful/efficient please go ahead.
You should read about normalization and relational design before attempting this.
This is a one-to-many relationship - model it as such.
A table for every subject is crazy. You'll have to add a new table for every subject.
A better solution will make it possible to add new subjects simply by adding data. That's what the relational model is all about.
Don't worry about tables; think about it in natural language first.
A SUBJECT(calculus) can have many COURSES(differential, integral, multi-variable).
A COURSE(differential calculus) can have many SECTIONs (Mon 9-10 am in room 2 of the math building).
A STUDENT(first name, last name, student id) can sign up for zero or more SECTIONs. The list of SECTIONs for a given STUDENT is a TRANSCRIPT.
Each STUDENT has one TRANSCRIPT per semester (fall 2012).
A SECTION can have zero or more ASSIGNMENTs.
These are the tables you'll need for this simple problem. Worry about the names and how they relate before you start writing SQL. You'll be glad you did.
I most definitely woudl NOT create a separate table for each subject. If you did, then if you wanted a query like "list all the homework for student X", you would have to access different tables depending on which subjects that student was enrolled in. Worse, anytime someone added a new subject, you would have to create a new table. If down the road you decide you need a new attribute of homework, instead of updating one table, you would have to update every one of these subject tables. It's just bad news all around.
I wish to save some semantic information about data in a table. How can I save this information in MySQL, such that I can access data and also search for the articles using the semantic data.
For example, I have a article about Apple and Microsoft. The semantic data will be like
Person : Steve Jobs
Person : Steve Ballmer
Company : Apple
Company : Microsoft
I want to save the information without losing the info that Steve Jobs and Steve Ballmer are persons and Apple and Microsoft are companies. I also want to search for articles about Steve Jobs / Apple.
Person and Company are not the only possible types, hence adding new fields is not viable. Since the type of the data is to be saved, I cannot use FullText field type directly.
Update - These are two options that I am considering.
Save the data in a full text column as serialized php array.
Create another table with 3 columns
--
--------------------------------
| id | subject | object |
--------------------------------
| 1 | Person | Steve Ballmer |
| 1 | Person | Steve Jobs |
| 1 | Company | Microsoft |
| 1 | Company | Apple |
| 2 | Person | Obama |
| 2 | Country | US |
--------------------------------
You're working on a hard and interesting problem! You may get some interesting ideas from looking at the Dublin Core Metadata Initiative.
http://dublincore.org/metadata-basics/
To make it simple, think of your metadata items as all fitting in one table.
e.g.
Ballmer employed-by Microsoft
Ballmer is-a Person
Microsoft is-a Organization
Microsoft run-by Ballmer
SoftImage acquired-by Microsoft
SoftImage is-a Organization
Joel Spolsky is-a Person
Joel Spolsky formerly-employed-by Microsoft
Spolsky, Joel dreamed-up StackOverflow
StackOverflow is-a Website
Socrates is-a Person
Socrates died-on (some date)
The trick here is that some, but not all, your first and third column values need to be BOTH arbitrary text AND serve as indexes into the first and third columns. Then, if you're trying to figure out what your data base has on Spolsky, you can full-text search your first and third columns for his name. You'll get out a bunch of triplets. The values you find will tell you a lot. If you want to know more, you can search again.
To pull this off you'll probably need to have five columns, as follows:
Full text subject (whatever your user puts in)
Canonical subject (what your user puts in, massaged into a standard form)
Relation (is-a etc)
Full text object
Canonical object
The point of the canonical forms of your subject and object is to allow queries like this to work, even if your user puts in "Joel Spolsky" and "Spolsky, Joel" in two different places even if they mean the same person.
SELECT *
FROM relationships a
JOIN relationships b (ON a.canonical_object = b.canonical_subject)
WHERE MATCH (subject,object) AGAINST ('Spolsky')
You might want to normalize your data table by making 2 tables.
----------------
| id | subject |
----------------
| 1 | Person |
| 2 | Company |
| 3 | Country |
----------------
-----------------------------------
| id | subject-id | object |
-----------------------------------
| 1 | 1 | Steve Ballmer |
| 2 | 1 | Steve Jobs |
| 3 | 2 | Microsoft |
| 4 | 2 | Apple |
| 5 | 1 | Obama |
| 6 | 3 | US |
-----------------------------------
This allows you to more easily see all the different subject types you have defined.