SQL - database design, need suggestion - mysql

I'm building website where each user has different classes
+----+-----------+---------+
| id | subject | user_id |
+----+-----------+---------+
| 1 | Math 140 | 2 |
| 2 | ART 240 | 2 |
+----+-----------+---------+
Each class then will have bunch of Homework files, Class-Papers files and so on.
And here I need your help. What will be the better approach: Build one table like that:
+----+-----------+--------------------------------------------------+--------------+
| id | subject | Homework | Class-Papers |
+----+-----------+-----------------------------------------------------------------+
| 1 | Math 140 | www.example.com/subjects/Math+140/file_name.pdf | bla-bla |
| 2 | Math 140 | www.example.com/subjects/Math+140/file_name.pdf | bla-bla |
| 3 | Math 140 | www.example.com/subjects/Math+140/file_name.pdf | bla-bla |
| 4 | ART 240 | www.example.com/subjects/ART +240/file_name.pdf | bla-bla |
| 5 | ART 240 | www.example.com/subjects/ART +240/file_name.pdf | bla-bla |
+----+-----------+--------------------------------------------------+--------------+
And than just separate the content when I want to display it,
OR build a table for every single subject and than just load necessary table?
Or if you can suggest something better or more common/useful/efficient please go ahead.

You should read about normalization and relational design before attempting this.
This is a one-to-many relationship - model it as such.
A table for every subject is crazy. You'll have to add a new table for every subject.
A better solution will make it possible to add new subjects simply by adding data. That's what the relational model is all about.
Don't worry about tables; think about it in natural language first.
A SUBJECT(calculus) can have many COURSES(differential, integral, multi-variable).
A COURSE(differential calculus) can have many SECTIONs (Mon 9-10 am in room 2 of the math building).
A STUDENT(first name, last name, student id) can sign up for zero or more SECTIONs. The list of SECTIONs for a given STUDENT is a TRANSCRIPT.
Each STUDENT has one TRANSCRIPT per semester (fall 2012).
A SECTION can have zero or more ASSIGNMENTs.
These are the tables you'll need for this simple problem. Worry about the names and how they relate before you start writing SQL. You'll be glad you did.

I most definitely woudl NOT create a separate table for each subject. If you did, then if you wanted a query like "list all the homework for student X", you would have to access different tables depending on which subjects that student was enrolled in. Worse, anytime someone added a new subject, you would have to create a new table. If down the road you decide you need a new attribute of homework, instead of updating one table, you would have to update every one of these subject tables. It's just bad news all around.

Related

How do I handle linking a record to another table?

I'm very new to Access and my teacher is... hard to follow. So I feel like there's something pretty basic I'm probably missing here. I think the biggest problem I'm having with this question is that I'm struggling to find the words to communicate what I actually need to do, which is really putting a damper on my google-fu.
In terms of what I think I want to do, I want to make a record reference another table in its entirety.
Main
+----+-------+--------+-------+----------------------------+
| PK | Name | Phone# | [...] | Cards |
+----+-------+--------+-------+----------------------------+
| 1 | Bob | [...] | [...] | < Reference to 2nd table > |
| 2 | Harry | [...] | [...] | [...] |
| 3 | Ted | [...] | [...] | [...] |
+----+-------+--------+-------+----------------------------+
Bob's Cards
+----+-------------+-----------+-------+-------+-------+
| PK | Card Name | Condition | Year | Price | [...] |
+----+-------------+-----------+-------+-------+-------+
| 1 | Big Slugger | Mint | 1987 | .20 | [...] |
| 2 | Quick Pete | [...] | [...] | [...] | [...] |
| 3 | Mac Donald | [...] | [...] | [...] | [...] |
+----+-------------+-----------+-------+-------+-------+
This would necessitate an entire new table for each record in the main table though, if it's even possible.
But the only alternative solution I can think of is to add 'Card1, Condition1, [...], Card2, Condition2, [...], Card3, [...]' fields to the main table and having to add another set of fields any time someone increases the maximum number of cards stored.
So I'm sort of left believing there is some other approach I should be taking that our teacher has failed to properly explain. We haven't even touched on forms and reports yet so I don't need to worry about working them in.
Any pointers?
(Also, the entirety of this data and structure is only a rough facsimile of my own, as I'd rather learn how to do it and apply it myself than be like 'here's my data, pls fix.')
Third option successfully found in comments by the helpful Minty.
This depends on a number of things, however to keep it simple you
would normally add one field to the cards table, with an number data
type called CardOwnerID. In your example it would be 1 indicating Bob.
This is known as a foreign key. (FK) - However if you have a table of
cards and multiple possible owners then you need a third table - a
Junction table. This would consist of the Main Person ID and the Card
ID. – Minty

CRM Releationships MySQL

Our Company is developing a CRM and we came now to the point where we have to decide how we want to handle the releationships. This is an important point because there are going to be tons of them. And changing the structure later would be simply not cool..
I know 3 ways how we could do it:
One releationship table:
The way i would do this is creating one table holding all the releationships.
Table: releationships
+----+-------------+-----------+--------------+------------+
| id | record_type | record_id | belongs_type | belongs_id |
+----+-------------+-----------+--------------+------------+
| 1 | person | 42 | company | 12 |
+----+-------------+-----------+--------------+------------+
| 2 | person | 43 | company | 12 |
+----+-------------+-----------+--------------+------------+
| 3 | note | 23 | company | 12 |
+----+-------------+-----------+--------------+------------+
| 4 | attachment | 13 | company | 12 |
+----+-------------+-----------+--------------+------------+
Multiple releationship tables:
I think this is the way how it for example the SugarCRM does.
Table: company_realationships
+----+-----------+------------+--------+
| id | record_id | has_type | has_id |
+----+-----------+------------+--------+
| 1 | 12 | person | 42 |
+----+-----------+------------+--------+
| 2 | 12 | person | 43 |
+----+-----------+------------+--------+
| 3 | 12 | note | 23 |
+----+-----------+------------+--------+
| 2 | 12 | attachment | 13 |
+----+-----------+------------+--------+
All in the record table:
Table: person
+----+-----------+------------+
| id | name | company_id |
+----+-----------+------------+
| 42 | luke | 12 |
+----+-----------+------------+
| 43 | other guy | 12 |
+----+-----------+------------+
ect.
So my Question is wich is the Best way of handling lots of releationships?
Are there other ways to do it?
What are disadvantages / advantages?
Is there a special way how hightraffic sides handle their releationships?
Thanks for your help guys :)
So my Question is wich is the Best way of handling lots of releationships?
The third one or the variation of it (see below).
Every "M:N" relationship should be represented by its own junction table. OTOH, a "1:N" relationship doesn't need additional table - just a proper foreign key in the table on the side of the "N".
If I understand your description correctly, the third option models a 1:N relationship between company and person. If by any chance you wanted to model a M:N relationship between them, you'd have a junction table: company_person ( company_id, person_id, PK (company_id, person_id) ).
Are there other ways to do it?
Sometimes, inheritance (aka. category, subtype, generalization hierarchy etc.) can be used to lower the number of possible "relatable" combinations. In a nutshell, make a relationship to a parent, then every child inherited from that parent is automatically involved in that relationship.
For an example, take a look at this post.
What are disadvantages / advantages?
Enforcing constraints (including FKs) declaratively is better (less prone to errors and probably more performant) than enforcing them through triggers, which is again better than enforcing them in the client code.
Choose a design that better adheres to that principle. For example, your options 1 and 2 don't allow the DBMS to enforce FKs declaratively.
Is there a special way how hightraffic sides handle their releationships?
Good logical design followed by good physical implementation is the only solid basis for good performance. It's hard to "bolt-on" the performance on top of a bad design.
Perhaps, you'd like to take a look at:
ERwin Methods Guide
Use The Index, Luke!
And when it comes to performance, don't guess! Measure on realistic amounts of data.

Find records where CSV column values match

I am making a website. In the database I have a table of articles that kind of looks like this:
id | name | cats | etc.
------------------------------------------------------
1 | "alice" | "this, that, those, them" |
2 | "bob" | "this, that, those" |
3 | "carol" | "this, banana, cupcake" |
4 | "dave" | "other, unrelated, words" |
5 | "errol" | "those, them, fishstick" |
When viewing an article I want to also show some of the most related articles, based on the amount of categories in common.
For example, if I was viewing the Alice article I would want to pick out (in order of preference) Bob (3 cats in common), Errol(2), Carol(1).
I am aware that this would be easier if the data was normalised (I could for example do this) but unfortunately that's not really an option.
I ended up creating a couple of extra tables and populating them with properly normalized data every time something was saved. These run alongside the existing tables so it's not the cleanest of solutions but it works and the query speeds are excellent.

Database modeling for international and multilingual purposes

I need to create a large scale DB Model for a web application that will be multilingual.
One doubt that I've every time I think on how to do it is how I can resolve having multiple translations for a field. A case example.
The table for language levels, that administrators can edit from the backend, can have multiple items like: basic, advance, fluent, mattern... In the near future probably it will be one more type. The admin goes to the backend and add a new level, it will sort it in the right position.. but how I handle all the translations for the final users?
Another problem with internationalization of a database is that probably for user studies can differ from USA to UK to DE... in every country they will have their levels (that probably it will be equivalent to another but finally, different). And what about billing?
How you model this in a big scale?
Here is the way I would design the database:
Visualization by DB Designer Fork
The i18n table only contains a PK, so that any table just has to reference this PK to internationalize a field. The table translation is then in charge of linking this generic ID with the correct list of translations.
locale.id_locale is a VARCHAR(5) to manage both of en and en_US ISO syntaxes.
currency.id_currency is a CHAR(3) to manage the ISO 4217 syntax.
You can find two examples: page and newsletter. Both of these admin-managed entites need to internationalize their fields, respectively title/description and subject/content.
Here is an example query:
select
t_subject.tx_translation as subject,
t_content.tx_translation as content
from newsletter n
-- join for subject
inner join translation t_subject
on t_subject.id_i18n = n.i18n_subject
-- join for content
inner join translation t_content
on t_content.id_i18n = n.i18n_content
inner join locale l
-- condition for subject
on l.id_locale = t_subject.id_locale
-- condition for content
and l.id_locale = t_content.id_locale
-- locale condition
where l.id_locale = 'en_GB'
-- other conditions
and n.id_newsletter = 1
Note that this is a normalized data model. If you have a huge dataset, maybe you could think about denormalizing it to optimize your queries. You can also play with indexes to improve the queries performance (in some DB, foreign keys are automatically indexed, e.g. MySQL/InnoDB).
Some previous StackOverflow questions on this topic:
What are best practices for multi-language database design?
What's the best database structure to keep multilingual data?
Schema for a multilanguage database
How to use multilanguage database schema with ORM?
Some useful external resources:
Creating multilingual websites: Database Design
Multilanguage database design approach
Propel Gets I18n Behavior, And Why It Matters
The best approach often is, for every existing table, create a new table into which text items are moved; the PK of the new table is the PK of the old table together with the language.
In your case:
The table for language levels, that administrators can edit from the backend, can have multiple items like: basic, advance, fluent, mattern... In the near future probably it will be one more type. The admin goes to the backend and add a new level, it will sort it in the right position.. but how I handle all the translations for the final users?
Your existing table probably looks something like this:
+----+-------+---------+
| id | price | type |
+----+-------+---------+
| 1 | 299 | basic |
| 2 | 299 | advance |
| 3 | 399 | fluent |
| 4 | 0 | mattern |
+----+-------+---------+
It then becomes two tables:
+----+-------+ +----+------+-------------+
| id | price | | id | lang | type |
+----+-------+ +----+------+-------------+
| 1 | 299 | | 1 | en | basic |
| 2 | 299 | | 2 | en | advance |
| 3 | 399 | | 3 | en | fluent |
| 4 | 0 | | 4 | en | mattern |
+----+-------+ | 1 | fr | élémentaire |
| 2 | fr | avance |
| 3 | fr | couramment |
: : : :
+----+------+-------------+
Another problem with internationalitzation of a database is that probably for user studies can differ from USA to UK to DE... in every country they will have their levels (that probably it will be equivalent to another but finally, different). And what about billing?
All localisation can occur through a similar approach. Instead of just moving text fields to the new table, you could move any localisable fields - only those which are common to all locales will remain in the original table.

How can I save semantic information in a MySQL table?

I wish to save some semantic information about data in a table. How can I save this information in MySQL, such that I can access data and also search for the articles using the semantic data.
For example, I have a article about Apple and Microsoft. The semantic data will be like
Person : Steve Jobs
Person : Steve Ballmer
Company : Apple
Company : Microsoft
I want to save the information without losing the info that Steve Jobs and Steve Ballmer are persons and Apple and Microsoft are companies. I also want to search for articles about Steve Jobs / Apple.
Person and Company are not the only possible types, hence adding new fields is not viable. Since the type of the data is to be saved, I cannot use FullText field type directly.
Update - These are two options that I am considering.
Save the data in a full text column as serialized php array.
Create another table with 3 columns
--
--------------------------------
| id | subject | object |
--------------------------------
| 1 | Person | Steve Ballmer |
| 1 | Person | Steve Jobs |
| 1 | Company | Microsoft |
| 1 | Company | Apple |
| 2 | Person | Obama |
| 2 | Country | US |
--------------------------------
You're working on a hard and interesting problem! You may get some interesting ideas from looking at the Dublin Core Metadata Initiative.
http://dublincore.org/metadata-basics/
To make it simple, think of your metadata items as all fitting in one table.
e.g.
Ballmer employed-by Microsoft
Ballmer is-a Person
Microsoft is-a Organization
Microsoft run-by Ballmer
SoftImage acquired-by Microsoft
SoftImage is-a Organization
Joel Spolsky is-a Person
Joel Spolsky formerly-employed-by Microsoft
Spolsky, Joel dreamed-up StackOverflow
StackOverflow is-a Website
Socrates is-a Person
Socrates died-on (some date)
The trick here is that some, but not all, your first and third column values need to be BOTH arbitrary text AND serve as indexes into the first and third columns. Then, if you're trying to figure out what your data base has on Spolsky, you can full-text search your first and third columns for his name. You'll get out a bunch of triplets. The values you find will tell you a lot. If you want to know more, you can search again.
To pull this off you'll probably need to have five columns, as follows:
Full text subject (whatever your user puts in)
Canonical subject (what your user puts in, massaged into a standard form)
Relation (is-a etc)
Full text object
Canonical object
The point of the canonical forms of your subject and object is to allow queries like this to work, even if your user puts in "Joel Spolsky" and "Spolsky, Joel" in two different places even if they mean the same person.
SELECT *
FROM relationships a
JOIN relationships b (ON a.canonical_object = b.canonical_subject)
WHERE MATCH (subject,object) AGAINST ('Spolsky')
You might want to normalize your data table by making 2 tables.
----------------
| id | subject |
----------------
| 1 | Person |
| 2 | Company |
| 3 | Country |
----------------
-----------------------------------
| id | subject-id | object |
-----------------------------------
| 1 | 1 | Steve Ballmer |
| 2 | 1 | Steve Jobs |
| 3 | 2 | Microsoft |
| 4 | 2 | Apple |
| 5 | 1 | Obama |
| 6 | 3 | US |
-----------------------------------
This allows you to more easily see all the different subject types you have defined.