MySQL database design of an online video education website - mysql

I am planning on a website that offers users training videos on multiple subjects, users can view videos, check their view history, comment on videos, taking relative exams and gain badges/certificates if they succeed in the exam. Also they can submit live questions when they had a problem during the video. ALso there will be a forum where users can help each other out and add friends, send messages, etc. WIth this idea in mind, I am now designing the database, but I had no experience in database design before, so please review the following design and give me some suggestions on performance and best practices. Thank you all in advance.
There are
USER PROFILE
username VARCHAR
id INT
Type ENUM
Email VARCHAR
Date_created TIMESTAMP
Date_modified TIMESTAMP
Pass VARBINARY
Name VARCHAR
certificates VARBINARY
Ranking_points INT
Gender ENUM
DOB DATE
Avatar_url VARCHAR
VIDEOS
Title VARCHAR
id INT
Category ENUM
Description TINYTEXT
Date_created TIMESTAMP
Tmp_name CHAR
File_name VARCHAR
Size MEDIUMINT
Subtitle_url VARCHAR
Liked_count INT
Shared_count INT
Tags ENUM
CATEGORY
Category VARCHAR
id INT
Description TINYTEXT
RecentlyViewed
User_id INT
Video_id INT
Viewed_date TIMESTAMP
FRIEND_LIST
User_id INT
Friend_id INT
USER_QUESTIONS
User_id INT
Question_id INT
Question_title VARCHAR
Question_content LONGTEXT
Date_asked TIMESTAMP
VIDEO_COMMENTS
Title VARCHAR
Video_id INT
Comment_content LONGTEXT
User_id TINYTEXT
Date_created TIMESTAMP
USER_MESSAGES
User_id INT
Message_id INT
Message_content LONGTEXT
Sender_id INT
Date_created TIMESTAMP
ONLINE EXAMS/ASSESSMENTS
id VARCHAR
type ENUM
exam_url VARCHAR
description TINYTEXT
Date_created TIMESTAMP
EXAMS_TAKEN_BY_USER
exam_id INT
User_id INT
Exam_result SMALLINT
Date_taken TIMESTAMP
I have two more quesions:
1. I would like to allow comments on user comments, how would I design the database structure?
2. One video can belong to multiple categories, is it necessary to create a new table for video categories or just put a series of categories in the category field in the VIDEO table?

you can use self-linked table for user comments. The table would look like this
Id ---- comments ---- parent_comment_Id
1 root comments null
2 child comments 1
3 grandchild comments 2
About video category, it would be nice to have a separate table particularly for video category. And then you can scale category information such as category name, description and etc, in future. Considering user experience, it is quite common case that users want to check the category list first and then click through to get all the related videos.

The overall structure seems to be okay, at least that looks pretty much like something I would do.
1.- About comments on comments:
I guess that the simplest design would be to add a parent_comment_id column to it. But that's a tricky one, because there would be a lot of NULL values either on parent_comment_id or video_id. So, another way around this would be to allow NULL values for video_id on VIDEO_COMMENT. In this scenario when a NULL in this column is found, you would know you're in presence of a reply to a user comment. Then, you'll have to found the parent comment on a second table:
USER_COMMENT
user_comment_id (primary key)
video_comment_id (foreign key to VIDEO_COMMENT)
comment (text)
created_date (date)
This is definitely not the cutest solution, but if you're troubled by performance issues, it could be worthy of a shot. You still have the same VIDEO_COMMENTS table (clean as before, no modifications to its structure). User comments on previous comment might not be that frequent, so you have them in a second structure, and not all over the same VIDEO_COMMENTS table.
2.- About Video Categories
Here I would create 2 additional tables: one for video categories (or tags) and a second for linking each video to its categories.
VIDEOS
Here I would drop the CATEGORY column
CATEGORIES
category_id (primary key)
title
VIDEO_CATEGORIES
video_category_id
category_id
video_id
In this case you can have a list of categories already stored, and the user would only have to select the ones he/she thinks right to his/her video.
I not recommend you to put multiple values on a same column since it's really a bad design issue. You couldn't provide the user with a list of categories (like the one I mentioned before). You would have, also, many duplicated values, typos, etc. Also, having several values stored in the same column makes editing unnecessary difficult: think how would you implement a user request to delete a category tag and add another. Of course, it isn't that hard, but it could be a lot simpler if you only had to delete individual references.

Related

Table 3 foreign key, either has a value and a null

So i have a users table. user can be author or a normal user. There's no defining column that says a user is an author, only in the story table where there's a foreign key(author_id = user.user_id).
Now i want to add a review function. User can review either a story and an author and can also rate a chapter.
So my table looks like this:
**review**
review_id(PK AI)
user_id(FK - user.user_id)//reviewer NOT NULL
author_id(FK - user.user_id)//reviewee NULL
story_id(FK - story.story_id)//reviewee NULL
chapter_id(FK - chapter.chapter_id)//reviewee NULL
**review_content**
review_content_id(PK AI)
review_id(FK)
rating decimal
content text
date_added datetime
Since user_id in the user_table, story_id in the story table and chapter_id in the chapter table are all PK AI. There is a 100% chance of duplicate/same values.
For example: user_id(5) reviewed author(10). user_id(5) also reviewed story_id(10), or user_id(5) rated chapter_id(10).
So to identify if a review is for a story/author/chapter. I plan to just null other FK and insert a value to the relevant FK.
If a review is for a story, then story_id(FK) will have a value meanwhile author_id and chapter_id will be null.
so in my query it will be:
//fetch review for story
$this->db->from('review');
$this->db->where('author_id', NULL);
$this->db->where('story_id', $story_id);
$this->db->where('chapter_id', NULL);
$this->db->get()->result();
I want to know if my table design is correct. Or should i stick to making different review tables(author,story,chapter).
I really wanted to write reusable methods for the 3 instead of making 3 for each types.
For example: checking whether review exists. Inserting reviews. fetching. etc. If i were to make 3 different tables then i would have to make 3 different methods for each functions.

Best Practice: find row for unique id from multiple tables

our database contain 5+ tables
user
----------
user_id (PK) int NOT NULL
name varchar(50) NOT NULL
photo
--------
photo_id (PK) int NOT NULL
user_id (FK) int NOT NULL
title varchar(50) NOT NULL
comment
-------
comment_id (PK) int NOT NULL
photo_id int NOT NULL
user_id int NOT NULL
message varchar(50) NOT NULL
all primary key id's are unique id's.
all data are linked to http://domain.com/{primary_key_id}
after user visit the link with id, which is unique for all tables.
how should i implement to find what table this id belongs to?
solution 1
select user_id from user where user_id = {primary_key_id}
// if not found, then move next
select photo_id from photo where photo_id = {primary_key_id}
... continue on, until we find which table this primary key belongs to.
solution 2
create object table to hold all the uniqe id and there data type
create trigger on all the tables for AFTER INSERT, to create row in object table with its data type, which was inserted to a selected table
when required, then do select statement to find the table name the id belongs to.
second solution will be double insert. 1 insert for row to actual table with complete data and 2 insert for inserting unique id and table name in object table, which we created on step 1.
select type from object_table where id = {primary_key_id}
solution 3
prepend table name + id = encode into new unique integer - using php
decode id and get the original id with table name (even if its just as number type)
i don't know how to implement this in php, but this solution sounds better!? what are your suggestion?
I don't know what you mean by Facebook reference in the comments but I'll explain my comment a little further.
You don't need unique ID's across five DB tables, just one per table. You have couple of options how to create your links (you can create the links yourself can you?):
using GET variables: http://domain.com/page.html?pk={id}&table={table}
using plain URL: http://domain.com/{id}{table}
Depending on the syntax of the link you choose the function to parse it. You can for example use one or both of the following:
http://php.net/manual/en/function.explode.php
http://www.php.net/manual/en/function.parse-url.php
When you get the simple model working you may add encoding/decoding/hashing functions. But do you really need them? And in what level? (I have no experience in that area so I'll shut up now.)
Is it actually important to maintain uniqueness across tables?
If no, just implement the solution 3 if you can (e.g. using URL encoding).
If yes, you'll need the "parent" table in any case, so the DBMS can enforce the uniqueness.
You can still try to implement the solution 3 on top of that,
or add a type discriminator1 there and you'll be able to (quickly) know which table is referenced for any given ID.
1 Take a look at the lower part of this answer. This is in fact a form of inheritance.

How to set up two MySQL data fields so one or the other can be null but not both?

I have 5 MySQL data fields for a votes table:
post id
poll id
vote id
voter
voteid
You can vote in a poll or vote for a post. If you vote in a poll, the post/person field will be null. If you vote for a post, the vote field will be null.
I want to set up the table so it will allow you to make either the post id or vote id null, but not both. I'm using phpmyadmin to manage my database.
Any ideas?
I have to agree with jmilloy above the best thing to do is to create separate tables. This an example how this would work:
Table structure and sample data:
CREATE TABLE post (
post_id INT AUTO_INCREMENT PRIMARY KEY,
vote_id INT
);
CREATE TABLE poll (
poll_id INT AUTO_INCREMENT PRIMARY KEY,
vote_id INT
);
CREATE TABLE voter(
vote_id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(30)
);
INSERT INTO post (vote_id) VALUES(1),(2),(3),(6);
INSERT INTO poll (vote_id) VALUES(3),(5),(4),(7);
INSERT INTO voter(name) VALUES ('bob'),
('Jack'),
('Joe'),
('Shara'),
('Hillary'),
('Steven'),
('Sandra');
To retrieve the voter that has voted for a post you have to use a JOIN. This is an example how this would look like if you want to find the voters for a post number 2.
SELECT post.post_id, vote.name
FROM (post
JOIN post_vote
ON post_vote.post_id = post.post_id)
JOIN vote
ON vote.vote_id = post_vote.vote_id
WHERE post.post_id = 2;
SQL FIDDLE DEMO
Some explanation if you have a poll and a vote you have a many to many relationship, i.e. one voter can vote for more than one poll and one poll can have more than one voter. To bridge between the vote and poll table you use a bridge table. This tables contains all the poll numbers and vote combinations. So when you want to know who has voted for a particular poll you need to link the poll_id with the poll_id in the poll_vote table. The result is then matched with the vote table using the vote_id in the poll_vote table and the vote table. Hope this helps. Good luck with your project.
Look at the MySQL CREATE TABLE syntax http://dev.mysql.com/doc/refman/5.1/en/create-table.html
Notice that NOT NULL or NULL are part of a column definition. The default is NULL. This can only be applied to columns, not pairs of columns.
The solution here is to make two separate tables, one for post votes and one for poll votes. Then you can put the relevant fields in each table. This will also save you space, and make your data less error prone.

Use varbinary as foreign key

Is it ok to use data type varbinary for foreign keys?
Why?
I have an EvalAnswer table with a FK to a Score table.
The score is sensitive and should be encrypted. The encrypt/decrypt happens in the asp.net (4.0) project and not in sql server (2008), so the data type needs to be varbinary.
EDIT: more info
Of course.
I have these columns: Id, Score, ScoreText, Description, Index
The Id is an incremental counter. (PK)
The Score is the score as number (such as 1).
The ScoreText is the score as a letter (Score 1 equals letter A).
The Description is a comment for every score.
The reason I have it like this is also that there are special situations,
such as one of the questions have only scoring from 1-4, and the rest has 1-5.
So every question has a score 1, but the the description differs from another questions score 1.
So If I have 5 questions, this gives 5*5 rows in the Score table. (All with different description)
When I page load I get the correct scoring (with description) for every dropdownlist. Normally 1-5.
But when the user has saved the scoring, I need to know the earlier saaved score for every question when I page load.
Therefore I have a relation between EvalAnswer and the scoring.
There are questions with relation to the score table which is NOT sensitive.
But some are. And for them I need to hide the relation beetween EvalAnswer and Score.
What might be a bad design is the fact that I use the same table (the score table) as the
one to show the available scoring for every questions.
and also as the one to hold what the user has chosen. (this is the FK from EvalAnswer to Score)
Please advice.......
I suggest adding an ID int column to the Score table and reference this field from the EvalAnswer table.
This means your table scripts will change to
CREATE TABLE Score (
Id int not null identity primary key
, Code varbinary(max) --> The new field containing the encrypted information
, Score int
, ScoreText varchar(5)
, Description varchar(max)
, Index int)
CREATE TABLE EvalAnswer (
Id int not null identity,
ScoreId int not null references Score(Id)
...
)
As you can see, the "Old" Id field has now become the Code field. The new field is an identity column containing a unique number
There is nothing against using a varbinary column in a foreign key, but it will make querying and debugging much harder.
Also note there is a 900 byte limit on the width of an index which might be easier to hit when storing an encrypted blob.

opinions and advice on database structure

I'm building this tool for classifying data. Basically I will be regularly receiving rows of data in a flat-file that look like this:
a:b:c:d:e
a:b:c:d:e
a:b:c:d:e
a:b:c:d:e
And I have a list of categories to break these rows up into, for example:
Original Cat1 Cat2 Cat3 Cat4 Cat5
---------------------------------------
a:b:c:d:e a b c d e
As of right this second, there category names are known, as well as number of categories to break the data down by. But this might change over time (for instance, categories added/removed...total number of categories changed).
Okay so I'm not really looking for help on how to parse the rows or get data into a db or anything...I know how to do all that, and have the core script mostly written already, to handle parsing rows of values and separating into variable amount of categories.
Mostly I'm looking for advice on how to structure my database to store this stuff. So I've been thinking about it, and this is what I came up with:
Table: Generated
generated_id int - unique id for each row generated
generated_timestamp datetime - timestamp of when row was generated
last_updated datetime - timestamp of when row last updated
generated_method varchar(6) - method in which row was generated (manual or auto)
original_string varchar (255) - the original string
Table: Categories
category_id int - unique id for category
category_name varchar(20) - name of category
Table: Category_Values
category_map_id int - unique id for each value (not sure if I actually need this)
category_id int - id value to link to table Categories
generated_id int - id value to link to table Generated
category_value varchar (255) - value for the category
Basically the idea is when I parse a row, I will insert a new entry into table Generated, as well as X entries in table Category_Values, where X is however many categories there currently are. And the category names are stored in another table Categories.
What my script will immediately do is process rows of raw values and output the generated category values to a new file to be sent somewhere. But then I have this db I'm making to store the data generated so that I can make another script, where I can search for and list previously generated values, or update previously generated entries with new values or whatever.
Does this look like an okay database structure? Anything obvious I'm missing or potentially gimping myself on? For example, with this structure...well...I'm not a sql expert, but I think I should be able to do like
select * from Generated where original_string = '$string'
// id is put into $id
and then
select * from Category_Values where generated_id = '$id'
...and then I'll have my data to work with for search results or form to alter data...well I'm fairly certain I can even combine this into one query with a join or something but I'm not that great with sql so I don't know how to actually do that..but point is, I know I can do what I need from this db structure..but am I making this harder than it needs to be? Making some obvious noob mistake?
My suggestion:
Table: Generated
id unsigned int autoincrement primary key
generated_timestamp timestamp
last_updated timestamp default '0000-00-00' ON UPDATE CURRENT_TIMESTAMP
generated_method ENUM('manual','auto')
original_string varchar (255)
Table: Categories
id unsigned int autoincrement primary key
category_name varchar(20)
Table: Category_Values
id unsigned int autoincrement primary key
category_id int
generated_id int
category_value varchar (255) - value for the category
FOREIGN KEY `fk_cat`(category_id) REFERENCES category.id
FOREIGN KEY `fk_gen`(generated_id) REFERENCES generated.id
Links
Timestamps: http://dev.mysql.com/doc/refman/5.1/en/timestamp.html
Create table syntax: http://dev.mysql.com/doc/refman/5.1/en/create-table.html
Enums: http://dev.mysql.com/doc/refman/5.1/en/enum.html
I think this solution is perfect for what you want to do. The Categories list is now flexible so that you can add new categories or retire old ones (I would recommend thinking long and hard about it before agreeing to delete a category - would you orphan record or remove them too, etc.)
Basically, I'm saying you are right on target. The structure is simple but it will work well for you. Great job (and great job giving exactly the right amount of information in the question).