I have an app that will allow an admin to upload an article and share it with many users to edit it. The article is then broken down into sentences which will be stored as individual rows in a MySQL DB. Each user can edit article sentences one at a time. How does one structure the database to allow admins to adjust the article sentences (merge, move, delete, edit, add) and still maintain the integrity of the the user's relationship to the article sentences?
Here is the basic structure:
article_sentences
---------------
-id (auto_increment)
-article_id (FK)
-paragraph_id
-content
user_article_sentences
---------------
-user_id (FK)
-article_id (FK)
-article_sentence_id (FK)
-user_content
One problem I see is the change in article_sentence ID. If the admin moves an article around, the ID will need to change along with the paragraph_id possibly changing if we want the article content to be in the correct order. To solve this, maybe we can add an article_sentence_order column? That way the id will never change but the order of the content is dictated by the article_sentence_order column.
What about merging and deleting? Those will cause some problems as well because fragmentation of the different IDs will start to happen.
Any ideas on a new schema design that will help solve these issues? How does an app like Google Docs deal with this type of issue?
Edit:
To solve the issue of moving different sentences around. We can use a new column called order_id and it can either be a varchar or int. Some tradeoffs: If int, then I will have to increment the subsequent sentences' order_id to be plus 1 of itself. If using a varchar, the order_id can simply be something like '3a' if I want to insert between 3 and 4. Problem with this is that in my application code, using numeric indexes to traverse to the next and previous sentences will be bit of a problem.
Are there other alternatives?
What about holding only full version of content, with a version number for each record so you will have a complete history of the article edited and by whom it was modified?
User:
- id
- name
User_article:
- id
- user_id (fk on user, this is the current editor)
- article_id
- version_number
- article_content (the full content of the article)
Article:
- id
- created_date
- user_id (the creator, or main owner )
- category_id
This way, it is very easy to revert articles content to a previous point in history, to see which user what modifications made, etc
Related
My Question, is actually a question about the usability / performance of a concept / idea I had:
The Setup:
Troughout my Database, two (actually three) fields always re-appear constantly: title and description (and created). The title is always a VARCHAR(100) and the description always a TEXT.
Now, to simplify those tables, I thought about something (and changed it in that way): Wouldnt it be more useful to just create a table named content, with id, title, description and created as only fields, and always point to that table from all others?
Example:
table tab has id, key and content_id (instead of title, description and created)
table chapter has id, story_id and content_id (" ")
etc
The Question:
Everything works fine so far, but my only fear is performance. Will I run into a bottleneck, doing it this way, or should I be fine? I have about 23 different tables pointing to content right now, and some of them will hold user-defined content (journals, comments, etc) - so the number of entries in content could get quite high.
Is this setup better, or equal to having title and description in every separate table?
Edit: And if it turns out to be a bad idea, what are alternatives to mantain/copying certain fields like title and description into ~25 tables?
Thanks in advance for the help!
There is no clear answer for your question because it mainly depends on usage of the tables, so just consider following points:
How often will you need write to the tables? In case of many inserts/updates having data in one big table can cause problems because all write operations will target the same table.
How often do you need data stored in table with common data? If title or description are not needed most of the time for your select this can be OK. If you need title every time then take into account that you wile always have to JOIN table with common data.
How do you manage your database schema? It can be easier to write some simple tool for creation/checking table structure. In MySQL you can easily access data dictionary with DESCRIBE table_name or through INFORMATION_SCHEMA database.
I'm working on project with 700+ tables where some of the fields have to be present in every table (when was record created, timestamp of last modification). We have simple script that helps with this, because having all data in one table would be disastrous.
As I continue to work on my social networking site (which I'll probably never finish), I've decided I probably should revise my "Updates" table. If you think of this like Facebook, the Updates table stores stories for the newsfeed, such as User_123 changed his status, or SomeOtherUser added a new photo/video, or YetAnotherUser joined a group.
My current table structure is as follows:
UPDATES
PK Update_ID
Type
Update_Content
FK Photo_ID
FK Video_ID
FK Owner_ID
FK Group_Wall_ID
FK Friend_Wall_ID
Upvotes
Downvotes
Timestamp
As a note, Type refers to the kind of update it is (1 is a status update, 2 is a user joined a group, 3 is a new photo, etc...) and Update_Content is the status text, or a message like "User_123 joined a group"
Right now the way I have it, when a user posts an update to their own "wall", Group_Wall_ID and Friend_Wall_ID are 0 by default. Whereas if that user posts an update to a Group, Group_Wall_ID has a value and Friend_Wall_ID doesn't.
Also, if the update is only a status update, Photo_ID and Video_ID are 0 by default. However, if the update is a new photo, Photo_ID would have a value that corresponds with a PK in the Photos table.
I feel like the structure of this table is pretty inefficient and can use some revisions. Can anyone suggest any revisions to make this table better? Any feedback would be great! Thanks and Happy Holidays!
I don't think this application is a good fit for MySQL. With MySQL you are pulling all the resources together on every read. It also doesnt seem that a feed will need to span very far back chronologically.
I think a better solution is to push an activity to the appropriate feed on write. So if you post a video, it gets appended to all of your friends news feeds. You could limit each feed to 100 items to keep the lists smaller.
I think using redis would be more appropriate. You could have a list for each user's activity feed. LPUSH user_id 'John just added a video`
This solution requires you to have a lot of memory though, and also it may be problematic if a user deletes something from their feed.
I'm creating a database users. I want to let users to choose notifications they want to receive by email.
Now I have the next columns in table users (boolean type):
notification_comment_photo.
notification_comment_comment.
notification_vote_photo.
notification_vote_comment.
notification_pm.
notification_followed.
notification_news.
What do you think, should I normalise table users and create another table notifications, considering that this table would have one-to-one relationship to table users?
Also I have the same problem with social links (twitter, facebook, google+, etc). Is it better to make a separate table links?
upd. Thanks all, I'll add the separate tables.
It's hard to answer your question, because you're not telling us what problem you're trying to solve.
One issue with your current design is that it requires a schema change for every new type of notification you want to store - if you want to notify users when they've been un_followed, you have to add a column to your users table.
I'd consider a schema like:
TABLE: users
------------------
ID
...
TABLE: notification_types
----------------------
ID
Description
TABLE: user_notifcation_subscriptions
-----------------------------------------
user_id
notification_type_id
subscribed (bool)
You could leave the "subscribed" column out of user_notification_subscriptions and decide that any record linking a user to a notification type means they have subscribed.
This design allows you to add new subscription types without changing the schema. I believe it's similar to the design #Daniel suggests, but he doesn't include the notification_type table, relying instead on name-value pairs. I'm not a fan of this - it can lead to silly, hard-to-find bugs when typos slip into the TYPE column.
You could (and probably should) create a separate table "notification_settings" or something.
ID
USER_ID
TYPE
VALUE
This allows you to easily add notification settings without messing with the database tables. Having a "strict" structure as you suggested sometimes gets in the way in the end and would be harder to expand.
For your social links, you should do the same. Another table named "user_social_accounts"
ID
USER_ID
NETWORK_ID
Assuming I want to have a web application that requires storing user information, images, etc as well as storing status updates or posts/comments would I want to separate tables?
For example if I have a "users" table that contains users information like passwords, emails, and typical social networking info like age, location etc. Would it be a good idea do create a second table("posts") that handles user content such as comments and/or post?
Table one: "users"
UserID
Username
Age
etc.
Table Two: "posts"
PostID
PostContent
PostAuthor
PostDate
etc
Is this a valid organization? Furthermore if I wanted to keep track of media should I do this in ANOTHER table?
Table Three: "media"
ID
Type
Uploader
etc.
Any help is much appreciated. I'm curious to see if I'm on the right track or just completely lost. I am mostly wondering if I should have many tables or if I should have larger less segregated tables.
Also of note thus far I planned on keeping information such as followers(or friends) in the 'users' table but I'm not sure that's a good idea in retrospect.
thanks in advance,
Generally speaking to design a database you create a table for each object you will be dealing with. In you example you have Users, Posts, Comments and Media. From that you can flesh out what it is you want to store for each object. Each item you want to store is a field in the table:
[Users]
ID
Username
PasswordHash
Age
Birthdate
Email
JoinDate
LastLogin
[Posts]
ID
UserID
Title
Content
CreateDate
PostedDate
[Comments]
ID
PostID
UserID
Content
[Media]
ID
Title
Description
FileURI
Taking a look above you can see a basic structure for holding the information for each object. By the field names you can even tell the relationships between the objects. That is a post has a UserID so the post was created by that user. the comments have a PostID and a UserID so you can see that a comment was written by a person for a specific post.
Once you have the general fields identified you can look at some other aspects of the design. For example right now the Email field under the Users table means that a user can have one (1) email address, no more. You can solve this one of two ways... add more email fields (EmailA, EmailB, EmailC) this generally works if you know there are specific types of emails you are dealing with, for example EmailWork or EmailHome. This doesn't work if you do not know how many emails in total there will be. To solve this you can pull the emails out into its own table:
[Users]
ID
Username
PasswordHash
Age
Birthdate
JoinDate
LastLogin
[Emails]
ID
UserID
Email
Now you can have any number of emails for a single user. You can do this for just about any database you are trying to design. Take it in small steps and break your bigger objects into smaller ones as needed.
Update
To deal with friends you should think about the relationship you are dealing with. There is one (1) person with many friends. In relation to the tables above its one User to many Users. This can be done with a special table that hold no information other than the relationship you are looking for.
[Friends]
[UserA]
[UserB]
So if the current user's ID is in A his friend's ID is in B and visa-verse. This sets up the friendship so that if you are my friend, then I am your friend. There is no way for me to be your friend without you being mine. If you want to setup the ability for one way friendships you can setup the table like this:
[Friends]
[UserID]
[FriendID]
So If we are both friends with each other there would have to be 2 records, one for my friendship to you and one for your freindship to me.
You need to use multiple tables.
The amount of tables depends on how complex you want your interactive site to be. Based on what you have posted you would need a table that would store information about the users, a table for comments, and more such as a table to store status types.
For example tbl_Users should store:
1. UserID
2. First Name
3. Last name
4. Email
5. Password (encrypted)
6. Address
7. City
8. State
9. Country
10. Date of Birth
11. UserStatus
12. Etc
This project sounds like it should be using a relational DB that will pull up records, such as comments, by relative userIDs.
This means that you will need a table that stores the following:
1. CommentID (primary key, int, auto-increment)
2. Comment (text)
3. UserID (foreign key, int)
The comment is attached to a user through a foreign key, which is essentially the userId from the tbl_Users table. You would need to combine these tables in an SQL statement with your script to query the information as a single piece of information. See example code
$sql_userWall = "SELECT tbl_Users.*, tbl_Comments.*, tbl_userStatus FROM tbl_Users
INNER JOIN tbl_Comments ON tbl_Users.userID = tbl_Comments.userID
INNER JOIN tbl_UserStatus ON tbl_Users.userID = tbl.UserStatus
WHERE tbl_Users.userID = $userID";
This statement essentially says get the information of the provided user from the users table and also get all the comments with that has the same userID attached to it, and get the userStatus from the table of user status'.
Therefore you would need a table called tbl_userStatus that held unique statusIDs (primary key, int, auto-incrementing) along with a text (varchar) of a determined length that may say for example "online" or "offline". When you started the write the info out from e record using php, asp or a similar language the table will automatically retrieve the information from tbl_userStatus for you just by using a simple line like
<?php echo $_REQUEST['userStatus']; ?>
No extra work necessary. Most of your project time will be spent developing the DB structure and writing SQL statements that correctly retrieve the info you want for each page.
There are many great YouTube video series that describe relational DBS and drawing entity relational diagrams. This is what you should look into for learning more on creating the tye of project you were describing.
One last note, if you wanted comments to be visible for all members of a group this would describe what is known as a many-to-many relationship which would require additional tables to allow for multiple users to 'own' a relationship to a single table. You could store a single groupID that referred to a table of groups.
tbl_groups
1. GroupID
2. GroupName
3. More group info, etc
And a table of users registered for the group
Tbl_groupMembers
1. membershipCountID (primary key, int, auto-increment)
2. GroupID (foriegn key, int)
3. UserID (foriegn key, int)
This allows users to registrar for a group and inner join them to group based comments. These relationships take a little more time to understand, the videos will help greatly.
I hope this helps, I'll come back and post some YouTube links later that I found helpful learning this stuff.
I am trying to normalize my database but I'm having a headache getting to grips with it. I am developing a CMS where Facebook users can create a page on my site. So far this is what I have
page
----
uid - PK AI
slug - Slug URL
title - Page title
description - Page description
image - Page image
imageThumbnail - Thumbnail of image
owner - The ID of the user that created the page
views - Page views
timestamp - Date page was created
user
----
uid - PK AI
fbid - Facebook ID
(at a later date may add profile options i.e name, website etc)
tags
----
uid - PK AI
tag - String (tag name)
page_tag
--------
pid - Page id (uid from page table)
tid - Tag id (uid from tag table)
page_user
---------
pid - Page id (uid form page table)
uid - User ID (uid from user table)
I've tried to seperate as much information as needed without going over the top. I created a seperate table for tags because I don't want tag names being repeated. If the database holds 100,000+ pages, the repeated tags will add to storage and speed no doubt.
Is there any problems with the design? Or anything I'm doing wrong? I remember learning this at university but I've done very little database design since then.
I'd rather get it right the first time then have the headache later on.
Looks fine to me. How bad can it be with five tables?
You have users, pages, and tags. Users can have many pages; pages can be referred to by many users. A page can have many tags; a tag can be associated with many pages.
Sums it up for me. I wouldn't worry about it.
Your next concern is indexes. You'll want an index for every WHERE clause that you'll use to query.