Im building a forum and I'm wondering whether I should have one table where I store all main posts and then all the answers in another table.
I've always stored everything in one table, making it east to count and let users comment every post (comments in another table).
What should I do? Pros and cons? Tried to google but didn't find anything.
Thanks for your help!
I usually follow the rule every type of dataset gets a own table. This way you can cleanly define relationships
You have types like
userTypes (e.g. guest,user,mod,admin)
users (has one userType_id)
posts (has one user_id)
answers (has one post_id)
comments (has one post_id, has one answer_id)
Since comments can be added for both, a post and an answer, you could add two bridge tables to define this relationship.
comment_to_answer (has one comment_id, has one answer_id)
comment_to_question (has one comment_id, has one question_id)
In case you save both, posts and answers into one table, posts table would need to reference themselves to define the posts - answers one-to-many relationship, which would make querying more complicated.
If you want to be able to cascade, post can have an answer, answer can have an answer and so on, you probably go better with one posts table and a parent_id pointing at the id of posts
Hope this helps.
This is highly subjective, given that it depends on your needs and intentions. Let me explain...
Forums, such as they are, usually take one of two forms. They are sometimes in the form of a main post, with subsequent comments. The main post, therefore logically would live in one table, and the comments in another. Facebook and Stack Exchange sites are examples of this.
In other cases, the content may take the form of a list of comments. More traditional forums take this form. In the case where hierarchy is called for, rather than pure date ordering, the single table approach makes more sense.
In both cases, hierarchy can be dealt with by creating a parent and child column.
My personal preference would be to go with one table, unless the main post contains an order of additional data that isn't needed for comments. That's just an efficiency thing, but you definitely don't want thousands or millions of rows with NULLs that take up space to no avail. To distinguish a post from a comment you can employ any number of logical schemes such as distinct ID's or flag columns.
Ultimately, an architectural situation like this depends on the project in question. There are advantages and disadvantages to both approaches. Using multiple tables offers a little more in the way of 'future proofing' your project in the case where complexity is added at a later date.
Well - since one post can have multiple answers I would go for a separate table.
Pro:
Less redundant information
Answer can be updated/deleted independent from the post
Easier counting of questions / answers
Con
Non (well, perhaps that you need to join the tables for certain queries, but hey that does not really count)
Same reason why you have comments already in another table...
Related
I have the following database scheme (I don't know if it's perfect but I think it's allright?)
It's a system where a User has many Surveys, the Users provide Answers for Questions in the Survey_Answers table.
Now a User can have multiple Surveys, it's the same questions but in a later time of the year they have to fill in the survey again.
I'm nearly there, I'm just wondering how to connect the answers to the survey. Should I make a relation between survey_answers and user_surveys.. thus adding an id to the user_surveys table
Or do you think it's ok to make a relation to the surveys table? I'm not sure which one is correct.
I outlined the 2 possibilities in the second screenshot.
Looking forward to your responses!
Thank you.
This probably depends on how your system is most likely/most frequently going to navigate the relationship.
If you are more likely to be looking at a Users answers and saying - hey let me see when this question was answered, as part of which dated survey, then you should join on user_surveys (I am assuming that the employee_id you are storing would match the user_id in user_surveys)
If you're more likely to be looking at a Users answers and saying - hey what survey did this question belong to, then you should join on Surveys.
You can still answer either question whichever join you use, it will just be a matter of more optimal performance (fewer table joins when trying to answer the most common query).
In reality there probably isn't much in it, so you could always toss a coin :)
I guess this is a very basic question and must have similar issues, but the truth is that I have found very little information. I am developing a website with multiple types of content: articles, threads, recipes, etc.. All these content types can comment and do "like". Comments may also receive "likes". I am no specialist in database architecture, and how I developed the prototype was to establish different tables: comment_article, comment_thread, comment_recipe ... and like_article, like_thread, like_recipe, like_comment.
Now I want to simplify the structure of the minimum number of possible tables: comments and likes.
I would like to know the most performance efficient way to accomplish this:
Field content_type, parent_id to specify the type of content and your referral id.
Fields content_type, thread_id, article_id, recipe_id.
Any others?
Note: We are using relational database with InnoDB storage engine.
I'm not an expert on this, but no answers yet, so I will try to come up with an answer:
You'd only want different tables/columns if the content-types are very different from each other, i.e. that the fields need different types. If all are for example 'text', you just add a column 'type' in the , which you can later manipulate independently in your code.
Comments will be a seperate table with a relation(FK) to the Content(PK).
Likes too. This table will have have a reference (FK) column to content (PK) and a reference column (FK) to comments (PK), of which only one is set per like ofcourse.
For those with a similar problem, a well explained answer can be found here: Implementing Comments and Likes in database
I am developing a forum in PHP MySQL. I want to make my forum as efficient as I can.
I have made these two tables
tbl_threads
tbl_comments
Now, the problems is that there is a like and dislike button under the each comment. I have to store the user_name which has clicked the Like or Dislike Button with the comment_id. I have made a column user_likes and a column user_dislikes in tbl_comments to store the comma separated user_names. But on this forum, I have read that this is not an efficient way. I have been advised to create a third table to store the Likes and Dislikes and to comply my database design with 1NF.
But the problem is, If I make a third table tbl_user_opinion and make two fields like this
1. comment_id
2. type (like or dislike)
So, will I have to run as many sql queries as there are comments on my page to get the like and dislike data for each comment. Will it not inefficient. I think there is some confusion on my part here. Can some one clarify this.
You have a Relational Scheme like this:
There are two ways to solve this. The first one, the "clean" one is to build your "like" table, and do "count(*)'s" on the appropriate column.
The second one would be to store in each comment a counter, indicating how many up's and down's have been there.
If you want to check, if a specific user has voted on the comment, you only have to check one entry, wich you can easily handle as own query and merge them two outside of your database (for this use a query resulting in comment_id and kind of the vote the user has done in a specific thread.)
Your approach with a comma-seperated-list is not quite performant, due you cannot parse it without higher intelligence, or a huge amount of parsing strings. If you have a database - use it!
("One Information - One Dataset"!)
The comma-separate list violates the principle of atomicity, and therefore the 1NF. You'll have hard time maintaining referential integrity and, for the most part, querying as well.
Here is one way to do it in a normalized fashion:
This is very clustering-friendly: it groups up-votes belonging to the same comment physically close together (ditto for down-votes), making the following query rather efficient:
SELECT
COMMENT.COMMENT_ID,
<other COMMENT fields>,
COUNT(DISTINCT UP_VOTE.USER_ID) - COUNT(DISTINCT DOWN_VOTE.USER_ID) SCORE
FROM COMMENT
LEFT JOIN UP_VOTE
ON COMMENT.COMMENT_ID = UP_VOTE.COMMENT_ID
LEFT JOIN DOWN_VOTE
ON COMMENT.COMMENT_ID = DOWN_VOTE.COMMENT_ID
WHERE
COMMENT.COMMENT_ID = <whatever>
GROUP BY
COMMENT.COMMENT_ID,
<other COMMENT fields>;
[SQL Fiddle]
Please measure on realistic amounts of data if that works fast enough for you. If not, then denormalize the model and cache the total score in the COMMENT table, and keep it current it through triggers every time a new row is inserted to or deleted from *_VOTE tables.
If you also need to get which comments a particular user voted on, you'll need indexes on *_VOTE {USER_ID, COMMENT_ID}, i.e. the reverse of the primary/clustering key above.1
1 This is one of the reasons why I didn't go with just one VOTE table containing an additional field that can be either 1 (for up-vote) or -1 (for down-vote): it's less efficient to cover with secondary indexes.
I'm a software developer. I love to code, but I hate databases... Currently, I'm creating a website on which a user will be allowed to mark an entity as liked (like in FB), tag it and comment.
I get stuck on database tables design for handling this functionality. Solution is trivial, if we can do this only for one type of thing (eg. photos). But I need to enable this for 5 different things (for now, but I also assume that this number can grow, as the whole service grows).
I found some similar questions here, but none of them have a satisfying answer, so I'm asking this question again.
The question is, how to properly, efficiently and elastically design the database, so that it can store comments for different tables, likes for different tables and tags for them. Some design pattern as answer will be best ;)
Detailed description:
I have a table User with some user data, and 3 more tables: Photo with photographs, Articles with articles, Places with places. I want to enable any logged user to:
comment on any of those 3 tables
mark any of them as liked
tag any of them with some tag
I also want to count the number of likes for every element and the number of times that particular tag was used.
1st approach:
a) For tags, I will create a table Tag [TagId, tagName, tagCounter], then I will create many-to-many relationships tables for: Photo_has_tags, Place_has_tag, Article_has_tag.
b) The same counts for comments.
c) I will create a table LikedPhotos [idUser, idPhoto], LikedArticles[idUser, idArticle], LikedPlace [idUser, idPlace]. Number of likes will be calculated by queries (which, I assume is bad). And...
I really don't like this design for the last part, it smells badly for me ;)
2nd approach:
I will create a table ElementType [idType, TypeName == some table name] which will be populated by the administrator (me) with the names of tables that can be liked, commented or tagged. Then I will create tables:
a) LikedElement [idLike, idUser, idElementType, idLikedElement] and the same for Comments and Tags with the proper columns for each. Now, when I want to make a photo liked I will insert:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Photo'
INSERT (user id, typeId, photoId)
and for places:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Place'
INSERT (user id, typeId, placeId)
and so on... I think that the second approach is better, but I also feel like something is missing in this design as well...
At last, I also wonder which the best place to store counter for how many times the element was liked is. I can think of only two ways:
in element (Photo/Article/Place) table
by select count().
I hope that my explanation of the issue is more thorough now.
The most extensible solution is to have just one "base" table (connected to "likes", tags and comments), and "inherit" all other tables from it. Adding a new kind of entity involves just adding a new "inherited" table - it then automatically plugs into the whole like/tag/comment machinery.
Entity-relationship term for this is "category" (see the ERwin Methods Guide, section: "Subtype Relationships"). The category symbol is:
Assuming a user can like multiple entities, a same tag can be used for more than one entity but a comment is entity-specific, your model could look like this:
BTW, there are roughly 3 ways to implement the "ER category":
All types in one table.
All concrete types in separate tables.
All concrete and abstract types in separate tables.
Unless you have very stringent performance requirements, the third approach is probably the best (meaning the physical tables match 1:1 the entities in the diagram above).
Since you "hate" databases, why are you trying to implement one? Instead, solicit help from someone who loves and breathes this stuff.
Otherwise, learn to love your database. A well designed database simplifies programming, engineering the site, and smooths its continuing operation. Even an experienced d/b designer will not have complete and perfect foresight: some schema changes down the road will be needed as usage patterns emerge or requirements change.
If this is a one man project, program the database interface into simple operations using stored procedures: add_user, update_user, add_comment, add_like, upload_photo, list_comments, etc. Do not embed the schema into even one line of code. In this manner, the database schema can be changed without affecting any code: only the stored procedures should know about the schema.
You may have to refactor the schema several times. This is normal. Don't worry about getting it perfect the first time. Just make it functional enough to prototype an initial design. If you have the luxury of time, use it some, and then delete the schema and do it again. It is always better the second time.
This is a general idea
please donĀ“t pay much attention to the field names styling, but more to the relation and structure
This pseudocode will get all the comments of photo with ID 5
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "comment"
This pseudocode will get all the likes or users who liked photo with ID 5
(you may use count() to just get the amount of likes)
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "like"
as far as i understand. several tables are required. There is a many to many relation between them.
Table which stores the user data such as name, surname, birth date with a identity field.
Table which stores data types. these types may be photos, shares, links. each type must has a unique table. therefore, there is a relation between their individual tables and this table.
each different data type has its table. for example, status updates, photos, links.
the last table is for many to many relation storing an id, user id, data type and data id.
Look at the access patterns you are going to need. Do any of them seem to made particularly difficult or inefficient my one design choice or the other?
If not favour the one that requires the fewer tables
In this case:
Add Comment: you either pick a particular many/many table or insert into a common table with a known specific identifier for what is being liked, I think client code will be slightly simpler in your second case.
Find comments for item: here it seems using a common table is slightly easier - we just have a single query parameterised by type of entity
Find comments by a person about one kind of thing: simple query in either case
Find all comments by a person about all things: this seems little gnarly either way.
I think your "discriminated" approach, option 2, yields simpler queries in some cases and doesn't seem much worse in the others so I'd go with it.
Consider using table per entity for comments and etc. More tables - better sharding and scaling. It's not a problem to control many similar tables for all frameworks I know.
One day you'll need to optimize reads from such structure. You can easily create agragating tables over base ones and lose a bit on writes.
One big table with dictionary may become uncontrollable one day.
Definitely go with the second approach where you have one table and store the element type for each row, it will give you a lot more flexibility. Basically when something can logically be done with fewer tables it is almost always better to go with fewer tables. One advantage that comes to my mind right now about your particular case, consider you want to delete all liked elements of a certain user, with your first approach you need to issue one query for each element type but with the second approach it can be done with only one query or consider when you want to add a new element type, with the first approach it involves creating a new table for each new type but with the second approach you shouldn't do anything...
This is a tough design question for a application I'm working on. I have 2 different items in my app that both will use comments. What but I can't decide how to design my database.
There are 2 possibilities here. The first is a different comment table for every table that needs comments (normalized way):
movies -> movie_comments
articles -> article_comments
The second way I was thinking of was the use of a generic comments table and then have a many 2 many relationship for the comment and movie|article relations. Eg
comments
comments_movies (movie_id, comment_id)
comments_articles (article_id, comment_id)
What is your opinion on that the best method would be and can you give a good reason so I can decide.
i personally opt for 2nd solution
comments
comments_movies (movie_id, comment_id)
comments_articles (article_id, comment_id)
it is much more simple to maintain only on table model for logical Comment model e.g. when You wan't to add some feature to comments You just do it once or when You wan't count comments for specific user is much more easier because there are in one table
of course someone else could write his advantages of keeping that in multiple tables but You asked for opinions so here is mine :)
Keeping them separate has the benefit of supporting change without impacting the comments for the other entity (movie vs articles). Assuming there are differences in attributes for a comment against an article vs. a movie. Otherwise...
I suppose there could be a need for displaying a comment with an article and a movie. But the consolidation would also support if you want to provide comment functionality for other entities in the future.
The answer depends on what you need currently, and a best guess of what you want to do in the future. More details help us to know what to suggest.
There is no "best" method, because it is a straight-forward Normalisation question: the proposal is either correctly Normalised or it is not.
Actually, the first option is not Normalised, the Normalisation is not complete. You have identical repeating groups of columns in two tables which have not been identified and grouped into a single table.
The second option is Normalised. You have identified that, and placed them in a single table.
at the logical level then, you have a many-to-many relation (not a table) between Movie and Comment, and between Article and Comment. End of story at the logical level.
at the physical level, where n::n relations are implemented as Associative tables, you have CommentMovie and CommentArticle.
as the Db expands and grows, life is simple, because:
any new column that is 1::1 with Movie.PK is placed in Movie
any new column that is 1::1 with Article.PK is placed in Article
any new column that is 1::1 with Comment.PK is placed in Comment
any new column that is 1::1 with CommentArticle.PK (the relation; PK is as shown (ArticleId, CommentId) ) is placed in CommentArticle. This (adding attributes to an n::n relation) will now cause the table to show up on the Logical model.
any new column that is 1::1 with CommentMovie.PK (the relation; PK is as shown (MovieId, CommentId) ) is placed in CommentMovie. This (adding attributes to an n::n relation) will now cause the table to show up on the Logical model.
I would suggest your second choice:
movies -> movie_comments -> comments
articles -> article_comments -> comments
One comments table, two pivot tables(many to many).
This will keep all the same data in one table and just loosely linking them. If you can get away with joins I usually recommend that for things that don't need to scale because joining can be a performance hit and a nightmare in cases. But this would be best for your case.
comment_table
-------------
comment_id (int)
object_id (int)
comment (varchar(max))
type (int)
--------------
object_id refers to object such as movie ,i articles and so on.
type equals 1: comment was done to movie ,
type equals 2: comment was done to article
You can design your tables like this.