Social Network: Delete Post in Database Without Breaking Application - mysql

Introduction
I am designing a social networking website for college students. To facilitate moderation, some students will be designated as moderators with elevated abilities to delete inappropriate posts and review flagged content.
Question
When a moderator finds a post or an event page that is inappropriate - how should the website proceed to hide or delete it without breaking the entire application?
Background
The issue becomes increasingly difficult when event pages need to be deleted but the page is linked to possibly couple thousand users. Connected users should have the ability to know an event page was deleted - so there would be no confusion.
An ability to "undo" deletion would be a huge plus. Since students will be the ones moderating, I would want the ability for superusers (i.e. website admins) to override moderator actions.
Considerations
The backend is run on MySQL with the Innodb engine.
Data integrity is important:
Foreign keys come into mind - but as it currently stands - the database schema does not employ them.
Deleted posts/pages should not come up in general search.
The ability to "undo" a delete action is preferable. Deleting an entire event page could ensue anger from the masses.
Current Solution
When content is found to violate the terms of use - add a special flag in the database (i.e. column status is changed to 1 -> indicating "inappropriate"). Content is then hidden according to this flag.
However - how does this tactic work in the whole scheme of a RELATIONAL database? Especially when it comes to search?
Case Study
Users can search and view all events that their friends are attending.
Table: Users
user_id
name
Table: Events
event_id
event_name
datetime
status (0 -> okay, 1 -> hide)
Table: Users_events
user_id
event_id
Issue with the Current Solution
How should the website filter out all the hidden or flagged content? Should there be a special covering index?
Conclusion
Designig a social networking website that has advanced moderation capabilities. How to flag content and hide potentially harmful and abusive posts/event pages while maintaining integral relationships in the database? How to optimize search to show only approved (non-flagged content?

You have the right approach.
By setting a status value and coding the application so that it behaves differently based on the value of the status, you preserve the integrity of the data and get the behavior you want.
Your searches, when you want just visible entries, just have to have "WHERE status = 0" in them so you don't get the ones that are hidden.
One small change - if you are only determining whether to hide or not, a better name for the column is hide. Then the values make sense: 0 is don't hide, and 1 is do hide.

Related

Database design & normalization

I'm creating a messaging system for a e-learning platform and there are some design concerns that I'd like some feedback on.
First of all, it is important for me and my system to be highly modifiable in the future. As such, maintaining a fairly high normalization across my tables is important.
On to how my system will work:
All members (students or teachers) are part of a virtual classroom.
Teachers can create tasks and exercises in these classrooms and assign them to one or multiple students (member_task table not illustrated).
A student can request help for a specific task or exercise by sending a message to the teachers of the classroom.
Messages sent by students are sent to all the teachers. They cannot address a message to a specific teacher.
Messages sent by teachers can be addressed to one or more students.
Students cannot send messages to other students.
Messages behave like chat, meaning that a private conversation starts between a student and all teachers when they send a message.
Here's the ER diagram I made:
So my question is, is this table normalized properly for my purpose? Is there anything that can be done to reduce redundancy of data across my tables? And out of curiosity, is it in BCNF?
Another question: I don't intend to ever implement delete features anywhere in my system. Only "archiving" where said classroom/task/member/message/whatever is simply hidden/deactivated. So is there any reason to actually use FK?
EDIT: Also, a friend brought to my attention that the Conversations table might be redundant, and it kinda feels so. Thoughts?
Thanks.
In response to your emphasis on "modifiability" which I'm taking to mean with respect to application and schema evolution I'm actually going to suggest a fairly extreme solution. Before that some notes some aspects you've mentioned. First, foreign keys represent meaningful constraints in your data. They should always be defined and enforced. Foreign keys are not there just for cascading delete. Second, the Conversations table is arguably redundant. It would make sense if you had a notion of "session" of chat which would correspond to a Conversation. Otherwise, you just have a bunch of messages throughout time. The Conversation table could also enable a many-to-many relation between messages and tasks/exercises if you wanted to have chats that simultaneously covered multiple exercises, for example.
Now for the extreme suggestion. You could use 6NF. In particular, you might look at its incarnation in anchor modeling. The most notable difference in this approach is each attribute is modeled as a different table. 6NF supports temporal databases (supported in anchor modeling via "historized" attributes/ties). This means handling situations like a student being associated to a task now but not later won't cause all their messages to disappear. Most relevant to you, all schema modifications are non-destructive and additive, so no old code breaks when you make a change.
There are downsides. First, it's a bit weird, and in particular anchor modeling (somewhat gratuitously?) introduces a bunch of new terms. Second, it produces weird queries for most relational databases which they may not optimize well. This can sometimes be resolved with materialized views. Third, at the physical level, every attribute is effectively nullable. Finally, the tooling and support, while present, is pretty young. In particular, for MySQL, you may only be "inspired by" what's provided on the anchor modeling site.
As far as the actual database model would go, it would look roughly similar. Anchor modeling uses the term "anchor" for roughly the same thing as an entity, and "tie" for roughly the same thing as a relation. For simplicity, dropping the Conversation relation (and thus directly connecting Message to Task), the image would be similar: you'd have an anchor for Classroom, Member, Message, and Task, and a tie replacing Recipient that you might called ReceivedMessage representing the relation of "member received message message". The attributes on your entities would be attribute nodes. Making the message attribute on the Message anchor historized would allow messages to be edited if desired and support a history of revisions.
One concern I have is that I don't see a Users table which will hold all the students and teachers info (login, email, system id, role, etc) but I assume there is something similar in our system?
Now, looking into the Members table: usually students change classes every semester or so and you don't want last semesters' students to receive new messages. I would suggest the following:
Members
=============
PK member_id
FK class_id
FK user_id
--------------
join_date
leave_date
active
role
The last two fields might be redundant:
active: is an alternative solution if you want to avoid using dates. This will become false when a user stops being member of this class. Since there is not delete feature, the Members entry has to be preserved for archive purposes (and historical log).
role: Depends on how you setup Users table and roles in your system. If a user entry has role field(s) then this is not needed. However, this field allows for the same user to assume different roles in different classes. Example: a 3rd year student, who was a member of this class 2 years ago, is now working as TA/LA (teaching/lab assistant) for the same class. This depends on how the institution works... in my BSc we had the "rule": anyone with grade > 8.5/10 in Java could volunteer to do workshops to other students (using uni's labs). Finally, this field if used as a mask or a constant, allows for roles to be extended (future-proof)
As for FKs I will always suggest using them for data consistency. Things can get really ugly really fast without FKs. The limitations they impose can be worked around and they are usually needed: What is the purpose of archiving a message with sender_id if the sender has been deleted by accident? Also, note that in most systems FKs are indexed which improves the performance of queries/joins.
Hope the above helps and not confuse things :)

Usability problems with empty tables

I have a question about web usability related with tables, this is my use case:
I have a view with more than 1 table, I mean, I have N>0 tables in the view and each table has a title (for example "Photo list", "Video list", "Sound list").
Using javascript, users have the possibility to change the "view level", I mean, the detail level of the view. This means that clicking in different action buttons (basic, medium, advance view) the users can modify the amount of rows in each table. So, could be that some of the tables would be empty (no rows).
My question: What is the best usability practice to manage empty tables?
When you have identified tables that shows certain information you shouldn't hide then when they are empty, at least not without showing in any way that there's no data related to the empty table.
If you don't show the table maybe your users don't perceive that there is an entity of data that's empty, if you show it they will. This is important.
It could, however, be less important depending on the way you are showing your data. Let's say, for example, that your view shows on top a list of the different data types with the number of records in each one. If you keep a reminder there that X data type has 0 records, you can hide the table header on the view body, as all the info your user need is on the view.
On the contrary, if your users have no way to know that a specific data type is empty other than seeing an empty table, you need to keep it in your view to avoid them loosing information.
Keep in mind that information is the key on our world. Design is important to help and improve user experience, but you shouldn't put it before information.

Proper way to store requests in Mysql (or any) database

What is the "proper" (most normalized?) way to store requests in the database? For example, a user submits an article. This article must be reviewed and approved before it is posted to the site.
Which is the more proper way:
A) store it in in the Articles table with an "Approved" field which is either a 0, 1, 2 (denied, approved, pending)
OR
B) Have an ArticleRequests table which has the same fields as Articles, and upon approval, move the row data from ArticleRequests to Articles.
Thanks!
Since every article is going to have an approval status, and each time an article is requested you're very likely going to need to know that status - keep it inline with the table.
Do consider calling the field ApprovalStatus, though. You may want to add a related table to contain each of the statuses unless they aren't going to change very often (or ever).
EDIT: Reasons to keep fields in related tables are:
If the related field is not always applicable, or may frequently be null.
If the related field is only needed in rare scenarios and is better described by using a foreign key into a related table of associated attributes.
In your case those above reasons don't apply.
Definitely do 'A'.
If you do B, you'll be creating a new table with the same fields as the other one and that means you're doing something wrong. You're repeating yourself.
I think it's better to store data in main table with specific status. Because it's not necessary to move data between tables if this one is approved and the article will appear on site at the same time. If you don't want to store disapproved articles you should create cron script with will remove unnecessary data or move them to archive table. In this case you will have less loading of your db because you can adjust proper time for removing old articles for example at night.
Regarding problem using approval status in each query: If you are planning to have very popular site with high-load for searching or making list of article you will use standalone server like sphinx or solr(mysql is not good solution for this purposes) and you will put data to these ones with status='Approved'. Using delta indexing helps you to keep your data up-to-date.

Database user table design, for specific scenario

I know this question has been asked and answered many times, and I've spent a decent amount of time reading through the following questions:
Database table structure for user settings
How to handle a few dozen flags in a database
Storing flags in a DB
How many database table columns are too many?
How many columns is too many columns?
The problem is that there seem to be a somewhat even distribution of supporters for a few classes of solutions:
Stick user settings in a single table as long as it's normalized
Split it into two tables that are 1 to 1, for example "users" and "user_settings"
Generalize it with some sort of key-value system
Stick setting flags in bitfield or other serialized form
So at the risk of asking a duplicate question, I'd like to describe my specific scenario, and hopefully get a more specific answer.
Currently my site has a single user table in mysql, with around 10-15 columns(id, name, email, password...)
I'd like to add a set of per-user settings for whether to send email alerts for different types of events (notify_if_user_follows_me, notify_if_user_messages_me, notify_when_friend_posts_new_stuff...)
I anticipate that in the future I'd be infrequently adding one off per-user settings which are mostly 1 to 1 with users.
I'm leaning towards creating a second user_settings table and stick "non-essential" information such as email notification settings there, for the sake of keeping the main user table more readable, but is very curious to hear what expects have to say.
Seems that your dilemma is to vertically partition the user table or not. You may want to read this SO Q/A too.
i'm gonna cast my vote for adding two tables... (some sota key-value system)
it is preferable (to me) to add data instead of columns... so,
add a new table that links users to settings, then add a table for the settings...
these things: notify_if_user_follows_me, notify_if_user_messages_me, notify_when_friend_posts_new_stuff. would then become row insertions with an id, and you can reference them at any time and extend them as needed without changing the schema.

Keeping Drop-downs DRY in a web app

I'm writing a CMS for various forms and such, and I find I'm creating a lot of drop-downs. I don't really feel like mucking up my database with tons of random key/string value tables for simple drop-downs with 2-4 options that change very infrequently. What do you do to manage this in a responsible way?
This is language-agnostic, but I'm working in Rails, if anyone has specific advice.
We put everything into a single LookUp table in the database, with a column that mapped to an enum that described which lookup it was for (title, country, etc.).
This enabled us to add the flexibility of an "Other, please specify" option in lookup dropdowns. We made a control that encapsulated this, with a property to turn this behaviour on or off on a case-by-case basis.
If the end user picked "Other, please specify", a textbox would appear for them to enter their own value. This would be added to the lookup table, but flagged as an ad hoc item.
The table contained a flag denoting the status of each lookup value: Active, Inactive, AdHoc. Only Active ones would appear in the dropdown; AdHoc ones were those created via the "Other, please specify" option.
An admin page showed the frequency of usage of the AdHoc values, allowing the administrators of the site to promote common popular values into general usage (i.e. changing their Status flag to Active).
This may well be overkill for your app, but it worked really well for ours: the app was basically almost entirely CRUD operations on very business-specific data. We had dozens of lookups throughout the site that the customer wanted to be able to maintain themselves. This gave them total flexibility with no intervention from us.
You cold have one single dropdown table with an extra column to say what the drop down is for... limit the results with a where clause...
At my current position, we implemented a LookupCode table that contains a CodeGroup,Code, and Meaning column, as well as some others (like active). That way you have a single table that contains all of your lookup values are in a single location and you can do some quick lookups to bind to your dropdown lists.