Proper way to store requests in Mysql (or any) database - mysql

What is the "proper" (most normalized?) way to store requests in the database? For example, a user submits an article. This article must be reviewed and approved before it is posted to the site.
Which is the more proper way:
A) store it in in the Articles table with an "Approved" field which is either a 0, 1, 2 (denied, approved, pending)
OR
B) Have an ArticleRequests table which has the same fields as Articles, and upon approval, move the row data from ArticleRequests to Articles.
Thanks!

Since every article is going to have an approval status, and each time an article is requested you're very likely going to need to know that status - keep it inline with the table.
Do consider calling the field ApprovalStatus, though. You may want to add a related table to contain each of the statuses unless they aren't going to change very often (or ever).
EDIT: Reasons to keep fields in related tables are:
If the related field is not always applicable, or may frequently be null.
If the related field is only needed in rare scenarios and is better described by using a foreign key into a related table of associated attributes.
In your case those above reasons don't apply.

Definitely do 'A'.
If you do B, you'll be creating a new table with the same fields as the other one and that means you're doing something wrong. You're repeating yourself.

I think it's better to store data in main table with specific status. Because it's not necessary to move data between tables if this one is approved and the article will appear on site at the same time. If you don't want to store disapproved articles you should create cron script with will remove unnecessary data or move them to archive table. In this case you will have less loading of your db because you can adjust proper time for removing old articles for example at night.
Regarding problem using approval status in each query: If you are planning to have very popular site with high-load for searching or making list of article you will use standalone server like sphinx or solr(mysql is not good solution for this purposes) and you will put data to these ones with status='Approved'. Using delta indexing helps you to keep your data up-to-date.

Related

How to design a MySQL table that tracks the Status of each Asset, as well as every old Status?

I would like to create a table that tracks the status of each asset as well as each past status. Basically I want to keep a log of all status changes.
Do I create a timestamp for each updated status and have every update be its own separate row, linked back to the asset through the assetid? Then sort by the timestamp to get these statuses in order? I can see this table getting unwieldy if there are tons of rows for each asset and the table grows linearly over time.
This is for a MySQL database.
Here is an example of how I have designed a database table to track/log purposes.
Columns:
auto increment pk (if you don't have better pk)
timestamp
tracked object id (asset_id in your case)
event type (probably you don’t need but this is explained below)
content (this can be also named status in your case)
My example is very simplified but the main idea is to insert each record into own row. You can create a table with proper primary keys or indexes to have a good search performance.
Using the structure you should be able to search by asset, by status, or get latest changes etc. The structure depends on your needs so usually I have modified it to support the need.
Don’t care too much about the event -columns. I just put it here because most of the implementations are based on event sourcing. Here is a link to one article that could explain it: http://scottlobdell.me/2017/01/practical-implementation-event-sourcing-mysql/
I suggest that you could read more about that event sourcing that if the design could work in your case. Look only the database example because that is similar like in my example.
In the results, you should have a journal of status changes. Then it depends on your code how to handle/read data and show results.
About the linear growth… I would say it is not a big problem. Of course, if you have more information what “tons of rows” means, then ask. I have not seen any scaling problems. The same structure works very well with relational or with NoSQL databases. Mysql also has features to optimize that kind of structure if the size of data will be a problem.

User submitted content to mysql with moderation: separate table?

In an mysql table I would like to get data from user, however the data would need to be moderated by admin first. My question is that is it normal to just insert into the original table and use a field as flag of the moderation status? Or have a separate table of pre-moderated posts and do the insertions only at moderation?
I think both method would work but I am not sure if I miss out other considerations here. Hope someone experienced can tell me the established/preferred way to do that.
If you're working with a not-huge data set I'd recommend just adding a flag column that allows you to show or hide user data. This will require fewer and easier queries to work with and should make your life a lot easier than juggling the data between multiple identical tables. Additionally, if you want to add something like a button for "report this content as BAD" you could remove the content from other results while only "soft deleting" it from public visibility.

Implementing Comments and Likes in database

I'm a software developer. I love to code, but I hate databases... Currently, I'm creating a website on which a user will be allowed to mark an entity as liked (like in FB), tag it and comment.
I get stuck on database tables design for handling this functionality. Solution is trivial, if we can do this only for one type of thing (eg. photos). But I need to enable this for 5 different things (for now, but I also assume that this number can grow, as the whole service grows).
I found some similar questions here, but none of them have a satisfying answer, so I'm asking this question again.
The question is, how to properly, efficiently and elastically design the database, so that it can store comments for different tables, likes for different tables and tags for them. Some design pattern as answer will be best ;)
Detailed description:
I have a table User with some user data, and 3 more tables: Photo with photographs, Articles with articles, Places with places. I want to enable any logged user to:
comment on any of those 3 tables
mark any of them as liked
tag any of them with some tag
I also want to count the number of likes for every element and the number of times that particular tag was used.
1st approach:
a) For tags, I will create a table Tag [TagId, tagName, tagCounter], then I will create many-to-many relationships tables for: Photo_has_tags, Place_has_tag, Article_has_tag.
b) The same counts for comments.
c) I will create a table LikedPhotos [idUser, idPhoto], LikedArticles[idUser, idArticle], LikedPlace [idUser, idPlace]. Number of likes will be calculated by queries (which, I assume is bad). And...
I really don't like this design for the last part, it smells badly for me ;)
2nd approach:
I will create a table ElementType [idType, TypeName == some table name] which will be populated by the administrator (me) with the names of tables that can be liked, commented or tagged. Then I will create tables:
a) LikedElement [idLike, idUser, idElementType, idLikedElement] and the same for Comments and Tags with the proper columns for each. Now, when I want to make a photo liked I will insert:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Photo'
INSERT (user id, typeId, photoId)
and for places:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Place'
INSERT (user id, typeId, placeId)
and so on... I think that the second approach is better, but I also feel like something is missing in this design as well...
At last, I also wonder which the best place to store counter for how many times the element was liked is. I can think of only two ways:
in element (Photo/Article/Place) table
by select count().
I hope that my explanation of the issue is more thorough now.
The most extensible solution is to have just one "base" table (connected to "likes", tags and comments), and "inherit" all other tables from it. Adding a new kind of entity involves just adding a new "inherited" table - it then automatically plugs into the whole like/tag/comment machinery.
Entity-relationship term for this is "category" (see the ERwin Methods Guide, section: "Subtype Relationships"). The category symbol is:
Assuming a user can like multiple entities, a same tag can be used for more than one entity but a comment is entity-specific, your model could look like this:
BTW, there are roughly 3 ways to implement the "ER category":
All types in one table.
All concrete types in separate tables.
All concrete and abstract types in separate tables.
Unless you have very stringent performance requirements, the third approach is probably the best (meaning the physical tables match 1:1 the entities in the diagram above).
Since you "hate" databases, why are you trying to implement one? Instead, solicit help from someone who loves and breathes this stuff.
Otherwise, learn to love your database. A well designed database simplifies programming, engineering the site, and smooths its continuing operation. Even an experienced d/b designer will not have complete and perfect foresight: some schema changes down the road will be needed as usage patterns emerge or requirements change.
If this is a one man project, program the database interface into simple operations using stored procedures: add_user, update_user, add_comment, add_like, upload_photo, list_comments, etc. Do not embed the schema into even one line of code. In this manner, the database schema can be changed without affecting any code: only the stored procedures should know about the schema.
You may have to refactor the schema several times. This is normal. Don't worry about getting it perfect the first time. Just make it functional enough to prototype an initial design. If you have the luxury of time, use it some, and then delete the schema and do it again. It is always better the second time.
This is a general idea
please don´t pay much attention to the field names styling, but more to the relation and structure
This pseudocode will get all the comments of photo with ID 5
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "comment"
This pseudocode will get all the likes or users who liked photo with ID 5
(you may use count() to just get the amount of likes)
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "like"
as far as i understand. several tables are required. There is a many to many relation between them.
Table which stores the user data such as name, surname, birth date with a identity field.
Table which stores data types. these types may be photos, shares, links. each type must has a unique table. therefore, there is a relation between their individual tables and this table.
each different data type has its table. for example, status updates, photos, links.
the last table is for many to many relation storing an id, user id, data type and data id.
Look at the access patterns you are going to need. Do any of them seem to made particularly difficult or inefficient my one design choice or the other?
If not favour the one that requires the fewer tables
In this case:
Add Comment: you either pick a particular many/many table or insert into a common table with a known specific identifier for what is being liked, I think client code will be slightly simpler in your second case.
Find comments for item: here it seems using a common table is slightly easier - we just have a single query parameterised by type of entity
Find comments by a person about one kind of thing: simple query in either case
Find all comments by a person about all things: this seems little gnarly either way.
I think your "discriminated" approach, option 2, yields simpler queries in some cases and doesn't seem much worse in the others so I'd go with it.
Consider using table per entity for comments and etc. More tables - better sharding and scaling. It's not a problem to control many similar tables for all frameworks I know.
One day you'll need to optimize reads from such structure. You can easily create agragating tables over base ones and lose a bit on writes.
One big table with dictionary may become uncontrollable one day.
Definitely go with the second approach where you have one table and store the element type for each row, it will give you a lot more flexibility. Basically when something can logically be done with fewer tables it is almost always better to go with fewer tables. One advantage that comes to my mind right now about your particular case, consider you want to delete all liked elements of a certain user, with your first approach you need to issue one query for each element type but with the second approach it can be done with only one query or consider when you want to add a new element type, with the first approach it involves creating a new table for each new type but with the second approach you shouldn't do anything...

Database user table design, for specific scenario

I know this question has been asked and answered many times, and I've spent a decent amount of time reading through the following questions:
Database table structure for user settings
How to handle a few dozen flags in a database
Storing flags in a DB
How many database table columns are too many?
How many columns is too many columns?
The problem is that there seem to be a somewhat even distribution of supporters for a few classes of solutions:
Stick user settings in a single table as long as it's normalized
Split it into two tables that are 1 to 1, for example "users" and "user_settings"
Generalize it with some sort of key-value system
Stick setting flags in bitfield or other serialized form
So at the risk of asking a duplicate question, I'd like to describe my specific scenario, and hopefully get a more specific answer.
Currently my site has a single user table in mysql, with around 10-15 columns(id, name, email, password...)
I'd like to add a set of per-user settings for whether to send email alerts for different types of events (notify_if_user_follows_me, notify_if_user_messages_me, notify_when_friend_posts_new_stuff...)
I anticipate that in the future I'd be infrequently adding one off per-user settings which are mostly 1 to 1 with users.
I'm leaning towards creating a second user_settings table and stick "non-essential" information such as email notification settings there, for the sake of keeping the main user table more readable, but is very curious to hear what expects have to say.
Seems that your dilemma is to vertically partition the user table or not. You may want to read this SO Q/A too.
i'm gonna cast my vote for adding two tables... (some sota key-value system)
it is preferable (to me) to add data instead of columns... so,
add a new table that links users to settings, then add a table for the settings...
these things: notify_if_user_follows_me, notify_if_user_messages_me, notify_when_friend_posts_new_stuff. would then become row insertions with an id, and you can reference them at any time and extend them as needed without changing the schema.

Is it possible to have "folders" in a database?

I am going to have a database with several (less than 10) "main" tables. Additionally to that I want to have hundreds or thousands tables of the same type (let same "user_1", "user_2", "user_3" and so on). Is it possible to put all these tables in a directory/folder? Or database itself is already considered as a "folder" for tables?
ADDED
Since I go a lot of questions about why I want to do that, I want to elaborate on that. I want to have many tables to optimize query to the database. If I put everything in one table, the table is going to be huge. Than, if I want to extract information about a particular user, I first need to find those rows in the table which have a given user in a given column. And it can be time consuming. I decided to create a table for every user. So, if I need to know something about a user I just read the required information from a "small" table.
To be more specific, I can have 10 000 user and information about a given user can contain 10 000 lines. I do not want to have one table with 100 000 000 lines.
The answer is—you shouldn't be doing this in the first place.
Don't have separate tables for each user—instead, use one table for all your user data, and add a column (e.g. userId) to store information on who it's about.
If you want separate tables based on the user, this tends to be done using an owner or schema concept. In other words, you use:
create table pax.table1 ...
and pax is them the owner of that table. Each user can then have their own data.
If you don't mind everyone seeing the data in each others "folders", you can opt for a single table with a column specifying the particular user but you tend to lose user-based protection in that case.
Having each user's data in their own schema (or owner) means that you can restrict access based on user name. Keep in mind that these are then separate tables so it becomes harder to consolidate data from them should you wish to do so.
It's pretty unusual to have hundreds of thousands of tables, even in the biggest database setups. You might want to consider the possibility that you're doing something unwise. Posting the "why" of this question instead of the "how" will help us in assisting you further.