I'm having conception difficulties to implement something in a database. I have two solutions for a problem, and I was wondering which one is the best.
Problem :
Let's picture a table speciality with 2 fields : speciality_id and speciality_name.
So for example :
1 - Mage
2 - Warrior
3 - Priest
Now, I have a table user with fields such as user_id, name, firstname etc ...
In this table, there is a field called speciality. The speciality stores an integer, corresponding to the speciality_id of the table speciality.
That would be acceptable for users that have only one speciality. I want to improve the model to be able to have multiple specialities for a user.
Here are my two solutions :
Create a table 'solution1' which link the user_id with the speciality_id and remove the speciality field in the user table. So for a user which has 2 specialities, 2 rows will be created in the table 'solution1'.
Change the type of the field speciality in the user table to be able to write down the specialities, separated with commas.
For example 2;3
The problem I got with the second solution is for making foreign keys between my table user and my table specialities, to link them. I may have a bit more difficulties with the PHP in the future too, while wanting to get the specilities for a user (will need to use a parser I guess).
Which solution do you find is the best ?
Thanks.
Absolutely go with your first solution.
Create a third "Many-to-Many" table that allows you to relate a user to multiple specialties. This is the only way to go in your case.
When designing tables, you always want to have each column contain one and only one data element. Think about what querying your second solution would look like. What would you do when you wanted to see all users who had a given specialty?
You might try something like this:
select * from user where specialty like '%2%'
Well, what happens when you have specialties that go to 12? Now "2" matches multiple entities. You could devolve further and try to be tricky, but...you really should just make your data design as normal as possible to avoid all the mess, headache, and errors. Go with Solution 1.
i think the best way is to follow solution1 cause solution2 will end up will lot of complexity later on
I am creating two tables with M:N realationship. One table is called user the other is edit (edit because it is an edit of an image or a text) a user can vote on edit and edit can have multiple votes so hence the linking table. When a user votes on an edit Its a vote that compares 2 edits so i want it to store the edit that is being compared. I wonder what is a nice way to implement that into a database.
So 2 edits get voted on by user and one is better than the other. I want to store both the value which got voted up which got voted down and the other edit that it was compared to.
Here is how my original design looks like:
and here is a solution I came up with please tell me if this is a good way of acomplishing what i want:
There is a UNIQUE index (alternate key) (AK) on NewDocument (DocumentID, DocumentType); the foreign key from Document table (DocumentID, DocumentType) points here. This is used to lock-in the document type for a given DocumentID.Once you open a new document, place version 1 in the Document table.
Place a check constraint on EditVotes for Version_B > Version_A
I'm a software developer. I love to code, but I hate databases... Currently, I'm creating a website on which a user will be allowed to mark an entity as liked (like in FB), tag it and comment.
I get stuck on database tables design for handling this functionality. Solution is trivial, if we can do this only for one type of thing (eg. photos). But I need to enable this for 5 different things (for now, but I also assume that this number can grow, as the whole service grows).
I found some similar questions here, but none of them have a satisfying answer, so I'm asking this question again.
The question is, how to properly, efficiently and elastically design the database, so that it can store comments for different tables, likes for different tables and tags for them. Some design pattern as answer will be best ;)
Detailed description:
I have a table User with some user data, and 3 more tables: Photo with photographs, Articles with articles, Places with places. I want to enable any logged user to:
comment on any of those 3 tables
mark any of them as liked
tag any of them with some tag
I also want to count the number of likes for every element and the number of times that particular tag was used.
1st approach:
a) For tags, I will create a table Tag [TagId, tagName, tagCounter], then I will create many-to-many relationships tables for: Photo_has_tags, Place_has_tag, Article_has_tag.
b) The same counts for comments.
c) I will create a table LikedPhotos [idUser, idPhoto], LikedArticles[idUser, idArticle], LikedPlace [idUser, idPlace]. Number of likes will be calculated by queries (which, I assume is bad). And...
I really don't like this design for the last part, it smells badly for me ;)
2nd approach:
I will create a table ElementType [idType, TypeName == some table name] which will be populated by the administrator (me) with the names of tables that can be liked, commented or tagged. Then I will create tables:
a) LikedElement [idLike, idUser, idElementType, idLikedElement] and the same for Comments and Tags with the proper columns for each. Now, when I want to make a photo liked I will insert:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Photo'
INSERT (user id, typeId, photoId)
and for places:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Place'
INSERT (user id, typeId, placeId)
and so on... I think that the second approach is better, but I also feel like something is missing in this design as well...
At last, I also wonder which the best place to store counter for how many times the element was liked is. I can think of only two ways:
in element (Photo/Article/Place) table
by select count().
I hope that my explanation of the issue is more thorough now.
The most extensible solution is to have just one "base" table (connected to "likes", tags and comments), and "inherit" all other tables from it. Adding a new kind of entity involves just adding a new "inherited" table - it then automatically plugs into the whole like/tag/comment machinery.
Entity-relationship term for this is "category" (see the ERwin Methods Guide, section: "Subtype Relationships"). The category symbol is:
Assuming a user can like multiple entities, a same tag can be used for more than one entity but a comment is entity-specific, your model could look like this:
BTW, there are roughly 3 ways to implement the "ER category":
All types in one table.
All concrete types in separate tables.
All concrete and abstract types in separate tables.
Unless you have very stringent performance requirements, the third approach is probably the best (meaning the physical tables match 1:1 the entities in the diagram above).
Since you "hate" databases, why are you trying to implement one? Instead, solicit help from someone who loves and breathes this stuff.
Otherwise, learn to love your database. A well designed database simplifies programming, engineering the site, and smooths its continuing operation. Even an experienced d/b designer will not have complete and perfect foresight: some schema changes down the road will be needed as usage patterns emerge or requirements change.
If this is a one man project, program the database interface into simple operations using stored procedures: add_user, update_user, add_comment, add_like, upload_photo, list_comments, etc. Do not embed the schema into even one line of code. In this manner, the database schema can be changed without affecting any code: only the stored procedures should know about the schema.
You may have to refactor the schema several times. This is normal. Don't worry about getting it perfect the first time. Just make it functional enough to prototype an initial design. If you have the luxury of time, use it some, and then delete the schema and do it again. It is always better the second time.
This is a general idea
please donĀ“t pay much attention to the field names styling, but more to the relation and structure
This pseudocode will get all the comments of photo with ID 5
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "comment"
This pseudocode will get all the likes or users who liked photo with ID 5
(you may use count() to just get the amount of likes)
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "like"
as far as i understand. several tables are required. There is a many to many relation between them.
Table which stores the user data such as name, surname, birth date with a identity field.
Table which stores data types. these types may be photos, shares, links. each type must has a unique table. therefore, there is a relation between their individual tables and this table.
each different data type has its table. for example, status updates, photos, links.
the last table is for many to many relation storing an id, user id, data type and data id.
Look at the access patterns you are going to need. Do any of them seem to made particularly difficult or inefficient my one design choice or the other?
If not favour the one that requires the fewer tables
In this case:
Add Comment: you either pick a particular many/many table or insert into a common table with a known specific identifier for what is being liked, I think client code will be slightly simpler in your second case.
Find comments for item: here it seems using a common table is slightly easier - we just have a single query parameterised by type of entity
Find comments by a person about one kind of thing: simple query in either case
Find all comments by a person about all things: this seems little gnarly either way.
I think your "discriminated" approach, option 2, yields simpler queries in some cases and doesn't seem much worse in the others so I'd go with it.
Consider using table per entity for comments and etc. More tables - better sharding and scaling. It's not a problem to control many similar tables for all frameworks I know.
One day you'll need to optimize reads from such structure. You can easily create agragating tables over base ones and lose a bit on writes.
One big table with dictionary may become uncontrollable one day.
Definitely go with the second approach where you have one table and store the element type for each row, it will give you a lot more flexibility. Basically when something can logically be done with fewer tables it is almost always better to go with fewer tables. One advantage that comes to my mind right now about your particular case, consider you want to delete all liked elements of a certain user, with your first approach you need to issue one query for each element type but with the second approach it can be done with only one query or consider when you want to add a new element type, with the first approach it involves creating a new table for each new type but with the second approach you shouldn't do anything...
What is the "proper" (most normalized?) way to store requests in the database? For example, a user submits an article. This article must be reviewed and approved before it is posted to the site.
Which is the more proper way:
A) store it in in the Articles table with an "Approved" field which is either a 0, 1, 2 (denied, approved, pending)
OR
B) Have an ArticleRequests table which has the same fields as Articles, and upon approval, move the row data from ArticleRequests to Articles.
Thanks!
Since every article is going to have an approval status, and each time an article is requested you're very likely going to need to know that status - keep it inline with the table.
Do consider calling the field ApprovalStatus, though. You may want to add a related table to contain each of the statuses unless they aren't going to change very often (or ever).
EDIT: Reasons to keep fields in related tables are:
If the related field is not always applicable, or may frequently be null.
If the related field is only needed in rare scenarios and is better described by using a foreign key into a related table of associated attributes.
In your case those above reasons don't apply.
Definitely do 'A'.
If you do B, you'll be creating a new table with the same fields as the other one and that means you're doing something wrong. You're repeating yourself.
I think it's better to store data in main table with specific status. Because it's not necessary to move data between tables if this one is approved and the article will appear on site at the same time. If you don't want to store disapproved articles you should create cron script with will remove unnecessary data or move them to archive table. In this case you will have less loading of your db because you can adjust proper time for removing old articles for example at night.
Regarding problem using approval status in each query: If you are planning to have very popular site with high-load for searching or making list of article you will use standalone server like sphinx or solr(mysql is not good solution for this purposes) and you will put data to these ones with status='Approved'. Using delta indexing helps you to keep your data up-to-date.
The website I'm building has a table which stores all the information of uploaded images on the site. These uploaded images can come from different resources such as a guestbook, news section or an item from an agenda.
Ofcourse I want the image to inherit the rights of the resource it is part of. For example: if user A isn't allowed to view the guestbook I don't want him to be able to view an image posted on the guestbook by going to image/view/id/12 (which would be the image request used it in the guestbook).
What I have now is that the system remembers the resources used (in this case the guestbook) the image-id is coupled to the resource-id. However I don't know to which guestbook post the image is connected (I do ofcourse know it the other way around).
Is there a way in SQL to connect one table field to a field in another table, where which table I connect to can vary based on one of the first table's field values?
In my case I would like to connect an image to a resource this could be a guestbook post in the table gb_posts or an agenda item in the table agenda_items.
Or is this all a stupid way of solving the problem and should I not use one table for the uploaded images but keep the image attached to the resource (as a column in the table for example)? It sounds like using one table is at least a lot slower in use (but I would have a great overview of all the images in one place).
I hope you guys can help me out.
EDIT: extra explanation: db model
I will try to explain how it all works the best I can.
First of all: I use Zend Framework, and therefor I also use Zend_Acl for working with priveleges.
My DB structure:
- Users are connected to roles (directly or by being connected to a group that is connected to a role)
- There is a table resources containing all the resources which is connected to priveleges. For example: guestbook is a resource, view or edit are the priveleges. Next to the controllers/actions there can also be other resources in this table such as a category within the agenda or a file location.
- roles are connected to a privelege
When for example the guestbook is requested for viewing I can check if the user is allowed to.
In short something like:
users -> roles -> priveleges <- resources
When a user adds a guestbook post with an image, the used resources (in this case guestbook is saved):
guestbook_posts -> images -> resources
I hope this explains my DB model for a bit, if it doesn't I will try to create an image of the tables.
I have to admit I'm failing to completely understand the model you wish to implement, but there is an interesting quote...
However I don't know to which
guestbook post the image is connected
(I do ofcourse know it the other way
around).
If you know an association one way, you should be able to use the associaton in both directions? I'm assuming you have a table that includes "post_id, image_id", or something?
It may be that the table is only indexed post_id first, in which case querying that table by image_id may be slow, but then you can just include a new index with image_id first?
If you can give examples of the table structure you have at present, and an example of the query you can't fullfil, we may be able to help you further.
Sounds like you want a foreign key constraint.
Update: Completely misunderstood the question, apparently.
There are two approaches here:
As it currently stands, there is nothing in the schema that would prohibit linking the same image from multiple resources. If that is desired, then a foreign key constraint and an index for the backreference is probably the best solution, although it will not scale well, and requires additional computation (because the rights on the image need to be the union of the rights of the refering resources).
The alternative is to create some kind of inheritance schema, where there is a table listing "resources" (that effectively just contains identifiers) that is referenced as a foreign key from the actual resource tables and the images table; the only constraint that cannot be expressed in plain SQL is that different resources may not share the same identifier.
Create two SELECT clauses, each having the correct joins to the correct tables, and then combine the output of the two SELECT clauses together using a UNION statement.
SELECT field1, field2
FROM table1
JOIN table2 on table1.PK = table2.FK
WHERE table1.selector = 1
UNION SELECT field1, field2
FROM table1
JOIN table3 on table1.PK = table3.FK
WHERE table1.selector = 2