Data structure : "likeable" object - mysql

I'm currently working on a app that has a lot of different Model which are "likeable". This means each of them can be liked/disliked.
Is it better to create an unique table "likes", and referencing in each row the table reference + the table reference id, or to create an unique "likes" associated table for each likeable Model ?

It depends. It depends on many things. What you've described is only a small portion of the problem space I'm assuming. There are important questions to be asked here, like, "Do you need to keep track of what user 'liked' what model?" Answers to questions like that will heavily influence the design of this database.
Having a separate 'like' table hanging off of each model table is what we call de-normalized. It's not necessarily a bad thing, it just depends on what your plan is for this data store. The normalized approach would be to have a single 'like' table that all the model tables relate to.
For this, you really need more information. Once you have ALL of the requirements in place, it should become a lot more clear TO YOU, what the best approach is.

Related

Asking opinion about table structure

I'm working on a project to make a digital form of this paper
this paper (can't post image)
and the data will displayed on a Web in a simple table view. There will be NO altering, deleting, updating. It's just displaying (via SELECT * of course) the data inputted.
The data will be inserted via android app and stored in a single table which has 30 columns in mysql.
and the question is, is it a good idea if i use a single table? because i think there will be no complex operation in the sql.
and the other question is, am i violating some rules for this method?
I need your opinion. thanks.
It's totally ok to use only one table, if that suits your needs. What you can do to make the database a little bit 'smarter' is add new tables for attributes in your paper that will be repeated. So, for example, the Soil Type could be another table where there are two columns, ID and Description, and you will use it as a foreign key in each record in the main table. You need this if you want your database to be in 3NF.
To sum up, yes you can have one table if that's all you need. However, adding more tables might help save some space and make your database more flexible. It's up to you to decide! :)

How do I join on the correct one-to-one table (super-type, sub-type model)?

I've been looking into first, second, and third normal forms, and I want to do a better job normalizing my tables. Part of this, I realized, was that I've never understood the purpose of one-to-one tables. From what I understand, "optional" data should be grouped into another table, leaving distinct entities intact, while avoiding the nuances of maintaining several NULL fields in one monolithic table.
So, a real-world scenario. In a CMS, I want to maintain several different types of "pages," making it extendable by additional plugins without affecting the original schema. I have these as sample tables so far:
Pages (title, path, type, etc.)
ContentPages (same as base page, but with keyword/description/content fields)
LinkedPages (same as base page, but contains a reference to another page)
ProductPages (same as base page, but with SKU and other ecomm-related info)
So far, so good. No NULLS. Self-documented design. Super-typing / Sub-typing is consistent between my PHP models and database. Everything's DANDY.
EXCEPT, given any page ID, I don't want to do a first query to get the base page info, figure out what type of page it is, and then get the corresponding sub-type information with another query. Do I have to keep track of this with application state (or URL), or is there a way to know which table to join on, while only knowing the page ID and nothing else?
This is really easy with only one table (obviously), as the NULL fields imply the type, or an ENUM can tell me what it is. Switching back to 1NF isn't an acceptable answer, as I already know how to do it. I want to learn this way ;)
UPDATE: Also wanted to mention that each of the sub-type properties is unique to that type. So, any common property shared by all types will, of course, go into the base page table. Sub-types won't share any other properties. This seemed like a logical way to group the sub-tables, but maybe I'm defeating the purpose of one-to-one tables with this arrangement...
It depends on who's asking the question. If your plugin is driving the query then it can start at its specific subtype and join in the supertype, which it knows must exist.
I don't know what your business requirements are, but it seems to me that if you are trying to keep things modular then you want to drive as many joins from the child side (i.e. the plugin side) as possible.
If you are going to have a query driven from the supertype to the subtype then you can use an outer join and just be ready in your code to handle null columns if the subtype in question isn't present. Obviously that approach is less modular, but I suppose there could be times when that is what you need or want to do.
you could create a view by left outer joining all the subtypes on the main Page table. The view could be queried by a single page_id and would return one row with many null values, the same as you'd get with one big 1st normal form page table.
is there a way to know which table to join on, while only knowing the
page ID and nothing else?
Well, in a supertype/subtype structure, you should know more than the page ID. You should also know the subtype.
Usually, a supertype/subtype structure for 'n' subtypes maps to
n + 1 tables, one for each subtype, plus one for the supertype, and
n updatable views, each of which joins the supertype with the appropriate subtype
So your application should usually be working with the views, not with the base tables. (Usually, but not always.)
If you're not using the views, then when you retrieve the page id numbers from the supertype, you should also retrieve the column that identifies the subtype. Don't have such a column? Fix that. And see this other relevant SO database design problem for a supertype/subtype with code, a description of the structure, and the logic behind it.

Implementing Comments and Likes in database

I'm a software developer. I love to code, but I hate databases... Currently, I'm creating a website on which a user will be allowed to mark an entity as liked (like in FB), tag it and comment.
I get stuck on database tables design for handling this functionality. Solution is trivial, if we can do this only for one type of thing (eg. photos). But I need to enable this for 5 different things (for now, but I also assume that this number can grow, as the whole service grows).
I found some similar questions here, but none of them have a satisfying answer, so I'm asking this question again.
The question is, how to properly, efficiently and elastically design the database, so that it can store comments for different tables, likes for different tables and tags for them. Some design pattern as answer will be best ;)
Detailed description:
I have a table User with some user data, and 3 more tables: Photo with photographs, Articles with articles, Places with places. I want to enable any logged user to:
comment on any of those 3 tables
mark any of them as liked
tag any of them with some tag
I also want to count the number of likes for every element and the number of times that particular tag was used.
1st approach:
a) For tags, I will create a table Tag [TagId, tagName, tagCounter], then I will create many-to-many relationships tables for: Photo_has_tags, Place_has_tag, Article_has_tag.
b) The same counts for comments.
c) I will create a table LikedPhotos [idUser, idPhoto], LikedArticles[idUser, idArticle], LikedPlace [idUser, idPlace]. Number of likes will be calculated by queries (which, I assume is bad). And...
I really don't like this design for the last part, it smells badly for me ;)
2nd approach:
I will create a table ElementType [idType, TypeName == some table name] which will be populated by the administrator (me) with the names of tables that can be liked, commented or tagged. Then I will create tables:
a) LikedElement [idLike, idUser, idElementType, idLikedElement] and the same for Comments and Tags with the proper columns for each. Now, when I want to make a photo liked I will insert:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Photo'
INSERT (user id, typeId, photoId)
and for places:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Place'
INSERT (user id, typeId, placeId)
and so on... I think that the second approach is better, but I also feel like something is missing in this design as well...
At last, I also wonder which the best place to store counter for how many times the element was liked is. I can think of only two ways:
in element (Photo/Article/Place) table
by select count().
I hope that my explanation of the issue is more thorough now.
The most extensible solution is to have just one "base" table (connected to "likes", tags and comments), and "inherit" all other tables from it. Adding a new kind of entity involves just adding a new "inherited" table - it then automatically plugs into the whole like/tag/comment machinery.
Entity-relationship term for this is "category" (see the ERwin Methods Guide, section: "Subtype Relationships"). The category symbol is:
Assuming a user can like multiple entities, a same tag can be used for more than one entity but a comment is entity-specific, your model could look like this:
BTW, there are roughly 3 ways to implement the "ER category":
All types in one table.
All concrete types in separate tables.
All concrete and abstract types in separate tables.
Unless you have very stringent performance requirements, the third approach is probably the best (meaning the physical tables match 1:1 the entities in the diagram above).
Since you "hate" databases, why are you trying to implement one? Instead, solicit help from someone who loves and breathes this stuff.
Otherwise, learn to love your database. A well designed database simplifies programming, engineering the site, and smooths its continuing operation. Even an experienced d/b designer will not have complete and perfect foresight: some schema changes down the road will be needed as usage patterns emerge or requirements change.
If this is a one man project, program the database interface into simple operations using stored procedures: add_user, update_user, add_comment, add_like, upload_photo, list_comments, etc. Do not embed the schema into even one line of code. In this manner, the database schema can be changed without affecting any code: only the stored procedures should know about the schema.
You may have to refactor the schema several times. This is normal. Don't worry about getting it perfect the first time. Just make it functional enough to prototype an initial design. If you have the luxury of time, use it some, and then delete the schema and do it again. It is always better the second time.
This is a general idea
please donĀ“t pay much attention to the field names styling, but more to the relation and structure
This pseudocode will get all the comments of photo with ID 5
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "comment"
This pseudocode will get all the likes or users who liked photo with ID 5
(you may use count() to just get the amount of likes)
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "like"
as far as i understand. several tables are required. There is a many to many relation between them.
Table which stores the user data such as name, surname, birth date with a identity field.
Table which stores data types. these types may be photos, shares, links. each type must has a unique table. therefore, there is a relation between their individual tables and this table.
each different data type has its table. for example, status updates, photos, links.
the last table is for many to many relation storing an id, user id, data type and data id.
Look at the access patterns you are going to need. Do any of them seem to made particularly difficult or inefficient my one design choice or the other?
If not favour the one that requires the fewer tables
In this case:
Add Comment: you either pick a particular many/many table or insert into a common table with a known specific identifier for what is being liked, I think client code will be slightly simpler in your second case.
Find comments for item: here it seems using a common table is slightly easier - we just have a single query parameterised by type of entity
Find comments by a person about one kind of thing: simple query in either case
Find all comments by a person about all things: this seems little gnarly either way.
I think your "discriminated" approach, option 2, yields simpler queries in some cases and doesn't seem much worse in the others so I'd go with it.
Consider using table per entity for comments and etc. More tables - better sharding and scaling. It's not a problem to control many similar tables for all frameworks I know.
One day you'll need to optimize reads from such structure. You can easily create agragating tables over base ones and lose a bit on writes.
One big table with dictionary may become uncontrollable one day.
Definitely go with the second approach where you have one table and store the element type for each row, it will give you a lot more flexibility. Basically when something can logically be done with fewer tables it is almost always better to go with fewer tables. One advantage that comes to my mind right now about your particular case, consider you want to delete all liked elements of a certain user, with your first approach you need to issue one query for each element type but with the second approach it can be done with only one query or consider when you want to add a new element type, with the first approach it involves creating a new table for each new type but with the second approach you shouldn't do anything...

A Beginner Question on database design

this is a follow-up question on my previous one.We junior year students are doing website development for the univeristy as volunteering work.We are using PHP+MySQL technique.
Now I am mainly responsible for the database development using MySQL,but I am a MySQL designer.I am now asking for some hints on writing my first table,to get my hands on it,then I could work well with other tables.
The quesiton is like this,the first thing our website is going to do is to present a Survey to the user to collect their preference on when they want to use the bus service.
and this is where I am going to start my database development.
The User Requirement Document specifies that for the survey,there should be
Customer side:
Survery will be available to customers,with a set of predefined questions and answers and should be easy to fill out
Business side:
Survery info. will be stored,outputed and displayable for analysis.
It doesnt sound too much work,and I dont need to care about any PHP thing,but I am just confused on :should I just creat a single table called " Survery",or two tables "Survey_business" and "Survey_Customer",and how can the database store the info.?
I would be grateful if you guys could give me some help so I can work along,because the first step is always the hardest and most important.
Thanks.
I would use multiple tables. One for the surveys themselves, and another for the questions. Maybe one more for the answer options, if you want to go with multiple-choice questions. Another table for the answers with a record per question per answerer. The complexity escalates as you consider multiple types of answers (choice, fill-in-the-blank single-line, free-form multiline, etc.) and display options (radio button, dropdown list, textbox, yada yada), but for a simple multiple-choice example with a single rendering type, this would work, I think.
Something like:
-- Survey info such as title, publish dates, etc.
create table Surveys
(
survey_id number,
survey_title varchar2(200)
)
-- one record per question, associated with the parent survey
create table Questions
(
question_id number,
survey_id number,
question varchar2(200)
)
-- one record per multiple-choice option in a question
create table Choices
(
choice_id number,
question_id number,
choice varchar2(200)
)
-- one record per question per answerer to keep track of who
-- answered each question
create table Answers
(
answer_id number,
answerer_id number,
choice_id number
)
Then use application code to:
Insert new surveys and questions.
Populate answers as people take the surveys.
Report on the results after the survey is in progress.
You, as the database developer, could work with the web app developer to design the queries that would both populate and retrieve the appropriate data for each task.
only 1 table, you'll change only the way you use the table for each ocasion
customers side insert data into the table
business side read the data and results from the same table
Survey.Customer sounds like a storage function, while Survey.Business sounds like a retrieval function.
The only tables you need are for storage. The retrieval operations will take place using queries and reports of the existing storage tables, so you don't need additional tables for those.
Use a single table only. If you were to use two tables, then anytime you make a change you would in effect have to do everything twice. That's a big pain for maintenance for you and anyone else who comes in to do it in the future.
most of the advice/answers so far are applicable but make certain (unstated!) assumptions about your domain
try to make a logical model of the entities and attributes that are required to capture the requirements, examine the relationships, consider how the data will be used on both sides of the process, and then design the tables. Talk to the users, talk to the people that will be running the reports, talk to whoever is designing the user interface (screens and reports) to get the complete picture.
pay close attention the the reporting requirements, as they often imply additional attributes and entities not extant in the data-entry schema
i think 2 tables needed:
a survey table for storing questions and choices for answer. each survey will be stored in one row with a unique survey id
other table is for storing answers. i think its better to store each customers answer in one row with a survey id and a customer id if necessary.
then you can compute results and store them in a surveyResults view.
Is the data you're presenting as the questions and answers going to be dynamic? Is this a long-term project that's going to have questions swapped in and out? If so, you'll probably want to have the questions and answers in your database as well.
The way I'd do it would be to define your entities and figure out how to design your tables so relationships are straightforward. Sounds to me like you have three entities:
Question
Answer
Completed Survey
Just a sample elaboration of what Steven and Chris has mentioned above.
There are gonna be multiple tables, if there are gonna be multiple surveys, and each survey has a different set of questions, and if same user can take multiple surveys.
Customer Table with CustID as the primary key
Questions Table with a Question ID as the primary key. If a question cannot belong to more than one survey (a N:1 relationship), then can also have Survey ID (of table Survey table mentioned in point 3) as one of the values in the table.
But if a Survey to Question relationship is N:M, then
(SurveryID, QuestionID) would become a composite key for the SurveyTable, else it would just have the SurveyID with the high level details of the survey like description.
UserSurvey table which would contain (USerID, SurveryID, QuestionID, AnswerGiven)
[Note: if same user can take the same survey again and again, either the old survey has to be updated or the repeat attempts have to stored as another rows with some serial number)

Guidelines for join/link/many to many tables

I have my own theories on the best way to do this, but I think its a common topic and I'd be interested in the different methods people use. Here goes
Whats the best way to deal with many-to-many join tables, particularly as far as naming them goes, what to do when you need to add extra information to the relationship, and what to do whene there are multiple relationships between two tables?
Lets say you have two tables, Users and Events and need to store the attendees. So you create EventAttendees table. Then a requirement comes up to store the organisers. Should you
create an EventOrganisers table, so each new relationship is modelled with a join table
or
rename EventAttendees to UserEventRelationship (or some other name, like User2Event or UserEventMap or UserToEvent), and an IsAttending column and a IsOrganiser column i.e. You have a single table which you store all relationship info between two attendees
or
a bit of both (really?)
or
something else entirely?
Thoughts?
The easy answer to a generic question like this is, as always, "It all depends on the details".
But in general, I try to create fewer tables when this can be done without abusing the data definitions unduly. So in your example, I would probably add an isOrganizer column to the table, or maybe an attendeeType to allow for easy future expansion from audience/organizer to audience/organizer/speaker/caterer or whatever may be needed. Creating an extra table with essentially identical columns, where the table name is in effect a flag identifying the "attendee type", seems to me the wrong way to go both from a pristine design perspective and also from a practical point of view.
A single table is more flexible. With one table and a type field, if we want to know just the organizers -- like when we're sending invitations to a planning meaning -- fine, we write "select userid from userevent where eventid=? and attendeetype='O'". If we want to know everyone who will be there -- like when we're printing name cards for the lunch tables -- we just don't include the attendeetype test.
But suppose we have two tables. Then if we want just the organizers, okay, that's easy, join on the organizer table. But if we want both organizers and audience, then we have to do a union, which makes for more complicated queries and is usually slow. And if you're thinking, What's the big deal doing a union?, note that there may be more to the query. Perhaps a person can have multiple phone numbers and we care about this, so the query is not just joining user and eventAttendee but also phone. Maybe we want to know if they've attended previous conferences because we give special deals to "alumni", so we have to join in eventAttendee a second time, etc etc. A ten-table join with a union can get very messy and confusing to read.