How do I restrict certain values in column 2 with the same value in column 1 in SQL? - mysql

consider the following ERD for a MySQL database:
the table roles contains all kinds of (website-specific) roles users that are logged in could have. As you can see from the ERD: members can have multiple roles, and roles can have multiple members assigned to them.
The table members is dynamic, new members with custom roles can be made at any time, but the roles table is not. The current set-up of roles is final.
The records inside the roles table look like this:
id rolename
1 captain
2 cabin boy
3 buccaneer
4 parrot caretaker
5 cook
Now for the problem: I want members to have certain roles assigned to them, but certain combinations of roles cannot be chosen. For example, a captain can not also be a cabin boy, but he can also be a parrot caretaker. A cook can also be a cabin boy, but not a parrot caretaker.
I have done some research on Google regarding this topic, but I seem to fail in finding the right keywords to actually find usable information to solve this problem. All I seem to find are references and tutorials on how the SQL CHECK works, but not quite THAT in-depth.
Is there a way for me to use MySQL constraints to restrict some combinations from happening? If not, might this problem be solved using triggers or functions? I am generally looking for the most efficient solution to this, it does not necessarily have to be on the database side.

This depends on a few things..
Do you want the database to handle this logic or are you happy to have it at the application level?
If you want the database to handle it, you are probably going to want a trigger.. mysql parses a CHECK constraint but doesn't enforce it.
Either way you'll probably want to store the allowed combinations somewhere.
For simple cases I'd go for either a black-list or a white-list of other roles for each role depending on numbers. You can store this easily in another table.
Another option is a pre-requisite table, for example to be an admiral you must also be a captain.

Related

I am creating a database for a community to store details of all the members. What would be the best way to create such database?

I am creating a database for a community to store details of all their members along with those members' relations with each other.
For Instance: There is a family of 4. Mother, Father, Son and Daughter. The Son gets married to a girl from another family in the same community (Their data is also in the same database). The newly married couple has a new member soon. Also they need to add their grand parents to the database at a later stage (Parents of both the Mother and Father in this case).
What would be the best way to create a schema for such a database.
I have a schema called member_details that'll store all community members' data in a single table something like this.
member_details: ID | Name | Birthdate | Gender | Father | Mother | Spouse | Child
All members would have relations mapped to Father,Mother,Spouse,Child referenced in the same table.
Is this schema workable from a technical pov?
I just need to know if my solution is correct or is there a better way to do this. I know there are a lot of websites storing this kind of data and if someone could point me to the right direction.
I'd advice you to use two tables. One for members of community and one for relations beetween them. Something like this:
Members:
ID | Name | Birth | Gender
Relations:
First Member ID | Second Member ID | Relation
Where you use IDs from first table as foreign keys in second. That way you'll be able to add more relations types when you need it. By the way, I'd add a third table to store relation types, so it can work as a dictionary. Same thing for genders.
As usual, "it depends".
The first question is "how will you use this data?". What sort of questions do you expect the database to answer? If you want to show a person's profile with their relationships, that's pretty easy. If you want to find out how many children a person has, or who is the grandfather of a person, or the age of someone's youngest child, that could be a little harder.
The second question is "how sure are you these are the only relationships you want to store?" Perhaps you also want to store "neighbour", "team member", "engaged_to" - or maybe you need to store that information later on. Maybe you need to take account of people getting divorced, or remarrying.
The schema you suggest works fine for most scenarios, but adding a new type of relationship means you have to add a new column. There are no hard and fast rules, but in general it's better to add rows than columns when faced with events in the problem domain. Asking "who is this person's grandfather" requires a couple of self joins, and that's okay.
#ba3a suggests splitting the information about people from the information about relationships. This is much "cleaner" - and less likely to require new columns as you store more types of relationship. Showing a person's profile requires a query with lots of outer joins. Finding a grand parent requires self joins on the "relations" table.

Database design & normalization

I'm creating a messaging system for a e-learning platform and there are some design concerns that I'd like some feedback on.
First of all, it is important for me and my system to be highly modifiable in the future. As such, maintaining a fairly high normalization across my tables is important.
On to how my system will work:
All members (students or teachers) are part of a virtual classroom.
Teachers can create tasks and exercises in these classrooms and assign them to one or multiple students (member_task table not illustrated).
A student can request help for a specific task or exercise by sending a message to the teachers of the classroom.
Messages sent by students are sent to all the teachers. They cannot address a message to a specific teacher.
Messages sent by teachers can be addressed to one or more students.
Students cannot send messages to other students.
Messages behave like chat, meaning that a private conversation starts between a student and all teachers when they send a message.
Here's the ER diagram I made:
So my question is, is this table normalized properly for my purpose? Is there anything that can be done to reduce redundancy of data across my tables? And out of curiosity, is it in BCNF?
Another question: I don't intend to ever implement delete features anywhere in my system. Only "archiving" where said classroom/task/member/message/whatever is simply hidden/deactivated. So is there any reason to actually use FK?
EDIT: Also, a friend brought to my attention that the Conversations table might be redundant, and it kinda feels so. Thoughts?
Thanks.
In response to your emphasis on "modifiability" which I'm taking to mean with respect to application and schema evolution I'm actually going to suggest a fairly extreme solution. Before that some notes some aspects you've mentioned. First, foreign keys represent meaningful constraints in your data. They should always be defined and enforced. Foreign keys are not there just for cascading delete. Second, the Conversations table is arguably redundant. It would make sense if you had a notion of "session" of chat which would correspond to a Conversation. Otherwise, you just have a bunch of messages throughout time. The Conversation table could also enable a many-to-many relation between messages and tasks/exercises if you wanted to have chats that simultaneously covered multiple exercises, for example.
Now for the extreme suggestion. You could use 6NF. In particular, you might look at its incarnation in anchor modeling. The most notable difference in this approach is each attribute is modeled as a different table. 6NF supports temporal databases (supported in anchor modeling via "historized" attributes/ties). This means handling situations like a student being associated to a task now but not later won't cause all their messages to disappear. Most relevant to you, all schema modifications are non-destructive and additive, so no old code breaks when you make a change.
There are downsides. First, it's a bit weird, and in particular anchor modeling (somewhat gratuitously?) introduces a bunch of new terms. Second, it produces weird queries for most relational databases which they may not optimize well. This can sometimes be resolved with materialized views. Third, at the physical level, every attribute is effectively nullable. Finally, the tooling and support, while present, is pretty young. In particular, for MySQL, you may only be "inspired by" what's provided on the anchor modeling site.
As far as the actual database model would go, it would look roughly similar. Anchor modeling uses the term "anchor" for roughly the same thing as an entity, and "tie" for roughly the same thing as a relation. For simplicity, dropping the Conversation relation (and thus directly connecting Message to Task), the image would be similar: you'd have an anchor for Classroom, Member, Message, and Task, and a tie replacing Recipient that you might called ReceivedMessage representing the relation of "member received message message". The attributes on your entities would be attribute nodes. Making the message attribute on the Message anchor historized would allow messages to be edited if desired and support a history of revisions.
One concern I have is that I don't see a Users table which will hold all the students and teachers info (login, email, system id, role, etc) but I assume there is something similar in our system?
Now, looking into the Members table: usually students change classes every semester or so and you don't want last semesters' students to receive new messages. I would suggest the following:
Members
=============
PK member_id
FK class_id
FK user_id
--------------
join_date
leave_date
active
role
The last two fields might be redundant:
active: is an alternative solution if you want to avoid using dates. This will become false when a user stops being member of this class. Since there is not delete feature, the Members entry has to be preserved for archive purposes (and historical log).
role: Depends on how you setup Users table and roles in your system. If a user entry has role field(s) then this is not needed. However, this field allows for the same user to assume different roles in different classes. Example: a 3rd year student, who was a member of this class 2 years ago, is now working as TA/LA (teaching/lab assistant) for the same class. This depends on how the institution works... in my BSc we had the "rule": anyone with grade > 8.5/10 in Java could volunteer to do workshops to other students (using uni's labs). Finally, this field if used as a mask or a constant, allows for roles to be extended (future-proof)
As for FKs I will always suggest using them for data consistency. Things can get really ugly really fast without FKs. The limitations they impose can be worked around and they are usually needed: What is the purpose of archiving a message with sender_id if the sender has been deleted by accident? Also, note that in most systems FKs are indexed which improves the performance of queries/joins.
Hope the above helps and not confuse things :)

Improving my Database Design for future scalability

Well, I am working on a project which might involve thousands of users & I don't have much experience in databases especially when it involves relationships between entities.
Let me explain my scenario. First there's an User who can login into our system using his credentials. We have a module in our system, which will enable him to create Projects. So that brings a relationship between User table & Projects table.
Now there's another module, namely Team Creation Module, it does what it says. Out of the list of available members, he can pick who he likes and add them to a team. So there are tables for that Members & Team. Furthermore, a member can be a part of many teams and a team can have many members & a "User" can be member as well.
I have a designed the database myself but I am not sure if it is good or bad one. Moreover, I would really appreciate if someone can point me to good tutorials which shows how to insert or update into tables involving relationships.
Here's my design till now:
Update
After a discussion with someone on IRC, I came up with a revised design. I merged "User" & "Members" table as User is also a Member.
My question still remains the same, Am I on right track?
It's great that you're thinking long-term, but your solution won't work long-term.
This is not the first time this sort of thing has been tried before. Rely on the wisdom of those that have messed up before. Read data modeling pattern books.
Abstract and Normalize. That's how you get to a good long-term solution.
At least read up on The Party Model. A group and individual are actually the same (abstract) thing.
Put actually different things in different tables. An Address and Member don't belong in the same table.
"Am I on the right track" is not a useful question - we have no way of telling, because it depends on where you are headed.
A couple of things:
it's a good idea to name the relation columns after the relationship. For instance, in the first diagram, the "owner" of the project should not be called users_user_id - that's meaningless. Call it "owner_id" or something that meaningfully describes the relationship between the project and members table.
in the second diagram, you appear to have a "many to many" relationship between members and projects in the members table - but there's no efficient way of storing the id of more than one project in the members table. You need to factor that out into a joining table - projects_members, for instance, just like you did with teams_members.
the "teams_members" table has a primary key called tm_id. A purist would tell you this is wrong - the unique identifier for that table should be the combination of member_id and team_id. You don't need another unique identifier - and in fact it's harmful, because you must guarantee uniqueness of the member_id and team_id combination.
As Neil says, you probably want to start reading up on this. I can recommend 'Database Systems: Design, Implementation, and Management' by Coronel et al.

Method To Create Database for Tv Shows

This is my first question to stackoverflow so if i do something wrong please let me know i will fix it as soon as possible.
So i am trying to make a database for Tv Shows and i would like to know the best way and to make my current database more simple (normalization).
I would to be able to have the following structure or similar.
Fringe
Season 1
Episodes 1 - 10(whatever there are)
Season 2
Episodes 1 - 10(whatever there are)
... (so on)
Burn Notice
Season 1
Episodes 1 - 10(whatever there are)
Season 2
Episodes 1 - 10(whatever there are)
... (so on)
... (More Tv Shows)
Sorry if this seems unclear. (Please ask for clarification)
But the structure i have right now is 3 tables (tvshow_list, tvshow_episodes, tvshow_link)
//tvshow_list//
TvShow Name | Director | Company_Created | Language | TVDescription | tv_ID
//tvshow_episodes//
tv_ID | EpisodeNum | SeasonNum | EpTitle | EpDescription | Showdate | epid
//tvshow_link//
epid | ep_link
The Director and the company are linked by an id to another table with a list of companies and directors.
I am pretty sure that there is an more simplified way of doing this.
Thanks for the help in advance,
Krishanthan Lingeswaran
The basic concept of Normalization is the idea that you should only store one copy of any item of data that you have. It looks like you've got a good start already.
There are two basic ways to model what you're trying to do here, with episodes and shows. In the database world, we you might have heard the term "one to many" or "many to many". Both are useful, it just depends on your specific situation to know which is the correct one to use. In your case, the big question to ask yourself is whether a single episode can belong to only one show, or can an episode belong to multiple shows at once? I'll explain the two forms, and why you need to know the answer to that question.
The first form is simply a foreign key relationship. If you have two tables, 'episodes' and 'shows', in the episodes table, you would have a column named 'show_id' that contains the ID of one (and only one!) show. Can you see how you could never have an episode belong to more than one show this way? This is called a "one to many" relationship, i.e. a show can have many episodes.
The second form is to use an association table, and this is the form you used in your example. This form would allow you to associate an episode with multiple shows and is therefore called a "many to many" relationship.
There is some benefit to using the first form, but it's not really that big of a deal in most cases. Your queries will be a little bit shorter because you only have to join 2 tables to get episodes->shows but the other table is just one more join. It really comes down to figuring out if you need a "one to many" or "many to many" type relationship.
An example of a situation where you would need a many-to-many relationship would be if you were modeling a library and had to keep track of who checked out which book. You'd have a table of books, a table of users, and then a table of "books to users" that would have an id, a book_id, and a user_id and would be a many-to-many relationship.
Hope that helps!
I am pretty sure that there is an more simplified way of doing this.
Not as far as I know. Your schema is close to the simplest you can make for what I presume is the functionality you're asking for. "Improvements" on it really only make it more complicated, and should be added as you judge the need emerges on your side. The following examples come to mind (none of which really simplify your schema).
I would standardize your foreign key and primary key names. An example would be to have the columns shows.id, episodes.id, episodes.show_id, link.id, link.episode_id.
Putting SeasonNum as what I presume will be an int in the Episodes table, in my opinion, violates the normalization constraint. This is not a major violation, but if you really want to stick to it, I would create a separate Seasons table and associate it many-to-one to the Shows table, and then have the Episodes associate only with the Seasons. This gives you the opportunity to, for instance, attach information to each season. Also, it prevent repetition of information (while the type of the season ID foreign key column in the Episodes table would ostensibly still be an INT, a foreign key philosophically stores an association, what you want, versus dumb data, what you have).
You may consider putting language, director, and company in their own tables rather than your TV show list. This is the same concern as above and in your case a minor violation of normalization.
Language, director, and company all have interesting issues attached to them regarding the level of the association. Most TV shows have different directors for different episodes. Many are produced in multiple languages and by several different companies and sometimes networks. So at what level do you plan on storing this information? I'm not a software architect, so someone else can better answer this question than me, but I'd set up a polymorphic many-to-many association for languages, directors, and companies and an inheritance model that allows for these values to be specified on an episode-by-episode, season-by-season, or show-by-show basis, inheriting the value from its parent if none are provided.
Bottom line concerning all these suggestions: Pick what's appropriate for your project. If you don't need the functionality afforded by this level of associations, and you don't mind manually entering in repetitive data (you might end up implementing an auto-complete system to help you), you can gloss over some of the normalization constraints.
Normalization is merely a suggestion. Pick what's right for you and learn from your mistakes.

Organizational chart represented in a table

I have an Access application, in which I have an employee table. The employees are part of several different levels in the organization. The orgranization has 1 GM, 5 department heads, and under each department head are several supervisors, and under those supervisors are the workers.
Depending on the position of the employee, they will only have access to records of those under them.
I wanted to represent the organization in a table with some sort of level system. The problem I saw with that was that there are many ppl on the same level (for example supervisors) but they shouldn't have access to the records of a supervisor in another department. How should I approach this problem?
One common way of keeping this kind of hierarchical data in a database uses only a single table, with fields something like this:
userId (primary key)
userName
supervisorId (self-referential "foreign key", refers to another userId in this same table)
positionCode (could be simple like 1=lakey, 2=supervisor; or a foreign key pointing to another table of positions and such)
...whatever else you need to store for each employee...
Then your app uses SQL queries to figure out permissions. To figure out the employees that supervisor 'X' (whose userId is '3', for example) is allowed to see, you query for all employees where supervisorId=3.
If you want higher-up bosses to be able to see everyone underneath them, the easiest way is just to do a recursive search. I.e. query for everyone that reports to this big boss, and for each of them query who reports to them, all the way down the tree.
Does that make sense? You let the database do the work of sorting through all the users, because computers are good at that kind of thing.
I put the positionCode in this example in case you wanted some people to have different permissions... for example, you might have a code '99' for HR employees which have the right to see the list of all employees.
Maybe I'll let some other people try to explain it better...
Here's an article from Microsoft's Access Cookbook that explains these queries rather well.
And here is a somewhat chunky explanation of the same.
Here's a completely different method (the "adjacency list model") that you might find useful, and his explanation is pretty good. He also points out some difficulties with both methods (when he talks about the tables being "denormalized").