I am trying to implement a simple many to many relationship between two tables.
User and Groups.
User
---------
user_id
user_name
Group
----------
group_id
group_name
UserGroup
----------
user_id
group_id
Lets say both the user and group table each has 1000 entries.
I have to create one admin user that belongs to all groups.
Should I create 1000 entries in the UserGroup table for the "admin" user?
Can I create a boolean column say "Applicable_to_all_groups" in User table that should be checked first before selecting from UserGroup table?
Any suggestion on doing this the correct way will be appreciated.
Well, I would say there's no "true" solution for that kind of cases.
Let's look on some pros / cons
Solution 1, all in UserGroup table
Pros
Requests to get allowed groups is easier to write (no OR clause)
Cons
You will have to add an entry in this table every time you add an entry in the group table.
Doable, of course, but boring, and error-prone.
If you want a new User "which can also be related to all groups", you'll have to rewrite all your procedures / triggers / whatever you use to have "up-to-date" UserGroup table to add this new thing.
Solution2, flag (= boolean column)
Pros
Avoid unnecessary entries in your db (well, minor point)
Always "up-to-date", without any additional work.
Easy to add a new User with "all groups" rights (just put the flag to true)
Cons
You'll have to add OR clauses when requesting for allowed groups (based on flag or on GroupUser
table)
A personal point of view
I would go for the flag solution...
Related
I'm modeling a DB for an application where one of the functions is to get a user from the DB and display in a diagram the selected user and all the referrals of that selected user, and the referrals (if any) for the selected user's referrals, going that way up to 3 referral levels.
I have two theories on how to model a scheme to accomplish this, but I don't know which one is the "best" (in terms of optimization, normalization, etc).
We have one scheme where the referrals are stored in a different table, with only a BOOLEAN to show if the user is, in fact, referred from another user.
On the other hand, I can substitute the BOOLEAN with a nullable INT (if referred, just store an INT, if null meaning is not referred by anyone).
If there is a better way to accomplish this, suggestions are also welcomed. Thank you.
I'd suggest a third model. You're talking a model where you can optionally have a referral, so perhaps a table that joins twice to the users table, the first column the refered persons ID, and in the second column, the referring users ID.
Then you know if there is a referral by joining to this table in your queries
I would use the second design and perhaps add a closure table for easily finding the referrals across multiple levels (see: http://wiki.pentaho.com/display/EAI/Closure+Generator)
I have a table Things and I want to add ownership relations to a table Users. I need to be able to quickly query the owners of a thing and the things a user owns. If I know that there will be at most 50 owners, and the pdf for the number of owners will probably look like this, should I rather
add 50 columns to the Things table, like CoOwner1Id, CoOwner2Id, …, CoOwner50Id, or
should I model this with a Ownerships table which has UserId and ThingId columns, or
would it better to create a table for each thing, for example Thing8321Owners with a row for each owner, or
perhaps a combination of these?
The second choice is the correct one; you should create an intermediate table between the table Things and the table Owners (that contains the details of each owner).
This table should have the thing_id and the owner_id as the primary key.
So finally, you well have 3 tables:
Things (the things details and data)
Owner (the owners details and data)
Ownerships (the assignment of each thing_id to an owner_id)
Because in a relational DB you should not have any redundant data.
You should definitely go with option 2 because what you are trying to model is a many to many relationship. (Many owners can relate to a thing. Many things can relate to an owner.) This is commonly accomplished using what I call a bridging table. (Which exactly what option 2 is.) It is a standard technique in a normalized database.
The other two options are going to give you nightmares trying to query or maintain.
With option 1 you'll need to join the User table to the Thing table on 50 columns to get all of your results. And what happens when you have a really popular thing that 51 people want to own?
Option 3 is even worse. The only way to easily query the data is to use dynamic sql or write a new query each time because you don't know which Thing*Owners table to join on until you know the ID value of the thing you're looking for. Or you're going to need to join the User table to every single Thing*Owners table. Adding a new thing means creating a whole new table. But at least a thing doesn't have a limit on the number of owners it could possibly have.
Now isn't this:
SELECT Users.Name, Things.Name
FROM Users
INNER JOIN Ownership ON Users.UserId=Ownership.UserId
INNER JOIN Things ON Things.ThingId=Ownership.ThingId
much easier than any of those other scenarios?
I'm involve in a development of a tiny Social Network where users must be able to establish relations between them and also have permissions over contents. For a example: I add one user as my friend but this user doesn allow me to see all his/him contents so I only have access to those contents that user allow me (permissions). So I have a problem and need help designing the DER of this part. I think in have this tables:
- users (id, name)
- relations_type (id, name, active)
- users_relations (id, id_user_1, id_user_2)
- users_permissions (id, id_relation, id_module, id_user, view, edit, delete)
This cause the following:
Two rows for every relationship: User 1 > User 2 and User 2 > User 1 because when I search (SELECT) I need to know which are User 1 friends and also which are User2 friends. If I leave only one way relationship then I need a UNION and this migth slow my DB
Is that correct? How yours handle this when CRUD on this? I'm using MySQL by the way and MyISAM tables.
Don't use user1 and user2 as table names, you're going to forget which is which when the scope expands.
you can denormalize permissions on the users table, you don't need to divide tables for a few boolean bits.
never count on union, it will betray you in the least expected situations.
I am currently working on restructuring my site's database. As the schema I have now is not one of the best, I thought it would be useful to hear some suggestions from you.
To start off, my site actually consists of widgets. For each widget I need a table for settings (where each instance of the widget has its user defined settings), a table for common (shared items between instances of the same widget) and userdata (users' saved data within an instance of a widget).
Until now, I had the following schema, consisting of 2 databases:
the first database, where I had all site-maintenance tables (e.g. users, widgets installed, logs, notifications, messages etc.) PLUS a table where I joined each widget instance to each user that instanciated it, having assigned a unique ID (so, I have the following columns: user_id, widget_id and unique_id).
the second database, where I kept all widget-related data. That means, for each widget (unique by its widget_id) I had three tables: [widget_id]_settings, [widget_id]_common and [widget_id]_userdata. In each of these tables, each row held that unique_id of the users' widget. Actually here was all the users' data stored within a widget.
To give a short example of how my databases worked:
First database:
In the users table I have user_id = 1
In the widgets table I have widget_id = 1
In the users_widgets table I have user_id = 1, widget_id = 1, unique_id = 1
Second database:
In the 1_settings I have unique_id = 1, ..., where ... represents the user's widget settings
In the 1_common I have several rows which represent shared data between instances of the same widget (so, no user specific data here)
In the 1_userdata I have unique_id = 1, ..., where ... represents the user's widget data. An important notice here is that this table may contain several rows with the same unique_id (e.g. For a tasks widget, a user can have several tasks for a widget instance)
Hope you understood in the rough my database schema.
Now, I want to develop a 'cleaner' schema, so it won't be necessary to have 2 databases and switch each time from one to another in my application. It would be also great if I found a way NOT to dinamically generate tables in the second database (1_settings, 2_settings, ... , n_settings).
I will greatly appreciate any effort in suggesting any better way of achieving this. Thank you very much in advance!
EDIT:
Shall I have databases like MongoDB or CouchDB in my mind when restructurating my databases? I mean, for the second database, where it would be better if I didn't have a fixed schema.
Also, how would traditional SQL's and NoSQL's get along on the same site?
A possible schema for the users_widgets table could be:
id | user_id | widget_id
You don't need the unique_id field in the users_widgets table, unless you want to hide the primary key for some reason. In fact, I would rename this table to something a little more memorable like widget_instances, and use widget_instance_id in the remaining tables of the second database.
One way to handle the second set of tables is by using a metadata style:
widget_instance_settings
id | widget_instance_id | key | value
This would include the userdata, because user_id is related to the widget_instance_id, unless you want to allow a user to create multiple instances of the same widget, and have the same data across all instances for some reason.
widget_common_settings
id | widget_id | key | value
This type of schema can be seen in packages like Elgg.
Do you know the settings a widget class and widget instance could have? In this case these settings could be made columns of the widget_class table (for common settings) and widget_instance (for instance specific settings).
If you don't know them, then you could have a widget_class_settings table that has a many to one relation with the widget_class table and a widget_instance_settings that has a many to one relation to the widget_instance table. Between the widget_instance and the widget_class you could, again, have a many to one relation. The widget_instance could also have a foreign key in the users table, so that you know which user created a specific widget.
What is the best way to store user relationships, e.g. friendships, that must be bidirectional (you're my friend, thus I'm your friend) in a rel. database, e.g. MYSql?
I can think of two ways:
Everytime a user friends another user, I'd add two rows to a database, row A consisting of the user id of the innitiating user followed by the UID of the accepting user in the next column. Row B would be the reverse.
You'd only add one row, UID(initiating user) followed by UID(accepting user); and then just search through both columns when trying to figure out whether user 1 is a friend of user 2.
Surely there is something better?
I would have a link table for friends, or whatever, with 2 columns both being PK's, and both being FK's to the User table.
Both columns would be the UID, and you would have two rows per friend relationship (A,B and B,A). As long as both columns are PK's, it should still be in normal format (although others are free to correct me on this)
Its a little more complex of a query, but nothing that can't be abstracted away by a stored procedure or some business logic, and its in Normal Format, which is usually nice to have.
You could check which of the two user_id's is the lowest and store them in a specific order. This way you don't need double rows for one friendship and still keep your queries simple.
user_id_low | user_id_high
a simple query to check if you're already friends with someone would be:
<?php
$my_id = 2999;
$friend_id = 500;
$lowest = min($my_id, $friend_id);
$highest= max($my_id, $friend_id);
query("SELECT * FROM friends WHERE user_id_low=$lowest AND user_id_high=$highest");
?>
Or you could find the lowest/higest userid using mysql
<?php
query("SELECT * FROM friends WHERE user_id_low=LEAST($my_id, $friend_id) AND user_id_high=GREATEST($my_id, $friend_id)");
?>
And to get all your friends id's
<?php
query("SELECT IF(user_id_low=$my_id,user_id_high,user_id_low) AS friend_id FROM friends WHERE $my_id IN (user_id_low, user_id_high)");
?>
Using double rows, while it creates extra data, will greatly simplify your queries and allow you to index smartly. I also remember seeing info on Twitter's custom MySQL solution wherein they used an additional field (friend #, basically) to do automatic limiting and paging. It seems pretty smooth:
https://blog.twitter.com/2010/introducing-flockdb
Use a key value store, such as Cassandra for example.