I have been working on a social networking site. Here a user requests another user to be his friend (friend request). I thought of a 'Friends' table which looks like
Table Name: Friends
Coloumns :
User1 - Int - FK
User2 - Int - FK
Request - Enum('0','1')
Time - DateTime
PK - (User1, User2)
In Request field '0' is stored when a request is made by User1 to User2 and '1' is stored when the request has been approved by User2.
The problem arises when i want to retrieve all friends of a user. I had to check whether the Request field is '0' or '1' each time. Is there another way to do so? Is it better if i have another table which stores all the details of Friend requests?
In a comment you basically state that an established friendship (as opposed to a request) is always symmetric in your setup. In that case, you have basically two options: you can either store it in two rows, or select it by matching either column. The former will yield simpler queries, but the latter will ensure that the symmetry is inherent in the database structure, and will avoid storing duplicate data as well. So I'd go for the latter, i.e. some form of WHERE (User1 = XX or User2 = XX ). The query might well take twice as long as a query for just one column would take on that same number of rows, but as the number of rows is only half that of the other storage scheme, the net effect in terms of performance should be negligible.
Whether you want a separate table for requests or established friendships depends on how similar those two are, both in terms of associated data and of the control flow in your application. So for example, if you want to present a single list to a user which shows both his established friendships and his pending requests, perhaps with different colors or whatever, but in the same list, then having a single table in the database would be more appropiate. If, on the other hand, you mostly treat requests and friendships separately, then having two tables would come more natural. And if, at some time, you decide that a freindship needs attributes like share_calendar whereas a request needs attributes like confirmation_key or whatever, then you'll be better off with different tables.
If you decide to make this a single table, I'd suggest more describtive values for the enum, like calling the column status and the values requested and established. I, for one, would at first glance interpret a value of request = 1 as ”this is a request only, not an established freindship”, exactly the opposite of the meaning you associate. This ambiguity can lead to errors when different people need to maintain the code. And in several years you'll be enough of a different person from who you are now that even you yourself might misinterpret your old code. So be descriptive there.
One more note: you may always use views to tweak the way your database appears to your queries. For example, you can create a view
CREATE VIEW SymmetricEstablishedFriends AS
SELECT User1 AS Me, User2 AS Friend, Time
FROM Friends
WHERE Status = 'established'
UNION
SELECT User2 AS Me, User1 AS Friend, Time
FROM Friends
WHERE Status = 'established'
This will restrict the data to established friendships only, and will take care of symmetrizing things for you. Using such views in your queries, you can avoid having to deal with all the details of the table structure in every query. And if you ever change those details, there will be less places to change.
I would break your data into requests and friendships. When a request is approved, convert it into a friendship. They really are two different objects, and should be treated as such.
Requests ::
requesting_user_id : int()
requested_user_id : int()
date_requested : datetime()
status_id : int()
Statuses ::
(Active, Declined, Accepted, Ignored)
Friendships ::
friendship_id : int()
user_id : int()
friend_id : int()
Maybe delete the request if it's declined, or have a column for it (to keep people from repeatedly requesting the same user's friendship). You'd have to convert the request to two friendships (one in each direction) for easy indexing
SELECT friend_id FROM friendships WHERE user_id = ?
Related
I'm creating a file hosting service, but right now I am creating the account email activation part of registering. So I had to come up with a database structure.
And right now it's:
users
id
first_name
last_name
email
password
since
active
hash_activate
But I can do it like a relational database too:
users
id
first_name
last_name
email
password
since
activation
id
user_id
hash
active
What would be the best way to go about it? And why?
If every person has only one activation hash active at at time, then it's feasible to store it in same table with users.
However, one advantage of separating it is that users only have an activation hash for a brief period of time, so to keep the user records smaller, you could store the hashes in a separate table. Keeping the user records small keeps it more performant. In this case, you wouldn't have active column. You'd just delete inactive hashes.
If you do store the activation columns in the user table, just be sure to select the columns by name. E.g. in most cases, you'll want do this:
SELECT id, first_name, last_name, email, password
FROM users
Instead of:
SELECT *
FROM users
You'd only want to select the activation columns when you needed them.
The second would only be sensible if one user could have multiple activations. You don't say whether this is true or false, so I couldn't possibly advise you.
If activations are a temporary thing, or having a hash defines someone as active, then make them different. Otherwise, that really won't matter.
However, neither is necessarily more or less relational than the other, without much more information. If you put a unique constraint on the combination of values in each row, and set each column up with a NOT NULL constraint, your first one would be quite relational.
You use a relational design when correctness of data, over time, is as important, if not more important, than what the application does with that data, and/or when data structure correctness/consistency is critical to the correct operation of an application, but might not necessarily be guaranteed by the application's own operation.
Assuming I want to have a web application that requires storing user information, images, etc as well as storing status updates or posts/comments would I want to separate tables?
For example if I have a "users" table that contains users information like passwords, emails, and typical social networking info like age, location etc. Would it be a good idea do create a second table("posts") that handles user content such as comments and/or post?
Table one: "users"
UserID
Username
Age
etc.
Table Two: "posts"
PostID
PostContent
PostAuthor
PostDate
etc
Is this a valid organization? Furthermore if I wanted to keep track of media should I do this in ANOTHER table?
Table Three: "media"
ID
Type
Uploader
etc.
Any help is much appreciated. I'm curious to see if I'm on the right track or just completely lost. I am mostly wondering if I should have many tables or if I should have larger less segregated tables.
Also of note thus far I planned on keeping information such as followers(or friends) in the 'users' table but I'm not sure that's a good idea in retrospect.
thanks in advance,
Generally speaking to design a database you create a table for each object you will be dealing with. In you example you have Users, Posts, Comments and Media. From that you can flesh out what it is you want to store for each object. Each item you want to store is a field in the table:
[Users]
ID
Username
PasswordHash
Age
Birthdate
Email
JoinDate
LastLogin
[Posts]
ID
UserID
Title
Content
CreateDate
PostedDate
[Comments]
ID
PostID
UserID
Content
[Media]
ID
Title
Description
FileURI
Taking a look above you can see a basic structure for holding the information for each object. By the field names you can even tell the relationships between the objects. That is a post has a UserID so the post was created by that user. the comments have a PostID and a UserID so you can see that a comment was written by a person for a specific post.
Once you have the general fields identified you can look at some other aspects of the design. For example right now the Email field under the Users table means that a user can have one (1) email address, no more. You can solve this one of two ways... add more email fields (EmailA, EmailB, EmailC) this generally works if you know there are specific types of emails you are dealing with, for example EmailWork or EmailHome. This doesn't work if you do not know how many emails in total there will be. To solve this you can pull the emails out into its own table:
[Users]
ID
Username
PasswordHash
Age
Birthdate
JoinDate
LastLogin
[Emails]
ID
UserID
Email
Now you can have any number of emails for a single user. You can do this for just about any database you are trying to design. Take it in small steps and break your bigger objects into smaller ones as needed.
Update
To deal with friends you should think about the relationship you are dealing with. There is one (1) person with many friends. In relation to the tables above its one User to many Users. This can be done with a special table that hold no information other than the relationship you are looking for.
[Friends]
[UserA]
[UserB]
So if the current user's ID is in A his friend's ID is in B and visa-verse. This sets up the friendship so that if you are my friend, then I am your friend. There is no way for me to be your friend without you being mine. If you want to setup the ability for one way friendships you can setup the table like this:
[Friends]
[UserID]
[FriendID]
So If we are both friends with each other there would have to be 2 records, one for my friendship to you and one for your freindship to me.
You need to use multiple tables.
The amount of tables depends on how complex you want your interactive site to be. Based on what you have posted you would need a table that would store information about the users, a table for comments, and more such as a table to store status types.
For example tbl_Users should store:
1. UserID
2. First Name
3. Last name
4. Email
5. Password (encrypted)
6. Address
7. City
8. State
9. Country
10. Date of Birth
11. UserStatus
12. Etc
This project sounds like it should be using a relational DB that will pull up records, such as comments, by relative userIDs.
This means that you will need a table that stores the following:
1. CommentID (primary key, int, auto-increment)
2. Comment (text)
3. UserID (foreign key, int)
The comment is attached to a user through a foreign key, which is essentially the userId from the tbl_Users table. You would need to combine these tables in an SQL statement with your script to query the information as a single piece of information. See example code
$sql_userWall = "SELECT tbl_Users.*, tbl_Comments.*, tbl_userStatus FROM tbl_Users
INNER JOIN tbl_Comments ON tbl_Users.userID = tbl_Comments.userID
INNER JOIN tbl_UserStatus ON tbl_Users.userID = tbl.UserStatus
WHERE tbl_Users.userID = $userID";
This statement essentially says get the information of the provided user from the users table and also get all the comments with that has the same userID attached to it, and get the userStatus from the table of user status'.
Therefore you would need a table called tbl_userStatus that held unique statusIDs (primary key, int, auto-incrementing) along with a text (varchar) of a determined length that may say for example "online" or "offline". When you started the write the info out from e record using php, asp or a similar language the table will automatically retrieve the information from tbl_userStatus for you just by using a simple line like
<?php echo $_REQUEST['userStatus']; ?>
No extra work necessary. Most of your project time will be spent developing the DB structure and writing SQL statements that correctly retrieve the info you want for each page.
There are many great YouTube video series that describe relational DBS and drawing entity relational diagrams. This is what you should look into for learning more on creating the tye of project you were describing.
One last note, if you wanted comments to be visible for all members of a group this would describe what is known as a many-to-many relationship which would require additional tables to allow for multiple users to 'own' a relationship to a single table. You could store a single groupID that referred to a table of groups.
tbl_groups
1. GroupID
2. GroupName
3. More group info, etc
And a table of users registered for the group
Tbl_groupMembers
1. membershipCountID (primary key, int, auto-increment)
2. GroupID (foriegn key, int)
3. UserID (foriegn key, int)
This allows users to registrar for a group and inner join them to group based comments. These relationships take a little more time to understand, the videos will help greatly.
I hope this helps, I'll come back and post some YouTube links later that I found helpful learning this stuff.
In Meetup.com, when you join a meetup group, you are usually required to complete a profile for that particular group. For example, if you join a movie meetup group, you may need to list the genres of movies you enjoy, etc.
I'm building a similar application, wherein users can join various groups and complete different profile details for each group. Assume the 2 possibilities:
Users can create their own groups and define what details to ask users that join that group (so, something a bit dynamic -- perhaps suggesting that at least an EAV design is required)
The developer decides now which groups to create and specify what details to ask users who join that group (meaning that the profile details will be predefined and "hard coded" into the system)
What's the best way to model such data?
More elaborate example:
The "Movie Goers" group request their members to specify the following:
Name
Birthdate (to be used to compute member's age)
Gender (must select from "male" or "female")
Favorite Genres (must select 1 or more from a list of specified genres)
The "Extreme Sports" group request their member to specify the following:
Name
Description of Activities Enjoyed (narrative form)
Postal Code
The bottom line is that each group may require different details from members joining their group. Ideally, I would like anyone to create a group (ala MeetUp.com). However, I also need the ability to query for members fairly well (e.g. find all women movie goers between the ages of 25 and 30).
For something like this....you'd want maximum normalization, so you wouldn't have duplicate data anywhere. Because your user-defined tables could possibly contain the same type of record, I think that you might have to go above 3NF for this.
My suggestion would be this - explode your tables so that you have something close to 6NF with EAV, so that each question that users must answer will have its own table. Then, your user-created tables will all reference one of your question tables. This avoids the duplication of data issue. (For instance, you don't want an entry in the "MovieGoers" group with the name "John Brown" and one in the "Extreme Sports" group with the name "Johnny B." for the same user; you also don't want his "what is your favorite color" answer to be "Blue" in one group and "Red" in another. Any data that can span across groups, like common questions, would be normalized in this form.)
The main drawback to this is that you'd end up with a lot of tables, and you'd probably want to create views for your statistical queries. However, in terms of pure data integrity, this would work well.
Note that you could probably get away with only factoring out the common fields, if you really wanted to. Examples of common fields would include Name, Location, Gender, and others; you could also do the same for common questions, like "what is your favorite color" or "do you have pets" or something to that extent. Group-specific questions that don't span across groups could be stored in a separate table for that group, un-exploded. I wouldn't advise this because it wouldn't be as flexible as the pure 6NF option and you run the risk of duplication (how do you predetermine which questions won't be common questions?) but if you really wanted to, you could do this.
There's a good question about 6NF here: Would like to Understand 6NF with an Example
I hope that made some sense and I hope it helps. If you have any questions, leave a comment.
Really, this is exactly a problem for which SQL is not a right solution. Forget normalization. This is exactly the job for NoSQL document stores. Every user as a document, having some essential fields like id, name, pwd etc. And every group adds possibility to add some fields. Unique fields can have names group-id-prefixed, shared fields (that grasp some more general concept) can have that field name free.
Except users (and groups) then you will have field descriptions with name, type, possible values, ... which is also very good for a document store.
If you use key-value document store from the beginning, you gain this freeform possibility of structuring your data plus querying them (though not by SQL, but by the means this or that NoSQL database provides).
First i'd like to note that the following structure is just a basis to your DB and you will need to expand/reduce it.
There are the following entities in DB:
user (just user)
group (any group)
template (list of requirement united into template to simplify assignment)
requirement (single requirement. For example: date of birth, gender, favorite sport)
"Modeling":
**User**
user_id
user_name
**Group**
name
group_id
user_group
user_id (FK)
group_id (FK)
**requirement**:
requirement_id
requirement_name
requirement_type (FK) (means the type: combo, free string, date) - should refers to dictionary)
**template**
template_id
template_name
**template_requirement**
r_id (FK)
t_id (FK)
The next step is to model appropriate schema for storing restrictions, i.e. validating rule for any requirement in any template. We have to separate it because for different groups the same restrictions can be different (for example: "age"). You can use the following table:
**restrictions**
group_id
template_id
requirement_id (should be here as template_id because the same requirement can exists in different templates and any group can consists of many templates)
restriction_type (FK) (points to another dict: value, length, regexp, at_least_one_value_choosed and so on)
So, as i said it is the basis. You can feel free to simplify this schema (wipe out tables, multiple templates for group). Or you can make it more general adding opportunity to create and publish temaplate, requirements and so on.
Hope you find this idea useful
You could save such data as JSON or XML (Structure, Data)
User Table
Userid
Username
Password
Groups -> JSON Array of all Groups
GroupStructure Table
Groupid
Groupname
Groupstructure -> JSON Structure (with specified Fields)
GroupData Table
Userid
Groupid
Groupdata -> JSON Data
I think this covers most of your constraints:
users
user_id, user_name, password, birth_date, gender
1, Robert Jones, *****, 2011-11-11, M
group
group_id, group_name
1, Movie Goers
2, Extreme Sports
group_membership
user_id, group_id
1, 1
1, 2
group_data
group_data_id, group_id, group_data_name
1, 1, Favorite Genres
2, 2, Favorite Activities
group_data_value
id, group_data_id, group_data_value
1,1,Comedy
2,1,Sci-Fi
3,1,Documentaries
4,2,Extreme Cage Fighting
5,2,Naked Extreme Bike Riding
user_group_data
user_id, group_id, group_data_id, group_data_value_id
1,1,1,1
1,1,1,2
1,2,2,4
1,2,2,5
I've had similar issues to this. I'm not sure if this would be the best recommendation for your specific situation but consider this.
Provide a means of storing data as XML, or JSON, or some other format that delimits the data, but basically stores it in field that has no specific format.
Provide a way to store the definition of that data
Provide a lookup/index table for the data.
This is a combination of techniques indicated already.
Essentially, you would create some interface to your clients to create a "form" for what they want saved. This form would indicated what pieces of information they want from the user. It would also indicate what pieces of information you want to search on.
Save this information to the definition table.
The definition table is then used to describe the user interface for entering data.
Once user data is entered, save the data (as xml or whatever) to one table with a unique id. At the same time, another table will be populated as an index with
id where the xml data was saved
name of field data is stored in
value of field data stored.
id of data definition.
now when a search commences, there should be no issue in searching for the information in the index table by name, value and definition id and getting back the id of the xml/json (or whatever) data you stored in the table that the data form was stored.
That data should be transformable once it is retrieved.
I was seriously sketchy on the details here, I hope this is enough of an answer to get you started. If you would like any explanation or additional details, let me know and I'll be happy to help.
if you're not stuck to mysql, i suggest you to use postgresql which provides build-in array datatypes.
you can define a define an array of varchar field to store group specific fields, in your groups table. to store values you can do the same in the membership table.
comparing to string parsing based xml types, this array approach will be really fast.
if you dont like array approach you can check out xml datatypes and an optional hstore datatype which is a key-value store.
i wanna create a 2 level status message system. Which is the best way to create a tables ?
Scope:
User sets a Status Message
Users Reply to the status message
this is a picture showing it
Tables i have created
users (id, name .... )
status_messages (id, message, time, user_id)
status_message_replies (id, message, time, status_message_id, user_d)
Some one suggested this can be done in a single table format
status_messages (id, pid, message, time, user_id)
where pid = selfId or ParentId of the status.
I wanna know which is the best method to create the system ?
As long as the original messages and the responses have the same structure (set of attributes, or columns) then you can use the single table approach. It has the advantage that you can search over original messages and responses with a single query.
The set of original messages can be found where pid = selfid and the responses where pid <> selfid. If it's important to be able to see the original and response messages separately (without knowledge of the storage mechanism) you can encapsulate the above conditions in two VIEWs: OriginalMessages and Responses.
If the originals and responses have different attributes (for instance, if you want the original to allow links to URLs, photos, etc) you might consider using two separate tables. But even there, I'd probably argue for the one table structure with a separate, extender table for the additional attributes. That means you don't have to store often-empty columns for those original messages that don't use the extended attributes, and you can later easily add the extended attributes to the response messages as well (if desired).
A classical IS-A relationship: every reply is a message with an extra attribute (the message it is a reply to).
This is probably not the best way to model it. You'll be running the risk of having to write a lot of UNION queries over those two tables.
Alternatives:
just one table: status_messages (id, message, time, status_message_id, user_id), and allowing status_message_id to be NULL
use a HAS-A: one table status_messages (id, message, time, user_id) and one table replies (reply_id, replies_to_id
The former has the disadvantage that working with NULL is tricky in SQL.
The latter will necessitate joins when you want to query replies specifically.
BTW it's much clearer (IMO) to name columns after the relationship they stand for, not the table they refer to.
I am just trying to figure out how Facebook's database is structured for tracking notifications.
I won't go much into complexity like Facebook is. If we imagine a simple table structure for notificaitons:
notifications (id, userid, update, time);
We can get the notifications of friends using:
SELECT `userid`, `update`, `time`
FROM `notifications`
WHERE `userid` IN
(... query for getting friends...)
However, what should be the table structure to check out which notifications have been read and which haven't?
I dont know if this is the best way to do this, but since I got no ideas from anyone else, this is what I would be doing. I hope this answer might help others as well.
We have 2 tables
notification
-----------------
id (pk)
userid
notification_type (for complexity like notifications for pictures, videos, apps etc.)
notification
time
notificationsRead
--------------------
id (pk) (i dont think this field is required, anyways)
lasttime_read
userid
The idea is to select notifications from notifications table and join the notificationsRead table and check the last read notification and rows with ID > notificationid. And each time the notifications page is opened update the row from notificationsRead table.
The query for unread notifications I guess would be like this..
SELECT `userid`, `notification`, `time` from `notifications` `notificationsRead`
WHERE
`notifications`.`userid` IN ( ... query to get a list of friends ...)
AND
(`notifications`.`time` > (
SELECT `notificationsRead`.`lasttime_read` FROM `notificationsRead`
WHERE `notificationsRead`.`userid` = ...$userid...
))
The query above is not checked.
Thanks to the idea of db design from #espais
You could add another table...
tblUserNotificationStatus
-------------------------
- id (pk)
- notification_id
- user_id
- read_status (boolean)
If you wanted to keep a history, you could keep the X latest notifications and delete the rest that are older than your last notification in the list....
If, when you give notifications, you give all relevant notifications available at that time, you can make this simpler by attaching timestamps to notifiable events, and keeping track of when each user last received notifications. If you are in a multi-server environment, though, you do have to be careful about synchronization. Note that this approach doesn't require true date-time stamps, just something that increases monotonically.
I see no-one here addresses the fact, that notifications are usually re-occurring, aka. notification of an upcoming transaction is always going to be the same, but with a different transaction ID or Date in it. as so: { You have a new upcoming payment: #paymentID, with a due date of #dueDate }.
Having texts in a different table can also help with
If you want to change the notification text later on
Making the app multilingual is easier, because I can just layer the notifications table with a language code and retrieve the appropriate string
Thus I also made a table for those abstract notifications, which are just linked under the the user with a middle table, where one notification type can be sent to one user at multiple times. I also linked the notifications to the user not by a foreign key ID, but I made notification codes for all notifications and full_text indexed the varchar field of those codes, for faster read speeds. Due to the fact that these notifications need to be sent at specific times, it is also easier for the developer to write
NotificationService::sendNew( Notification::NOTE_NEW_PAYMENT, ['paymentId'] => 123, ['dueDate'] => Carbon::now(), 'userIdToSendTo' );
Now since my messages are going to have custom data in them, that is inserted into the string, as you can see from the second argument beforehand, then I will store them in a database blob. as such
$values = base64_encode(serialize($valuesInTextArray));
This is because I want to decouple the notifications from other tables and as such I dont want to crete unnessecary FK relations from and to the notifications table, so that I can for example say notification 234 is attached to transaction 23 and then join and get that transaction ID. Decoupling this takes away the overhead of managing these relations. The downside is, it is nigh impossible to delete notifications, when for example a transaction is deleted, but in my use case I decided, this is not needed anyway.
I will retrieve and fill the texts on the App side as follows. Ps. I am using someones vksprintf function (https://github.com/washingtonpost/datawrapper/blob/master/lib/utils/vksprintf.php), props to him!
$valuesToFillInString = unserialize(base64_decode($notification->values));
vksprintf( $notificationText->text, $valuesToFillInString )
Notice also which fields I index, because I am going to find or sort by them
My Database design is as follows
==============================
TABLE: Users
id (pk)
==============================
TABLE: Notifications
id (pk)
user_id (fk, indexed)
text_id (fk - NotificationTexts table)
values (blob) [containing the array of values, to input into the text string]
createdDateTime (DateTime)
read (boolean)
[ClusterIndex] => (user_id, createdDateTime)
==============================
TABLE: NotificationTexts
id (pk)
text_id (uniquem indexed)
text (varchar) [{ You have a new upcoming payment: #paymentID, with a due date of #dueDate }]
note (varchar, nullable) [notes for developers, informational column]
I am also trying to figure out how to design a notification system. Regarding notification status (read, unread, deleted, archived, ect) I think that it would be good a good candidate to for ENUM. I think it is possible that there will be more than two different types of status other than READ and UNREAD such as deleted, archived, seen, dismissed, ect.
That will allow you to expand as your needs evolve.
Also I think it may make sense (at least in my case) to have a field to store an action url or a link. Some notifications could require or prompt the user to follow a link.
It also may make sense to have a notification type as well if you want different types. I am thinking there could be system notifications (such as a verify email notification) and user prompted notifications (such as a friend request).
Here is the structure I think would be a minimum to have a decent notification system.
users
-------------
id
username
password
email
notifications
-------------
id
user_id (fk)
notification_type (enum)
notification_status (enum)
notification_action (link)
notification_text
date_created (timestamp)
Table are following
User
userId (Integer)
fullName(VarChar)
Notification
notificationId (Integer)
creationDate (Date)
notificationDetailUrl (VarChar)
isRead (bollean)
description (VarChar)
userId (F.K)