I'm building a database for what is soon to be my version of a social networking site. Now, I'd like to store friend relations, sort of like facebook does. I should mention that I'm using MySQL for this.
So i'm thinking of doing something like this:
UserFriends
(
UserFriendID SOME_DATA_TYPE NOT NULL AUTO_INCREMENT PRIMARY KEY,
UserID BIGINT(20) UNSIGNED NOT NULL,
FriendID BIGINT(20) UNSIGNED NOT NULL -- This is basically the same as UserID
)Engine=InnoDB;
Now, I'm looking for some type of data type to use for the primary key for this table as I expect that there will be a ton of records and I'd like some type of indexing to speed up any types of look-up that I might do on the records. Such as a friend suggestion feature etc.
I'm open to suggestions. Another option, in my opinion, but much more difficult to manage is to dynamically create a separate table for each user and store their friends in them, however this would be sort of a nightmare to manage code-wise.
If you do something like this
create table UserFriends
(
UserFriendID SOME_DATA_TYPE NOT NULL AUTO_INCREMENT PRIMARY KEY,
UserID BIGINT(20) UNSIGNED NOT NULL,
FriendID BIGINT(20) UNSIGNED NOT NULL -- This is basically the same as UserID
) Engine=InnoDB;
then you'll probably end up with data that looks like this.
UserFriendID UserID FriendID
--
1 100 201
2 100 201
3 201 100
The problem with that should be obvious.
If you don't need to know who friended whom, then something like this would make more sense. (Standard SQL, not MySQL.)
create table UserFriends (
UserID BIGINT(20) UNSIGNED NOT NULL,
FriendID BIGINT(20) UNSIGNED NOT NULL,
primary key (UserID, FriendID),
check (UserID < FriendID),
foreign key (UserID) references users (UserID),
foreign key FriendID references users (UserID)
);
The primary key constraint guarantees that you don't have multiple identical rows for a single "friendship". The check() constraint guarantees that you don't have two rows, differing only in the order of the id numbers, for a single "friendship".
But because MySQL doesn't enforce check() constraints, you'll have to write a trigger to make sure that UserID is less than FriendID.
use the same pattern BIGINT(20)
avoid a table per user like the plague :)
Just use INT. There are lots of methods to optimize performance, choosing an unusual primary key data type is not one of them.
Don't create one table per user. If you really have a lot of users, you can split them by some shard key later when you know where your bottlenecks are.
If you are expecting to have enough records to fill INT data type, MySQL is not the right solution, especially for recommendations, multi level friend-of-friend-of-friend etc. It might be more suited for one of Graph databases out there. Neo4j is a good example, designed specifically for social networks. http://neo4j.org check it out, might be a good alternative. You don't have to get rid of mysql, it most likely will be a hybrid approach.
Related
Consider two tables User and UserDetails
User (UserID,Name,Password)
UserDetails(UserID,FullName, Mobile Number,EMail)
First I will enter details into User table
Then Afterwards I wish to enter details into UserDetails Table with respect to primary key of first table i.e., UserID which is autoincremented.
consider this scenario..
User: (101, abc, xyz), (102,asd,war)
Now i want to store details in second table with respect to Primary key where UserID= 102
How can I accomplish this?
Start over with the design. Here is a start that runs through and doesn't blow up. Do the same for email. Keep data normalized and don't cause unnecessary lookups. When you have a lot of constraints, it is a sign that you care about the quality of your data. Not that you don't without constraints, if they are un-constrainable.
We all read on the internet how we should keep main info in one table and details in another. Nice as a broad brush stroke. But yours does not rise to that level. Yours would have way too many tables. See Note1 at bottom about about Entities. See Note2 at bottom about performance. See any of us with any broad or specific question you may have.
create table user
( userId int auto_increment primary key,
fullName varchar(100) not null
-- other columns
);
create table phoneType
( phoneType int auto_increment primary key, -- here is the code
long_description varchar(100) not null
-- other columns
);
create table userPhone
( id int auto_increment primary key,
userId int not null,
phone varchar(20) not null,
phoneType int not null,
-- other columns
CONSTRAINT fk_up_2_user FOREIGN KEY (userId) REFERENCES user(userId),
CONSTRAINT fk_up_2_phoneType FOREIGN KEY (phoneType) REFERENCES phoneType(phoneType)
);
Note1:
I suspect that your second table as you call it is really a third table, as you try to bring in missing information that really belongs in the Entity.
Entities
Many have come before you crafting our ideas as we slug it out in design. Many bad choices have been made and by yours truly. A good read is third normal form (3NF) for data normalization techniques.
Note2:
Performance. Performance needs to be measured both in real-time user and in developer problem solving of data that has run amok. Many developers spend significant time doing data patches for schemas that did not enforce data integrity. So factor that into performance, because those hours add up in those split seconds of User Experience (UX).
You can try this:-
INSERT INTO userDetails
(SELECT UserID, Name FROM User WHERE USerID= 102), 'Mob_No', EMail;
Straight into this one. I have a table for a sort of "like" feature. This table naturally has the following:
Name | Type | Attributes | (Comment)
Post ID | int | index | ID of the post which was "Liked"
Topic ID | int | index | ID of the topic which contains the "Liked" post
Member ID | int | index | ID of the member who "Liked" the post
Date | bigint | index | Date/time of "Like"
As you can see, there's no primary key. This seems natural. The only functions which need performing are the INSERT (for "Like"), DELETE (for "Unlike") and searching for likes in order of most recent by the post or member who gave them.
Each entry will obviously be very 'UNIQUE' - as only one like is needed per person per post. There seems absolutely no need for a unique primary index, as if duplicates occur (somehow) I will want to DELETE them all, not just one with a particular ID. Same with insertion, no one can like the same thing twice. And these "likes" will only ever be selected using the indexes from other tables.
Yet, phpMyAdmin now forbids me from any manual editing, copying or deleting. This is also fine, but prompted me to further look up the logistics of not having a primary key. When I found a stackoverflow answer, the general opinion was that it's "very rare" to not need a primary key.
So, either I've found one of these very rare moments, or it's not that rare at all. My scenario seems quite simple and common, so there should be a more definite answer. Everything seems natural this way, I will never ever need to actually use a primary key. Therefore, I'd think it'd be simpler not to have one. Are there any really mysterious (and somewhat magical) ways of MySQL I'm overlooking? Or am I safe to leave out a useless auto-incrementing primary ID key (which could reach its limit way before any of the currently used ID's would, anyway) at least until I time I find a use for them (never)?
You've said that Post ID and Member ID define the uniqueness of a column (and that Topic ID is secondary, included only for convenience).
So, why not have a primary key on (Post ID, Member ID)? If you already have UNIQUEness constraints on them, then this is not a big leap.
CREATE TABLE `Likes` (
`PostID` INT UNSIGNED NOT NULL,
`TopicID` INT UNSIGNED NOT NULL,
`MemberID` INT UNSIGNED NOT NULL,
`Date` DATETIME NOT NULL,
PRIMARY KEY (`PostID`, `MemberID`),
FOREIGN KEY (`PostID`) REFERENCES `Posts` (`ID`) ON DELETE CASCADE,
FOREIGN KEY (`MemberID`) REFERENCES `Members` (`ID`) ON DELETE CASCADE
) Engine=InnoDB;
(I don't know enough about TopicID to suggest key constraints for it, but you may wish to add some.)
Certainly adding an arbitrary auto-incrementing field is pointless, but that doesn't mean that you can't have a meaningful primary key.
As an aside, I'd consider removing the TopicID field; if you have your foreign keys set up properly then it should be trivial to do post<->topic lookup without it, and in this instance you're duplicating data and violating the relational model!
I'm trying to achieve a "One to one" relationship in a MySQL database. For example, let's say I have a Users table and an Accounts table. I want to be sure that a User can have only one Account. And that there can be only one Account per User.
I found two solutions for this but don't know what to use, and are there any other options.
First solution:
DROP DATABASE IF EXISTS test;
CREATE DATABASE test CHARSET = utf8 COLLATE = utf8_general_ci;
USE test;
CREATE TABLE users(
id INT NOT NULL AUTO_INCREMENT,
user_name VARCHAR(45) NOT NULL,
PRIMARY KEY(id)
) ENGINE = InnoDB DEFAULT CHARSET = utf8;
CREATE TABLE accounts(
id INT NOT NULL AUTO_INCREMENT,
account_name VARCHAR(45) NOT NULL,
user_id INT UNIQUE,
PRIMARY KEY(id),
FOREIGN KEY(user_id) REFERENCES users(id)
) ENGINE = InnoDB DEFAULT CHARSET = utf8;
In this example, I define the foreign key in accounts pointing to the primary key in users.
And then I make foreign key UNIQUE, so there can't be two identical users in accounts.
To join tables I would use this query:
SELECT * FROM users JOIN accounts ON users.id = accounts.user_id;
Second solution:
DROP DATABASE IF EXISTS test;
CREATE DATABASE test CHARSET = utf8 COLLATE = utf8_general_ci;
USE test;
CREATE TABLE users(
id INT NOT NULL AUTO_INCREMENT,
user_name VARCHAR(45) NOT NULL,
PRIMARY KEY(id)
) ENGINE = InnoDB DEFAULT CHARSET = utf8;
CREATE TABLE accounts(
id INT NOT NULL AUTO_INCREMENT,
account_name VARCHAR(45) NOT NULL,
PRIMARY KEY(id),
FOREIGN KEY(id) REFERENCES users(id)
) ENGINE = InnoDB DEFAULT CHARSET = utf8;
In this example, I create a foreign key that points from the primary key to a primary key in another table. Since Primary Keys are UNIQUE by default, this makes this relation One to One.
To join tables I can use this:
SELECT * FROM users JOIN accounts ON users.id = accounts.id;
Now the questions:
What is the best way to create One to One relation in MySQL?
Are there any other solutions other than these two?
I'm using MySQL Workbench, and when I design One To One relation in EER diagram and let MySQL Workbench produce SQL code, I get One to Many relation :S That's what's confusing me :S
And if I import any of these solutions into MySQL Workbench EER diagram, it recognizes relations as One to Many :S That's also confusing.
So, what would be the best way to define One to One relation in MySQL DDL. And what options are there to achieve this?
Since Primary Keys are UNIQUE by default, this makes this relation One to One.
No, that makes the relation "one to zero or one". Is that what you actually need?
If yes, then then your "second solution" is better:
it's simpler,
takes less storage1 (and therefore makes cache "larger")
hes less indexes to maintain2, which benefits data manipulation,
and (since you are using InnoDB) naturally clusters the data, so users that are close together will have their accounts stored close together as well, which may benefit cache locality and certain kinds of range scans.
BTW, you'll need to make accounts.id an ordinary integer (not auto-increment) for this to work.
If no, see below...
What is the best way to create One to One relation in MySQL?
Well, "best" is an overloaded word, but the "standard" solution would be the same as in any other database: put both entities (user and account in your case) in the same physical table.
Are there any other solutions other than these two?
Theoretically, you could make circular FKs between the two PKs, but that would require deferred constraints to resolve the chicken-and-egg problem, which are unfortunately not supported under MySQL.
And if I import any of these solutions into MySQL Workbench EER diagram, it recognizes relations as One to Many :S Thats also confusing.
I don't have much practical experience with that particular modeling tool, but I'm guessing that's because it is "one to many" where "many" side was capped at 1 by making it unique. Please remember that "many" doesn't mean "1 or many", it means "0 or many", so the "capped" version really means "0 or 1".
1 Not just in the storage expense for the additional field, but for the secondary index as well. And since you are using InnoDB which always clusters tables, beware that secondary indexes are even more expensive in clustered tables than they are in heap-based tables.
2 InnoDB requires indexes on foreign keys.
Your first approach creates two candidate keys in the accounts table: id and user_id.
I therefore suggest the second approach i.e. using the foreign key as the primary key. This:
uses one less column
allows you to uniquely identify each row
allows you to match account with user
What about the following approach
Create Table user
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(45) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Create Table account with a unique index on user_id and account_id with a foreign key relation to user/account and a primary key on user_id and account_id
CREATE TABLE `account` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(45) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Create Table user2account
CREATE TABLE `user2account` (
`user_id` int(11) NOT NULL,
`account_id` int(11) NOT NULL,
PRIMARY KEY (`user_id`,`account_id`),
UNIQUE KEY `FK_account_idx` (`account_id`),
UNIQUE KEY `FK_user_idx` (`user_id`),
CONSTRAINT `FK_account` FOREIGN KEY (`account_id`) REFERENCES `account` (`id`),
CONSTRAINT `FK_user` FOREIGN KEY (`user_id`) REFERENCES `user` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
While this solution has the largest footprint in the database, there are some advantages.
Putting the FK_Key in either the user table or the account table is something that I expect to be a one to many releation (user has many accounts ...)
While this user2account approach is mainly used to define a many to many relationship, adding a UNIQUE constraint on user_id and on account_id will prevent creating something else than a one to one relation.
The main advantage I see in this solution is that you can divide the work in different code layers or departements in a company
Department A is responsible for creating users, this is possible even without write permission to accounts table
Departement B is responsible for creating accounts, this is possible even without write permission to user table
Departement C is responsible for creating the mapping, this is possible even without write permission to user or account table
Once Departement C has created a mapping neither the user nor the account can be deleted by departement A or B without asking departement C to delete the mapping first.
I was wondering what else should I add to my friends table how can i stop a user from adding the same friend twice as well as when the user is online? As well as any suggestions on how i can make my friends table better?
Here is my MySQL table.
CREATE TABLE friends (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
user_id INT UNSIGNED NOT NULL,
friend_id INT UNSIGNED NOT NULL,
PRIMARY KEY (id)
);
You don't need an id on a pivot table and could define your primary key like this PRIMARY KEY(user_id, friend_id), this would prevent from having duplicates.
IMO there's no need for id attribute.
You can add timestamp, which can be useful sometimes.
Create key for both user_id and friend_id, and they will be unique, which prevents you from creating this tuple twice.
well if the deal it's about the table design you could make the combination of the user_id and the friend_id as an unique key, or maybe to make all those three(3) fields as primary keys,,, not so good practice but works.
The other thing would be for you to controll this by using PHP or the connection language you alreade chose.
Let me know if this helps.
I am writing a data warehouse, using MySQL as the back-end. I need to partition a table based on two integer IDs and a name string. I have read (parts of) the mySQL documentation regarding partitioning, and it seems the most appropriate partitioning scheme in this scenario would be either a HASH or KEY partitioning.
I have elected for a KEY partitioning because I (chicked out and) dont want to be responsible for providing a 'collision free' hashing algorithm for my fields - instead, I am relying on MySQL hashing to generate the keys required for hashing.
I have included below, a snippet of the schema of the table that I would like to partition based on the COMPOSITE of the following fields:
school id, course_id, ssname (student surname).
BTW, before anyone points out that this is not the best way to store school related information, I'll have to point out that I am only using the case below as an analogy to what I am trying to model.
My Current CREATE TABLE statement looks like this:
CREATE TABLE foobar (
id int UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
school_id int UNSIGNED NOT NULL,
course_id int UNSIGNED NOT NULL,
ssname varchar(64) NOT NULL,
/* some other fields */
FOREIGN KEY (school_id) REFERENCES school(id) ON DELETE RESTRICT ON UPDATE CASCADE,
FOREIGN KEY (course_id) REFERENCES course(id) ON DELETE RESTRICT ON UPDATE CASCADE,
INDEX idx_fb_si (school_id),
INDEX idx_fb_ci (course_id),
CONSTRAINT UNIQUE INDEX idx_fb_scs (school_id,course_id,ssname(16))
) ENGINE=innodb;
I would like to know how to modify the statement above so that the table is partitioned using the three fields I mentioned at the begining of this question (namely - school_id, course_id and the starting letter of the students surname).
Another question I would like to ask is this:
What happens in 'edge' situations for example if I attempt to insert a record that contains a valid* school_id, course_id or surname - for which no underlying partitioned table file exists - will mySQL automatically create the underlying file.?
Case in point. I have the following schools: New York Kindergaten, Belfast Elementary and the following courses: Lie Algebra in Infitesmal Dimensions, Entangled Entities
Also assume I have the following students (surnames): Bush, Blair, Hussein
When I add a new school (or course, or student), can I insert them into the foobar table (actually, I cant think why not). The reason I ask is that I forsee adding more schools and courses etc, which means that mySQL will have to create additional tables behind the scenes (as the hash will generate new keys).
I will be grateful if someone with experience in this area can confirm (preferably with links backing their assertion), that my understanding (i.e. no manual administration is required if I add new schools, courses or students to the database), is correct.
I dont know if my second question was well formed (clear) or not. If not, I will be glad to clarify further.
*VALID - by valid, I mean that it is valid in terms of not breaking referential integrity.
I doubt partitioning is as useful as you think. That said, there are a couple of other problems with what you're asking for (note: the entirety of this answer applies to MySQL 5; version 6 might be different):
columns used in KEY partitioning must be a part of the primary key. school_id, course_id and ssname are not part of the primary key.
more generally, every UNIQUE key (including the primary key) must include all columns in the partition1. This means you can only partition on the intersection of the columns in the UNIQUE keys. In your example, the intersection is empty.
most partitioning schemes (other than KEY) require integer or null values. If not NULL, ssname will not be an integer value.
foreign keys and partitioning aren't supported simultaneously2. This is a strong argument not to use partitioning.
Fortunately, collision free hashing is one thing you don't need to worry about, because partitioning is going to result in collisions (otherwise, you'd only have a single row in each partition). If you could ignore the above problems as well as the limitations on functions used in partitioning expressions, you could create a HASH partition with:
CREATE TABLE foobar (
...
) ENGINE=innodb
PARTITION BY HASH (school_id + course_id + ORD(ssname))
PARTITIONS 2
;
What should work is:
CREATE TABLE foobar (
id int UNSIGNED NOT NULL AUTO_INCREMENT,
school_id int UNSIGNED NOT NULL,
course_id int UNSIGNED NOT NULL,
ssname varchar(64) NOT NULL,
/* some other fields */
PRIMARY KEY (id, school_id, course_id),
INDEX idx_fb_si (school_id),
INDEX idx_fb_ci (course_id),
CONSTRAINT UNIQUE INDEX idx_fb_scs (school_id,course_id,ssname)
) ENGINE=innodb
PARTITION BY HASH (school_id + course_id)
PARTITIONS 2
;
or:
CREATE TABLE foobar (
id int UNSIGNED NOT NULL AUTO_INCREMENT,
school_id int UNSIGNED NOT NULL,
course_id int UNSIGNED NOT NULL,
ssname varchar(64) NOT NULL,
/* some other fields */
PRIMARY KEY (id, school_id, course_id, ssname),
INDEX idx_fb_si (school_id),
INDEX idx_fb_ci (course_id),
CONSTRAINT UNIQUE INDEX idx_fb_scs (school_id,course_id,ssname)
) ENGINE=innodb
PARTITION BY KEY (school_id, course_id, ssname)
PARTITIONS 2
;
As for the files that store tables, MySOL will create them, though it may do it when you define the table rather than when rows are inserted into it. You don't need to worry about how MySQL manages files. Remember, there are a limited number of partitions, defined when you create the table by the PARTITIONS *n* clause.