add friends logic in database with efficiency - mysql

I'm looking forward to a logic to store friends list in a database. I'mthinking of adding things in an array with userids
example: user a's array in friends table would contain userid arrays of friends like 1,2,4,6,77,44 etc
I want to know whether this will be an efficient way of doing this. If not, what logic should be ideally implemented for large community?

You likely need a separate many-to-many join table. To achieve this, if both the user and their friends reside in the same user table. The table could look like this:
user_id - id of user the friend lookup is being done for, is foreign key to user_id field in user table
friend_id - id of friend associated with user, also is a foreign key to user_id field in user table
You would have a compound primary key across both fields, ensuring that each user_id to friend_id combination is unique.
This would be sample CREATE TABLE statement:
CREATE TABLE `friends` (
`user_id` INT(11) NOT NULL,
`friend_id` INT(11) NOT NULL,
PRIMARY KEY (`user_id`, `friend_id`)
)
ENGINE = INNODB;
And sample data may look like this:
INSERT INTO `friends` (`user_id`, `friend_id`)
VALUES
(1, 2), (1, 3), (1, 4), (2, 1), (2, 10), (4, 1), (4, 20);
Then say you wanted to do a lookup of all the user table data for the friends associated with a particular user (i.e. the logged in user). You could query that data like this:
SELECT users.*
FROM users
INNER JOIN friends ON users.user_id = friends.friend_id
WHERE friends.user_id = [The ID of current user]

No, no, no!
Don't store multiple values in one db field. It will bring you very much problems.
You could use a stucture like this for instance
friends (user_id, user_id)
which indicates 2 friends.

Related

Inserting into one table and using the id to insert into other table

I am using MySql, and I want to implement a query.
I have 5 Tables and in MySql they look like this.
Table1- Site:
Site_ID
domain_name
site_name
Table2- Locations:
site_id (Same as from Site)
Table3- Users:
user_id (AI primary key)
site_id
Table4- Users_Roles:
role_id(AI Primary key)
site_id
Table5- Users_Addresss:
user_address_id(AI Primary Key)
user_id (Same as from Users)
site_id
With one single query, I want to insert into all of these tables. My Database is normalized
I am not able to think of the query that would do the operation.
I will be using this query in a php file and trigger it with the ajax.
First you need to insert a record into Site table
INSERT INTO Site (domain_name,site_name) VALUES ('www.google.com', 'Test site');
Then assign the last insert id i.e. Site_id into a variable like below
SET #site_id = LAST_INSERT_ID(); // This is the Site_id
INSERT INTO Users (site_id) VALUES (#site_id);
Now do the same for all the tables.
Thanks

List of user data in database?

Suppose I wanted to have a table of users for people who collected various rare video games. Further suppose I made a model for a list of video games that can be created and saved by those users. How would I save the list in the database.
For example, my thoughts are there would be a cell in the user table called game_list which is a comma separated list of game('s). The games being a separate table with static game data and statistics which the users can pick from to create their rare game list.
Is this the best way? BTW, I am using rails, in case of specific solutions. I know this question is rather general, but I have a hard time phrasing the question to google and this site to get the answer I am looking for. I'm quite new to web development (SQL and HTML5 parts mostly), but not programming (been programming for a long time).
Thank you!
This is a many-to-many relationship. One usually models this with a table for user information, a table for game information, and a table for each user-game relation:
create table user(
user_id int primary key,
user_name varchar(255));
create table game(
game_id int primary key,
game_name varchar(255));
create table user_game(
user_id int not null references user(user_id),
game_id int not null references video(video_id));
insert into user values (1, "ed"), (2, "bob");
insert into game values (1, "pacman"), (2, "poker");
insert into user_game values (1, 1), (1, 2), (2, 2);
select user_name, game_name
from user
natural join user_game
natural join game;
http://sqlfiddle.com/#!2/6b637/1
A thing you could do is make a form, so the user can choose which games they wanna store. Then you take the games the user have selected and make a array.
Foreach game in the array you do a query to the table in your database which contains the game data, and you get the id and make a array. Then you use var game_list = serialize($array) in php (http://php.net/manual/en/function.serialize.php), for example and update the user table with the variable you got from serialize().
If you then wanna get the data and make it to a array again you should use var game_list = unserialize($row).

Defining a two-way link

I have a users table, and I want to define a "friends" relationship between two arbitrary users.
Up until now, I've used two different methods for this:
The friends table contains user1 and user2. Searching for users involves a query that looks like
... WHERE #userid IN (`user1`,`user2`), which is not terribly efficient
The friends table contains from and to fields. Initiating a friend request creates a row in that direction, and if it accepted then a second row is inserted with the opposite direction. There is additionally a status column that indicates that this has happened, making the search something like:
... WHERE `user1`=#userid AND `status`=1
I'm not particularly satisfied with either of these solutions. The first one feels messy with that IN usage, and the second seems bloated having two rows to define a single link.
So that's why I'm here. What would you suggest for such a link? Note that I don't need any more information saved with it, I just need two user IDs associated with each other, and preferably some kind of status like ENUM('pending','accepted','blocked'), but that's optional depending on what the best design for this is.
There are in general two approaches:
Store each friend pair once, storing the friend with the least id first.
CREATE TABLE
friend
(
l INT NOT NULL,
g INT NOT NULL,
PRIMARY KEY
(l, g),
KEY (g)
)
Store each friend pair twice, both ways:
CREATE TABLE
(
user INT NOT NULL,
friend INT NOT NULL,
PRIMARY KEY
(user, friend)
)
To store additional fields like friendship status, acceptance dates etc. you usually utilize a second table, for reasons I'll describe below.
To retrieve a list of friends for each user, you do:
SELECT CASE #myuserid WHEN l THEN g ELSE l END
FROM friend
WHERE l = #myuserid
OR
g = #myuserid
or
SELECT g
FROM friend
WHERE l = #myuserid
UNION
SELECT l
FROM friend
WHERE g = #myuserid
for the first solution; and
SELECT friend
FROM friend
WHERE user = #friend
To check if two users are friends, you issue this:
SELECT NULL
FROM friend
WHERE (l, g) =
(
CASE WHEN #user1 < #user2 THEN #user1 ELSE #user2 END,
CASE WHEN #user1 > #user2 THEN #user1 ELSE #user2 END
)
or
SELECT NULL
FROM friend
WHERE (user, friend) = (#user1, #user2)
Storage-wise, the two solutions are almost the same. The first (least/greatest) solution stores twice as few rows, however, for it to work fast you should have a secondary index on g, which, in fact, has to store g plus the part of the table's primary key which is not in the secondary index (that is, l). Thus, each record is effectively store twice: once in the table itself, once again in the index on g.
Performance-wise, the solutions are almost the same too. The first one, though, requires two index seeks followed by index scans (for "all friends"), the second one just one index seek, so for the L/G solution I/O amount might be slighly more. This might be mitigated a little by the fact that the one single index may become one level deeper than two independent ones, so the initial search may take one page read more. This may slow down "are they friends" query a little for the "both pairs" solution, compared to L/G.
As for the additional table for extra data, you most probably want it because it's usually much less used than the two query I described above (and usually only for history purposes).
Its layout also depends on the kind of queries you are using. Say, if you want "show my last ten friendships", then you may want to store the timestamp in "both pairs" so that you don't have to do filesorts, etc.
Consider the following schema:
CREATE TABLE `users` (
`uid` int(10) unsigned NOT NULL AUTO_INCREMENT,
`username` varchar(30) NOT NULL,
PRIMARY KEY (`uid`)
);
INSERT INTO `users` (`uid`, `username`) VALUES
(1, 'h2ooooooo'),
(2, 'water'),
(3, 'liquid'),
(4, 'wet');
CREATE TABLE `friends` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`uid_from` int(10) unsigned NOT NULL,
`uid_to` int(10) unsigned NOT NULL,
`status` enum('pending','accepted','blocked') NOT NULL,
PRIMARY KEY (`id`),
KEY `uid_from` (`uid_from`),
KEY `uid_to` (`uid_to`)
);
INSERT INTO `friends` (`id`, `uid_from`, `uid_to`, `status`) VALUES
(1, 1, 3, 'accepted'), -- h2ooooooo sent a friend request to liquid - accepted
(2, 1, 2, 'pending'), -- h2ooooooo sent a friend request to water - pending
(3, 4, 1, 'pending'), -- wet sent a friend request to h2ooooooo - pending
(4, 4, 2, 'pending'), -- wet sent a friend request to water - pending
(5, 3, 4, 'accepted'); -- liquid sent a friend request to wet - accepted
I'd use something like the following:
SELECT
fu.username as `friend_username`,
fu.uid as `friend_uid`
FROM
`users` as `us`
LEFT JOIN
`friends` as `fr`
ON
(fr.uid_from = us.uid OR fr.uid_to = us.uid)
LEFT JOIN
`users` as `fu`
ON
(fu.uid = fr.uid_from OR fu.uid = fr.uid_to)
WHERE
fu.uid != us.uid
AND
fr.status = 'accepted'
AND
us.username = 'liquid'
Result:
friend_username | friend_uid
----------------|-----------
h2ooooooo | 1
wet | 4
Here us would be the user you want to query for friends, and fu would be the users friends. You could easily change the WHERE statement to select the user in whatever whay you want. The status could be changed to pending (and should only join on uid_to) if you want to find friends request that the users hasn't answered.
DEMO ON SQLFIDDLE
The EXPLAIN if we use us.uid to match the user (as it's indexed):
Performance considerations aside, another option might be a "friend" table in which one row represents a friend (does not matter which way around), together with a view which produces two result rows (one in each direction) for any friend row. In use, it would simplify queries because it could be used in the same way as the "two row" solution while only requiring one data row per "friendship".
The only drawback could be performance... depending on how the query optimizer works.
I tried to be creative, here are some results.
Easier drawn than said,
A simple trigger on table friends would do a nice service, ordering (user1,user2) without forgeting who requested friendship.
CREATE TRIGGER `friends_insert` BEFORE INSERT ON friends
FOR EACH ROW BEGIN
DECLARE X INT UNSIGNED;
IF NEW.user1 > NEW.user2 THEN
SET X = NEW.user1;
SET NEW.user1 = NEW.user2;
SET NEW.user2 = X;
SET NEW.invited_by = 1;
END IF;
END$$
Finally, let's say a user U has id = x. We can say U divides table users in two parts: users with id < x and ones with id > x. Before inserting a tuple into table friends, we order its ids, and so a certain information won't be explicitly written twice.
We obtain friends of our user U (id = x) through union of U's friends with id < x and ones with id > x:
SELECT user1 AS `friend_id` FROM friends
WHERE user1<#id AND user2=#id
UNION
SELECT user2 AS `friend_id` FROM friends
WHERE user2>#id AND user1=#id;
The main goal here is query performance. Dividing in these two cases would help MySQL to use the right index for each situation.
[ Time for questions & disagreement. Perhaps you want the complete SQL; it's shown here ]
You could try something like this SQLFiddle: http://sqlfiddle.com/#!2/219dae/3/0
Here is the code:
The SCHEMA:
-- This is the users table:
CREATE TABLE users
(
u_id int auto_increment,
username varchar(20),
PRIMARY KEY (u_id)
);
INSERT INTO users (username)
VALUES ('user1'),
('user2'),
('user3'),
('user4'),
('user5');
-- This is the friends table:
CREATE TABLE friends
(
f_id int auto_increment,
r_name varchar(20), -- the name of the user that requests for friendship
a_name varchar(20), -- the name of the user that answers the friendship request
status varchar(20), -- the status of the request
PRIMARY KEY (f_id)
);
-- below, user1 sends frind requests to user2, user3, user4 and user5; and receives one from user2:
INSERT INTO friends (r_name, a_name, status)
VALUES ('user1','user2', 'pending');
INSERT INTO friends (r_name, a_name, status)
VALUES ('user1','user3', 'pending');
INSERT INTO friends (r_name, a_name, status)
VALUES ('user1','user4', 'pending');
INSERT INTO friends (r_name, a_name, status)
VALUES ('user1','user5', 'pending');
INSERT INTO friends (r_name, a_name, status)
VALUES ('user2','user1', 'pending');
-- user1 accepts user2 request to be his friend:
UPDATE friends
SET status='accepted'
WHERE a_name='user1' AND r_name='user2';
-- user3 accepts user1 request to be his friend:
UPDATE friends
SET status='accepted'
WHERE a_name='user3' AND r_name='user1';
and the SELECT:
-- here we select all friend requests that the user1 received and all friend requests that he made
SELECT r_name, a_name, status FROM users
INNER JOIN friends ON users.username=friends.a_name
WHERE username='user1'
UNION
SELECT r_name, a_name, status FROM users
INNER JOIN friends ON users.username=friends.r_name
WHERE username='user1'

faster way of executing SELECT ... WHERE id IN ... in mysql

I am in the middle of a uni project, when I discovered a huge problem with my database. Using wamp, and a massive (300Mb) database but with just a few tables my queries are very slow :( All tables are created with MyISAM engine. All settings are on default, I am not experienced in any optimisation. I will need to think of some better way to do it, but for now my question is what is the best substitute for the following query:
SELECT * FROM `payments` WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
I can't use left join or any similar solution I have found here, because those IDs (1,2,3,4,5, ...) are not coming from the database. User select the payments he wants to delete, and on the next screen payment details are displayed.
FYI, payments table has more than a million records :)
For a continuos range:
SELECT * FROM payments WHERE id BETWEEN 1 AND 10
If the range is disjoint:
Create an indexed memory table with the values in it.
CREATE TABLE mem_table (
pk unsigned integer primary key
) ENGINE = MEMORY;
INSERT INTO mem_table (pk) VALUES (1),(2),...,(10);
SELECT p.* FROM payments p
INNER JOIN mem_table m ON (m.pk = p.id);
See: http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine.html
PS
Make sure you have an index on id (this should really be the primary key).

Composite primary key comprising two foreign keys referencing same table: SQL Server vs. MySQL

I've read over a number of posts regarding DB table design for a common one-to-many / users-to-friends scenario. One post included the following:
USERS
* user_id (primary key)
* username
FRIENDS
* user_id (primary key, foreign key to USERS(user_id))
* friend_id (primary key, foreign key to USERS(user_id))
> This will stop duplicates (IE: 1, 2)
from happening, but won't stop
reversals because (2, 1) is valid.
You'd need a trigger to enforce that
there's only one instance of the
relationship...
The bold portion motivated me to post my question: is there a difference between how SQL Server and MySQL handle these types of composite keys? Do both require this trigger that the poster mentions, in order to ensure uniqueness?
I ask, because up until this point I've been using a similar table structure in SQL Server, without any such triggers. Have I just luckily not run into this data duplication snake that's lurking in the grass?
Yes, all DBMS will treat this the same. The reason is that the DBMS assumes that the column has meaning. I.e., the tuple is not comprised of meaningless numbers. Each attribute has meaning. user_id is assumed to have different meaning than friend_id. Thus, it is incumbent upon the designer to build a rule that claims that 1,2 is equivalent to 2,1.
You could just use a check constraint that friend_id > user_id to prevent "reversals". This would enforce that it was not possible to enter a pair such as (2, 1) such a relationship would have to be entered as (1, 2).
If you friendship relationship is symmetrical, you need to add a CHECK(user_id < friend_id) into the table definition and insert the data like this:
INSERT
INTO friends
VALUES (
(CASE user_id < friend_id THEN user_id ELSE friend_id END),
(CASE user_id > friend_id THEN user_id ELSE friend_id END)
)
In SQL Server, you can build a UNIQUE index on a pair of computed columns:
CREATE TABLE friends (orestes INT, pylades INT, me AS CASE WHEN orestes < pylades THEN orestes ELSE pylades END, friend AS CASE WHEN orestes > pylades THEN orestes ELSE pylades END)
CREATE UNIQUE INDEX ux_friends_me_friend ON friends (me, friend)
INSERT
INTO friends
VALUES (1, 2)
INSERT
INTO friends
VALUES (2, 1)
-- Fails
To fetch all friends for a given user, you need to run this query:
SELECT friend_id
FROM friends
WHERE user_id = #myuser
UNION ALL
SELECT user_id
FROM friends
WHERE friend_id = #myuser
However, in MySQL, it may be more efficient to always keep each both copies of each pair.
You may find these article interesting:
Selecting friends
Six degrees of separation
If relationship is symmetrical, then one alternative is to "define" the relationship as asymetrical in the database, but just always add both tuples every time you add either one.
You are basically saying "Nature of friendship is in DB assymetrical, A can be friend to B while B is not friend to A, but application will always add (or remove) BOTH records (a,B) and (B, A) anytime I add (remove) either. That simplifies the query logic as well since you don't have to look in both columns anymore. One extra insert / delete each time you modify data, but fewer reads when querying...