MySQL retrieve friends of friends structure and performance - mysql

I would simply like to find a database structure in MySQL to get all users friends of friends and the corresponding query to retrieve them. (friend links are bi-directional)
I have found a couple posts related to that, but my concern is the performance:
Structure 1
Many posts suggest a structure where you have a table in which each row represents a friendship link e.g:
CREATE TABLE `friends` (
`user_id` int(10) unsigned NOT NULL,
`friend_id` int(10) unsigned NOT NULL,
)
saying the user '1' has three friend '2','3','4' and user '2' has two friend '1','5' . Your friend table would look like this:
user_id | friend_id
1 | 2
1 | 3
1 | 4
2 | 1
2 | 5
friends of friends query: How to select friends of friends can be seen here SQL to get friends AND friends of friends of a user. The rsult of the query for user '1' is supposed to give (1,2,3,4,5)
My concern: The average fb-user has about 140 friends. Frequent users will have a lot more.
If I have 20.000 users this will end up in at least 3million rows.
Structure 2
If I could use a structure like this:
CREATE TABLE `friends` (
`user_id` int(10) unsigned NOT NULL,
`friend_1` int(10) unsigned NOT NULL,
`friend_2` int(10) unsigned NOT NULL,
`friend_3` int(10) unsigned NOT NULL,
`friend_4` int(10) unsigned NOT NULL,
....
)
My table would look like this (taking example from above):
user_id | friend_1 | friend_2 | friend_3 | ...
1 | 2 | 3 | 4 |
2 | 1 | 5 | |...
Now I have only 20.000 rows.
friends of friends query: To select user friends of friends I tried
Select * FROM friends as a
WHERE a.user_id
IN (
SELECT * FROM friends AS b
WHERE b.user_id = '1'
)
but I get an error "#1241 - Operand should contain 1 column(s) ". I think the problem is, that the sub-selection passes a row, not a column?
Questions
I hope you understand my concern. I would be really really happy about any input to these questions
1)
find a query that returns all friends of friends for a specified user in structure 2?
2)
Which structure allows me to return friends of friends quicker?
In structure 2 I think the "join row with column" could be slow, if its even possible to use a join here. Thank you for any suggestions. If you could think of any other structures, maybe taking advantage of the small-world-network-type I'd be happy to hear them.
THANK YOU!!

Definitely use the first structure. Queries for the second structure will be huge, hard to maintain and slow because of complicated clauses.
A fast enough query for the first approach:
(
select friend_id
from friends
where user_id = 1
) union (
select distinct ff.friend_id
from
friends f
join friends ff on ff.user_id = f.friend_id
where f.user_id = 1
)
For the best performance you need to have these indexes:
ALTER TABLE `friends` ADD UNIQUE INDEX `friends_idx` (`user_id` ASC, `friend_id` ASC);
ALTER TABLE `friends` ADD INDEX `friends_user_id_idx` (`user_id` ASC);

I'd say you ought to use the first structure. It's more flexible in my opinion. My solution for the query would be a simple sub-query, like this:
SELECT friend_id FROM friends WHERE user_id IN (
SELECT friend_id FROM friends WHERE user_id='$USER_ID'
);
EDIT: Sorry I just woke up and realized after posting a reply that this wasn't at all what you were looking for. Sry.

Don't use "Structure 2" you cannot create a column for all users if just 1 user have let's say 100 friends (what abaout 10K friends or more ?) it results in low performance, for structure 1 you can do a simple join to the same table:
select u.user_id, f.friend_id
from friends as u
inner join friends as f
on (u.friend_id=f.friend_id);
EDIT:
you're error #1241 means that you use * in the subselect and the table returns more than 1 column, your subquery should return just one colums (no mather how many rows), so change your "*" with a "user_id" (without quotes)
Solution 1 is not only faster it is flexible, I don't recomand a subquery for a simple select like this, just join the same table (it's much faster than a subselect).
Solution 2 in my opinion is not a solution at all, it's not flexible, it slower, it uses more space on HD, more columns means less performance in mysql. How can you index such a thing? And how can you select by friend_id not by user_id, you look in every column for that friend_id ?

As the below answers state, solution 1 is preferred to solution 2. Also solution 1 will work out for a decent amount of data.
However, when things go bigger there is also a third solution - Graph Databases.
When your data model focuses on the "relations" instead of the "objects" RDBMSs don't scale well since they have to perform lookups through the tables concerned. DB Indexes make this easier but it was not enough so Graph Databases came to the rescue.
A Graph DB actually "stores" the relations next to each entity making it much faster to perform tasks like yours.
Here is some information to get you started:
http://www.slideshare.net/maxdemarzi/graph-database-use-cases
Neo4j or OrientDB are among the popular choices.

Related

How to correctly query my friend table with its many to many relationship with MySQL?

I am building a simple application using MySQL as the database. I have a user table with varying information, but I also have a friends table. Each user has it's own ID which is referenced in the friend table when a request is initiated.
My friends table is pretty simple:
id int NO PRI auto_increment
date datetime NO CURRENT_TIMESTAMP DEFAULT_GENERATED
user_one_id int NO
user_two_id int NO
request_status tinyint NO
My issue is when trying to query my table to find a specific user's friends. My two subqueries in the below query both work fine on their own, however when I include them in the main query, I get no results. No errors, but also no results.
If the logged in user's ID is 1, then -
SELECT id, display_name, join_date, profile_image_url
FROM user WHERE id IN
(SELECT user_one_id FROM friend WHERE user_two_id = 1 AND request_status = 1)
AND id IN
(SELECT user_two_id FROM friend WHERE user_one_id = 1 AND request_status = 1);
The query is supposed to account for the user's ID being in either column, depending on who actually initiated the friend request.
I apologize if this seems like a dumb question, but we all gotta learn sometime.
Please let me know if any additional information would help.
I believe that your code would work if you used OR instead of AND for the conditions in the WHERE clause, but I propose a simpler solution with EXISTS:
SELECT u.id, u.display_name, u.join_date, u.profile_image_url
FROM user u
WHERE EXISTS (
SELECT *
FROM friend f
WHERE f.request_status = 1
AND (f.user_one_id, f.user_two_id) IN ((u.id, ?), (?, u.id))
);
Replace ? with the id of the user that you want to search.

SQL - Join two distinct result sets

I'm new with SQL and just had my first assignment.
I have the following requirements:
Given is a database of two tables. The first one contains information about the user, like a unique ID per user, the phone number and the city. ID and phone number consist only of numeric digits. The second table contains data about so called „credits“, which a user can own. Again there is a column for the unique user ID, but also the number, the date and the type of credits. A user can have none, one or several entries in the credit table.
I'm still now sure if I got right the part where a user can have none, one or several entries in the credit table. I created these two tables:
CREATE table user
(
user_id INT NOT NULL UNIQUE AUTO_INCREMENT,
user_phone_number INT NOT NULL,
user_city VARCHAR(32) NOT NULL,
PRIMARY KEY (user_id)
);
CREATE table credit
(
credit_user_id INT FOREIGN KEY (user_id),
credit_date date,
credit_number double,
credit_type char(10),
CONSTRAINT chk_type CHECK (credit_type in ('None','A','B','C')),
);
After creating this, I was asked the following questions:
a) The phone number of all users, who own credits of type „A“
SELECT user_phone_number
FROM user, credit
WHERE credit_type = 'A';
b) Like a), but additionally the credit_number of the credits is smaller than 2 or greater than 4
SELECT user_phone_number
FROM user, credit
WHERE (credit_type ='A')
AND (credit_number < 2 OR credit_number > 4);
C) Like a), but additionally the users also own credits of at least one other type.
SELECT user_phone_number
FROM user, credit
WHERE credit_type = 'A'
AND (
SELECT DISTINCT c1.credit_type FROM credit AS c1
JOIN credit a1 ON (c1.credit_type=a1.credit_type)
JOIN credit a2 ON (c1.credit_type=a2.credit_type)
WHERE a2.credit_type<>a1.credit_type);
My problem is that I can't make letter C work, even if both selects seem to work separately. Any ideas or suggestions would be appreciated, thank you!
I'm not sure to understand what you want in C) but many things to say.
You should't use a database name like 'user' because it can be ambigous (reserved word) for SGBD.
You should prefer 'join' instead of 'from table1, table2' and / or mix both.
Have a look here.
You've got ';' in your request in C) which must be only for specify the end of your request.
You can use nested resquet but not like that, not directly after 'AND' because AND is for condition like a comparison. You've got many possibilites : in select fied, after 'FROM', after 'IN', with join, in condition...
Quick search on google.
From another post:
PRIMARY KEY(x), UNIQUE(x) -- Since a PRIMARY KEY is by definition (in MySQL) UNIQUE...
Since you want to find everyone with 2 kinds of credit, I'd try to make a query like if I was looking for duplicates, here's two ways to do that:
With subquery
Find duplicate records in MySQL
Without
Finding duplicate values in MySQL
Welcome to SO! Here's an approach using the nested query style you're trying to use. I've used explicit JOINs rather than FROM user, credit in the FROM clause, because this makes it clearer that it's a join.
Say your users table looks like this -
user_id user_phone_number user_city
6 75771 Leeds
7 75772 Wakefield
8 75773 Dewsbury
9 75774 Heckmondwike
10 75775 Huddersfield
And your credit table looks like this -
credit_user_id credit_date credit_number credit_type
7 2017-02-13 2 A
7 2017-02-13 2 B
6 2017-02-13 2 A
8 2017-02-13 4 B
The nested query in the AND clause returns records where the credit_type is not A, and the WHERE in the main query selects all records where the credit_type is is A, so if the record appears in both, the user must have two types of credit -
SELECT user_phone_number
FROM [user] AS u
JOIN credit AS c ON u.user_id = c.credit_user_id
WHERE credit_type = 'A'
AND u.user_id IN (
SELECT user_id
FROM [user] AS u
JOIN credit AS c ON u.user_id = c.credit_user_id
WHERE credit_type <> 'A')
As you can see from the tables, the user with the id of 7 has credit both of type A and B, so we end up with -
user_phone_number
75772
I'd agree that you might want to consider some of the points others have raised above, but won't repeat.

Select rows from a table not matched in another table

So I have the following:
A lookup table that has two columns and looks like this for example:
userid moduleid
4 4
I also have a users table that has a primary key userid which the lookup table references. The user table has a few users lets say, and looks like this:
userid
1
2
3
4
In this example, it show that the user with ID 4 has a match with module ID 4. The others are not matched to any moduleid.
I need a query that gets me data from the users table WHERE the moduleid is not 4. In my application, I know the module but I don't know the user. So the query should return the other userids apart from 4, because 4 is already matched with module ID 4.
Is this possible to do?
I think I understand your question correctly. You can use a sub-query to cross-check the data between both tables using the NOT IN() function.
The following will select all userid records from the user_tbl table that do not exist in the lookup_tbl table:
SELECT `userid`
FROM `user_tbl`
WHERE `userid` NOT IN (
SELECT DISTINCT(`userid`) FROM `lookup_tbl` WHERE moduleid = 4
)
There are several ways to do this, one pretty intuitive way (in my opinion) is the use an in predicate to exclude the users with moduleid 4 in the lookup table:
SELECT * FROM Users WHERE UserID NOT IN (SELECT UserID FROM Lookup WHERE ModuleID = 4)
There are other ways, with possibly better performance (using a correlated not exists query or a join for instance).
One other option is to use a LEFT JOIN so that you can get the values from both tables, even when there is not a match. Then, pick the rows where there is no userid value from the lookup table.
SELECT u.userid
FROM usersTable u
LEFT JOIN lookupTable lt ON u.userid = lt.userid
WHERE lt.userid IS NULL
Are you looking for a query like this?
select userid from yourtablename where moduleid<>4

MySQL - 2 way query check

I am implementing "Add Friend" in my web app, so users could add other users as fiends.
We have 2 tables: tbl_users and tbl_relations, tbl_users have an unique ID for the registered users and tbl_relations will store the users which are friends, for example some rows in tbl_relations are:
id user_id_1 user_id_2
1 4 6
2 4 8
3 8 23
4 12 84
5 3 4
...
In the above results, the id is the unique id for the tbl_relations, user_id_1 is foreign key to tbl_users and user_id_2 is foreign key to tbl_users, now imagine we want to query and check if user with id "4" is friend with user with id "9" or not, here we need to send the query in 2 ways, I mean:
SELECT * FROM tbl_relations WHERE (user_id_1 = '4' AND user_id_2 = '9') OR (user_id_1 = '9' AND user_id_2 = '4')
The above query seems a little weird to me, there should be another way for implementing this I guess, maybe a different database structure?
Or another query, we want to get the mutual friends between users with id "4" and "8", how I'm supposed to get the mutual friends in this scenario? is there any better database structure for this?
I would appreciate any kind of help.
I would de-normalize the relation such that it's symmetric. That is, if 1 and 2 are friends, i'd have two rows (1,2) and (2,1).
The disadvantage is that it's twice the size, and you have to do 2 writes when forming and breaking friendships. The advantage is all your read queries are simpler. This is probably a good trade-off because most of the time you are reading instead of writing.
This has the added advantage that if you eventually outgrow one database and decide to do user-sharding, you don't have to traverse every other db shard to find out who a person's friends are.
If you do it this way you'll have to check for duplicates every time you update. Why not have
user_id1 friend_id
and then query as
select * from tbl_relations where user_id1 in (4,9)
This still seems odd at it implies that 'friend' relations are one-way.
To get the 'mutual' friends - if you do it this way -
select * from tbl_relations t0
join tbl_relations t1 on t0.friend_id = t1.friend_id
where t0.user_id1 = ? and t1.user_id1 = ?

SQL query with mutual user relationship

I'm making a SNS that users can follow each other. If user A follows user B and user B also follows user A, they become friends.
Also consider that some popular people(like movie stars) may be followed by hundreds of thousands times, but a user can follow 1000 people max.
So given the table below, what is the best SQL query to fetch all friends' ids of user 1?
PS: I'm using MySQL 5.5.
Here is what I have done so far:
SELECT followee_id AS friend_id FROM follow
WHERE follower_id = 1 AND
followee_id IN (SELECT follower_id FROM follow
WHERE followee_id = 1);
CREATE TABLE follow
(
follower_id INT UNSIGNED NOT NULL,
followee_id INT UNSIGNED NOT NULL,
PRIMARY KEY (follower_id, followee_id),
INDEX (followee_id, follower_id)
);
Assuming that by 'best' you mean most performant, and given that a following must be mutual in order to meet your 'friend' criteria:
A filter using followee_id will hit your index better than a filter on follower_id
select
me.follower_id
from
follow me inner join
follow you
on
me.follower_id = you.followee_id
and me.followee_id = you.follower_id
where
me.followee_id = #user
(although note that RDBMS's like MSSQL will default to using your Primary Key as a clustered index, in which case its much of a muchness really.)