I am implementing "Add Friend" in my web app, so users could add other users as fiends.
We have 2 tables: tbl_users and tbl_relations, tbl_users have an unique ID for the registered users and tbl_relations will store the users which are friends, for example some rows in tbl_relations are:
id user_id_1 user_id_2
1 4 6
2 4 8
3 8 23
4 12 84
5 3 4
...
In the above results, the id is the unique id for the tbl_relations, user_id_1 is foreign key to tbl_users and user_id_2 is foreign key to tbl_users, now imagine we want to query and check if user with id "4" is friend with user with id "9" or not, here we need to send the query in 2 ways, I mean:
SELECT * FROM tbl_relations WHERE (user_id_1 = '4' AND user_id_2 = '9') OR (user_id_1 = '9' AND user_id_2 = '4')
The above query seems a little weird to me, there should be another way for implementing this I guess, maybe a different database structure?
Or another query, we want to get the mutual friends between users with id "4" and "8", how I'm supposed to get the mutual friends in this scenario? is there any better database structure for this?
I would appreciate any kind of help.
I would de-normalize the relation such that it's symmetric. That is, if 1 and 2 are friends, i'd have two rows (1,2) and (2,1).
The disadvantage is that it's twice the size, and you have to do 2 writes when forming and breaking friendships. The advantage is all your read queries are simpler. This is probably a good trade-off because most of the time you are reading instead of writing.
This has the added advantage that if you eventually outgrow one database and decide to do user-sharding, you don't have to traverse every other db shard to find out who a person's friends are.
If you do it this way you'll have to check for duplicates every time you update. Why not have
user_id1 friend_id
and then query as
select * from tbl_relations where user_id1 in (4,9)
This still seems odd at it implies that 'friend' relations are one-way.
To get the 'mutual' friends - if you do it this way -
select * from tbl_relations t0
join tbl_relations t1 on t0.friend_id = t1.friend_id
where t0.user_id1 = ? and t1.user_id1 = ?
Related
I am new to this database SQL language so I will try to make it as simple as possible. (I am using MySQL Workbench)
I have a table for User and I want multiple users to be friends with each other. For that I created the friends table where it has a user_id and friend_user_id. So let's say we have (1,2), (2,3). I want this to be read as: "2 is friends with 1 and 3, 1 is friends with 2, 3 is friends with 2". So when inserting on this friends table I never do something like this (1,2),(2,1). I'm looking for a procedure that by receiving an user_id as parameter to return all his friends whether they are in the user_id column or the friend_user_id column. For example, if I look for user 2's friends it should appear 1 column with 1 and 3, because 1 and 3 are friends with 2.
To be more specific, when I call get_friends(2) it should appear
[1]
[3]
Even though these are in different columns on the friends table.
You can use IN to check if either column is equal to the ID of the user you want to look up and a CASE ... END to take the column which is not equal to the ID of the user you want to look up.
SELECT CASE
WHEN user_id = 2
THEN user_friend_id
WHEN user_friend_id = 2
THEN user_id
END friend
FROM friends
WHERE 2 IN (user_id, user_friend_id);
Alternatively you could use a UNION ALL approach, which might perform better as it can use indexes on user_id or user_friend_id.
SELECT user_id friend
FROM friends
WHERE user_friend_id = 2
UNION ALL
SELECT user_friend_id friend
FROM friends
WHERE friend_id = 2;
But this is only better if there are such indexes. If there aren't, it may need two scans on the table opposed to the first approach only needing one. So it's worse in that case.
Use UNION ALL to get all friends, parameterize this query:
select friend_user_id as friend_id from friends f where f.user_id = 2 --parameter
union all
select user_id as friend_id from friends f where f.friend_user_id = 2 --parameter
Trying to do things a bit different with a database, I got a table called "services", this table consist off pID, uID, serviceID.
Then I got a table called "user_profile", that of course got the same uID as used in the table services.
So a user can have multiple services, let's say
pID uID serviceID
1 1 101
2 1 102
3 1 104
4 2 105
So how do I join this to my user_profile data? I'm a bit confused about that.
Let's say somebody visits the profile with uID 1.
Then I need all the services to in the same SQL call if that's possible somehow?
Hope I make abit of sense.
In order to relate tables in SQL you must have in both tables the same column, in your example uID.
Then you write something like:
select a.uID,b.pID,b.serviceID from user_profile a left join services b on a.uID=b.uID
I'm making a following system in my site. For example, a user follows 500 people and when that user goes to the main page, it will show messages that is posted from those 500 people. And I will make a query like:
SELECT UserComments FROM comments_table
WHERE UserName = user1 OR user2 OR user3(...)
ORDER BY PostDate DESC.
The problem is, I want to fetch comments from users that is followed, and I want to fetch them with or operator, but I have to include a lot of or operator in between of usernames. Is it good practice to add a lot of or operators?
What should I do? I'm using MYSQL.
You're better off storing these relationships in a table and joining to the table per user.
So you have a User table with an ID. A Following table with LoggedInUserID and FollowedUserID. Then join through the Following table to get comments. Insert 1 record into Following for each User the LoggedInUser is following. This way you keep track of the relationships between users and make good use of the indexes on these tables.
SELECT comments.UserComments FROM Following
INNER JOIN comments on comments.UserID = Following.FollowedUserID
WHERE Following.LoggedInUserID = UserID;
User
UserID Field1 Field2
1 "Stuff" "OtherStuff"
2 "More" "Blah"
3 "And" "Blerg"
Following
LoggedInUser FollowedUserID
1 2
1 3
2 1
3 2
I think you should use IN in this case:
SELECT UserComments FROM comments_table
WHERE UserName IN (user1, user2, user3, .... userN)
ORDER BY PostDate DESC
And many operators OR will bad affect on performance
I would simply like to find a database structure in MySQL to get all users friends of friends and the corresponding query to retrieve them. (friend links are bi-directional)
I have found a couple posts related to that, but my concern is the performance:
Structure 1
Many posts suggest a structure where you have a table in which each row represents a friendship link e.g:
CREATE TABLE `friends` (
`user_id` int(10) unsigned NOT NULL,
`friend_id` int(10) unsigned NOT NULL,
)
saying the user '1' has three friend '2','3','4' and user '2' has two friend '1','5' . Your friend table would look like this:
user_id | friend_id
1 | 2
1 | 3
1 | 4
2 | 1
2 | 5
friends of friends query: How to select friends of friends can be seen here SQL to get friends AND friends of friends of a user. The rsult of the query for user '1' is supposed to give (1,2,3,4,5)
My concern: The average fb-user has about 140 friends. Frequent users will have a lot more.
If I have 20.000 users this will end up in at least 3million rows.
Structure 2
If I could use a structure like this:
CREATE TABLE `friends` (
`user_id` int(10) unsigned NOT NULL,
`friend_1` int(10) unsigned NOT NULL,
`friend_2` int(10) unsigned NOT NULL,
`friend_3` int(10) unsigned NOT NULL,
`friend_4` int(10) unsigned NOT NULL,
....
)
My table would look like this (taking example from above):
user_id | friend_1 | friend_2 | friend_3 | ...
1 | 2 | 3 | 4 |
2 | 1 | 5 | |...
Now I have only 20.000 rows.
friends of friends query: To select user friends of friends I tried
Select * FROM friends as a
WHERE a.user_id
IN (
SELECT * FROM friends AS b
WHERE b.user_id = '1'
)
but I get an error "#1241 - Operand should contain 1 column(s) ". I think the problem is, that the sub-selection passes a row, not a column?
Questions
I hope you understand my concern. I would be really really happy about any input to these questions
1)
find a query that returns all friends of friends for a specified user in structure 2?
2)
Which structure allows me to return friends of friends quicker?
In structure 2 I think the "join row with column" could be slow, if its even possible to use a join here. Thank you for any suggestions. If you could think of any other structures, maybe taking advantage of the small-world-network-type I'd be happy to hear them.
THANK YOU!!
Definitely use the first structure. Queries for the second structure will be huge, hard to maintain and slow because of complicated clauses.
A fast enough query for the first approach:
(
select friend_id
from friends
where user_id = 1
) union (
select distinct ff.friend_id
from
friends f
join friends ff on ff.user_id = f.friend_id
where f.user_id = 1
)
For the best performance you need to have these indexes:
ALTER TABLE `friends` ADD UNIQUE INDEX `friends_idx` (`user_id` ASC, `friend_id` ASC);
ALTER TABLE `friends` ADD INDEX `friends_user_id_idx` (`user_id` ASC);
I'd say you ought to use the first structure. It's more flexible in my opinion. My solution for the query would be a simple sub-query, like this:
SELECT friend_id FROM friends WHERE user_id IN (
SELECT friend_id FROM friends WHERE user_id='$USER_ID'
);
EDIT: Sorry I just woke up and realized after posting a reply that this wasn't at all what you were looking for. Sry.
Don't use "Structure 2" you cannot create a column for all users if just 1 user have let's say 100 friends (what abaout 10K friends or more ?) it results in low performance, for structure 1 you can do a simple join to the same table:
select u.user_id, f.friend_id
from friends as u
inner join friends as f
on (u.friend_id=f.friend_id);
EDIT:
you're error #1241 means that you use * in the subselect and the table returns more than 1 column, your subquery should return just one colums (no mather how many rows), so change your "*" with a "user_id" (without quotes)
Solution 1 is not only faster it is flexible, I don't recomand a subquery for a simple select like this, just join the same table (it's much faster than a subselect).
Solution 2 in my opinion is not a solution at all, it's not flexible, it slower, it uses more space on HD, more columns means less performance in mysql. How can you index such a thing? And how can you select by friend_id not by user_id, you look in every column for that friend_id ?
As the below answers state, solution 1 is preferred to solution 2. Also solution 1 will work out for a decent amount of data.
However, when things go bigger there is also a third solution - Graph Databases.
When your data model focuses on the "relations" instead of the "objects" RDBMSs don't scale well since they have to perform lookups through the tables concerned. DB Indexes make this easier but it was not enough so Graph Databases came to the rescue.
A Graph DB actually "stores" the relations next to each entity making it much faster to perform tasks like yours.
Here is some information to get you started:
http://www.slideshare.net/maxdemarzi/graph-database-use-cases
Neo4j or OrientDB are among the popular choices.
I've got 3 tables that are something like this (simplified here ofc):
users
user_id
user_name
info
info_id
user_id
rate
contacts
contact_id
user_id
contact_data
users has a one-to-one relationship with info, although info doesn't always have a related entry.
users has a one-to-many relationship with contacts, although contacts doesn't always have related entries.
I know I can grab the proper 'users' + 'info' with a left join, is there a way to get all the data I want at once?
For example, one returned record might be:
user_id: 5
user_name: tom
info_id: 1
rate: 25.00
contact_id: 7
contact_data: 555-1212
contact_id: 8
contact_data: 555-1315
contact_id: 9
contact_data: 555-5511
Is this possible with a single query? Or must I use multiple?
It is possible to do what you're asking in one query, but you'd either need a variable number of columns which is evil because SQL isn't designed for that, or you'd have to have a fixed number of columns, which is even more evil because there is no sensible fixed number of columns you could choose.
I'd suggest using one of two alternatives:
1. Return one row for each contact data, repeating the data in other columns:
5 tom 1 25.00 7 555-1212
5 tom 1 25.00 8 555-1315
5 tom 1 25.00 9 555-5511
The problem with this of course is that redundant data is normally a bad idea, but if you don't have too much redundant data it will be OK. Use your judgement here.
2. Use two queries. This means a slightly longer turnaround time, but less data to transfer.
In most cases I'd prefer the second solution.
You should try to avoid making a large number of queries inside a loop. This can almost always be rewritten to a single query. But if using two queries is the most natural way to solve your problem, just use two queries. Don't try to cram all the data you need into a single query just for the sake of reducing the number of queries.
Each row of result must have the same columns, so you can't aggregate multiple rows of contact not having the other columns as well.
Hopefully, this query would achieve what you need:
SELECT
u.user_id as user_id,
u.user_name as user_name,
i.info_id as info_id,
i.rate as rate,
c.contact_id as contact_id,
c.contact_data as contact_data
FROM users as u
LEFT JOIN info as i ON i.user_id = u.user_id
LEFT JOIN contacts as c ON c.user_id = u.user_id