Database choice: frequently querying 2nd degree connections - mysql

My web app needs to ALWAYS query 2nd degree connections. Each user has say 200 friends & those friends have 200 friends each. I could use some help in determining the right database (and table structure) to make this web app fast & responsive.
Business logic: Users search their 1st & 2nd degree connections to get a list of other users who use a specific service (stored in one column as unsigned int). That's the only functionality of this app.
Table structure:
User Table: User_ID (pk), Facebook_ID (sk), Name, Specific-service, Location
Relationship Table: still undecided.
Question: I read many posts & searched the web for "social networking database design". However, these applications feel much different than mine. I will have many users (+10 mil) but a small database & run only one query as described in business logic.
Additional info: Users can register (& subsequently log-in) only using their Facebook account. Their friends will be invited (via Facebook) to also register. The Relationship Table will be populated once friends register (only active/not-blocked/not-pending friends). Thus I can get rid of "friendship status" column from Relationship Table.

You need a table that has two ids in it; it will define a 'Friend'. Is this relationship symmetric? That is, if A is a friend of B, is B a friend of A? Well, I will assume there are 2 rows when both occur.
Then
CREATE TABLE Friends (
user1 ...,
user2 ...,
PRIMARY KEY(user1, user2),
INDEX( user2, user1)
) ENGINE=InnoDB;
SELECT a.name, c.name
FROM Users AS a
JOIN Friends AS ab ON ab.user1 = a.user_id
JOIN Users AS b ON b.user_id = ab.user2
JOIN Friends AS bc ON bc.user1 = b.user_id
JOIN Users AS c ON c.user_id = bc.user2
WHERE a.user_id = ?

Related

Relating 2 tables (users and groups) via IDs

I have 2 tables in my MySQL database for users and groups. I need to relate users with groups and groups with the users. The only way that came my mind is having a group_ids col for users and user_ids col for groups. I have to do like this because I will show users' groups that they registered in their profile and I will show registered users in groups' users pages.
In this option I need to to store group ids for users like "2,5,14" and same in groups for registered user ids like "22,24,15 ...".
It sounds okey to me but parsing IDs on back-end from commas is not sounds "professional". And also I have concerns for the performance when there is huge amounts of users in a group.
I know this seems like a opinion based question but I have a question and I think it is not opinion based.
Is there a usage like this in "data science"? I mean, is this a common usage or am I missing something here because I really can't think something else.
You could create a new table called user_group wich stores the user_id and group_id as foreign key and primary key
The you can get all groups by user with
SELECT item1, item2...
FROM user
INNER JOIN user_group on user.user_id = user_group.user_id
INNER JOIN group on user_group.group_id = group.group_id
WHERE user.user_id = id;

mysql - Maintaining Subscription List of a Group in a Website

I'm creating a website where the users can join certain groups. Now I need to maintain the set of users in each group and/or the set of groups that each user has joined. Since MySql doesn't support arrays, I cannot maintain say, an array of users in a group(as a field in the "groups" table) or an array of groups in a user(as a field in the "users" table). So how can I achieve this?
My current solution is to maintain a table of group-subscriptions which has fields for the userID and groupID. So when I need either of these two lists I can do,
SELECT USERID FROM SUBSCRIPTIONS WHERE GROUPID=3
or
SELECT GROUPID FROM SUBSCRIPTIONS WHERE USERID=4
This will get me the desired lists. Is this the most efficient/standard way to do this or is there a better way?
You wrote all right.
Normally there are 3 types of relations between records in relative databases:
One - one (e.g. user and profile linked via user.profile_id = profile.id)
One - many (user and messages linked via message.user_id = user.id)
Many - many
Your case is the last and it always works via a 3rd table.
For your case it can be users_subscriptions (user_id, subscription_id)
Example query to select all users with their subscriptions:
SELECT u.name, GROUP_CONCAT(s.name) as `subscriptions`
FROM users u
JOIN users_subscriptions us ON us.user_id = u.id
JOIN subscriptions s ON us.subscription_id = s.id
GROUP BY u.id
If I understand your question correctly, that is the standard way.
You've created a "pivot table" that sits between the user table and the groups table and it stores the relationships between the two. This is the way that many-to-many relationships are stored in relational databases. As you correctly stated, you can retrieve all members of a group or all groups for a member that way.

How do i get relationships between users?

I have a table called users, that looks like this for example:
Table: users
username id
Simon 6
Ida 7
And a relationships table
Table: Relationships
me partner
6 7
Now for every relationship that is created the me and partner columns will get the id depending on which one of the users sent the request.
If Ida sends a request to Simon, the column looks like this: me: 7 partner: 6
because Ida is id number 7 and Simon is number 6.
But if Simon sends a request then the columns looks like this: me: 6 partner: 7
What i want to do is to write a query that will get the right id from the relationships table for every user with a relationship, and then use the id to Join the users table and get the partners username and print it out.
My problem is that the columns Me and Partner can have different values depending on which of the 2 users sended the request first.
How do i write a query that prints out the right information for each and every user and give them the right id of their partner?
How the output should look like:
From simons point of view*
: You are in a relationship with Ida.
From Idas point of view *
: You are in a relationship with Simon.
I want to offer an alternative.
Join the user table to the relationship table either on the me column or the partner column.
Then join the user table back onto the relationship table using the same criteria.
SELECT u.username,
FROM users u
JOIN relationships r ON u.id = r.me OR u.id = r.partner
JOIN users u2 ON r.me = u2.id OR r.partner = u2.id
WHERE u.id = 'THE USER YOU CARE ABOUT'
Untested. If you throw up some CREATE TABLEs and INSERT INTOs we can all have a play.
If you want to get all users which have a relationship with some user (ex.: id:2) you specify, I would use something like this:
SELECT id, username
FROM user,
(SELECT receiver_id as rel_id FROM relationship WHERE sender_id=2 union
SELECT sender_id as rel_id FROM relationship WHERE receiver_id=2) tab
WHERE id=rel_id
The inner statement means you are selecting all the relationships in which 2 was the sender and all the relationships in which 2 was the receiver and stacking them in the same column (union).
After that, the table user is used to get the user name from the id.
Select username as username,
(select username name from users uSubq where uSubq.id = r.partner) as partnername
from users u join relationships r on u.id=r.me
UNION
Select username as username,
(select username name from users uSubq where uSubq.id = r.me) as partnername
from users u join relationships r on u.id=r.partner;
Here is the test for the query: Sql Fiddle
If either Ida sent request to Simon or the reverse but not both, use UNION ALL for better performance.

how to find a user available online by checking two tables?

I have 2 tables 'Users' and 'Friends':
Table: Users(username,email,location,online_status)
Table: Friends(Friend_rq_from,Friend_request_to,status)
'Users' table is for number of registered users in a website
and 'Friends' table for individual users for maintaing thier friends..
i want to count eg: mithun's friends from 'Friends' table(1 indicating they are friends in Friends table) who are online in Users table..(1 indicate online in 'users' table)
its a little bit difficult for me because, 'Friends' table have 2 fields which 'mithun' can be in 'Friend_rq_from' field or 'Friend_request_to' field...
how to find this?
NB: Friend_rq_from: Indicating who start a friend request,
Friend_request_to: Indicating to whom a request sent(or who is waiting to accept)
SELECT COUNT(distinct u.username)
FROM Users u
INNER JOIN Friends f
ON (u.username = f.Friend_rq_from AND f.Friend_rq_to='mithun' AND f.status=1)
OR (u.username = f.Friend_rq_to AND f.Friend_rq_from='mithun' AND f.status=1)
WHERE u.online_status = 1 AND u.username <> 'mithun'

MySQL Multiple interests matching problem

I have a database where users enter their interests. I want to find people with matching interests.
The structure of the interest table is
interestid | username | hobby | location | level | matchinginterestids
Let's take two users to keep it simple.
User Joe may have 10 different interest records
User greg may have 10 different interest records.
I want to do the following algorithm
Take Joe's interest record 1 and look for matching hobbies and locations from the interest database. Put any matching interest id's in the matches field. Then go to joe's interest record 2 etc..
I guess what I need is some sort of for loop that will loop through all of joe's intersts and then do an update each time it finds a match in the interest database. Is that even possible in MySQL?
Further example:
I am Dan. I have 3 interests. Each interest is composed of 3 subjects:
Dan cats,nutrition,hair
Dan superlens,dna,microscopes
Dan film,slowmotion,fightscenes
Other people may have other interests
Joe:
Joe cats,nutrition,strength
Joe superlens,dna,microscopes
Moe
Moe mysql,queries,php
Moe film,specialfx,cameras
Moe superlens,dna,microscopes
Now I want the query to return the following when I log in as Dan:
Here are your interest matches:
--- is interested in cats nutrition hair
Joe is interested in cats and nutrition
Joe and Moe are interested in superlens, dna, microscopes
Moe is interested in film
The query needs to iterate through all Dan's interests, and compare 3,2,1 subject matches.
I could do this in php from a loop but it would be calling the database all the time to get the results. I was wondering if there's a crafty way to do it using a single query Or maybe 3 separate queries one looking for 3 matches, one for 2 and one for 1.
This is definitely possible with MySQL, but I think you may be going about it in an awkward way. I would begin by structuring the tables as follows:
TABLE Users ( userId, username, location )
TABLE Interests( interestId, hobby )
TABLE UserInterests( userId, interestId, level )
When a user adds an interest, if it hasn't been added before, you add it to the Interests table, and then add it to the UserInterests table. When you want to check for other nearby folks with similar interests, you can simply query the UserInterests table for other people who have similar interests, which has all that information for you already:
SELECT DISTINCT userId
FROM UserInterests
WHERE interestId IN (
SELECT interestId
FROM UserInterests
WHERE userId = $JoesID
)
This can probably be done in a more elegant fashion without subqueries, but it's what I thought of now.
As per special request from daniel, although it's kind of duplicate but never mind.
The schema explained
TABLE User (id, username, location )
TABLE Interests(id, hobby )
TABLE UserInterest(userId, interestId, level )
Table users has just user data and a primary key field at the start: id.
The primary key field is a pure link field, the other fields are info fields.
Table Interest again has a primary key that is use to link against and some info field
(ehm well just one, but that's because this is an example)
Note that users and interests are not linked in any way whatsoever.
That's odd, why is that?
Well there is a problem... One user can have multiple intrests and intrests can belong to multiple people.
We can solve this by changing to users table like so:
TABLE users (id, username, location, intrest1, intrest2, intrest3)
But this is a bad, really really bad idea, because:
This way only 3 interests per user are allowed
It's a waste of space if many users have 2, 1 or no interests
And most important, it makes queries difficult to write.
Example query for linking with the bad users table
SELECT * FROM user
INNER JOIN interests ON (user.intrest1 = interests.id) or
(user.intrest2 = interests.id) or
(user.intrest3 = interests.id);
And that's just for a simple query listing all users and their interests.
It quickly gets horribly complex as things progress.
many-to-many relationships
The solution to the problem of a many to many relationship is to use a link table.
This reduces the many-to-many relationship into two 1-to-many relationships.
A: 1 userinterest to many user's
B: 1 userinterest to many interests
Example query using a link-table
SELECT * FROM user
INNER JOIN userInterest ON (user.id = userInterest.userID) //many-to-1
INNER JOIN interest ON (interest.id = userInterest.InterestID); //many-to-1
Why is this better?
Unlimited number of interests per user and visa versa
No wasted space if a user has a boring life and few if any interests
Queries are simpler to maintain
Making it interesting
Just listing all users is not very fun, because then we still have to process the data in php or whatever. But there's no need to do that SQL is a query language after all so let's ask a question:
Give all users that share an interest with user Moe.
OK, lets make a cookbook and gather our ingredients. What do we need.
Well we have a user "Moe" and we have other user's, everybody but not "Moe".
And we have the interests shared between them.
And we'll need the link table userInterest as well because that's the way we link user and interests.
Let's first list all of Moe's Hobbies
SELECT i_Moe.hobby FROM interests AS i_Moe
INNER JOIN userInterests as ui2 ON (ui2.InterestID = i_Moe.id)
INNER JOIN user AS u_Moe ON (u_Moe.id = ui2.UserID)
WHERE u_Moe.username = 'Moe';
Now we combine the select for all users against only Moe's hobbies.
SELECT u_Others.username FROM interests AS i_Others
INNER JOIN userinterests AS ui1 ON (ui1.interestID = i_Others.id)
INNER JOIN user AS u_Others ON (ui1.user_id = u_Others.id)
/*up to this point this query is a list of all interests of all users*/
INNER JOIN Interests AS i_Moe ON (i_Moe.Hobby = i_Others.hobby)
/*Here we link Moe's hobbies to other people's hobbies*/
INNER JOIN userInterests as ui2 ON (ui2.InterestID = i_Moe.id)
INNER JOIN user AS u_Moe ON (u_Moe.id = ui2.UserID)
/*And using the link table we link Moe's hobbies to Moe*/
WHERE u_Moe.username = 'Moe'
/*We limited user-u_moe to only 'Moe'*/
AND u_Others.username <> 'Moe';
/*and the rest to everybody except 'Moe'*/
Because we are using INNER JOIN's on link fields only matches will be considered and non-matches will be thrown out.
If you read the query in english it goes like this.
Consider all users who are not Moe, call them U_others.
Consider user Moe, call him U_Moe.
Consider user Moe's Hobbies, call those i_Moe
Consider other users's Hobbies, call those i_Others
Now link i_Others hobbies to u_Moe's Hobbies
Return only users from U_Others that have a hobby that matches Moe's
Hope this helps.