Query for suggested friends based on mutual friend count? - mysql

I'd like to suggest users based on mutual friend count, like this.
Suggested Friends:
Amy Adams (42 mutual friends)
Brian Bautista (21 mutual friends)
Chris Cross (6 mutual friends)
Note: I have read several similar posts and answers, however the solutions are very dependent on the table structure, and I haven't been able to get anything to work with how our Friendships table is setup. See below.
USERS TABLE
id
name
FRIENDSHIPS TABLE
id
initiatingUserId (FK to Users Table)
targetUserId (FK to Users Table)
status ('active', 'denied', 'pending')
As you can see, there is a single row for each friendship. The user who sent the friend request is the initiating user, and the person who accepts the friend request is the target user. There are indexes on both those columns.
How can I efficiently get a list of users who are not yet friends with the current user, along with their mutual friend counts?
Here is a sample data set, as per #Strawberry's request...
CREATE TABLE `users` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`firstName` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`lastName` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `friends` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`initiatingUserId` int(10) unsigned NOT NULL,
`targetUserId` int(10) unsigned NOT NULL,
`status` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'pending',
PRIMARY KEY (`id`),
UNIQUE KEY `compositeInitiatingUserIdTargetUserIdIndex` (`initiatingUserId`,`targetUserId`),
KEY `friends_initiating_user_id` (`initiatingUserId`),
KEY `friends_target_user_id` (`targetUserId`),
CONSTRAINT `friends_ibfk_1` FOREIGN KEY (`initiatingUserId`) REFERENCES `users` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `friends_ibfk_2` FOREIGN KEY (`targetUserId`) REFERENCES `users` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
);
INSERT INTO `users` (`id`, `firstName`, `lastName`) VALUES
(1, 'A', 'A'),
(2, 'B', 'B'),
(3, 'C', 'C'),
(4, 'D', 'D'),
(5, 'E', 'E'),
(6, 'F', 'F'),
(7, 'G', 'G');
INSERT INTO `friends` (`id`, `initiatingUserId`, `targetUserId`, `status`) VALUES
( 1, 1, 2, 'active'),
( 2, 1, 3, 'active'),
( 3, 4, 1, 'active'),
( 4, 2, 3, 'active'),
( 5, 2, 4, 'active'),
( 6, 5, 2, 'active'),
( 7, 2, 6, 'active'),
( 8, 2, 7, 'active'),
( 9, 3, 5, 'active'),
(10, 6, 3, 'active'),
(11, 4, 5, 'active'),
(12, 6, 7, 'active');
...and SQL Fiddle of same
The desired output, assuming we're pulling suggested users for User ID 1, would be:
+----------------+-------------+
| strangerUserId | mutualCount |
+----------------+-------------+
| 5 | 2 |
+----------------+-------------+
| 6 | 2 |
+----------------+-------------+
| 4 | 1 |
+----------------+-------------+
| 7 | 1 |
+----------------+-------------+
I do have a solution, but it assumes we already have the ids of the user's friends, and I'm not sure how fast it'll run:
select
count(*) as 'mutualCount',
case when f.initiatingUserId in (2, 3) then f.targetUserId else f.initiatingUserId end as strangerUserId
from friends f
join users initiating on f.initiatingUserId=initiating.id
join users target on targetUserId=target.id
and (
(
initiatingUserId in (2, 3)
and targetUserId not in (1, 2, 3)
)
or
(
targetUserId in (2, 3)
and initiatingUserId not in (1, 2, 3)
)
)
group by strangerUserId
order by mutualCount desc;

I have come up with this quick (MySql only) solution.
Note: This query is not optimized for your existing indexes and overall is probably going to have performance issues on a larger data set. It will also be hard to modify the query should you need to add some extra criteria. I would use individual parts of the query to pull the list of friends first then the list of non-friends and then run individual queries to identify mutual friends counts.
So here is the query to pull the list of non friends with mutual friends counts for userId '1'. You will have to replace hardcoded userId value of '1' in the query with dynamic variable:
SELECT COUNT(fnf.friendOfNonFriendId) AS mutualFriednCount, fnf.nonFriendId
FROM
-- Subquery to pull the list of non-friends and their friends
(SELECT
IF(f.initiatingUserId = nf.nonFriendId, f.targetUserId, f.initiatingUserId) AS friendOfNonFriendId,
nf.nonFriendId
FROM friends f
JOIN
(SELECT u.id AS nonFriendId
FROM users u
LEFT JOIN (
SELECT IF(f.initiatingUserId = 1, f.targetUserId, f.initiatingUserId) AS friendId
FROM friends f
WHERE f.status = 'active' AND (f.initiatingUserId = 1 OR f.targetUserId = 1)) f
ON f.friendId = u.id
WHERE f.friendId IS NULL AND u.id != 1) nf ON nf.nonFriendId = f.initiatingUserId
OR nf.nonFriendId = f.targetUserId) fnf
-- Joining the subquery with the list of friends of a user with ID 1 to filter the above subquery to only mutual friends
JOIN
(SELECT IF(f.initiatingUserId = 1, f.targetUserId, f.initiatingUserId) AS friendId
FROM friends f
WHERE f.status = 'active' AND (f.initiatingUserId = 1 OR f.targetUserId = 1)) f
ON f.friendId = fnf.friendOfNonFriendId
GROUP BY fnf.nonFriendId

Related

MySQL Joining 3 Quiz Tables with Calculated Totals By Attribute/User

I have the following tables which I am trying to combine into one query (with the ultimate goal being to output a CSV file from PHP):
users
id, name, email
eg. data
1, John, email#email.com
2, Jane, email#email.com
questions - a static list of questions, each question is of one of 3 types stored as "attribute," which is a single letter [A, B, C]
id, text, attribute
eg. data
1, How cool are dogs?, A
2, How cool are cats?, B
3, How cool are fish?, A
4, How cool are mice?, C
5, How cool are birds?, B
users_questions - where answer is an integer [1-5]
id, user_id, question_id, answer
eg. data
1, 1, 1, 2
2, 1, 2, 5
3, 1, 3, 1
4, 1, 4, 1
5, 1, 5, 4
6, 2, 1, 4
7, 2, 2, 1
8, 2, 3, 3
9, 2, 4, 2
10, 2, 5, 2
Desired results:
I'm trying to combine all of this data as one query with the questions for each user totalled grouped by attribute so the output format is something like:
users.id, users.name, users.email , A_question_total, B_question_total, C_question_total
1, John , email#email.com, 3, 9, 1
2, Jane , email#email.com, 7, 3, 2
What I have currently:
I've tried the queries below which all give me almost what I need:
I'm able to select everything joined together, but this duplicates users and questions and doesn't give me the question totals by user/attribute:
Select * FROM users
JOIN users_questions ON users.id = users_questions.user_id
JOIN questions ON questions.id = users_questions.question_id;
I can also select all the question totals by user/attribute, but then I'd have to grab the users separately and then connect them together in PHP.
SELECT questions.attribute, users_questions.user_id, SUM(users_questions.answer) AS `total` FROM `questions` LEFT OUTER JOIN `users_questions` ON questions.id = users_questions.`question_id` GROUP BY users_questions.user_id, questions.attribute;
I'm wondering if there is a way with complex joining, grouping, subqueries, etc. to be able to do this all in one query. I'm struggling as to how to combine the above two queries and also convert what is essentially separate calculated total "rows" in columns for each user.
Here's the sql dump of the example data:
DROP TABLE IF EXISTS `questions`;
CREATE TABLE `questions` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`text` varchar(30) DEFAULT NULL,
`attribute` enum('A','B','C') DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `questions` (`id`, `text`, `attribute`)
VALUES
(1,'How cool are dogs?','A'),
(2,'How cool are cats?','B'),
(3,'How cool are fish?','A'),
(4,'How cool are mice?','C'),
(5,'How cool are birds?','B');
DROP TABLE IF EXISTS `users`;
CREATE TABLE `users` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(10) DEFAULT NULL,
`email` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `users` (`id`, `name`, `email`)
VALUES
(1,'John','email#email.com'),
(2,'Jane','email#email.com');
DROP TABLE IF EXISTS `users_questions`;
CREATE TABLE `users_questions` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(11) unsigned DEFAULT NULL,
`question_id` int(11) unsigned DEFAULT NULL,
`answer` tinyint(1) unsigned DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `users_questions` (`id`, `user_id`, `question_id`, `answer`)
VALUES
(1,1,1,2),
(2,1,2,5),
(3,1,3,1),
(4,1,4,1),
(5,1,5,4),
(6,2,1,4),
(7,2,2,1),
(8,2,3,3),
(9,2,4,2),
(10,2,5,2);
Any help is appreciated!
I think this query (SQLFiddle) will solve your problem:
Select u.id, u.name, u.email,
SUM(case q.attribute when 'A' then uq.answer else 0 end) as A_question_total,
SUM(case q.attribute when 'B' then uq.answer else 0 end) as B_question_total,
SUM(case q.attribute when 'C' then uq.answer else 0 end) as C_question_total
FROM users u
JOIN users_questions uq ON u.id = uq.user_id
JOIN questions q ON q.id = uq.question_id
group by u.id
Output:
id name email A_question_total B_question_total C_question_total
1 John email#email.com 3 9 1
2 Jane email#email.com 7 3 2

mysql - grouping results taking the first in an ordered subquery

In mysql, I'm having trouble pulling a single row for each foreign_id based on the largest value. Strangely, different versions of mysql works (listed below)
id foreign_id value
---------------------
1 1 1000
2 1 2000
3 2 2000
4 2 1000
5 3 2000
I try to pull ids 2,3,5 not 1,3,5
CREATE TABLE `docs` (
`id` int(11) NOT NULL,
`foreign_id` int(6) DEFAULT NULL,
`value` int(8) UNSIGNED NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `docs`
ADD PRIMARY KEY (`id`),
ADD KEY `foreign_id_index` (`foreign_id`);
INSERT INTO `docs` (`id`, `foreign_id`, `value`) VALUES
(1, 1, 1000), (2, 1, 2000), (3, 2, 2000), (4, 2, 1000), (5, 3, 2000)
select
docs.id, docs.foreign_id, docs.value
FROM docs
INNER JOIN
(select id, max(value) from docs group by foreign_id) sub
ON sub.id = docs.id
# expected results are ids (2,3,5), not (1,3,5)
It's simpler than it looks
SELECT D.id, D.foreign_id, max_vals.max_val as value
FROM docs D
JOIN
(SELECT foreign_id, MAX(value) as max_val
FROM docs
GROUP BY foreign_id) max_vals
ON D.foreign_id=max_vals.foreign_id and D.value=max_vals.max_val
In this case, you need JOIN and not INNER JOIN
There is a difficult case where there are 2 or more foreign_ids with the same MAX value...

Joining table with min(amount) does not work

I have 3 tables, but data is only fetch from 2 tables.
I'm trying to get the lowest bids for selected items and display user name with the lowest bid.
Currently query works until when we display user name, it shows wrong user name, which does not match the bid.
Below is working example of structure and query.
SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE `bid` (
`id` int(11) NOT NULL,
`amount` float NOT NULL,
`user_id` int(11) NOT NULL,
`item_id` int(11) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;
INSERT INTO `bid` (`id`, `amount`, `user_id`, `item_id`) VALUES
(1, 9, 1, 1),
(2, 5, 2, 1),
(3, 4, 3, 1),
(4, 3, 4, 1),
(5, 4, 2, 2),
(6, 22, 5, 1);
-- --------------------------------------------------------
CREATE TABLE `item` (
`id` int(11) NOT NULL,
`name` varchar(100) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1;
INSERT INTO `item` (`id`, `name`) VALUES
(1, 'chair'),
(2, 'sofa'),
(3, 'table'),
(4, 'box');
-- --------------------------------------------------------
CREATE TABLE `user` (
`id` int(11) NOT NULL,
`name` varchar(100) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1;
INSERT INTO `user` (`id`, `name`) VALUES
(1, 'James'),
(2, 'Don'),
(3, 'Hipes'),
(4, 'Sam'),
(5, 'Zakam');
ALTER TABLE `bid`
ADD PRIMARY KEY (`id`);
ALTER TABLE `item`
ADD PRIMARY KEY (`id`);
ALTER TABLE `user`
ADD PRIMARY KEY (`id`);
ALTER TABLE `bid`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=7;
ALTER TABLE `item`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=5;
ALTER TABLE `user`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=5;
Query 1:
SELECT b.id, b.item_id, MIN(b.amount) as amount, b.user_id, p.name
FROM bid b
LEFT JOIN user p ON p.id = b.user_id
WHERE b.item_id in (1, 2)
GROUP BY b.item_id
ORDER BY b.amount, b.item_id
Results:
| id | item_id | amount | user_id | name |
|----|---------|--------|---------|-------|
| 5 | 2 | 4 | 2 | Don |
| 1 | 1 | 3 | 1 | James |
Explanation of query:
Get the selected items (1, 2).
get the lowest bid for thous items - MIN(b.amount)
display user names, who has given the bid - LEFT JOIN user p on p.id = b.user_id (this is not working or I'm doing something wrong)
[Note] I can't use sub-query, I'm doing this in doctrine2 (php code) which limits mysql sub-query
No, you are not necessarily fetching the user_id who has given the bid. You group by item_id, so you get one result row per item. So you are aggregating and for every column you say what value you want to see for that item. E.g.:
MIN(b.amount) - the minimum amount of the item's records
MAX(b.amount) - the maximum amount of the item's records
AVG(b.amount) - the avarage amount of the item's records
b.amount - one of the amounts of the item's records arbitrarily chosen (as there are many amounts and you don't specify which you want to see, the DBMS simply choses one of them)
This said, b.user_id isn't necessarily the user who made the lowest bid, but just one random user of the users who made a bid.
Instead find the minimum bids and join again with your bid table to access the realted records:
select bid.id, bid.item_id, bid.amount, user.id as user_id, user.name
from bid
join
(
select item_id, min(amount) as amount
from bid
group by item_id
) as min_bid on min_bid.item_id = bid.item_id and min_bid.amount = bid.amount
join user on user.id = bid.user_id
order by bid.amount, bid.item_id;
You can solve this using a subquery. I am not 100% sure if this is the most efficient way, but at least it works.
SELECT b1.id, b1.item_id, b1.amount, b1.user_id, p.name
FROM bid b1
LEFT JOIN user p ON p.id = b1.user_id
WHERE b1.id = (
SELECT b2.id
FROM bid b2
WHERE b2.item_id IN (1, 2)
ORDER BY b2.amount LIMIT 1
)
This first selects for the lowest bid with for item 1 or 2 and then uses the id of that bid to find the information you need.
Edit
You are saying that Doctrine does not support subqueries. I have not used Doctrine a lot, but something like this should work:
$subQueryBuilder = $entityManager->createQueryBuilder();
$subQuery = $subQueryBuilder
->select('b2.id')
->from('bid', 'b2')
->where('b2.item_id IN (:items)')
->orderBy('b2.amount')
->setMaxResults(1)
->getDql();
$queryBuilder = $entityManager->createQueryBuilder();
$query = $queryBuilder
->select('b1.id', 'b1.item_id', 'b1.amount', 'b1.user_id', 'p.name')
->from('bid', 'b1')
->leftJoin('user', 'p', 'with', 'p.id = b1.user_id')
->where('b1.id = (' . $subQuery . ')')
->setParameter('items', [1, 2])
->getQuery()->getSingleResult();

Function to find first available option based on count of records and condition

I need to write an SQL statement to get the first 'free' poule (pool / collection of teams) for my team. Let's explain a bit.
I have two tables, one table poules with 4 poules each having a TEAMQTY of 4 (the max. number of teams allowed in a poule):
ID TOURNID NAME TEAMQTY
1 1 Poule 1 4
2 1 Poule 2 4
3 1 Poule 3 4
4 1 Poule 4 4
and a table teams
ID TOURNID NAME POULEID
1 1 Team 1 1
2 1 Team 2 1
3 1 Team 3 1
4 1 Team 4 1
I want to write a function in mysql which based on the situation above suggest a pouleid of 2 since poule 1 is completely filled up with teams. IOW I should be able to insert 4 more teams in PouleId 2, after that my function should return PouleID 3 as a suggestion.
I'm new to mysql (an sql noob) and I've tried:
SELECT id FROM POULES WHERE TOURNID = 1 AND
teamqty > (SELECT COUNT(ID) FROM TEAMS WHERE TOURNID = 1) LIMIT 1
Needless to say my experiment sql code is useless..
Do I need a while loop here or would an SQL statement do?
Here's my supporting code:
CREATE TABLE IF NOT EXISTS `poules` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`TOURNID` int(11) NOT NULL,
`NAME` varchar(20) NOT NULL,
`TEAMQTY` int(11) NOT NULL,
PRIMARY KEY (`ID`),
KEY `TOURNID` (`TOURNID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;
INSERT INTO `poules` (`ID`, `TOURNID`, `NAME`, `TEAMQTY`) VALUES
(1, 1, '1', 4),
(2, 1, '2', 4),
(3, 1, '3', 4),
(4, 1, '4', 4);
CREATE TABLE IF NOT EXISTS `teams` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`TOURNID` int(11) NOT NULL,
`NAME` varchar(50) NOT NULL,
`POULEID` int(11) DEFAULT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `NAME` (`NAME`),
KEY `TOURNID` (`TOURNID`))
ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=6 ;
INSERT INTO `teams` (`ID`, `TOURNID`, `NAME`, `POULEID`) VALUES
(1, 1, '1', 1),
(2, 1, '2', 1),
(3, 1, '3', 1),
(4, 1, '4', 1);
TIA Mike
you can do left join with a subquery that gets total team count and compares with team count in the main table
you can use limit to get the one result based on order by on team count.
select p.id as pouleid, ifnull(t.teamcount,0), p.tournid
from poules p
left join ( select count(pouleid) as teamcount, pouleid, tournid
from teams
group by pouleid, tournid
)t
on p.id = t.pouleid
and p.tournid = t.tournid
where ifnull(t.teamcount,0) < p.teamqty

Mysql - Help me alter this search query to get desired results

Following is a dump of the tables and data needed to answer understand the system:-
The system consists of tutors and classes.
The data in the table All_Tag_Relations stores tag relations for each tutor registered and each class created by a tutor. The tag relations are used for searching classes.
CREATE TABLE IF NOT EXISTS `Tags` (
`id_tag` int(10) unsigned NOT NULL auto_increment,
`tag` varchar(255) default NULL,
PRIMARY KEY (`id_tag`),
UNIQUE KEY `tag` (`tag`),
KEY `id_tag` (`id_tag`),
KEY `tag_2` (`tag`),
KEY `tag_3` (`tag`),
KEY `tag_4` (`tag`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `Tags` (`id_tag`, `tag`) VALUES
(1, 'Sandeepan'),
(2, 'Nath'),
(3, 'first'),
(4, 'class'),
(5, 'new'),
(6, 'Bob'),
(7, 'Cratchit');
CREATE TABLE IF NOT EXISTS `All_Tag_Relations` (
`id_tag` int(10) unsigned NOT NULL default '0',
`id_tutor` int(10) default NULL,
`id_wc` int(10) unsigned default NULL,
KEY `All_Tag_Relations_FKIndex1` (`id_tag`),
KEY `id_wc` (`id_wc`),
KEY `id_tag` (`id_tag`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `All_Tag_Relations` (`id_tag`, `id_tutor`, `id_wc`) VALUES
(1, 1, NULL),
(2, 1, NULL),
(3, 1, 1),
(4, 1, 1),
(6, 2, NULL),
(7, 2, NULL),
(5, 2, 2),
(4, 2, 2),
(8, 1, 3),
(9, 1, 3);
Following is my query:-
This query searches for "first class" (tag for first = 3 and for class = 4, in Tags table) and returns all those classes such that both the terms first and class are present in the class name.
SELECT wtagrels.id_wc,SUM(DISTINCT( wtagrels.id_tag =3)) AS
key_1_total_matches,
SUM(DISTINCT( wtagrels.id_tag =4)) AS
key_2_total_matches
FROM all_tag_relations AS wtagrels
WHERE ( wtagrels.id_tag =3
OR wtagrels.id_tag =4 )
GROUP BY wtagrels.id_wc
HAVING key_1_total_matches = 1
AND key_2_total_matches = 1
LIMIT 0, 20
And it returns the class with id_wc = 1.
But, I want the search to show all those classes such that all the search terms are present in the class name or its tutor name
So that searching "Sandeepan class" (wtagrels.id_tag = 1,4) or "Sandeepan Nath" also returns the class with id_wc=1. And Searching. Searching "Bob First" should not return any classes.
Please modify the above query or suggest a new query, if possible using MyIsam - fulltext search, but somehow help me get the result.
I think this query would help you:
SET #tag1 = 1, #tag2 = 4; -- Setting some user variables to see where the ids go. (you can put the values in the query)
SELECT wtagrels.id_wc,
SUM(DISTINCT( wtagrels.id_tag =#tag1 OR wtagrels.id_tutor =#tag1)) AS key_1_total_matches,
SUM(DISTINCT( wtagrels.id_tag =#tag2 OR wtagrels.id_tutor =#tag2)) AS key_2_total_matches
FROM all_tag_relations AS wtagrels
WHERE ( wtagrels.id_tag =#tag1 OR wtagrels.id_tag =#tag2 )
GROUP BY wtagrels.id_wc
HAVING key_1_total_matches = 1 AND key_2_total_matches = 1
LIMIT 0, 20
It returns id_wc = 1.
For (6, 3) the query returns nothing.