Find the frequency of rows from multiple joint tables - mysql

I have this problem with SQL and I can't figure it out.
Imagine that I have 3 tables as follows
Names
Nameid name
1 Starbucks Coffee
2 Johns Restaurant
3 Davids Restaurant
user_likes
userid Nameid
1 1
2 1
2 3
user_visited
userid Nameid
1 2
I want to find the places with the most number of (likes+visited). I also want to select all places not just those who have been liked or visited
I do:
SELECT n.nameid, n.name , COUNT(f.nameid) AS freq
FROM names AS n
LEFT JOIN user_likes ON n.nameid=user_likes.nameid
LEFT JOIN user_visited ON n.nameid=user_visited.nameid
ORDER BY freq DESC
But it doesn't give me the total frequency. The problem is, if a place is both visited and liked, it is counted only once, while I want it to be counted twice.
Any suggestions?

I've made a quick test and although I prefer Serge's solution, this one seemed to perform faster as the amount of items to join will be less:
SELECT n.nameId, n.name, coalesce(sum(likesCount), 0) totalCount FROM NAMES n
LEFT JOIN (
SELECT nameId, count(*) likesCount FROM user_likes
GROUP BY nameId
UNION ALL
SELECT nameId, count(*) visitsCount FROM user_visited
GROUP BY nameId
) s ON n.nameId = s.nameId
GROUP BY n.nameId
ORDER BY totalCount DESC
I'm assuming the following indexes:
alter table names add index(nameid);
alter table user_likes add index(nameid);
alter table user_visited add index(nameid);
Probably the OP can compare the efficiency of both queries with actual data and provide feedback.

SELECT n.name, t.nameid, COUNT(t.nameid) AS freq
FROM Names n
JOIN (
SELECT nameid FROM user_likes
UNION ALL
SELECT nameid FROM user_visited
) t
ON n.nameid = t.nameid
GROUP BY t.nameid ORDER BY freq DESC

Mosty, your usage of coalesce() gave me an idea and I came up with this:
SELECT n.nameid, n.name ,
SUM((IFNULL(user_likes.userid,0)>0)+(IFNULL(user_visited.userid,0)>0) ) AS freq
FROM names AS n LEFT JOIN user_likes ON n.nameid=user_likes.nameid LEFT JOIN
user_visited ON n.nameid=user_visited.nameid ORDER BY freq DESC
Since my example here was a simplification of my problem (I have to join more than two tables to the main table) I'm reluctant to use SELECT inside SELECT, because I know it's not very efficient. Do you see any fundamental problem with my solution?

Related

Select ID of a row with max value

How can I select the ID of a row with the max value of another column in a query that joins multiple tables?
For example, say I have three tables. tblAccount which stores a grouping of users, like a family. tblUser which stores the users, each tied to a record from tblAccount. And each user can be part of a plan, stored in tblPlans. Each plan has a Rank column that determines it's sorting when comparing the levels of plans. For example, Lite is lower than Premium. So the idea is that each user can have a separate plan, like Premium, Basic, Lite etc..., but the parent account does not have a plan.
How can I determine the highest plan in the account with a single query?
tblAccount
PKID
Name
1
Adams Family
2
Cool Family
tblUsers
PKID
Name
AccountID
PlanID
1
Bob
1
3
2
Phil
2
2
3
Suzie
2
1
tblPlans
PKID
Name
Rank
1
Premium
3
2
Basic
2
3
Elite
4
4
Lite
1
Here's the result I'm hoping to produce:
AccountID
Name
HighestPlanID
PlanName
2
Adams Family
1
Premium
I've tried:
SELECT U.AccountID, A.Name, MAX(P.Rank) AS Rank, P.PKID as HighestPlanID, P.Name as PlanName
FROM tblPlans P
INNER JOIN tblUsers U ON U.PlanID = P.PKID
INNER JOIN tblAccounts A ON U.AccountID = A.PKID
WHERE U.AccountID = 2
and the query will not always work, selecting the MAX of Rank does not select entire row's values from tblPlans.
I am looking for a solution that is compatible with mysql-5.6.10
You can join the tables and use ROW_NUMBER() to identify the row you want. Then filtering is ieasy.
For example:
select *
from (
select a.*, p.*,
row_number() over(partition by a.pkid order by p.rank desc) as rn
from tblaccount a
join tblusers u on u.accountid = a.pkid
join tblplans p on p.pkid = u.planid
) x
where rn = 1
Inside the subquery you can add where u.accountid = 2 to retrieve a single account of interest, instead of all of them.
With the help of #the-impaler, I massaged their answer a bit and came out with something very similar:
select *
from (
select a.*, p.*
from tblaccount a
join tblusers u on u.accountid = a.pkid
join tblplans p on p.pkid = u.planid
where u.accountid = 2
order by p.rank desc
) x limit 1
The subquery sorts each user by plan rank from top to bottom, and then the top level query selects the top most row with limit 1. It seems to work!

Remove duplicates based on rank after join in SQL request

I am using MySQL 5.6.
I have a SQL table with a list of users:
id name
1 Alice
2 Bob
3 John
and a SQL table with the list of gifts for each user (numbered in order of preference):
id gift rank
1 balloon 2
1 shoes 1
1 seeds 3
1  video-game 1
2 computer 2
3 shoes 2
3 hat 1
And I would like a list of the preferred gift for each user (the highest rank - if two gifts have the same rank, pick only one randomly) (bonus: if the list could be randomized, that would be perfect!):
id name gift rank
2 Bob computer 2
1 Alice shoes 1
3 John hat 1
I tried to use the clause GROUP BY but without any success.
Considering rank as a part of your data; Without using window functions or complex sub queries
SELECT u.id, u.name, g.gift
FROM users u
JOIN gifts g ON g.id = u.id
LEFT JOIN gifts g2 ON g2.id = g.id AND g2.rank > g.rank
WHERE g2.id IS NULL;
Added link http://sqlfiddle.com/#!9/62f59e/15/0
You can use row_number to get one row for each User.(Mysql 8.0+)
SELECT A.ID,NAME,GIFT,`RANK` FROM USERS A
LEFT JOIN (
SELECT ID,GIFT,`RANK` FROM
(SELECT *,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY `RANK` ASC) AS RN FROM X) X
WHERE RN =1
) B
ON A.ID= B.ID
I do not know DB what you use. And I'm not an expert in SQL(I can have some mistake in next). But I think it is not difficult.
So I can give you just advice that you have to think gradually. Let me write.
First All I need is the highest rank. So I have to get this.
SELECT MAX(RANK)
FROM GIFT
GROUP BY ID
And then I think that I need get gifts from this rank.
SELECT GIFT.*
FROM GIFT
INNER JOIN(
SELECT ID, MAX(RANK)
FROM GIFT
GROUP BY ID
) filter ON GIFT.ID = filter.ID AND GIFT.RANK = filter.RANK
I think this is the table what you want!
So If below code works, That's what you really want.
SELECT *
FROM USER
LEFT OUTER JOIN(
above table
) GIFT ON USER.ID = GIFT.ID
But Remember this, I said I'm not an expert in SQL. There can be better way.
Checkout the query
SELECT tbluser.id,name,gift,rank into tblrslt
FROM tbluser
LEFT JOIN tblgifts
ON tbluser.id = tblgifts.id order by id,rank;
SELECT tt.*
FROM tblrslt tt
INNER JOIN
(SELECT id, min(rank) AS rank
FROM tblrslt
GROUP BY id) groupedtt
ON tt.id = groupedtt.id
AND tt.rank = groupedtt.rank order by id
In MySQL versions older than 8 you have no ranking functions available. You'll select the minimum rank per user instead and use these ranks to select the gift rows. This means you access the gifts table twice.
I suggest this:
select *
fron users u
join gifts g
on g.id = u.id
and (g.id, g.rank) in (select id, min(rank) from gifts group by id)
order by u.id;
If you also want to show users without gifts, simply change the inner join to a left outer join.

Optimisation of subqueries

I have a relation between users and groups. Users can be in a group or not.
EDIT : Added some stuff to the model to make it more convenient.
Let's say I have a rule to add users in a group considering it has a specific town, and a custom metadata like age 18).
Curently, I do that to know which users I have to add in the group of the people living in Paris who are 18:
SELECT user.id AS 'id'
FROM user
LEFT JOIN
(
SELECT user_id
FROM user_has_role_group
WHERE role_group_id = 1 -- Group for Paris
)
AS T1
ON user.id = T1.user_id
WHERE
(
user.town = 'Paris' AND JSON_EXTRACT('custom_metadata', '$.age') = 18
)
AND T1.user_id IS NULL
It works & gives me the IDs of the users to insert in group.
But when I have 50 groups to proceed, like for 50 town or various ages, it forces me to do 50 requests, it's very slow and not efficient for my Database.
How could I generate a result for each group ?
Something like :
role_group_id user_to_add
1 1
1 2
2 1
2 3
The only way I know to do that for now is to do an UNION on several sub queries like the one above, but of course it's very slow.
Note that the custom_metadata field is a user defined field. I can't create specific columns or tables.
Thanks a lot for your help.
if I good understood you:
select user.id, grp.id
from user, role_group grp
where (user.id, grp.id) not in (select user_id, role_group_id from user_has_role_group) and user.town in ('Paris', 'Warsav')
that code give list of users and group which they not belong from one of towns..
To add the missing entries to user_has_role_group, you might want to have some mapping between those town names and their group_id's.
The example below is just using a subquery with unions for that.
But you could replace that with a select from a table.
Maybe even from role_group, if those names correlate with the user town names.
insert into user_has_role_group (user_id, group_id)
select u.user_id, g.group_id
from user u
join (
select 'Paris' as name, 1 as group_id union all
select 'Rome', 2
-- add more towns here
) g on (u.town = g.name)
left join user_has_role_group ug
on (ug.user_id = u.user_id and ug.role_group_id = g.group_id)
where u.town in ('Paris','Rome') -- add more towns here
and json_extract(u.custom_metadata, '$.age') = 18
and ug.id is null;

count most occurences with the WHERE from other table

I have a movie DB to organize my collection but also I'm using it to learn more of mySQL. along the development I find bumps (plenty of them) and right now my problem is this:
Table ACTORS:
id_actor
name
sex
Table MOVIEACTORES:
id_movieactores
id_movie
id_actor
I want to count the TOP 5 (top10, top20 or whatever!) of actors with most movies and then the Top5 of actresses with most movies!
I have this:
SELECT filmesactores.id_actor,
COUNT( * ) AS contagem
FROM filmesactores
GROUP BY id_actor
ORDER BY contagem DESC
LIMIT 10
But this code doesn't discriminates actors from actresses. I feel the solution might be simple but with my knowledge is out of my reach right now. Anyone?
Grouping by sex, name would separate actors' counts by gender, but since you want to apply the limit to each gender group (i.e. top 5 actors and top 5 actresses), perform two queries and UNION their results together:
SELECT name, COUNT(*) AS moviecount
FROM actors
JOIN movieactores ON actors.id_actor = movieactores.id_actor
WHERE sex = 'Male'
GROUP BY name
ORDER BY COUNT(id_movie) DESC
LIMIT 5
UNION
SELECT name, COUNT(*)
FROM actors
JOIN movieactores ON actors.id_actor = movieactores.id_actor
WHERE sex = 'Female'
GROUP BY name
ORDER BY COUNT(id_movie) DESC
LIMIT 5

Organizing data output with MySQL

I have the following table with data:
t1 (results): card_id group_id project_id user_id
The tables that contain actual labels are:
t2 (groups): id project_id label
t3 (cards): id project_id label
There could be multiple entries by different users.
I need help with writing a query to display the results in a table format with totals counts corresponding card/group. Here's my start but I'm not sure that I'm on the right track...
SELECT COUNT(card_id) AS cTotal, COUNT(group_id) AS gTotal
WHERE project_id = $projID
Unless I'm mistaken, it seems that all you need to do is group by card_id and group_id for the given project_id and pull out the count for each group
SELECT card_id, group_id, COUNT(user_id) FROM mytable
WHERE project_id = 001
GROUP BY (card_id, group_id);
EDIT:
Taking into account the card and group tables involves some joins, but the query is fundamentally the same. Still grouping by card and group, and constraining by project id
SELECT c.label, g.label, COUNT(t1.user_id) FROM mytable t1
JOIN groups g ON t1.group_id=g.id
JOIN cards c ON t1.card_is=c.id
WHERE t1.project_id = 001
GROUP BY (c.card_id, g.group_id)
ORDER BY (c.card_id, g.group_id);
I don't think you can get a table as you want with just SQL. You'll have to render the table in code by iterating over the results. How you do that depends on what language/platform you are using.
if you know for a fixed fact that there are nine groups, then just include those groups in subqueries - similar to this:
select cTotal, g1.gTotal as Group1, g2.gTotal as Group2... etc
from
( SELECT COUNT(card_id) AS cTotal
, COUNT(group_id) AS gTotal
WHERE project_id = $projID
AND group_id = 1 ) g1
, ( SELECT COUNT(card_id) AS cTotal
, COUNT(group_id) AS gTotal
WHERE project_id = $projID
AND group_id = 2 ) g2
etc.