SQL distinct elements count in a group by same element - mysql

I have a problem to count distinct elements in a group by this element.
Let me explain this, I have two tables:
tb1
team amount
1 90
2 80
3 70
4 50
5 60
tb2
team player
5 1
1 1
3 2
1 2
2 2
1 3
3 3
4 3
5 3
2 4
The expected result is:
player nb_team Sum_amount nb_player
1 2 150 3
2 3 240 4
3 4 270 3
4 1 80 2
I'm doing this:
SELECT tb2.player, COUNT(DISTINCT tb1.team) as nb_team,
SUM(tb1.amount) AS sum,
(SELECT COUNT(DISTINCT tb2.player)
FROM tb2 where tb1.team=tb2.team) AS nb_player
FROM tb1, tb2
WHERE tb1.team=tb2.team
GROUP BY tb2.player
ORDER BY tb2.player ASC;
The 3 first columns are correct but I can't get the right value for nb_player.
I have to count how many players are included by the number of teams
For example for the first line result:
player 1 is playing in 2 teams that involved 3 players in total (player #1,#2 and #3)
any idea?

Counting teams and summing those teams amounts for a player needs a different resultset than counting players playing with given player in the same team. So I suggest using two different subqueries and then joining them on the player.
SELECT teams_total.player, teams_total.nb_team, teams_total.`sum`,
players_total.nb_players
FROM
( SELECT tb2.player, COUNT(DISTINCT tb1.team) as nb_team,
SUM(tb1.amount) AS `sum`
FROM tb1 JOIN tb2 ON tb1.team=tb2.team
GROUP BY tb2.player ) teams_total
JOIN
( SELECT tb2_1.player, COUNT(DISTINCT tb2_2.player) as nb_players
FROM tb2 tb2_1
JOIN tb1 ON tb2_1.team=tb1.team
JOIN tb2 tb2_2 ON tb2_2.team=tb1.team
GROUP BY tb2_1.player ) players_total
ON teams_total.player=players_total.player
ORDER BY teams_total.player ASC;

You can use the following query:
SELECT player,
COUNT(DISTINCT t2.team) AS nb_team,
SUM(amount) AS Sum_amount,
(SELECT COUNT(DISTINCT(player))
FROM tb2 AS t
WHERE INSTR(CONCAT(',',GROUP_CONCAT(DISTINCT t2.team), ','),
CONCAT(',',t.team,',')) <> 0) AS nb_player
FROM tb2 AS t2
INNER JOIN tb1 AS t1 ON t2.team = t1.team
GROUP BY player
GROUP_CONCAT is used in the correlated sub-query in order to get a comma separated list of all teams related to the player of the outer query. Using INSTR on this list, we can filter tb2 rows and count the DISTINCT number of players of these teams.
Demo here

Related

SQL: How to join two tables and extract the data by timestamp?

I'm using mysql. I have two tables, one is about movie type, and the other is about movie rating with timestamps. I want to join these two tables together with movie id to count the average rating for each type of movie. I'm trying to extract only the movie types which have at least 10 ratings per film and the ratings made in December, and order by the highest to lowest average rating.
Table 'types'
movieId
type
1
Drama
2
Adventure
3
Comedy
...
...
Table 'ratings'
movieId
rating
timestamp
1
1
851786086
2
1.5
1114306148
1
2
1228946388
3
2
850723898
1
2.5
1167422234
2
2.5
1291654669
1
3
851345204
2
3
944978286
3
3
965088579
3
3
1012598088
1
3.5
1291598726
1
4
1291779829
1
4
850021197
2
4
945362514
1
4.5
1072836909
1
5
881166397
1
5
944892273
2
5
1012598088
...
...
...
Expect result: (Nb ratings >= 10 and rate given in December)
type
Avg_Rating
Drama
3.45
I'm trying to write the query like below, but I'm not able to execute it. (around 10 thousand data in original table)
Where should I adjust my query?
SELECT DISTINCT T.type, AVG(R.rating) FROM types AS T
INNER JOIN ratings AS R ON T.movieId = R.movieId
WHERE R.timestamp LIKE (
SELECT FROM_UNIXTIME(R.timestamp,'%M') AS Month FROM ratings
GROUP BY Month
HAVING Month = 'December')
GROUP BY T.type
HAVING COUNT(R.rating) >=10
ORDER BY AVG(R.rating) DESC;
I see two problems:
timestamp LIKE - what's that supposed to do?
and
inner query with GROUP BY by without any aggregation. Perhaps you meant WHERE? And anyway you don't need it at all - just do the same check for December directly on timestamp, w/o LIKE and w/o subquery
SELECT DISTINCT T.type, AVG(R.rating) FROM
types AS T INNER JOIN ratings AS R 
ON T.movieId = R.movieId
WHERE FROM_UNIXTIME(R.timestamp,'%M') = 'December'
GROUP BY T.type
HAVING COUNT(R.rating) >=10
ORDER BY AVG(R.rating) DESC;
You can try next query.
SELECT DISTINCT T.type, AVG(R.rating) FROM types AS T
INNER JOIN ratings AS R ON T.movieId = R.movieId
GROUP BY T.type
HAVING
COUNT(R.rating) >= 10 -- have 10 or more rating records
AND SUM(MONTH(FROM_UNIXTIME(R.timestamp)) = 12) > 0 -- have at least one rating in December
ORDER BY AVG(R.rating) DESC;
sqlize

compare mysql numeric values group_concat of two columns with join

I have 3 tables
1.users
user_id nationality
1 Egyptian
2 Palestinian
3 French
centers
id center_name
1 q
12 y
5 x
23 z
centers_users
student_id center_id
1 12
2 5
3 5
1 23
2 12
what I expect
Nationality center_name count_of_users_from this country
Egyptian y,z 10
Palestinian x,y 33
French x,q 7
I have tried many mysql queries but I cannot get the result I want
Final query I execute:
SELECT * from (SELECT (LENGTH(GROUP_CONCAT(DISTINCT user_id))-ENGTH(REPLACE(GROUP_CONCAT(DISTINCT user_id), ',', ''))) as ss,GROUP_CONCAT( DISTINCT user_id) ,nationality from user where user_id in(SELECT student_id FROM `centers_users`) GROUP by nationality)a
But only get the count with nationality.
When I Join with centers gives me redundancy because I cannot put "ON" condition with
group_concat
How can I implement it?
Thanks..
I think you want to join the tables and aggregate:
select u.nationality,
group_concat(distinct c.center_name) as center_names,
count(distinct user_id) as users_from_this_country
from users u join
user_centers uc
on u.user_id = uc.student_id join
centers c
on c.center_id = uc.center_id
group by u.nationality;
You may be able to use count(*) for users_from_this_country. It depends on how you want to count a user who is in multiple centers in the same country.

SQL Query to return rows where a column value appears multiple time

I'm creating a simple database which will allow me to track snooker results, producing head to head results between players. Currently I have 3 tables: (Player, Fixture, Result)
PlayerID PlayerName
1 Michael Abraham
2 Ben Mullen
3 Mark Crozier
FixtureID Date TableNo Group
1 07/12/2015 19:00:00 12 0
2 08/12/2015 12:00:00 9 0
ResultID FixtureID PlayerID FramesWon
1 1 1 3
2 1 3 1
3 2 1 5
4 2 2 1
I would like a query which returns all rows in the result table for fixtures which took place between players 1 and 3. Currently my query is:
SELECT *
FROM Result
WHERE PlayerID IN (1,3);
This returns the first 3 rows of the result table - when I'm only looking for the top 2 rows because they share the same FixtureID. Is there an easy way to remove the third row from this query result, or should I reconsider my database design? Any help would be appreciated.
One solution is to use a GROUP BY query, grouping by FixtureID and counting the rows for each FixtureID. This query will select all FixtureIDs with both players 1 and 3:
select
FixtureID
from
Results
where
PlayerID IN (1,3)
group by
FixtureID
having
count(*)=2
then to get the record from the Results table you can use this query:
select *
from Results
where FixtureID IN (
select FixtureID
from Results
where PlayerID IN (1,3)
group by FixtureID
having count(*)=2
)
You could join your fixtures table twice, like this:
select
*
from
Result as R1
join Result as R2 on R1.FixtureID = R2.FixtureID
where
R1.PlayerID in (1,3)
AND R2.PlayerID in (1,3)
AND R1.PlayerID != R2.PlayerID
group by
R1.FixtureID
;
Or, since it's a bit messy now, show it like a snooker score display often is shown:
select
R1.FixtureID, R1.PlayerID as player1, R1.FramesWon as player1_frames, R1.FramesWon+R2.FramesWon as total_frames, R2.FramesWon as player2_frames, R2.PlayerID as player2
from
Result as R1
join Result as R2 on R1.FixtureID = R2.FixtureID
where
R1.PlayerID in (1,3)
AND R2.PlayerID in (1,3)
AND R1.PlayerID != R2.PlayerID
group by
R1.FixtureID
;

SQL Incorrect SUMS from multiple JOINS

I'm trying to sum multiple tables using Joins and Sums in MySQL and not having much success.
My Tables (Unnecessary Columns Removed)
Students
idStudent studentname studentyear
1 foobar 11
2 barfoo 11
3 thing 8
Athletics_Results
idResult idStudent points
1 1 14
2 1 11
3 3 7
4 2 9
Team_Results
idTeamResults year points
1 11 9
2 8 8
3 7 14
So let me explain about the tables, because I admit they're poorly named and designed.
Students holds the basic info about each student, including their year and name. Each student has a unique ID.
Athletics_Results stores the results from athletics events. The idStudent column is a foreign key and relates to idStudent in the student column. So student foobar (idStudent 1) has scored 14 and 11 points in the example.
Team_Results stores results from events that more than one student took part in. It just stores the year group and points.
The Aim
I want to be able to produce a sum of points for each year - combined from both athletics_results and team_results. EG:
year points
7 14 <-- No results in a_r, just 14 points in t_r
8 15 <-- 7 points in a_r (idResult 4) and 8 in t_r
11 43 <-- 14, 11, 9 points in a_r and 9 in t_r
What I've tried
For testing purposes, I've not tried combining the a_r scores and t_r scores yet but left them as two columns so I can see what's going on.
The first query I tried:
SELECT students.studentyear as syear, SUM(athletics_results.points) as score, SUM(team_results.points) as team_score
FROM students
JOIN team_results ON students.studentyear = team_results.year
JOIN athletics_results ON students.idStudent = athletics_results.idStudent
GROUP BY syear;
This gave different rows for each year (as desired) but had incorrect SUMS. I learnt this was due to not grouping the joins.
I then created this code:
SELECT studentyear as sYear, teamPoints, AthleticsPoints
FROM students st
JOIN (SELECT year, SUM(tm.points) as teamPoints
FROM team_results tm
GROUP BY year) tr ON st.studentyear = tr.year
JOIN (SELECT idStudent, SUM(atr.points) as AthleticsPoints
FROM athletics_results atr
) ar ON st.idStudent = ar.idStudent
Which gave correct SUMS but only returned one year group row (e.g the scores for Year 11).
EDIT - SQLFiddle here: http://sqlfiddle.com/#!9/dbc16/. This is with my actual test data which is a bigger sample than the data I posted here.
http://sqlfiddle.com/#!9/ad111/7
SELECT tr.`year`, COALESCE(tr.points,0)+COALESCE(SUM(ar.points),0)
FROM Team_Results tr
LEFT JOIN Students s
ON tr.`year`=s.studentyear
LEFT JOIN Athletics_Results ar
ON s.idStudent = ar.idStudent
GROUP BY tr.year
According to your comment and fiddle provided
check http://sqlfiddle.com/#!9/dbc16/3
SELECT tr.`year`, COALESCE(tr.points,0)+COALESCE(SUM(ar.points),0)
FROM (
SELECT `year`, SUM(points) as points
FROM Team_Results
GROUP BY `year`) tr
LEFT JOIN Students s
ON tr.`year`=s.studentyear
LEFT JOIN Athletics_Results ar
ON s.idStudent = ar.idStudent
GROUP BY tr.year
Try this http://sqlfiddle.com/#!9/2bfb1/1/0
SELECT
year, SUM(points)
FROM
((SELECT
a.year, SUM(b.points) AS points
FROM
student a
JOIN at_result b ON b.student_id = a.id
GROUP BY a.year) UNION (SELECT
a.year, SUM(a.points) AS points
FROM
t_result a
GROUP BY a.year)) c
GROUP BY year;
On your data I get:
year points
7 14
8 15
11 43
Can be done in multiple ways. My first thought is:
SELECT idStudent, year, SUM(points) AS totalPoints FROM (
SELECT a.idStudent, c.year, a.points+b.points AS points
FROM students a
INNER JOIN Athletics_Results b ON a.idStudent=b.idStudent
INNER JOIN Team_Results c ON a.studentyear=c.year) d
GROUP BY idStudent,year

MySQL join with a subquery

I have three tables and am trying to get info from two and then perform a calculation on the third and display all the results in one query.
The (simplified) tables are:
table: employee_work
employee_id name
1 Joe
2 Bob
3 Jane
4 Michelle
table: carryover
employee_id days
1 5
2 10
3 3
table: timeoff
employee_id time_off_type days
1 Carryover 2
1 Leave 3
1 Carryover 1
2 Sick 4
2 Carryover 4
3 Leave 1
4 Sickness 4
The results I would like are:
employee_id, carryover.days, timeoff.days
1 5 3
2 10 4
3 3 0
However when I run the query, whilst I get the correct values in columns 1 and 2, I get the same number repeated in the third column for all entries.
Here is my query:
Select
employee_work.employee_id,
carryover.carryover,
(SELECT SUM(days) FROM timeoff WHERE timeoff.time_off_type = 'Carryover'
AND timeoff.start_date>='2013-01-01') AS taken
From
carryover Left Join
employee_work On employee_work.employee_id = carryover.employee_id Left Join
timeoff On employee_work.employee_id = timeoff.employee_id Left Join
Where
carryover.carryover > 0
Group By
employee_work.employee_id
I have tried to group by in the sub query but I then get told "Subquery returns more than one row" - how can I ensure that the sub query is respecting the join so it only looks at each employee at a time so I get my desired results?
The answer to your question is to use a correlated subquery. You don't need to mention the timeoff table twice in this case:
Select
employee_work.employee_id,
carryover.carryover,
(SELECT SUM(days)
FROM timeoff
WHERE timeoff.time_off_type = 'Carryover' and
timeoff.start_date>='2013-01-01' and
timeoff.employee_id = employee_work.employee_id
) AS taken
From
carryover Left Join
employee_work On employee_work.employee_id = carryover.employee_id
Where
carryover.carryover > 0
Group By
employee_work.employee_id;
An alternative structure is to do the grouping for all employees in the from clause. You can also remove the employee_work table, because it does not seem to be being used. (You can use carryover.employee_id for the id.)
Select co.employee_id, co.carryover, et.taken
From carryover c Left Join
(SELECT employee_id, SUM(days) as taken
FROM timeoff
WHERE timeoff.time_off_type = 'Carryover' and
timeoff.start_date>='2013-01-01'
) et
on co.employee_id = et.employee_id
Where c.carryover > 0;
I don't think the group by is necessary. If it is, then you should probably have an aggregation function in the original query.